bash - Replacing VCG1 or VCG2 with VCG* in perl script -


with of jaypal in previous question (https://stackoverflow.com/a/25735444/3767980) able format restraints both ambigous , unambigous cases. let's consider ambiguous here more difficult.

i have restraints like

g6n-d5c-?: (116.663, 177.052, 29.149) k87cd/e85cb/e94cb/h32cb/q21cb l12n-t11c-?: (128.977, 175.109, 174.412) k158c/h60c/a152c/n127c/y159c(noth60c) k14n-e13c-?: (117.377, 176.474, 29.823) i187cg1/v78cg2 a75n-q74c-?: (123.129, 177.253, 23.513) v131cg1/v135cg1/v78cg1 

and subjected following perl script:

#!/usr/bin/perl   use strict; use warnings; use autodie; #   open $fh, '<', $argv[0];  while (<$fh>) {     @values = map { /.(\d+)(\w+)/; $1, $2 } split '/', (split)[-1];     ( $resid, $name ) = /^[^-]+-.(\d+)(\w+)-/;     print "assign (resid $resid , name $name ) (";     print join ( " or ",          map  { "resid $values[$_] , name $values[$_ + 1]" }          grep { not $_ % 2 } 0 .. $#values      );     print " ) 3.5 2.5 4.5 ! $_"; } 

with output:

assign (resid 5 , name c ) (resid 87 , name cd or resid 85 , name cb or resid 94 , name cb or resid 32 , name cb or resid 21 , name cb ) 3.5 2.5 8.5 ! g6n-d5c-?: (116.663, 177.052, 29.149) k87cd/e85cb/e94cb/h32cb/q21cb assign (resid 11 , name c ) (resid 158 , name c or resid 60 , name c or resid 152 , name c or resid 127 , name c or resid 159 , name c ) 3.5 2.5 8.5 ! l12n-t11c-?: (128.977, 175.109, 174.412) k158c/h60c/a152c/n127c/y159c(noth60c) assign (resid 13 , name c ) (resid 187 , name cg1 or resid 78 , name cg2 ) 3.5 2.5 8.5 ! k14n-e13c-?: (117.377, 176.474, 29.823) i187cg1/v78cg2 assign (resid 74 , name c ) (resid 131 , name cg1 or resid 135 , name cg2 or resid 78 , name cg1 ) 3.5 2.5 8.5 ! a75n-q74c-?: (123.129, 177.253, 23.513) v131cg1/v135cg1/v78cg1 

  • what need lines containing entries begin v followed 2 or 3 digits , cg1 or cg2 after !. examples v78cg2 or v135cg1.
  • i need restraints corresponding entries treated wildcard. need restraints returned like:

assign (resid 5 , name c ) (resid 87 , name cd or resid 85 , name cb or resid 94 , name cb or resid 32 , name cb or resid 21 , name cb ) 3.5 2.5 8.5 ! g6n-d5c-?: (116.663, 177.052, 29.149) k87cd/e85cb/e94cb/h32cb/q21cb assign (resid 11 , name c ) (resid 158 , name c or resid 60 , name c or resid 152 , name c or resid 127 , name c or resid 159 , name c ) 3.5 2.5 8.5 ! l12n-t11c-?: (128.977, 175.109, 174.412) k158c/h60c/a152c/n127c/y159c(noth60c) assign (resid 13 , name c ) (resid 187 , name cg1 or resid 78 , name cg* ) 3.5 2.5 8.5 ! k14n-e13c-?: (117.377, 176.474, 29.823) i187cg1/v78cg2 assign (resid 74 , name c ) (resid 131 , name cg* or resid 135 , name cg* or resid 78 , name cg* ) 3.5 2.5 8.5 ! a75n-q74c-?: (123.129, 177.253, 23.513) v131cg1/v135cg1/v78cg1 

i need advice selecting matching lines , applying applied transfomation cluster input (before !). can find lines match basic regex of v.*cg[1-2].

i solution in above perl script.

if unclear, please comment. still new. thank in advance advice.

here modified version of script explanation of going on. my @values = map { ... } split '/', (split)[-1]; little tricky understand, i'll explain separately:

map takes array , applies whatever within braces every member of array, , outputs new array. 2 splits used chop line. if used without arguments, split takes $_ input , splits on whitespace. therefore, first split takes $_, current line, , splits spaces:

input: 'g6n-d5c-?: (116.663, 177.052, 29.149) k87cd/e85cb/e94cb/h32cb/q21cb'  array created calling split: 'g6n-d5c-?:', '(116.663,', '177.052,', '29.149)', 'k87cd/e85cb/e94cb/h32cb/q21cb' 

the second split chops input on /; input, uses last item in array created first split -- i.e. (split) shorthand "array created splitting $_ on whitespace", , (split)[-1] last element of array.

input:  k87cd/e85cb/e94cb/h32cb/q21cb  array created calling `split "/"` 'k87cd', 'e85cb', 'e94cb', 'h32cb', 'q21cb' 

the map command applies regex every member of array:

/.(\d+)(\w+)/; # match character (.) followed 1 or more digits (\d)                  # followed 1 or more alphanumeric (\w) characters. 

the brackets capture results read-only variables $1 , $2. second statement in map adds characters array being created map command. default, perl puts result of last statement array, this:

my @arr = (1, 2, 3, 4); @two_times = map { $_ * 2 } @arr; # @two_times (2, 4, 6, 8) 

(the "results" of pattern match $1 , $2, statement $1, $2 add them @values array not strictly necessary.)

so @values = map { /.(\d+)(\w+)/; $1, $2 } @array captures matches each element in @array , puts them in @values.

i hope rest of script understandable; if not, recommend taking apart each command , using data::dumper examine results can work out going on.

to alter script treat vnncg1 / vnncg2 entries differently, added line map command finds residue matches pattern , replaces vnncg*. altered matching regex grab appropriate pieces of residue name not grab inappropriate data (such (notb28dg)). here new script comments:

#!/usr/bin/perl  use strict; use warnings; use feature ':5.10'; use autodie;  open $fh, '<', $argv[0];  while (<$fh>) {      # brief guide regexps:     # \d     = digits     # \w     = digits or letters or _     # [ ]    = match of characters within these brackets     # ( )    = capture value in these brackets, save $1, $2, $3, etc.     #        (brackets used alternation, not in case)     # *      = match 0 or 1 times     # +      = match 1 or more times     # \*     = match character *     # s/ / / = search , replace     # /x     = ignore whitespace      @values = map {         # find pattern         s/v     # v         (\d+)   # 1 or more digits; brackets mean capture value                 # , gets saved in $1         cg      # cg         [12]    # either 1 or 2         /v$1cg*/x; #replace v $1 cg *          # find pattern         /.       # character         (\d+)    # 1 or more digits; capture value in $1         ([a-z][\w\*]*) # letter followed 0 or more alphanum or *          /x;            # value captured in $2          # put $1 , $2 array we're building         $1, $2         } split '/', (split)[-1];      ( $resid, $name ) = /^[^-]+-.(\d+)(\w+)-/;     # compose new string     $str = "assign (resid $resid , name $name ) ("     . join ( " or ",         map  { "resid $values[$_] , name $values[$_ + 1]" }         grep { not $_ % 2 } 0 .. $#values     )     . " ) 3.5 2.5 8.5 ! $_";     # "say" prints out string stderr , automatically adds carriage return     $str; } 

short version of 'core' script without comments:

foreach (@data) {     @values = map {         s/v(\d+)cg[12]/v$1cg*/; /.(\d+)([a-z][\w\*]*)/;         } split '/', (split)[-1];     ( $resid, $name ) = /^[^-]+-.(\d+)(\w+)-/;     "assign (resid $resid , name $name ) ("     . join ( " or ",         map  { "resid $values[$_] , name $values[$_ + 1]" }         grep { not $_ % 2 } 0 .. $#values     )     . " ) 3.5 2.5 8.5 ! $_"; } 

Comments

Popular posts from this blog

php - Submit Form Data without Reloading page -

linux - Rails running on virtual machine in Windows -