For old or new-style pdb files, try: awk '$1!~/ATOM|HETATM/ || $3!~/^H/' old.pdb > new.pdb But I understand some new H-atom names don't start with H, then this won't work. And if the number of atoms is large enough to fill the second field and fuse with HETATM in the first field, it won't work unless you define fixed field widths: awk '$1!~/ATOM|HETATM/ || $3!~/^H/' \ FIELDWIDTHS="6 5 5 4 2 4 4 8 8 8 6 6" \ old.pdb > new.pdb . Maybe a specialized tool like phenix.refine is safest! Pavel Afonine wrote:
Hi Phil,
ok, now I tried this:
egrep -v '^ATOM|HETATM.*H$' m.pdb > m_noH.pdb
Input (m.pdb - file right from PDB):
ATOM 1 N GLY A 1 0.504 -0.494 0.924 1.00 7.85 N ATOM 2 CA GLY A 1 1.272 0.589 0.277 1.00 6.79 C ATOM 3 C GLY A 1 1.700 1.614 1.301 1.00 5.59 C ATOM 4 O GLY A 1 1.434 1.460 2.496 1.00 6.04 O ATOM 0 H1 GLY A 1 0.408 -1.171 0.354 1.00 7.85 H ATOM 0 H2 GLY A 1 0.939 -0.775 1.648 1.00 7.85 H ATOM 0 H3 GLY A 1 -0.298 -0.189 1.160 1.00 7.85 H ATOM 0 HA2 GLY A 1 2.052 0.220 -0.166 1.00 6.79 H ATOM 0 HA3 GLY A 1 0.731 1.013 -0.407 1.00 6.79 H END
Output (m_noH.pdb):
END
Just to be clear: none of the two commands suggested so far worked on a valid PDB file (above). So I thought it might be useful to point this out.
All the best, Pavel
On 2/6/14, 12:21 PM, Phil Jeffrey wrote:
So I guess your contribution at this point in the thread is just to be as difficult as possible ? I dare say if you use it on PROTIN format it won't work either.
Try it on a file that actually puts out something that conforms to the PDB standard with the element line at the end, not something that almost conforms to the standard, has non-distinct index numbers, apparently missing spaces on the GLY:N atom.
I would hope, modulo the usual list of bugs, that phenix.refine actually writes out something more closely resembling the correct format, in which case Tim's regular expression would actually work.
From the original post:
Actually i refine my structure with phenix along all hydrogen now
Phil Jeffrey Princeton
On 2/6/14 3:09 PM, Pavel Afonine wrote:
Thanks Phil,
did this:
egrep -v '^ATOM|HETATM.*H$' m.pdb > m_noH.pdb
Result:
in input file (m.pdb) I have:
ATOM 1 N GLY A 1 0.504 -0.494 0.924 1.00 7.85 ATOM 2 CA GLY A 1 1.272 0.589 0.277 1.00 6.79 ATOM 3 C GLY A 1 1.700 1.614 1.301 1.00 5.59 ATOM 4 O GLY A 1 1.434 1.460 2.496 1.00 6.04 ATOM 0 H1 GLY A 1 0.452 -1.280 0.308 1.00 7.85 ATOM 0 H2 GLY A 1 0.959 -0.765 1.772 1.00 7.85 ATOM 0 H3 GLY A 1 -0.420 -0.171 1.131 1.00 7.85 ATOM 0 HA2 GLY A 1 2.157 0.171 -0.225 1.00 6.79 ATOM 0 HA3 GLY A 1 0.659 1.070 -0.499 1.00 6.79 END
Output file (m_noH.pdb) contains only:
END
Pavel
On 2/6/14, 12:03 PM, Phil Jeffrey wrote:
Of course, because in the shells that I use it will attempt to do variable name substitution in strings that are double-quoted. (I make no warranties about all possible shells). However if you use single quotes:
egrep -v '^ATOM|HETATM.*H$' your.pdb > your_noH.pdb
Should work just fine in tcsh, csh at the very least.
Phil
On 2/6/14 2:52 PM, Pavel Afonine wrote:
Hi Tim,
On 2/6/14, 10:52 AM, Tim Gruene wrote:
the simple and qucik command
egrep -v "^ATOM|HETATM.*H$" your.pdb > your_noH.pdb
should also work.
just out of curiosity I did (copy-paste of your example)
egrep -v "^ATOM|HETATM.*H$\" m.pdb > m_noH.pdb
and I got:
Illegal variable name.
Pavel
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb