Does refine restrains side chain rotamers to, say, the closest one from the rotamer dictionary or, perhaps, the starting coordinates (I find this less likely, of course)? I am particularly interested in what would happen if a side chain is disordered and there is no strong electron density to support a specific conformation. Do I understand correctly that the real-space refinement for the side chains is on by default, which means that disordered side chains would tend to stay within their biased density? I apologize if I missed the description of this in the manual, just point me to the right section then. Cheers, Ed. -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs
On Wed, Mar 23, 2011 at 10:17 AM, Ed Pozharski
Does refine restrains side chain rotamers to, say, the closest one from the rotamer dictionary or, perhaps, the starting coordinates (I find this less likely, of course)?
No, except to the extent that the standard monomer library dihedral restraints will often drive sidechains to rotameric conformations. I've been playing with multi-angle restraints using e.g. the Molprobity distributions, similar to what we've done for Ramachandran restraints, but nothing working yet.
Do I understand correctly that the real-space refinement for the side chains is on by default, which means that disordered side chains would tend to stay within their biased density?
Real-space refinement for the entire structure is on by default (strategy=individual_sites_real_space, subject to resolution and R-factor cutoffs) - individual rotamer fitting needs to be explicitly enabled (fix_rotamers=True). -Nat
Hi Ed, you are asking the right question. We discussed the issue of residue side chains partially or fully lacking density at PHENIX developers meeting last week. This was brought up by an industrial consortium user. Some people tend to chop them off (which I do not not not like), some let the refinement programs do what they want, and the programs stupidly tends to stick these sidechains in whatever density is available around (this is bad too). I think we decided that I have to do something about it, probably the following: - if a side chain lacks the density and the full rotamer search does not find any density, then a) set occupancies of affected atoms to zero, and b) move the side chain to the highest probable rotameric state that generates least clashes with surrounding atoms and may be still have some good density fit (even though it is a bad density). I will work on this. That's for sure. Pavel.
Does refine restrains side chain rotamers to, say, the closest one from the rotamer dictionary or, perhaps, the starting coordinates (I find this less likely, of course)? I am particularly interested in what would happen if a side chain is disordered and there is no strong electron density to support a specific conformation. Do I understand correctly that the real-space refinement for the side chains is on by default, which means that disordered side chains would tend to stay within their biased density?
I apologize if I missed the description of this in the manual, just point me to the right section then.
Cheers,
Ed.
On Wed, 2011-03-23 at 14:39 -0700, Pavel Afonine wrote:
Some people tend to chop them off (which I do not not not like), some let the refinement programs do what they want, and the programs stupidly tends to stick these sidechains in whatever density is available around (this is bad too).
Yes, there are basically three options: 1) cut the side chain down to whatever is still visible in the density 2) let the refinement proceed as is 3) set side chain atom occupancies to zero Personally, I have evolved from 2) to 1). The argument was that omitting atoms will confuse some end users, and refinement will essentially take care of it by increasing the B-factors. I have seen some quite convincing evidence since that the disordered side chains do have a detectable effect on the rest of the model, and thus leaving them in makes a model worse. Other half of the argument was of semantic nature and referred more to replacing disordered residues with alanines, which is silly because we know from sequence it's not alanine and we know the atoms must be in the vicinity. But on the other hand we exclude whole segments of sequence sometimes, such as disordered loops and termini, so why not side chains? Missing atoms are just that - missing from the density. Chemically it is still a lysine, according to the residue name. The third option (the one you gravitate towards) seems problematic to me for the following reasons. The meaning of the occupancy is that the atom distribution in space is multimodal, and it spends certain fraction of time vibrating around the specified position. So what is the meaning of zero occupancy? This is the average atomic position, but it spends zero time here? Makes no physical sense, and in fact is wrong since there is some non-zero probability that the disordered side chain will occupy the designated conformation. Of course, structural model may be considered a *mathematical* model, and it does not have to be strictly interpretable (or interpretable the way I see most logical, anyway). As for end-user argument, I would say that omitted atoms are better than high B-factors or zero occupancies, as it is more likely to attract attention to the fact that side chain conformation is undefined (coot, however, does highlight zero occupancy atoms). It is true though that the "end-user argument" is inherently weak, since nobody can be held responsible for self-inflicted damage due to lack of knowledge. Except MacDonalds and microwave oven manufacturers, of course. -- Hurry up, before we all come back to our senses! Julian, King of Lemurs
Yes, there are basically three options:
1) cut the side chain down to whatever is still visible in the density 2) let the refinement proceed as is 3) set side chain atom occupancies to zero
Personally, I have evolved from 2) to 1). The argument was that omitting atoms will confuse some end users,
there can be endless list of things how you can confuse the end-user, so I guess I put it aside and assume dealing with an educated individual.
and refinement will essentially take care of it by increasing the B-factors.
Yes, stupid refinement would probably do it. phenix.refine will not do it since zero occupancy atoms will not contribute to the scattering, and their B-factors will be roughly similar to those of neighbor atoms.
I have seen some quite convincing evidence since that the disordered side chains do have a detectable effect on the rest of the model, and thus leaving them in makes a model worse.
I've seen both.
Other half of the argument was of semantic nature and referred more to replacing disordered residues with alanines, which is silly
It is silly but honest. If you call TYR something like this ATOM 134 N TYR A 19 21.657 -76.614 65.963 1.00 28.50 A N ATOM 135 CA TYR A 19 23.064 -76.802 65.641 1.00 27.23 A C ATOM 136 CB TYR A 19 23.231 -77.079 64.157 1.00 30.04 A C ATOM 137 C TYR A 19 23.816 -75.537 66.027 1.00 27.07 A C ATOM 138 O TYR A 19 23.265 -74.434 65.976 1.00 24.32 A O that would be weird too. Call it then "handicapped TYR" -;) And I guess to see something like this is confusing for the end user too (especially one who learns things). If I see something like this my first geuss would be "someone messed up the file while doing copy-paste".
because we know from sequence it's not alanine
Yes, we know this. But before we really know this, we need: 1) extract sequence from PDB file; 2) get your correct sequence; 3) align them and see mismatches; 4) distinguish between model building (occasional) errors and intentional ones (due to ALA truncation).
The third option (the one you gravitate towards) seems problematic to me for the following reasons. The meaning of the occupancy is that the atom distribution in space is multimodal, and it spends certain fraction of time vibrating around the specified position. So what is the meaning of zero occupancy? This is the average atomic position, but it spends zero time here? Makes no physical sense, and in fact is wrong since there is some non-zero probability that the disordered side chain will occupy the designated conformation. Of course, structural model may be considered a *mathematical* model, and it does not have to be strictly interpretable (or interpretable the way I see most logical, anyway).
This is a valid argument, I agree. Better, one would need to run a bunch of identical refinements and obtain the ensemble that would tell you (more or less) the uncertainty and degree of confidence for each atom: http://cci.lbl.gov/~afonine/p2.png That would be a step forward towards a better option then setting the occupancy to zero.
As for end-user argument, I would say that omitted atoms are better than high B-factors or zero occupancies,
I exclude high b-factors as an option because if the program does it then it is a bug that must be fixed. I agree, making occupancy zero is kind of abusing it in order to say "I do not see this atom". But so far I have no feeling about what is more confusing: - set occupancy to zero in order to say "I don't see it in the map", or - call TYR (or whatever else) something that is according to the atom content is NOT TYR but is ALA. Finally, when we model 4-6A or so resolution data why we stick atoms into those tubes of density? Do these densities really tell you where that specific atom or often even residue is? Pavel.
I have used both the zero occupancy and removal of side-chain atom options for dealing with side chains with no/little observable density.
Although I can see from the non-crystallogapher end-user perspective that the zero occupancy option is probably preferred, because as at least the side-chain atoms are preserved, and readily identifiable when working in PyMOL, CCP4MG, Coot etc.
However I'm fairly sure though that if you deposit data in the PDB with zero occupancy atoms, the curators remove the atoms anyway and stick them in one of the header comments - true?
Which kind of argues for the removal of the atoms anyway...
Tony.
Sent from my iPhone
On 24 Mar 2011, at 09:27, Pavel Afonine
Yes, there are basically three options:
1) cut the side chain down to whatever is still visible in the density 2) let the refinement proceed as is 3) set side chain atom occupancies to zero
Personally, I have evolved from 2) to 1). The argument was that omitting atoms will confuse some end users,
there can be endless list of things how you can confuse the end-user, so I guess I put it aside and assume dealing with an educated individual.
and refinement will essentially take care of it by increasing the B-factors.
Yes, stupid refinement would probably do it. phenix.refine will not do it since zero occupancy atoms will not contribute to the scattering, and their B-factors will be roughly similar to those of neighbor atoms.
I have seen some quite convincing evidence since that the disordered side chains do have a detectable effect on the rest of the model, and thus leaving them in makes a model worse.
I've seen both.
Other half of the argument was of semantic nature and referred more to replacing disordered residues with alanines, which is silly
It is silly but honest. If you call TYR something like this
ATOM 134 N TYR A 19 21.657 -76.614 65.963 1.00 28.50 A N ATOM 135 CA TYR A 19 23.064 -76.802 65.641 1.00 27.23 A C ATOM 136 CB TYR A 19 23.231 -77.079 64.157 1.00 30.04 A C ATOM 137 C TYR A 19 23.816 -75.537 66.027 1.00 27.07 A C ATOM 138 O TYR A 19 23.265 -74.434 65.976 1.00 24.32 A O
that would be weird too. Call it then "handicapped TYR" -;) And I guess to see something like this is confusing for the end user too (especially one who learns things). If I see something like this my first geuss would be "someone messed up the file while doing copy-paste".
because we know from sequence it's not alanine
Yes, we know this. But before we really know this, we need: 1) extract sequence from PDB file; 2) get your correct sequence; 3) align them and see mismatches; 4) distinguish between model building (occasional) errors and intentional ones (due to ALA truncation).
The third option (the one you gravitate towards) seems problematic to me for the following reasons. The meaning of the occupancy is that the atom distribution in space is multimodal, and it spends certain fraction of time vibrating around the specified position. So what is the meaning of zero occupancy? This is the average atomic position, but it spends zero time here? Makes no physical sense, and in fact is wrong since there is some non-zero probability that the disordered side chain will occupy the designated conformation. Of course, structural model may be considered a *mathematical* model, and it does not have to be strictly interpretable (or interpretable the way I see most logical, anyway).
This is a valid argument, I agree. Better, one would need to run a bunch of identical refinements and obtain the ensemble that would tell you (more or less) the uncertainty and degree of confidence for each atom: http://cci.lbl.gov/~afonine/p2.png
That would be a step forward towards a better option then setting the occupancy to zero.
As for end-user argument, I would say that omitted atoms are better than high B-factors or zero occupancies,
I exclude high b-factors as an option because if the program does it then it is a bug that must be fixed. I agree, making occupancy zero is kind of abusing it in order to say "I do not see this atom". But so far I have no feeling about what is more confusing: - set occupancy to zero in order to say "I don't see it in the map", or - call TYR (or whatever else) something that is according to the atom content is NOT TYR but is ALA.
Finally, when we model 4-6A or so resolution data why we stick atoms into those tubes of density? Do these densities really tell you where that specific atom or often even residue is?
Pavel.
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
However I'm fairly sure though that if you deposit data in the PDB with zero occupancy atoms, the curators remove the atoms anyway and stick them in one of the header comments - true?
Interesting... I do not know whether it happens or not, but if it does happen then is really terrible. As the author of the structure I am responsible for its content, and if someone manipulates the structure I would not like to be responsible for it. So I guess this kind of manipulation (again, if it takes place) should be documented in PDB file header along with the name of the person who did it (well, in that case he or she becomes a co-author! -:) ) Pavel.
On 3/24/2011 2:57 AM, Pavel Afonine wrote:
However I'm fairly sure though that if you deposit data in the PDB with zero occupancy atoms, the curators remove the atoms anyway and stick them in one of the header comments - true?
Interesting... I do not know whether it happens or not, but if it does happen then is really terrible. As the author of the structure I am responsible for its content, and if someone manipulates the structure I would not like to be responsible for it. So I guess this kind of manipulation (again, if it takes place) should be documented in PDB file header along with the name of the person who did it (well, in that case he or she becomes a co-author! -:) )
The PDB does not modify the model in this way. They will run their validation checks on the deposited model and notify the depositor of anything thought to be strange, including missing atoms, but if the depositor signs off on it the file is put in the data base. This is a debate that goes back many years and the PDB has no interest in taking sides. Personally I place atoms when I have some idea where to put them. If someone needs the full side chain to perform electrostatic calculations the method probably would lead towards some favored method of placing the floppy bits (perhaps taking into account the location of the other charges in the area). If someone needs to place atoms for a packing analysis, then they may want to create placements for the disordered atoms, but would have different criteria for the task then the person interested in electrostatics. I am a crystallographer, not a mind reader. I look at density (and other information) and do my best to build a model. Of the possible ways to represent this common situation I think placing a side chain with occupancy set to zero is the worst. For the naive user (and that is not necessarly pejorative) it will lead them to view the placement of those atoms as unwarrantably confident. For those with a deep understanding of the meaning of the parameters it is flat wrong - it says that I know that this atom is never in this location. If you want to be precise for the knowledgeable user you need to put in a non-zero occupancy but assign it a very large sigma on your SIGATM record for that atom. Of course when naive users see SIGATM records I expect their heads will explode. Dale Tronrud
Pavel.
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
On 12:29 Thu 24 Mar , Dale Tronrud wrote:
Of the possible ways to represent this common situation I think placing a side chain with occupancy set to zero is the worst. For the naive user (and that is not necessarly pejorative) it will lead them to view the placement of those atoms as unwarrantably confident. For those with a deep understanding of the meaning of the parameters it is flat wrong - it says that I know that this atom is never in this location.
If you want to be precise for the knowledgeable user you need to put in a non-zero occupancy but assign it a very large sigma on your SIGATM record for that atom. Of course when naive users see SIGATM records I expect their heads will explode.
Has anyone thought about putting them into a second MODEL record instead? That way, they're present for people who want them, but aren't shown by default in programs like PyMol. No clue how refinement programs would deal with this. -- Thanks, Donnie Donald S. Berkholz, Ph.D. Research Fellow James R. Thompson lab, Physiology & Biomedical Engineering Grazia Isaya lab, Pediatric & Adolescent Medicine Medical Sciences 2-66 Mayo Clinic College of Medicine 200 First Street SW Rochester, MN 55905 office: 507-538-6924 cell: 612-991-1321
On 3/25/2011 8:53 AM, Donnie Berkholz wrote:
On 12:29 Thu 24 Mar , Dale Tronrud wrote:
Of the possible ways to represent this common situation I think placing a side chain with occupancy set to zero is the worst. For the naive user (and that is not necessarly pejorative) it will lead them to view the placement of those atoms as unwarrantably confident. For those with a deep understanding of the meaning of the parameters it is flat wrong - it says that I know that this atom is never in this location.
If you want to be precise for the knowledgeable user you need to put in a non-zero occupancy but assign it a very large sigma on your SIGATM record for that atom. Of course when naive users see SIGATM records I expect their heads will explode.
Has anyone thought about putting them into a second MODEL record instead? That way, they're present for people who want them, but aren't shown by default in programs like PyMol. No clue how refinement programs would deal with this.
That solution has two problems. First, it would violate the definition of MODEL -- Each MODEL is supposed to be a complete, alternative model fitting the data. Second, what do you do when you do have alternative models, each with floppy bits? There isn't a second level of MODEL. The more I think about it the better I like SIGATM cards. Their usage isn't as I suggested in my previous posting, of increasing the sigma of the occupancy. Instead you place the atom somewhere reasonable and give the xyz a large sigma. It has the advantage of using only long established PDB syntax and using it for exactly for what it was intended. In addition, it is the right answer. The main argument against my previously favorite idea of not building floppy atoms was that the absence of these atoms in the model indicate that I have no idea where they are. In fact I do. They can't wander far from the ordered atoms they are covalently bonded to. Put each atom in a reasonable place and give their positional parameters a 5 A sigma. Of course these sigmas would have to be calculated for all the atoms in the model. George Sheldrick would advise you on the calculation but, if you have an aversion to inverting big matrices, Cruickshank came up with an empirical formula about 10 years ago that might be useful. The PDB format does not have to be changed, but we would have to crusade to get software to begin to pay attention to SIGATM. A program like Coot could have the option of showing spheres whose size is related to the atom's uncertainty in location. An electrostatic calculation program would have a direct indication of how far it could move each charged atom to optimize the electrostatics while still being consistent with the original x-ray model. This seems to be a win-win solution. Dale Tronrud
Dear all, I have had some personal off-line communication with the PDB about the current policy on zero-occupancy atoms. I thought it would be useful to disseminate the information to the wider community... I have just taken the key points and listed them below...(hopefully without changing the intended meaning!)
The PDB deprecates the use of zero occupancy atoms - but they are certainly allowed - they are annotated in remark 475 if present. The preferred action is for them to be removed and then the missing atoms are documented in remark 470. At no point should this be imposed on the authors - it is by agreement. The policy is just a preference not an enforced change to the coordinates.
you can check the policies at the wwPDB http://www.wwpdb.org/procedure.html search for the section Zero occupancy residues (REMARK 475) and atoms (REMARK 480) which indicates that zero occupancy atoms are allowed in the format.
I keenly await the (hopefully) consensus opinion on what we, as crystallographers, should do with residues with poor or absent side-chain density. Tony. -- Dr Antony W. Oliver Senior Research Fellow Cancer Research UK DNA Repair Enzymes Group Genome Damage and Stability Centre Science Park Road University of Sussex Falmer, Brighton BN1 9QR email: [email protected] tel (office): 01273 678349 tel (lab): 01273 677512
On Sat, 2011-03-26 at 10:06 +0000, Antony Oliver wrote:
I keenly await the (hopefully) consensus opinion on what we, as crystallographers, should do with residues with poor or absent side-chain density.
Antony, thanks for passing along the information. I won't hold my breath for the consensus though - this is one of the recurring discussions (myself being guilty of repeatedly stoking the fire) that never result in clear conclusion, simply because the choice here has zero effect regarding conclusions drawn from your data. It is simply a question of semantics. To get some idea if there is any indication of consensus, I've set up a survey at google docs https://spreadsheets.google.com/viewform?hl=en&formkey=dHVNa3VodUtfbVQtZ2pnUFcxQkx6RHc6MQ#gid=0 I'll post the results once it levels off - please don't vote more than once (but feel free to get your lab/family members to have a say). Cheers, Ed. -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs
Dear Ed, nice survey page! I have a few comments / questions: - what you mean by "no density", I mean in sigmas? zero? 0.5? multiple apparently alternative locations with 0.7 sigma? or..? Most of the time it is not that the density is zero or negative due to artifacts, but "weak" (I used "" because that needs definition, and probably another survey -:)). Therefore this raises another item for your questionnaire: model each such weak density state with ensemble of alternative conformers (each corresponding to plausible rotamer) and refine group occupancy for these atoms (one occupancy per all atoms in question - the occupancy typically will refine to something less than 0.5 or so). The sum of occupancies may not necessarily add up to 1 because you will still be missing a few conformers with truly no density. - Item #3: I guess if a program lets you to do this during restrained refinement of individual isotropic (or anisotropic) B-factors, then you should submit a bug report. All the flavors of restraints used in individual ADP refinement are essentially similarity restraints that make sure the B-factors of connected atoms are more or less similar. This trick with smearing out an atom by B-factor may only work for isolated (single) atoms such as waters because they are not bonded to anything through restraints. All the best! Pavel. On 3/28/11 7:25 AM, Ed Pozharski wrote:
On Sat, 2011-03-26 at 10:06 +0000, Antony Oliver wrote:
I keenly await the (hopefully) consensus opinion on what we, as crystallographers, should do with residues with poor or absent side-chain density. Antony,
thanks for passing along the information. I won't hold my breath for the consensus though - this is one of the recurring discussions (myself being guilty of repeatedly stoking the fire) that never result in clear conclusion, simply because the choice here has zero effect regarding conclusions drawn from your data. It is simply a question of semantics. To get some idea if there is any indication of consensus, I've set up a survey at google docs
https://spreadsheets.google.com/viewform?hl=en&formkey=dHVNa3VodUtfbVQtZ2pnUFcxQkx6RHc6MQ#gid=0
I'll post the results once it levels off - please don't vote more than once (but feel free to get your lab/family members to have a say).
Cheers,
Ed.
Pavel,
- what you mean by "no density",
Lack of confidence in placement of the side chain. Everyone would have somewhat different take on it, but the question is more about what to do, not how to decide if the side chain is disordered.
Therefore this raises another item for your questionnaire:
There is "other" option, feel free to use it
refine group occupancy for these atoms (one occupancy per all atoms in question - the occupancy typically will refine to something less than 0.5 or so).
This raises an entirely different question regarding reliability of occupancy refinement in general due to its correlation with the B-factors. Another can of worms.
This trick with smearing out an atom by B-factor may only work for isolated (single) atoms such as waters because they are not bonded to anything through restraints.
Certainly, presence of restraints makes the B-factor increase less steep. I just looked at an instance of a disordered arginine (no density above 1 sigma for any side chain atoms), and B-factors jump from 30 at the backbone to 90 at the tip of the side chain. This would reduce the density level ~5x, which is probably quite sufficient for blending it into the solvent. There could be a bit of a problem in the middle, where B-factors are inflated/deflated, but it does take care of density reduction. Things like atom-specific restraints and modified restraint target may be of some help, but the effect on the final model may be too small to validate the effort. -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs
We have been doing a lot of parallel refinements where we are checking out the new options in PHENIX refinement, and one of the things we have observed is that sometimes the sidechains with no clear electron density end up in the main chain density, and distort the model. There is no clear pattern as to which options lead to this phenotype, as different combinations give different results. Until we can sort out what is causing this, it seems clear to me that it is better to delete the side chains. If you want to leave a side chain with no clear electron density, you have to make sure each one is not distorting the model. So leaving the side chains with no clear electron density requires much work, with a benefit that is not clear to me. Kendall Nettles On Mar 28, 2011, at 1:04 PM, Ed Pozharski wrote:
Pavel,
- what you mean by "no density",
Lack of confidence in placement of the side chain. Everyone would have somewhat different take on it, but the question is more about what to do, not how to decide if the side chain is disordered.
Therefore this raises another item for your questionnaire:
There is "other" option, feel free to use it
refine group occupancy for these atoms (one occupancy per all atoms in question - the occupancy typically will refine to something less than 0.5 or so).
This raises an entirely different question regarding reliability of occupancy refinement in general due to its correlation with the B-factors. Another can of worms.
This trick with smearing out an atom by B-factor may only work for isolated (single) atoms such as waters because they are not bonded to anything through restraints.
Certainly, presence of restraints makes the B-factor increase less steep. I just looked at an instance of a disordered arginine (no density above 1 sigma for any side chain atoms), and B-factors jump from 30 at the backbone to 90 at the tip of the side chain. This would reduce the density level ~5x, which is probably quite sufficient for blending it into the solvent. There could be a bit of a problem in the middle, where B-factors are inflated/deflated, but it does take care of density reduction.
Things like atom-specific restraints and modified restraint target may be of some help, but the effect on the final model may be too small to validate the effort.
-- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Kendall, we've been disusing this off-list between developers and some users... I'm going to copy-paste one of my latest comments: """ Given the poll result I guess there is no way to make everyone happy with a unique solution, so having multiple options (keep/trim/do something else) is clearly the way to go. I guess another poll (or proper research) about "what's distinguishable" (or what you call poor density) would be of help too. Because I bet once we start doing trimming there will be always someone screaming "No! you are trimming/not building at too low/high CC/sigma!" Personally, I'm not leaning towards one or another option. I believe all of them have approximately equal amount of clear advantages and disadvantages. I was just thinking of a longer term research-like solution that might (or might not) bring a novel idea... By the way, another alternative way to model them is to define a probability distribution mask around a side chain where its atoms are expected, and use that mask as a contribution to Fcalc. I guess (not 100% sure) this is what you can do in BUSTER-TNT (at least this is what advertised in their paper). That would account for these missing atoms to some degree without actually including the ATOM records for them into final PDB. One wrinkle though is that in this case you would need to deposit that mask along with your PDB file, since you will be having a mixed model - atomic model + nonatomic model. (I wonder how many users who ever used this option actually did deposit the masks? -:) ) """ All the best! Pavel. On 3/29/11 6:43 PM, Kendall Nettles wrote:
We have been doing a lot of parallel refinements where we are checking out the new options in PHENIX refinement, and one of the things we have observed is that sometimes the sidechains with no clear electron density end up in the main chain density, and distort the model. There is no clear pattern as to which options lead to this phenotype, as different combinations give different results. Until we can sort out what is causing this, it seems clear to me that it is better to delete the side chains. If you want to leave a side chain with no clear electron density, you have to make sure each one is not distorting the model. So leaving the side chains with no clear electron density requires much work, with a benefit that is not clear to me.
Kendall Nettles
On Mar 28, 2011, at 1:04 PM, Ed Pozharski wrote:
Pavel,
- what you mean by "no density", Lack of confidence in placement of the side chain. Everyone would have somewhat different take on it, but the question is more about what to do, not how to decide if the side chain is disordered.
Therefore this raises another item for your questionnaire: There is "other" option, feel free to use it
refine group occupancy for these atoms (one occupancy per all atoms in question - the occupancy typically will refine to something less than 0.5 or so). This raises an entirely different question regarding reliability of occupancy refinement in general due to its correlation with the B-factors. Another can of worms.
This trick with smearing out an atom by B-factor may only work for isolated (single) atoms such as waters because they are not bonded to anything through restraints. Certainly, presence of restraints makes the B-factor increase less steep. I just looked at an instance of a disordered arginine (no density above 1 sigma for any side chain atoms), and B-factors jump from 30 at the backbone to 90 at the tip of the side chain. This would reduce the density level ~5x, which is probably quite sufficient for blending it into the solvent. There could be a bit of a problem in the middle, where B-factors are inflated/deflated, but it does take care of density reduction.
Things like atom-specific restraints and modified restraint target may be of some help, but the effect on the final model may be too small to validate the effort.
-- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
On Thu, Mar 24, 2011 at 2:41 AM, Antony Oliver
Although I can see from the non-crystallogapher end-user perspective that the zero occupancy option is probably preferred, because as at least the side-chain atoms are preserved, and readily identifiable when working in PyMOL, CCP4MG, Coot etc.
But if the naive end user sees an arginine sidechain sticking out, he/she will most likely assume that the coordinates reflect reality. I'm inclined to agree with Frank - we're not making theoretical models here, so we shouldn't be guessing coordinate positions. Even in a 4A structure, however poor the sidechain density is, the coordinates have been refined against the data, and the B-factors will still provide some information about disorder and accuracy. If end users demand complete sidechains, let them make their own - this is far less confusing (and less dangerous) than making something up and expecting them to figure out what the zero column means. -Nat
One catch with truncating side chains is that if you calculate electrostatic surfaces (for what they are worth) then you will probably lose some charges (I believe) Phil On 24 Mar 2011, at 14:54, Nathaniel Echols wrote:
On Thu, Mar 24, 2011 at 2:41 AM, Antony Oliver
wrote: Although I can see from the non-crystallogapher end-user perspective that the zero occupancy option is probably preferred, because as at least the side-chain atoms are preserved, and readily identifiable when working in PyMOL, CCP4MG, Coot etc.
But if the naive end user sees an arginine sidechain sticking out, he/she will most likely assume that the coordinates reflect reality. I'm inclined to agree with Frank - we're not making theoretical models here, so we shouldn't be guessing coordinate positions. Even in a 4A structure, however poor the sidechain density is, the coordinates have been refined against the data, and the B-factors will still provide some information about disorder and accuracy. If end users demand complete sidechains, let them make their own - this is far less confusing (and less dangerous) than making something up and expecting them to figure out what the zero column means.
-Nat _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
On Thu, Mar 24, 2011 at 8:04 AM, Phil Evans
wrote:One catch with truncating side chains is that if you calculate electrostatic surfaces (for what they are worth) then you will probably lose some charges (I believe)
Certainly, but the same is also true of missing loops - some of which are very important for forming binding surfaces, even if they aren't visible in the crystal. -Nat
On Thu, 2011-03-24 at 15:04 +0000, Phil Evans wrote:
One catch with truncating side chains is that if you calculate electrostatic surfaces (for what they are worth) then you will probably lose some charges (I believe)
True. But it is also true that for electrostatic potentials to be meaningful one has to carefully define protonation states. Thus model inspection is necessary anyway. Also, it is possible that at least some programs will use occupancy when calculating electrostatic potentials, and thus zero occupancy will have the same effect. Maybe the burden of using a corrected model should be on programs that calculate electrostatic maps, or on their users. Otherwise a crystallographer is forced to come up with model that is partially unsubstantiated. Cheers, Ed. -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs
I have seen some quite convincing evidence since that the disordered side chains do have a detectable effect on the rest of the model, and thus leaving them in makes a model worse.
I've seen both.
What do you mean? That removing disordered side chains affects the rest of the model too? But surely removing atoms not found in the density will reduce overall model error.
Other half of the argument was of semantic nature and referred more to replacing disordered residues with alanines, which is silly
It is silly but honest.
Again, not sure what you mean. Tyrosine is a tyrosine even if positions of some atoms are unknown/omitted. If I rename it to alanine, then I am claiming it's an alanine, which is wrong. (Of course, one can be wrong and honest at the same time).
I agree, making occupancy zero is kind of abusing it in order to say "I do not see this atom".
No, that is not what occ=0.00 says. 100% occupancy means in the standard model that an atom has an average position at (x,y,z) and moves around (dynamically and statically) in a harmonic fashion with the amplitude defined by B-factor. 50% occupancy means that it is found 50% of time in the "gaussian vicinity" of (x,y,z), and 50% of time it's somewhere else (in the second location present as alternate conformation or, if the latter is absent, everywhere else as allowed by geometry). 10% means it is only found at (x,y,z) 10% of the time, etc. What happens at 0% depends on one's take on continuity of the occupancy as a descriptor of physical reality. Unless we redefine occupancy of precisely 0.00 as a special case that actually means "I have no idea where this atom is other than what I can deduce from geometry and by the way, the coordinates provided are completely meaningless" then by continuity it means that the probability to find this atom at (x,y,z) is precisely 0. Which, for a disordered side chain, is most likely incorrect.
Finally, when we model 4-6A or so resolution data why we stick atoms into those tubes of density? Do these densities really tell you where that specific atom or often even residue is?
With low precision, but yes. A correct analogy here is placing extra residues when there is no tube of density, e.g. missing loops and/or termini. But nobody does that. We just stick with partial model. Why should side chains be any different? Cheers. -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs
The occupancy is a scalar weighting term in the structure factor equation (or if you prefer, the electron density equation). That's its actual meaning, and anything else is interpretation of the meaning. For example I think it's very unlikely that any refinement program takes the occupancy as a probability and weight the geometric terms accordingly. Ergo occupancy zero surely should be interpreted as meaning "this atom doesn't contribute to the X-ray scattering in this model", which I personally interpret as "I've got no idea where it is". Geometry does limit the "no idea part", naturally, but this might involve some complex analysis of the ambiguity level of the backbone placement. I tend to go with putting the s/c in and letting the B-factors float, but given the nature of PDB, whatever you do is wrong for some applications if used by someone unaware of this issue. Cheers, Phil Jeffrey Princeton On 3/24/11 11:23 AM, Ed Pozharski wrote:
No, that is not what occ=0.00 says. 100% occupancy means in the standard model that an atom has an average position at (x,y,z) and moves around (dynamically and statically) in a harmonic fashion with the amplitude defined by B-factor. 50% occupancy means that it is found 50% of time in the "gaussian vicinity" of (x,y,z), and 50% of time it's somewhere else (in the second location present as alternate conformation or, if the latter is absent, everywhere else as allowed by geometry). 10% means it is only found at (x,y,z) 10% of the time, etc. What happens at 0% depends on one's take on continuity of the occupancy as a descriptor of physical reality. Unless we redefine occupancy of precisely 0.00 as a special case that actually means "I have no idea where this atom is other than what I can deduce from geometry and by the way, the coordinates provided are completely meaningless" then by continuity it means that the probability to find this atom at (x,y,z) is precisely 0. Which, for a disordered side chain, is most likely incorrect.
This topic generates very strong feelings, and everybody have their favorite and valid arguments. I usually favor placing side chains, but with occupancies left at the chemical reality of 1.0, and letting B factors take care of disorder. I would like to point out at least one case where placing side chain atoms with occ=1 makes very good sense: a high-resolution model placed in a low-resolution structure and map, where side chains for which density is lacking. This is a case where we have prior information where the atoms are, but we may not see them in our 4-Angstrom map. Obviously, how you do B factors for such structures is important. The reason why we place some atoms in our models depends on how much prior information (i.e. geometry, chemical connectivity, and previously determined models) we have for that special set of atoms. If a long loop has no density for it, it will not be placed. However, this is usually not the case for side chains. Low resolution structures are more common these days, especially for many large protein complexes (and it is great that we can get these structures). I guess Pavel will be testing different scenarios (hopefully including 4 A-ish resolution), and will let us know of his findings... Engin On 3/24/11 8:35 AM, Phil Jeffrey wrote:
The occupancy is a scalar weighting term in the structure factor equation (or if you prefer, the electron density equation). That's its actual meaning, and anything else is interpretation of the meaning.
For example I think it's very unlikely that any refinement program takes the occupancy as a probability and weight the geometric terms accordingly.
Ergo occupancy zero surely should be interpreted as meaning "this atom doesn't contribute to the X-ray scattering in this model", which I personally interpret as "I've got no idea where it is". Geometry does limit the "no idea part", naturally, but this might involve some complex analysis of the ambiguity level of the backbone placement.
I tend to go with putting the s/c in and letting the B-factors float, but given the nature of PDB, whatever you do is wrong for some applications if used by someone unaware of this issue.
Cheers, Phil Jeffrey Princeton
On 3/24/11 11:23 AM, Ed Pozharski wrote:
No, that is not what occ=0.00 says. 100% occupancy means in the standard model that an atom has an average position at (x,y,z) and moves around (dynamically and statically) in a harmonic fashion with the amplitude defined by B-factor. 50% occupancy means that it is found 50% of time in the "gaussian vicinity" of (x,y,z), and 50% of time it's somewhere else (in the second location present as alternate conformation or, if the latter is absent, everywhere else as allowed by geometry). 10% means it is only found at (x,y,z) 10% of the time, etc. What happens at 0% depends on one's take on continuity of the occupancy as a descriptor of physical reality. Unless we redefine occupancy of precisely 0.00 as a special case that actually means "I have no idea where this atom is other than what I can deduce from geometry and by the way, the coordinates provided are completely meaningless" then by continuity it means that the probability to find this atom at (x,y,z) is precisely 0. Which, for a disordered side chain, is most likely incorrect.
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
-- Engin Özkan Post-doctoral Scholar Laboratory of K. Christopher Garcia Howard Hughes Medical Institute Dept of Molecular and Cellular Physiology 279 Campus Drive, Beckman Center B173 Stanford School of Medicine Stanford, CA 94305 ph: (650)-498-7111
On Mar 24, 2011, at 5:27 AM, Pavel Afonine wrote: I agree, making occupancy zero is kind of abusing it in order to say "I do not see this atom". But so far I have no feeling about what is more confusing: - set occupancy to zero in order to say "I don't see it in the map", or - call TYR (or whatever else) something that is according to the atom content is NOT TYR but is ALA. What about some kind of atom-type card (along the lines of "ANISOU") that specifies the disordered nature of the atom in question? So if everything past CB is disordered in a lysine, you would get something like: ATOM 592 N LYS A 84 -17.487 3.994 61.377 1.00 32.95 N ATOM 593 CA LYS A 84 -17.022 4.714 60.192 1.00 32.98 C ATOM 594 CB LYS A 84 -17.046 6.221 60.446 1.00 32.90 C ATOM 595 CG LYS A 84 -18.435 6.824 60.503 1.00 32.69 C DISATM 595 CG LYS A 84 ATOM 596 CD LYS A 84 -18.405 8.207 61.132 1.00 37.38 C DISATM 596 CD LYS A 84 ATOM 597 CE LYS A 84 -17.881 9.259 60.164 1.00 41.33 C DISATM 597 CE LYS A 84 ATOM 598 NZ LYS A 84 -17.856 10.600 60.803 1.00 42.35 N DISATM 598 NZ LYS A 84 ATOM 599 C LYS A 84 -15.620 4.293 59.759 1.00 32.36 C ATOM 600 O LYS A 84 -15.359 4.124 58.568 1.00 35.18 O The "most probable" conformation could be included in the coordinates as is often the case now, but at the authors' discretion, certain atoms could be flagged as disordered. These atoms could then be handled in a way similar to those with occ=0 now (or whatever developers decide). Then we wouldn't be abusing the occupancy (an objection that seems to be fairly common) but there would be clear indication in the PDB file of the disorder. Of course, where to draw the line between ordered and disordered is highly subjective, but that's the case now, anyway. Also, this solution would perhaps not be as effective with missing residues as with missing side chains. How to display this kind of difference in visualization programs would be an issue as well, but I could imagine (for example) a PyMOL command to do something like hide sticks, disordered being relatively straightforward to implement. And in Coot, perhaps a toolbar button like the exisiting "Delete" that would be instead "Flag atom as disordered" and a different color or transparency used to indicate the disordered atoms. I realize altering the PDB format may not be the favored approach to this issue, and I'm not very well versed in the intricacies of crystallographic software design, but this all just occurred to me so I'm throwing it out there. Cheers, Jared -- Jared Sampson Xiangpeng Kong Lab NYU Langone Medical Center New York, NY 10016 212-263-7898 ------------------------------------------------------------ This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email. =================================
Oh no!! Please please please don't bring that hoary old canard of building in side-chains - it's Very Bad Practice, we don't build missing amino acids or ligands, do we, so why are sidechains different? The "most likely rotamer" is still the WRONG rotamer: if it were right, you'd see it. CNS used to do this, but as far as I could tell it was an artefact because CNS could not refine a residue if not all atoms were present, so to truncate something you had to call it ALA. That's obviously wrong, and phenix fixed this, thank god, and good riddance. Now it's to come back...? *sob* As for theoption (b): that sounds like a very valid criterion for placing a side-chain. In fact, it's awesome - so why set occupancies to zero? Phenix already deals with validly modelled atoms very effectively: set occ=1 and relax the B-factor restraints. (Sorry, you can tell I feel strongly... :) phx. On 23/03/2011 21:39, Pavel Afonine wrote:
Hi Ed,
you are asking the right question.
We discussed the issue of residue side chains partially or fully lacking density at PHENIX developers meeting last week. This was brought up by an industrial consortium user.
Some people tend to chop them off (which I do not not not like), some let the refinement programs do what they want, and the programs stupidly tends to stick these sidechains in whatever density is available around (this is bad too).
I think we decided that I have to do something about it, probably the following:
- if a side chain lacks the density and the full rotamer search does not find any density, then a) set occupancies of affected atoms to zero, and b) move the side chain to the highest probable rotameric state that generates least clashes with surrounding atoms and may be still have some good density fit (even though it is a bad density). I will work on this. That's for sure.
Pavel.
Does refine restrains side chain rotamers to, say, the closest one from the rotamer dictionary or, perhaps, the starting coordinates (I find this less likely, of course)? I am particularly interested in what would happen if a side chain is disordered and there is no strong electron density to support a specific conformation. Do I understand correctly that the real-space refinement for the side chains is on by default, which means that disordered side chains would tend to stay within their biased density?
I apologize if I missed the description of this in the manual, just point me to the right section then.
Cheers,
Ed.
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
(oops, got carried away: meant of course "of building MISSING SIDE-CHAINS AT OCC=0". Embarrassing :) On 24/03/2011 05:54, Frank von Delft wrote:
Oh no!! Please please please don't bring that hoary old canard of building in side-chains - it's Very Bad Practice, we don't build missing amino acids or ligands, do we, so why are sidechains different? The "most likely rotamer" is still the WRONG rotamer: if it were right, you'd see it.
CNS used to do this, but as far as I could tell it was an artefact because CNS could not refine a residue if not all atoms were present, so to truncate something you had to call it ALA. That's obviously wrong, and phenix fixed this, thank god, and good riddance.
Now it's to come back...? *sob*
As for theoption (b): that sounds like a very valid criterion for placing a side-chain. In fact, it's awesome - so why set occupancies to zero? Phenix already deals with validly modelled atoms very effectively: set occ=1 and relax the B-factor restraints.
(Sorry, you can tell I feel strongly... :)
phx.
On 23/03/2011 21:39, Pavel Afonine wrote:
Hi Ed,
you are asking the right question.
We discussed the issue of residue side chains partially or fully lacking density at PHENIX developers meeting last week. This was brought up by an industrial consortium user.
Some people tend to chop them off (which I do not not not like), some let the refinement programs do what they want, and the programs stupidly tends to stick these sidechains in whatever density is available around (this is bad too).
I think we decided that I have to do something about it, probably the following:
- if a side chain lacks the density and the full rotamer search does not find any density, then a) set occupancies of affected atoms to zero, and b) move the side chain to the highest probable rotameric state that generates least clashes with surrounding atoms and may be still have some good density fit (even though it is a bad density). I will work on this. That's for sure.
Pavel.
Does refine restrains side chain rotamers to, say, the closest one from the rotamer dictionary or, perhaps, the starting coordinates (I find this less likely, of course)? I am particularly interested in what would happen if a side chain is disordered and there is no strong electron density to support a specific conformation. Do I understand correctly that the real-space refinement for the side chains is on by default, which means that disordered side chains would tend to stay within their biased density?
I apologize if I missed the description of this in the manual, just point me to the right section then.
Cheers,
Ed.
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
participants (13)
-
Antony Oliver
-
Antony Oliver
-
Dale Tronrud
-
Donnie Berkholz
-
Ed Pozharski
-
Engin Özkan
-
Frank von Delft
-
Kendall Nettles
-
Nathaniel Echols
-
Pavel Afonine
-
Phil Evans
-
Phil Jeffrey
-
Sampson, Jared