Yes, there are basically three options:
1) cut the side chain down to whatever is still visible in the density 2) let the refinement proceed as is 3) set side chain atom occupancies to zero
Personally, I have evolved from 2) to 1). The argument was that omitting atoms will confuse some end users,
there can be endless list of things how you can confuse the end-user, so I guess I put it aside and assume dealing with an educated individual.
and refinement will essentially take care of it by increasing the B-factors.
Yes, stupid refinement would probably do it. phenix.refine will not do it since zero occupancy atoms will not contribute to the scattering, and their B-factors will be roughly similar to those of neighbor atoms.
I have seen some quite convincing evidence since that the disordered side chains do have a detectable effect on the rest of the model, and thus leaving them in makes a model worse.
I've seen both.
Other half of the argument was of semantic nature and referred more to replacing disordered residues with alanines, which is silly
It is silly but honest. If you call TYR something like this ATOM 134 N TYR A 19 21.657 -76.614 65.963 1.00 28.50 A N ATOM 135 CA TYR A 19 23.064 -76.802 65.641 1.00 27.23 A C ATOM 136 CB TYR A 19 23.231 -77.079 64.157 1.00 30.04 A C ATOM 137 C TYR A 19 23.816 -75.537 66.027 1.00 27.07 A C ATOM 138 O TYR A 19 23.265 -74.434 65.976 1.00 24.32 A O that would be weird too. Call it then "handicapped TYR" -;) And I guess to see something like this is confusing for the end user too (especially one who learns things). If I see something like this my first geuss would be "someone messed up the file while doing copy-paste".
because we know from sequence it's not alanine
Yes, we know this. But before we really know this, we need: 1) extract sequence from PDB file; 2) get your correct sequence; 3) align them and see mismatches; 4) distinguish between model building (occasional) errors and intentional ones (due to ALA truncation).
The third option (the one you gravitate towards) seems problematic to me for the following reasons. The meaning of the occupancy is that the atom distribution in space is multimodal, and it spends certain fraction of time vibrating around the specified position. So what is the meaning of zero occupancy? This is the average atomic position, but it spends zero time here? Makes no physical sense, and in fact is wrong since there is some non-zero probability that the disordered side chain will occupy the designated conformation. Of course, structural model may be considered a *mathematical* model, and it does not have to be strictly interpretable (or interpretable the way I see most logical, anyway).
This is a valid argument, I agree. Better, one would need to run a bunch of identical refinements and obtain the ensemble that would tell you (more or less) the uncertainty and degree of confidence for each atom: http://cci.lbl.gov/~afonine/p2.png That would be a step forward towards a better option then setting the occupancy to zero.
As for end-user argument, I would say that omitted atoms are better than high B-factors or zero occupancies,
I exclude high b-factors as an option because if the program does it then it is a bug that must be fixed. I agree, making occupancy zero is kind of abusing it in order to say "I do not see this atom". But so far I have no feeling about what is more confusing: - set occupancy to zero in order to say "I don't see it in the map", or - call TYR (or whatever else) something that is according to the atom content is NOT TYR but is ALA. Finally, when we model 4-6A or so resolution data why we stick atoms into those tubes of density? Do these densities really tell you where that specific atom or often even residue is? Pavel.