Validation question: R-factor (all) and # of unique reflections (all)

Hi everyone, While I am trying to deposit, I am faced with several things that I am not sure: FYI. I am in the <Refinement Statistics> section (Auto Dep Input Tool) in the PDB bank. Most values are came from pdb file that was made from phenix.refine but it asks me "R-factor (all)" that satisfy the limits established by high & low resolution cut off and "number of unique reflections (all)". My simple question is if phenix has those kind of information, if so where I can find them. Another question is if I changed water (deletion or changed occupancy) based on 'precheck', should I run phenix.refine again to get R/Rfree values in correct? I am not sure if these are captured by PDB automatically or not. (seemingly they do like that at this point but not 100% sure yet.) With many thanks~ Young-Jin

On Tue, Jul 5, 2011 at 11:17 AM, Young-Jin Cho
While I am trying to deposit, I am faced with several things that I am not sure: FYI. I am in the <Refinement Statistics> section (Auto Dep Input Tool) in the PDB bank. Most values are came from pdb file that was made from phenix.refine but it asks me "R-factor (all)" that satisfy the limits established by high & low resolution cut off and "number of unique reflections (all)". My simple question is if phenix has those kind of information, if so where I can find them.
It looks like these aren't included in the PDB header of the files phenix.refine writes. I think "R-factor (all)" is an artifact from the era when crystallographers would do a final refinement with all reflections included (a Very Bad Idea), and I think I always left this blank in PDB depositions. Assuming you didn't truncate the resolution, "Number of unique reflections (all)" is probably whatever is in the MTZ file - depending on the outlier removal, it may be slightly more than phenix.refine reports.
Another question is if I changed water (deletion or changed occupancy) based on 'precheck', should I run phenix.refine again to get R/Rfree values in correct? I am not sure if these are captured by PDB automatically or not. (seemingly they do like that at this point but not 100% sure yet.)
You can run phenix.model_vs_data or the validation GUI to determine new R-factors. Unless the PDB has made huge improvements in the last half-year, their recalculation of R-factors isn't at all robust. -Nat

Hi Young-Jin,
While I am trying to deposit, I am faced with several things that I am not sure: FYI. I am in the <Refinement Statistics> section (Auto Dep Input Tool) in the PDB bank. Most values are came from pdb file that was made from phenix.refine but it asks me "R-factor (all)" that satisfy the limits established by high & low resolution cut off and "number of unique reflections (all)". My simple question is if phenix has those kind of information, if so where I can find them.
phenix.refine and phenix.model_vs_data report Rfree and Rwork computed for all reflections used in refinement (which may be different from the number of reflections in your input file, because of Iobs to Fobs conversion and outliers filtering). They do not report R(all) since I believe it is a useless number given that both, Rfree and Rwork, are reported. If you really want it you can compute it using a MTZ file created by phenix.refine, since it contains Fobs used in refinement and Fmodel, so the R-factor is simply R = SUM ||Fobs|-scale*|Fmodel|| / SUM |Fobs| and scale = SUM ( |Fobs|*|Fmodel| )/SUM |Fmodel|**2
Another question is if I changed water (deletion or changed occupancy) based on 'precheck', should I run phenix.refine again to get R/Rfree values in correct?
Yes, definitely, preferably with "ordered_solvent=true" flag. And having a look at this paper: phenix.model_vs_data: a high-level tool for the calculation of crystallographic model and data statistics. P.V. Afonine, R.W. Grosse-Kunstleve, V.B. Chen, J.J. Headd, N.W. Moriarty, J.S. Richardson, D.C. Richardson, A. Urzhumtsev, P.H. Zwart, P.D. Adams J. Appl. Cryst. 43, 677-685 (2010). is also a good idea in this case, since it explains how bad a post-refinement manipulation of PDB or data files can be. All the best! Pavel.

In the denominator of the R-factor equation, Sum(|Fobs|) could be replaced by Nref*<Fobs>. Then if <Fobs> is the same for free and working reflections, which it should be to a good approximation, it could easily be shown that the overall R factor is the average of the free and working R-factor, weighted by the number of reflections in each. Thus if 5% of reflections ar free, Rall = .05*Rfree + .95*Rwork Although the meaning of this is quite different from the Rall you would get by doing a final refinement against all reflections. Young-Jin Cho wrote:
Hi everyone, While I am trying to deposit, I am faced with several things that I am not sure: FYI. I am in the <Refinement Statistics> section (Auto Dep Input Tool) in the PDB bank. Most values are came from pdb file that was made from phenix.refine but it asks me "R-factor (all)" that satisfy the limits established by high & low resolution cut off and "number of unique reflections (all)". My simple question is if phenix has those kind of information, if so where I can find them.
Another question is if I changed water (deletion or changed occupancy) based on 'precheck', should I run phenix.refine again to get R/Rfree values in correct? I am not sure if these are captured by PDB automatically or not. (seemingly they do like that at this point but not 100% sure yet.)
With many thanks~
_______________________________________________ phenixbb mailing list [email protected]
participants (4)
Edward A. Berry
Nathaniel Echols
Pavel Afonine
Young-Jin Cho