Re: [phenixbb] Geometry Restraints - Anisotropic truncation
Message: 4 Date: Wed, 25 Apr 2012 09:05:57 -0400 From: Mario Sanches
To: PHENIX user mailing list Subject: [phenixbb] Geometry Restraints - UPDATE .......... It turns out that the reason to my nightmares was a highly anisotropic dataset. When I tried to fix geometry related problems in coot,
Hi, so will we see now the possibility to apply anisotropic resolution limits truncation in Phenix? Despite the fact that a lot of people use UCLA server, there is still no possibility to do this in any major crystallographic software, AFAIK. Phaser does the scaling, but no truncation, and noise sitting in those regions where there is no signal, is a real culprit. Large SIGFs there do not resolve the problem completely. Thanks. phenix.refine was twisting the geometry back again, probably because it was trying to fit a lot of noise due to the anisotropy. Everything was getting worst, geometry, clashscore, and Ramachandran. Pavel finally corrected it by using this server: http://services.mbi.ucla.edu/anisoscale/ ............ -- Dr. Leonid A. Sazanov Research group leader Medical Research Council Mitochondrial Biology Unit Wellcome Trust / MRC Building Hills Road Cambridge CB2 0XY WEB: www.mrc-mbu.cam.ac.uk Tel: +44-1223-252910 Fax: +44-1223-252915
On Sun, Apr 29, 2012 at 9:30 AM, Leonid Sazanov
Hi, so will we see now the possibility to apply anisotropic resolution limits truncation in Phenix?
It wouldn't be difficult to implement - maybe a few dozen lines of code. It certainly seems less dangerous than providing a general-purpose program for anisotropy-correcting data, which we've resisted adding for a long time because it's prone to abuse. I think a lot of people use the UCLA server blindly because they think it's magically going to make their refinement better, whether or not they actually need it. My feeling is that the truncation is really better left to the data processing software, however; I'm not sure what the options are there. -Nat
Dear All, Following is a table of x-ray data analysis. In it the "Outershell" is same as "the highest resolution shell" in the table 1 of crystallography paper, am I right? I am looking forward to getting a reply from you. Cheers, Dialing Overall InnerShell OuterShell Low resolution limit 77.381 77.381 1.669 High resolution limit 1.663 7.718 1.663 Rmerge 0.059 0.031 0.586 Ranom 0.057 0.028 0.561 Rmeas (within I+/I-) 0.062 0.031 0.625 Rmeas (all I+ & I-) 0.062 0.032 0.620 Rpim (within I+/I-) 0.023 0.012 0.271 Rpim (all I+ & I-) 0.017 0.010 0.196 Total number of observations 171554 1803 810 Total number unique 13699 181 88 Mean(I)/sd(I) 32.4 46.5 4.6 Completeness 98.0 100.0 70.4 Multiplicity 12.5 10.0 9.2
Dear All, In some table 1 of the crystallography paper, there is an item called "FOM". Will you please explain the meaning of it? Do we have a server to get it or some other route to get this value? I am looking forward to getting a reply from you. Cheers, Dialing
On 29 Apr 2012, at 23:55, Dialing Pretty wrote:
In some table 1 of the crystallography paper, there is an item called "FOM". Will you please explain the meaning of it? Do we have a server to get it or some other route to get this value?
I'm guessing this is the "figure of merit", defined as the cosine of the phase error. Since the correct phase is generally unknown it's an estimate, which may be calculated during density modification (phase improvement). // Best wishes; Johan
On Mon, Apr 30, 2012 at 10:17 AM, Johan Hattne
On 29 Apr 2012, at 23:55, Dialing Pretty wrote:
In some table 1 of the crystallography paper, there is an item called "FOM". Will you please explain the meaning of it? Do we have a server to get it or some other route to get this value?
I'm guessing this is the "figure of merit", defined as the cosine of the phase error. Since the correct phase is generally unknown it's an estimate, which may be calculated during density modification (phase improvement).
It will probably appear somewhat in the phenix.refine log file as well, in one of the tables at the end: stage <pher> fom alpha beta 0 : 55.378 0.4528 0.8776 905662.442 1_bss: 49.163 0.5316 1.2928 650255.977 1_xyz: 46.167 0.5687 1.3802 559471.522 1_adp: 40.720 0.6353 1.4942 483440.556 2_bss: 41.137 0.6303 1.4612 505402.815 2_xyz: 36.144 0.6885 1.5241 412562.858 2_adp: 33.622 0.7173 1.4769 378709.335 3_bss: 33.992 0.7133 1.4808 392255.409 3_xyz: 33.219 0.7213 1.5110 375299.213 3_adp: 32.144 0.7339 1.3212 366007.489 3_bss: 32.351 0.7318 1.3162 372771.223 But like Johan says, it's an estimate, and not a very useful number for evaluating the final model quality - however, the FOM for experiment phasing is a relatively good estimate of the quality of the initial map, and I suspect most papers which include it may be referring to this, not the value for the refined model. If you didn't use experimental phasing for your structure, I wouldn't bother reporting the FOM unless the journal (or a reviewer) requires it. -Nat
In some table 1 of the crystallography paper, there is an item called "FOM". Will you please explain the meaning of it? Do we have a server to get it or some other route to get this value?
This is the letter "m" in Randy's 2mFo-DFc map -;) phenix.refine, phenix.maps and other relevant tools estimate it in resolution shells using test reflections as described here: J. Appl. Cryst. (1996). 29, 741-744 and here Acta Cryst. (1995). A51, 880-887 Of course there are plenty of other literature on this subject! Pavel
Correct, assuming that the high resolution used in data processing is
the same as the high resolution used for refinement.
-Nat
On Mon, Apr 30, 2012 at 1:22 AM, Dialing Pretty
Dear All,
Following is a table of x-ray data analysis. In it the "Outershell" is same as "the highest resolution shell" in the table 1 of crystallography paper, am I right?
I am looking forward to getting a reply from you.
Cheers,
Dialing
Overall InnerShell OuterShell Low resolution limit 77.381 77.381 1.669 High resolution limit 1.663 7.718 1.663
Rmerge 0.059 0.031 0.586 Ranom 0.057 0.028 0.561 Rmeas (within I+/I-) 0.062 0.031 0.625 Rmeas (all I+ & I-) 0.062 0.032 0.620 Rpim (within I+/I-) 0.023 0.012 0.271 Rpim (all I+ & I-) 0.017 0.010 0.196 Total number of observations 171554 1803 810 Total number unique 13699 181 88 Mean(I)/sd(I) 32.4 46.5 4.6 Completeness 98.0 100.0 70.4 Multiplicity 12.5 10.0 9.2
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
IMO reporting statistics in "Outer shell" or in "the highest resolution shell" doesn't really make sense for obvious reasons which don't need to be explained unless one wants to summarize a crystallography text book in an email. If you want to be minimalistic just state the overall figure. A way better idea, though, is to report completeness (and other similar metrics) in relatively thin resolution bins, which would actually tell something about your data. Pavel On 4/30/12 8:26 AM, Nathaniel Echols wrote:
Correct, assuming that the high resolution used in data processing is the same as the high resolution used for refinement.
-Nat
On Mon, Apr 30, 2012 at 1:22 AM, Dialing Pretty
wrote: Dear All,
Following is a table of x-ray data analysis. In it the "Outershell" is same as "the highest resolution shell" in the table 1 of crystallography paper, am I right?
I am looking forward to getting a reply from you.
Cheers,
Dialing
Overall InnerShell OuterShell Low resolution limit 77.381 77.381 1.669 High resolution limit 1.663 7.718 1.663
Rmerge 0.059 0.031 0.586 Ranom 0.057 0.028 0.561 Rmeas (within I+/I-) 0.062 0.031 0.625 Rmeas (all I+& I-) 0.062 0.032 0.620 Rpim (within I+/I-) 0.023 0.012 0.271 Rpim (all I+& I-) 0.017 0.010 0.196 Total number of observations 171554 1803 810 Total number unique 13699 181 88 Mean(I)/sd(I) 32.4 46.5 4.6 Completeness 98.0 100.0 70.4 Multiplicity 12.5 10.0 9.2
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
On Mon, Apr 30, 2012 at 1:49 PM, Pavel Afonine
IMO reporting statistics in "Outer shell" or in "the highest resolution shell" doesn't really make sense for obvious reasons which don't need to be explained unless one wants to summarize a crystallography text book in an email.
I disagree - for the data processing it is very relevant to know what these statistics are, otherwise one has very little idea what the criteria used to determine "resolution" were. For refinement it is perhaps less essential; the R-factors in the outer shell will almost always be somewhat higher than the values for the entire dataset, but this rarely tells you anything. -Nat
Sure, but then at least one needs to clearly define what exactly is "Outer shell" and "highest resolution shell". I can come up with gazillions ways of defining the "highest resolution shell" and depending on how I define it the numbers will be vastly different. In general it may be a good idea to define what exactly you want to calculate first before cranking the machine to get some numbers. In this sense reporting statistics in resolution bins contains both - the definition and desired numbers. Pavel On 4/30/12 1:58 PM, Nathaniel Echols wrote:
IMO reporting statistics in "Outer shell" or in "the highest resolution shell" doesn't really make sense for obvious reasons which don't need to be explained unless one wants to summarize a crystallography text book in an email. I disagree - for the data processing it is very relevant to know what
On Mon, Apr 30, 2012 at 1:49 PM, Pavel Afonine
wrote: these statistics are, otherwise one has very little idea what the criteria used to determine "resolution" were. For refinement it is perhaps less essential; the R-factors in the outer shell will almost always be somewhat higher than the values for the entire dataset, but this rarely tells you anything. -Nat
On Mon, Apr 30, 2012 at 2:17 PM, Pavel Afonine
Sure, but then at least one needs to clearly define what exactly is "Outer shell" and "highest resolution shell". I can come up with gazillions ways of defining the "highest resolution shell" and depending on how I define it the numbers will be vastly different.
This is true in theory, but in practice the most common convention by far is to divide the possible reflections into 10 bins. Whether this number is optimally chosen or not is largely irrelevant; most programs (for data processing, or further downstream) report something like this, and experienced crystallographers have some idea how to interpret such values when they see them. Reporting individual statistics for each bin would indeed more information at the expense of legibility - as I keep saying, humans are not text parsers. (Besides which, the "Table 1" program in Phenix does in fact report the actual resolution range for the outer bin.) -Nat
The problem is the journals sometimes specify the content of Table 1. There might be more room for this if It is supplemental table 1. I bet if you make it a GUI feature (table 1 vs extended supplemental table 1 option) it will catch on fast. Kendall On Apr 30, 2012, at 5:17 PM, Pavel Afonine wrote:
Sure, but then at least one needs to clearly define what exactly is "Outer shell" and "highest resolution shell". I can come up with gazillions ways of defining the "highest resolution shell" and depending on how I define it the numbers will be vastly different. In general it may be a good idea to define what exactly you want to calculate first before cranking the machine to get some numbers. In this sense reporting statistics in resolution bins contains both - the definition and desired numbers.
Pavel
On 4/30/12 1:58 PM, Nathaniel Echols wrote:
IMO reporting statistics in "Outer shell" or in "the highest resolution shell" doesn't really make sense for obvious reasons which don't need to be explained unless one wants to summarize a crystallography text book in an email. I disagree - for the data processing it is very relevant to know what
On Mon, Apr 30, 2012 at 1:49 PM, Pavel Afonine
wrote: these statistics are, otherwise one has very little idea what the criteria used to determine "resolution" were. For refinement it is perhaps less essential; the R-factors in the outer shell will almost always be somewhat higher than the values for the entire dataset, but this rarely tells you anything. -Nat
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Are anisotropic cutoff desirable? Phil On 29 Apr 2012, at 18:06, Nathaniel Echols wrote:
On Sun, Apr 29, 2012 at 9:30 AM, Leonid Sazanov
wrote: Hi, so will we see now the possibility to apply anisotropic resolution limits truncation in Phenix?
It wouldn't be difficult to implement - maybe a few dozen lines of code. It certainly seems less dangerous than providing a general-purpose program for anisotropy-correcting data, which we've resisted adding for a long time because it's prone to abuse. I think a lot of people use the UCLA server blindly because they think it's magically going to make their refinement better, whether or not they actually need it. My feeling is that the truncation is really better left to the data processing software, however; I'm not sure what the options are there.
-Nat _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote:Are anisotropic cutoff desirable?
is there a peer-reviewed publication - perhaps from Acta Crystallographica - which describes precisely why scaling or refinement programs are inadequate to ameliorate the problem of anisotropy, and argues why the method applied in Strong, et. al. 2006 satisfies this need? -Bryan
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise. Kendall On May 1, 2012, at 11:10 AM, Bryan Lepore wrote:
On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote:Are anisotropic cutoff desirable?
is there a peer-reviewed publication - perhaps from Acta Crystallographica - which describes precisely why scaling or refinement programs are inadequate to ameliorate the problem of anisotropy, and argues why the method applied in Strong, et. al. 2006 satisfies this need?
-Bryan _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
On Tue, May 1, 2012 at 10:34 AM, Kendall Nettles
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise.
Was this just performing anisotropy correction, or also truncation? It would be interesting to see the data before and after treatment, in any case. I wonder if combining the anisotropy correction with B-factor sharpening of the map coefficients would make a difference. -Nat
I would think that an R-free drop was inevitable if one deletes a whole swath of really weak data from the free set. Did you monitor what happened to the subset of free set reflections within the anisotropic cutoff ellipsoid - the conserved subset ? Phil Jeffrey Princeton On 5/1/12 1:34 PM, Kendall Nettles wrote:
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise.
Hi Kendall, A caution on using the free R value to evaluate the utility of applying an anisotropic truncation to the data: As you remove data with low I/sigma, you will generally improve the free R regardless of whether your model is improved or not. This is of course true also if you simply remove all the data with I/sigma < 3 or apply any other truncation of that type. All the best, Tom T ________________________________________ From: [email protected] [[email protected]] on behalf of Kendall Nettles [[email protected]] Sent: Tuesday, May 01, 2012 11:34 AM To: PHENIX user mailing list Subject: Re: [phenixbb] Geometry Restraints - Anisotropic truncation I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise. Kendall On May 1, 2012, at 11:10 AM, Bryan Lepore wrote:
On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote:Are anisotropic cutoff desirable?
is there a peer-reviewed publication - perhaps from Acta Crystallographica - which describes precisely why scaling or refinement programs are inadequate to ameliorate the problem of anisotropy, and argues why the method applied in Strong, et. al. 2006 satisfies this need?
-Bryan _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
While philosophically I see no difference between a spherical resolution cutoff and an elliptical one, a drop in the free R can't be the justification for the switch. A model cannot be made more "publishable" simply by discarding data. We have a whole bunch of empirical guides for judging the quality of this and that in our field. We determine the resolution limit of a data set (and imposing a "limit" is another empirical choice made) based on Rmrg, or Rmes, or Rpim getting too big or I/sigI getting too small and there is no agreement on how "too big/small" is too "too big/small". We then have other empirical guides for judging the quality of the models we produce (e.g. Rwork, Rfree, rmsds of various sorts). Most people seem to recognize that the these criteria need to be applied differently for different resolutions. A lower resolution model is allowed a higher Rfree, for example. Isn't is also true that a model refined to data with a cutoff of I/sigI of 1 would be expected to have a free R higher than a model refined to data with a cutoff of 2? Surely we cannot say that the decrease in free R that results from changing the cutoff criteria from 1 to 2 reflects an improved model. It is the same model after all. Sometimes this shifting application of empirical criteria enhances the adoption of new technology. Certainly the TLS parametrization of atomic motion has been widely accepted because it results in lower working and free Rs. I've seen it knock 3 to 5 percent off, and while that certainly means that the model fits the data better, I'm not sure that the quality of the hydrogen bond distances, van der Waals distances, or maps are any better. The latter details are what I really look for in a model. On the other hand, there has been good evidence through the years that there is useful information in the data beyond an I/sigI of 2 or an Rmeas > 100% but getting people to use this data has been a hard slog. The reason for this reluctance is that the R values of the resulting models are higher. Of course they are higher! That does not mean the models are of poorer quality, only that data with lower signal/noise has been used that was discarded in the models you used to develop your "gut feeling" for the meaning of R. When you change your criteria for selecting data you have to discard your old notions about the acceptable values of empirical quality measures. You either have to normalize your measure, as Phil Jeffrey recommends, by ensuring that you calculate your R's with the same reflections, or by making objective measures of map quality. Dale Tronrud P.S. It is entirely possible that refining a model to a very optimistic resolution cutoff and calculating the map to a lower resolution might be better than throwing out the data altogether. On 5/1/2012 10:34 AM, Kendall Nettles wrote:
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise. Kendall
On May 1, 2012, at 11:10 AM, Bryan Lepore wrote:
On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote:Are anisotropic cutoff desirable?
is there a peer-reviewed publication - perhaps from Acta Crystallographica - which describes precisely why scaling or refinement programs are inadequate to ameliorate the problem of anisotropy, and argues why the method applied in Strong, et. al. 2006 satisfies this need?
-Bryan _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
I didnt think the structure was publishable with Rfree of 33% because I was expecting the reviewers to complain.
We have tested a number of data sets on the UCLA server and it usually doesn't make much difference. I wouldn't expect truncation alone to change Rfree by 5%, and it usually doesn't. The two times I have seen dramatic impacts on the maps ( and Rfree ), the highly anisotrophic sets showed strong waves of difference density as well, which was fixed by throwing out the noise. We have moved to using loose data cutoffs for most structures, but I do think anisotropic truncation can be helpful in rare cases.
Kendall
On May 1, 2012, at 3:07 PM, "Dale Tronrud"
While philosophically I see no difference between a spherical resolution cutoff and an elliptical one, a drop in the free R can't be the justification for the switch. A model cannot be made more "publishable" simply by discarding data.
We have a whole bunch of empirical guides for judging the quality of this and that in our field. We determine the resolution limit of a data set (and imposing a "limit" is another empirical choice made) based on Rmrg, or Rmes, or Rpim getting too big or I/sigI getting too small and there is no agreement on how "too big/small" is too "too big/small".
We then have other empirical guides for judging the quality of the models we produce (e.g. Rwork, Rfree, rmsds of various sorts). Most people seem to recognize that the these criteria need to be applied differently for different resolutions. A lower resolution model is allowed a higher Rfree, for example.
Isn't is also true that a model refined to data with a cutoff of I/sigI of 1 would be expected to have a free R higher than a model refined to data with a cutoff of 2? Surely we cannot say that the decrease in free R that results from changing the cutoff criteria from 1 to 2 reflects an improved model. It is the same model after all.
Sometimes this shifting application of empirical criteria enhances the adoption of new technology. Certainly the TLS parametrization of atomic motion has been widely accepted because it results in lower working and free Rs. I've seen it knock 3 to 5 percent off, and while that certainly means that the model fits the data better, I'm not sure that the quality of the hydrogen bond distances, van der Waals distances, or maps are any better. The latter details are what I really look for in a model.
On the other hand, there has been good evidence through the years that there is useful information in the data beyond an I/sigI of 2 or an Rmeas > 100% but getting people to use this data has been a hard slog. The reason for this reluctance is that the R values of the resulting models are higher. Of course they are higher! That does not mean the models are of poorer quality, only that data with lower signal/noise has been used that was discarded in the models you used to develop your "gut feeling" for the meaning of R.
When you change your criteria for selecting data you have to discard your old notions about the acceptable values of empirical quality measures. You either have to normalize your measure, as Phil Jeffrey recommends, by ensuring that you calculate your R's with the same reflections, or by making objective measures of map quality.
Dale Tronrud
P.S. It is entirely possible that refining a model to a very optimistic resolution cutoff and calculating the map to a lower resolution might be better than throwing out the data altogether.
On 5/1/2012 10:34 AM, Kendall Nettles wrote:
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise. Kendall
On May 1, 2012, at 11:10 AM, Bryan Lepore wrote:
On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote:Are anisotropic cutoff desirable?
is there a peer-reviewed publication - perhaps from Acta Crystallographica - which describes precisely why scaling or refinement programs are inadequate to ameliorate the problem of anisotropy, and argues why the method applied in Strong, et. al. 2006 satisfies this need?
-Bryan _______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Kendall, I just did this quick test: calculated R-factors using original and anisotropy-corrected Mike Sawaya's data (*) Original: r_work : 0.3026 r_free : 0.3591 number of reflections: 26944 Truncated: r_work : 0.2640 r_free : 0.3178 number of reflections: 18176 The difference in R-factors is not too surprising given how many reflections was removed (about 33%). Pavel (*) Note, the data available in PDB is anisotropy corrected. The original data set was kindly provided to me by the author. On 5/2/12 5:25 AM, Kendall Nettles wrote:
I didnt think the structure was publishable with Rfree of 33% because I was expecting the reviewers to complain.
We have tested a number of data sets on the UCLA server and it usually doesn't make much difference. I wouldn't expect truncation alone to change Rfree by 5%, and it usually doesn't. The two times I have seen dramatic impacts on the maps ( and Rfree ), the highly anisotrophic sets showed strong waves of difference density as well, which was fixed by throwing out the noise. We have moved to using loose data cutoffs for most structures, but I do think anisotropic truncation can be helpful in rare cases.
Kendall
On May 1, 2012, at 3:07 PM, "Dale Tronrud"
wrote: While philosophically I see no difference between a spherical resolution cutoff and an elliptical one, a drop in the free R can't be the justification for the switch. A model cannot be made more "publishable" simply by discarding data.
We have a whole bunch of empirical guides for judging the quality of this and that in our field. We determine the resolution limit of a data set (and imposing a "limit" is another empirical choice made) based on Rmrg, or Rmes, or Rpim getting too big or I/sigI getting too small and there is no agreement on how "too big/small" is too "too big/small".
We then have other empirical guides for judging the quality of the models we produce (e.g. Rwork, Rfree, rmsds of various sorts). Most people seem to recognize that the these criteria need to be applied differently for different resolutions. A lower resolution model is allowed a higher Rfree, for example.
Isn't is also true that a model refined to data with a cutoff of I/sigI of 1 would be expected to have a free R higher than a model refined to data with a cutoff of 2? Surely we cannot say that the decrease in free R that results from changing the cutoff criteria from 1 to 2 reflects an improved model. It is the same model after all.
Sometimes this shifting application of empirical criteria enhances the adoption of new technology. Certainly the TLS parametrization of atomic motion has been widely accepted because it results in lower working and free Rs. I've seen it knock 3 to 5 percent off, and while that certainly means that the model fits the data better, I'm not sure that the quality of the hydrogen bond distances, van der Waals distances, or maps are any better. The latter details are what I really look for in a model.
On the other hand, there has been good evidence through the years that there is useful information in the data beyond an I/sigI of 2 or an Rmeas> 100% but getting people to use this data has been a hard slog. The reason for this reluctance is that the R values of the resulting models are higher. Of course they are higher! That does not mean the models are of poorer quality, only that data with lower signal/noise has been used that was discarded in the models you used to develop your "gut feeling" for the meaning of R.
When you change your criteria for selecting data you have to discard your old notions about the acceptable values of empirical quality measures. You either have to normalize your measure, as Phil Jeffrey recommends, by ensuring that you calculate your R's with the same reflections, or by making objective measures of map quality.
Dale Tronrud
P.S. It is entirely possible that refining a model to a very optimistic resolution cutoff and calculating the map to a lower resolution might be better than throwing out the data altogether.
On 5/1/2012 10:34 AM, Kendall Nettles wrote:
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise. Kendall
On May 1, 2012, at 11:10 AM, Bryan Lepore wrote:
On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote:Are anisotropic cutoff desirable? is there a peer-reviewed publication - perhaps from Acta Crystallographica - which describes precisely why scaling or refinement programs are inadequate to ameliorate the problem of anisotropy, and argues why the method applied in Strong, et. al. 2006 satisfies this need?
-Bryan
Is the explanation not simpler? The volumes of reciprocal space that were left out did not in fact contain signal, and it's by removing those noise non-reflections that the actual R is revealed. As James Holton posted a while ago, Rfactors calculated for noise give randomly large values. So it seems less less misleading to refer to it as "anisotropy-TRUNCATED". phx. On 03/05/2012 07:40, Pavel Afonine wrote:
Hi Kendall,
I just did this quick test: calculated R-factors using original and anisotropy-corrected Mike Sawaya's data (*)
Original: r_work : 0.3026 r_free : 0.3591 number of reflections: 26944
Truncated: r_work : 0.2640 r_free : 0.3178 number of reflections: 18176
The difference in R-factors is not too surprising given how many reflections was removed (about 33%).
Pavel
(*) Note, the data available in PDB is anisotropy corrected. The original data set was kindly provided to me by the author.
On 5/2/12 5:25 AM, Kendall Nettles wrote:
I didnt think the structure was publishable with Rfree of 33% because I was expecting the reviewers to complain.
We have tested a number of data sets on the UCLA server and it usually doesn't make much difference. I wouldn't expect truncation alone to change Rfree by 5%, and it usually doesn't. The two times I have seen dramatic impacts on the maps ( and Rfree ), the highly anisotrophic sets showed strong waves of difference density as well, which was fixed by throwing out the noise. We have moved to using loose data cutoffs for most structures, but I do think anisotropic truncation can be helpful in rare cases.
Kendall
On May 1, 2012, at 3:07 PM, "Dale Tronrud"
wrote: While philosophically I see no difference between a spherical resolution cutoff and an elliptical one, a drop in the free R can't be the justification for the switch. A model cannot be made more "publishable" simply by discarding data.
We have a whole bunch of empirical guides for judging the quality of this and that in our field. We determine the resolution limit of a data set (and imposing a "limit" is another empirical choice made) based on Rmrg, or Rmes, or Rpim getting too big or I/sigI getting too small and there is no agreement on how "too big/small" is too "too big/small".
We then have other empirical guides for judging the quality of the models we produce (e.g. Rwork, Rfree, rmsds of various sorts). Most people seem to recognize that the these criteria need to be applied differently for different resolutions. A lower resolution model is allowed a higher Rfree, for example.
Isn't is also true that a model refined to data with a cutoff of I/sigI of 1 would be expected to have a free R higher than a model refined to data with a cutoff of 2? Surely we cannot say that the decrease in free R that results from changing the cutoff criteria from 1 to 2 reflects an improved model. It is the same model after all.
Sometimes this shifting application of empirical criteria enhances the adoption of new technology. Certainly the TLS parametrization of atomic motion has been widely accepted because it results in lower working and free Rs. I've seen it knock 3 to 5 percent off, and while that certainly means that the model fits the data better, I'm not sure that the quality of the hydrogen bond distances, van der Waals distances, or maps are any better. The latter details are what I really look for in a model.
On the other hand, there has been good evidence through the years that there is useful information in the data beyond an I/sigI of 2 or an Rmeas> 100% but getting people to use this data has been a hard slog. The reason for this reluctance is that the R values of the resulting models are higher. Of course they are higher! That does not mean the models are of poorer quality, only that data with lower signal/noise has been used that was discarded in the models you used to develop your "gut feeling" for the meaning of R.
When you change your criteria for selecting data you have to discard your old notions about the acceptable values of empirical quality measures. You either have to normalize your measure, as Phil Jeffrey recommends, by ensuring that you calculate your R's with the same reflections, or by making objective measures of map quality.
Dale Tronrud
P.S. It is entirely possible that refining a model to a very optimistic resolution cutoff and calculating the map to a lower resolution might be better than throwing out the data altogether.
On 5/1/2012 10:34 AM, Kendall Nettles wrote:
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise. Kendall
On May 1, 2012, at 11:10 AM, Bryan Lepore wrote:
On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote:Are anisotropic cutoff desirable? is there a peer-reviewed publication - perhaps from Acta Crystallographica - which describes precisely why scaling or refinement programs are inadequate to ameliorate the problem of anisotropy, and argues why the method applied in Strong, et. al. 2006 satisfies this need?
-Bryan
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Pavel,
What happens if you throw out that many reflections that have signal? Can you take out a random set of the same size?
Best,
Kendall
On May 3, 2012, at 2:41 AM, "Pavel Afonine"
Hi Kendall,
I just did this quick test: calculated R-factors using original and anisotropy-corrected Mike Sawaya's data (*)
Original: r_work : 0.3026 r_free : 0.3591 number of reflections: 26944
Truncated: r_work : 0.2640 r_free : 0.3178 number of reflections: 18176
The difference in R-factors is not too surprising given how many reflections was removed (about 33%).
Pavel
(*) Note, the data available in PDB is anisotropy corrected. The original data set was kindly provided to me by the author.
On 5/2/12 5:25 AM, Kendall Nettles wrote:
I didnt think the structure was publishable with Rfree of 33% because I was expecting the reviewers to complain.
We have tested a number of data sets on the UCLA server and it usually doesn't make much difference. I wouldn't expect truncation alone to change Rfree by 5%, and it usually doesn't. The two times I have seen dramatic impacts on the maps ( and Rfree ), the highly anisotrophic sets showed strong waves of difference density as well, which was fixed by throwing out the noise. We have moved to using loose data cutoffs for most structures, but I do think anisotropic truncation can be helpful in rare cases.
Kendall
On May 1, 2012, at 3:07 PM, "Dale Tronrud"
wrote: While philosophically I see no difference between a spherical resolution cutoff and an elliptical one, a drop in the free R can't be the justification for the switch. A model cannot be made more "publishable" simply by discarding data.
We have a whole bunch of empirical guides for judging the quality of this and that in our field. We determine the resolution limit of a data set (and imposing a "limit" is another empirical choice made) based on Rmrg, or Rmes, or Rpim getting too big or I/sigI getting too small and there is no agreement on how "too big/small" is too "too big/small".
We then have other empirical guides for judging the quality of the models we produce (e.g. Rwork, Rfree, rmsds of various sorts). Most people seem to recognize that the these criteria need to be applied differently for different resolutions. A lower resolution model is allowed a higher Rfree, for example.
Isn't is also true that a model refined to data with a cutoff of I/sigI of 1 would be expected to have a free R higher than a model refined to data with a cutoff of 2? Surely we cannot say that the decrease in free R that results from changing the cutoff criteria from 1 to 2 reflects an improved model. It is the same model after all.
Sometimes this shifting application of empirical criteria enhances the adoption of new technology. Certainly the TLS parametrization of atomic motion has been widely accepted because it results in lower working and free Rs. I've seen it knock 3 to 5 percent off, and while that certainly means that the model fits the data better, I'm not sure that the quality of the hydrogen bond distances, van der Waals distances, or maps are any better. The latter details are what I really look for in a model.
On the other hand, there has been good evidence through the years that there is useful information in the data beyond an I/sigI of 2 or an Rmeas> 100% but getting people to use this data has been a hard slog. The reason for this reluctance is that the R values of the resulting models are higher. Of course they are higher! That does not mean the models are of poorer quality, only that data with lower signal/noise has been used that was discarded in the models you used to develop your "gut feeling" for the meaning of R.
When you change your criteria for selecting data you have to discard your old notions about the acceptable values of empirical quality measures. You either have to normalize your measure, as Phil Jeffrey recommends, by ensuring that you calculate your R's with the same reflections, or by making objective measures of map quality.
Dale Tronrud
P.S. It is entirely possible that refining a model to a very optimistic resolution cutoff and calculating the map to a lower resolution might be better than throwing out the data altogether.
On 5/1/2012 10:34 AM, Kendall Nettles wrote:
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise. Kendall
On May 1, 2012, at 11:10 AM, Bryan Lepore wrote:
On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote:Are anisotropic cutoff desirable? is there a peer-reviewed publication - perhaps from Acta Crystallographica - which describes precisely why scaling or refinement programs are inadequate to ameliorate the problem of anisotropy, and argues why the method applied in Strong, et. al. 2006 satisfies this need?
-Bryan
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Kendall, removing same amount of data randomly gives Rwork/Rfree ~ 30/35%. Pavel On 5/3/12 4:13 AM, Kendall Nettles wrote:
Hi Pavel, What happens if you throw out that many reflections that have signal? Can you take out a random set of the same size? Best, Kendall
On May 3, 2012, at 2:41 AM, "Pavel Afonine"
wrote: Hi Kendall,
I just did this quick test: calculated R-factors using original and anisotropy-corrected Mike Sawaya's data (*)
Original: r_work : 0.3026 r_free : 0.3591 number of reflections: 26944
Truncated: r_work : 0.2640 r_free : 0.3178 number of reflections: 18176
The difference in R-factors is not too surprising given how many reflections was removed (about 33%).
Pavel
(*) Note, the data available in PDB is anisotropy corrected. The original data set was kindly provided to me by the author.
On 5/2/12 5:25 AM, Kendall Nettles wrote:
I didnt think the structure was publishable with Rfree of 33% because I was expecting the reviewers to complain.
We have tested a number of data sets on the UCLA server and it usually doesn't make much difference. I wouldn't expect truncation alone to change Rfree by 5%, and it usually doesn't. The two times I have seen dramatic impacts on the maps ( and Rfree ), the highly anisotrophic sets showed strong waves of difference density as well, which was fixed by throwing out the noise. We have moved to using loose data cutoffs for most structures, but I do think anisotropic truncation can be helpful in rare cases.
Kendall
On May 1, 2012, at 3:07 PM, "Dale Tronrud"
wrote: While philosophically I see no difference between a spherical resolution cutoff and an elliptical one, a drop in the free R can't be the justification for the switch. A model cannot be made more "publishable" simply by discarding data.
We have a whole bunch of empirical guides for judging the quality of this and that in our field. We determine the resolution limit of a data set (and imposing a "limit" is another empirical choice made) based on Rmrg, or Rmes, or Rpim getting too big or I/sigI getting too small and there is no agreement on how "too big/small" is too "too big/small".
We then have other empirical guides for judging the quality of the models we produce (e.g. Rwork, Rfree, rmsds of various sorts). Most people seem to recognize that the these criteria need to be applied differently for different resolutions. A lower resolution model is allowed a higher Rfree, for example.
Isn't is also true that a model refined to data with a cutoff of I/sigI of 1 would be expected to have a free R higher than a model refined to data with a cutoff of 2? Surely we cannot say that the decrease in free R that results from changing the cutoff criteria from 1 to 2 reflects an improved model. It is the same model after all.
Sometimes this shifting application of empirical criteria enhances the adoption of new technology. Certainly the TLS parametrization of atomic motion has been widely accepted because it results in lower working and free Rs. I've seen it knock 3 to 5 percent off, and while that certainly means that the model fits the data better, I'm not sure that the quality of the hydrogen bond distances, van der Waals distances, or maps are any better. The latter details are what I really look for in a model.
On the other hand, there has been good evidence through the years that there is useful information in the data beyond an I/sigI of 2 or an Rmeas> 100% but getting people to use this data has been a hard slog. The reason for this reluctance is that the R values of the resulting models are higher. Of course they are higher! That does not mean the models are of poorer quality, only that data with lower signal/noise has been used that was discarded in the models you used to develop your "gut feeling" for the meaning of R.
When you change your criteria for selecting data you have to discard your old notions about the acceptable values of empirical quality measures. You either have to normalize your measure, as Phil Jeffrey recommends, by ensuring that you calculate your R's with the same reflections, or by making objective measures of map quality.
Dale Tronrud
P.S. It is entirely possible that refining a model to a very optimistic resolution cutoff and calculating the map to a lower resolution might be better than throwing out the data altogether.
On 5/1/2012 10:34 AM, Kendall Nettles wrote:
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise. Kendall
On May 1, 2012, at 11:10 AM, Bryan Lepore wrote:
On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote: > Are anisotropic cutoff desirable? is there a peer-reviewed publication - perhaps from Acta Crystallographica - which describes precisely why scaling or refinement programs are inadequate to ameliorate the problem of anisotropy, and argues why the method applied in Strong, et. al. 2006 satisfies this need?-Bryan
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Sorry guys, I got a bit lost in this long thread. This test means that by truncating the data with the anisotropy server we get better R/Rfree statistics. We are throwing away "bad" data, because throwing away the same number of a random set of relfections, the statistics don't drop. Hence, the server is indeed helping in dropping the statistics, and if I got it right, it is also providing better maps. Is my understanding correct? Thanks, ciao, s On May 3, 2012, at 1:45 PM, Pavel Afonine wrote:
Hi Kendall,
removing same amount of data randomly gives Rwork/Rfree ~ 30/35%.
Pavel
On 5/3/12 4:13 AM, Kendall Nettles wrote:
Hi Pavel, What happens if you throw out that many reflections that have signal? Can you take out a random set of the same size? Best, Kendall
On May 3, 2012, at 2:41 AM, "Pavel Afonine"
wrote: Hi Kendall,
I just did this quick test: calculated R-factors using original and anisotropy-corrected Mike Sawaya's data (*)
Original: r_work : 0.3026 r_free : 0.3591 number of reflections: 26944
Truncated: r_work : 0.2640 r_free : 0.3178 number of reflections: 18176
The difference in R-factors is not too surprising given how many reflections was removed (about 33%).
Pavel
(*) Note, the data available in PDB is anisotropy corrected. The original data set was kindly provided to me by the author.
On 5/2/12 5:25 AM, Kendall Nettles wrote:
I didnt think the structure was publishable with Rfree of 33% because I was expecting the reviewers to complain.
We have tested a number of data sets on the UCLA server and it usually doesn't make much difference. I wouldn't expect truncation alone to change Rfree by 5%, and it usually doesn't. The two times I have seen dramatic impacts on the maps ( and Rfree ), the highly anisotrophic sets showed strong waves of difference density as well, which was fixed by throwing out the noise. We have moved to using loose data cutoffs for most structures, but I do think anisotropic truncation can be helpful in rare cases.
Kendall
On May 1, 2012, at 3:07 PM, "Dale Tronrud"
wrote: While philosophically I see no difference between a spherical resolution cutoff and an elliptical one, a drop in the free R can't be the justification for the switch. A model cannot be made more "publishable" simply by discarding data.
We have a whole bunch of empirical guides for judging the quality of this and that in our field. We determine the resolution limit of a data set (and imposing a "limit" is another empirical choice made) based on Rmrg, or Rmes, or Rpim getting too big or I/sigI getting too small and there is no agreement on how "too big/small" is too "too big/small".
We then have other empirical guides for judging the quality of the models we produce (e.g. Rwork, Rfree, rmsds of various sorts). Most people seem to recognize that the these criteria need to be applied differently for different resolutions. A lower resolution model is allowed a higher Rfree, for example.
Isn't is also true that a model refined to data with a cutoff of I/sigI of 1 would be expected to have a free R higher than a model refined to data with a cutoff of 2? Surely we cannot say that the decrease in free R that results from changing the cutoff criteria from 1 to 2 reflects an improved model. It is the same model after all.
Sometimes this shifting application of empirical criteria enhances the adoption of new technology. Certainly the TLS parametrization of atomic motion has been widely accepted because it results in lower working and free Rs. I've seen it knock 3 to 5 percent off, and while that certainly means that the model fits the data better, I'm not sure that the quality of the hydrogen bond distances, van der Waals distances, or maps are any better. The latter details are what I really look for in a model.
On the other hand, there has been good evidence through the years that there is useful information in the data beyond an I/sigI of 2 or an Rmeas> 100% but getting people to use this data has been a hard slog. The reason for this reluctance is that the R values of the resulting models are higher. Of course they are higher! That does not mean the models are of poorer quality, only that data with lower signal/noise has been used that was discarded in the models you used to develop your "gut feeling" for the meaning of R.
When you change your criteria for selecting data you have to discard your old notions about the acceptable values of empirical quality measures. You either have to normalize your measure, as Phil Jeffrey recommends, by ensuring that you calculate your R's with the same reflections, or by making objective measures of map quality.
Dale Tronrud
P.S. It is entirely possible that refining a model to a very optimistic resolution cutoff and calculating the map to a lower resolution might be better than throwing out the data altogether.
On 5/1/2012 10:34 AM, Kendall Nettles wrote:
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise. Kendall
On May 1, 2012, at 11:10 AM, Bryan Lepore wrote:
> On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote: >> Are anisotropic cutoff desirable? > is there a peer-reviewed publication - perhaps from Acta > Crystallographica - which describes precisely why scaling or > refinement programs are inadequate to ameliorate the problem of > anisotropy, and argues why the method applied in Strong, et. al. 2006 > satisfies this need? > > -Bryan
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
-- Sebastiano Pasqualato, PhD Crystallography Unit Department of Experimental Oncology European Institute of Oncology IFOM-IEO Campus via Adamello, 16 20139 - Milano Italy tel +39 02 9437 5167 fax +39 02 9437 5990
Hi Sebastiano, hm.. I don't know, I didn't have chance to think about it, I just found it attractive and easy to do Kendall's experiment, and post the results. Despite the obviousness of the result, something tells me that it's not that easy, but I'm not in the most convenient moment to carry out such a delicate process as thinking -;) I still don't like the idea of throwing data, even if some of list members believe it is not data but junk. That requires a thorough thinking and comprehensive testing: sounds like a project for someone to do. So, no - I would not take the result of this test as a green light to run all data sets through aniso-truncation server, no. What if you just start trying blindly removing reflections one by one and see if removing each one decreases the R, and actually remove one if and only if it does decrease R? That would be a way to come up with very robust and unjustifiable protocol of lowering R, but does anyone really want it? I think mathematicians should know methods to measure information content in the data, and that would probably a better route to take. Pavel
Hi Pavel, Could you use a similar approach to figuring out where to cut your data in general? Could you compare the effects of throwing out reflections in different bins, based on I/sigma, for example, and use this to determine what is truly noise? I might predict that as you throw out "noise" reflections you will see a larger drop in Rfree than from throwing out "signal" reflections, which should converge as you approach the "true" resolution. While we don't use I/sigma exclusively, we do to tend towards cutting most of our data sets at the same i/sigma, around 1.5. It would be great if there was a more scientific approach. Best, Kendall On May 3, 2012, at 7:45 AM, Pavel Afonine wrote:
Hi Kendall,
removing same amount of data randomly gives Rwork/Rfree ~ 30/35%.
Pavel
On 5/3/12 4:13 AM, Kendall Nettles wrote:
Hi Pavel, What happens if you throw out that many reflections that have signal? Can you take out a random set of the same size? Best, Kendall
On May 3, 2012, at 2:41 AM, "Pavel Afonine"
wrote: Hi Kendall,
I just did this quick test: calculated R-factors using original and anisotropy-corrected Mike Sawaya's data (*)
Original: r_work : 0.3026 r_free : 0.3591 number of reflections: 26944
Truncated: r_work : 0.2640 r_free : 0.3178 number of reflections: 18176
The difference in R-factors is not too surprising given how many reflections was removed (about 33%).
Pavel
(*) Note, the data available in PDB is anisotropy corrected. The original data set was kindly provided to me by the author.
On 5/2/12 5:25 AM, Kendall Nettles wrote:
I didnt think the structure was publishable with Rfree of 33% because I was expecting the reviewers to complain.
We have tested a number of data sets on the UCLA server and it usually doesn't make much difference. I wouldn't expect truncation alone to change Rfree by 5%, and it usually doesn't. The two times I have seen dramatic impacts on the maps ( and Rfree ), the highly anisotrophic sets showed strong waves of difference density as well, which was fixed by throwing out the noise. We have moved to using loose data cutoffs for most structures, but I do think anisotropic truncation can be helpful in rare cases.
Kendall
On May 1, 2012, at 3:07 PM, "Dale Tronrud"
wrote: While philosophically I see no difference between a spherical resolution cutoff and an elliptical one, a drop in the free R can't be the justification for the switch. A model cannot be made more "publishable" simply by discarding data.
We have a whole bunch of empirical guides for judging the quality of this and that in our field. We determine the resolution limit of a data set (and imposing a "limit" is another empirical choice made) based on Rmrg, or Rmes, or Rpim getting too big or I/sigI getting too small and there is no agreement on how "too big/small" is too "too big/small".
We then have other empirical guides for judging the quality of the models we produce (e.g. Rwork, Rfree, rmsds of various sorts). Most people seem to recognize that the these criteria need to be applied differently for different resolutions. A lower resolution model is allowed a higher Rfree, for example.
Isn't is also true that a model refined to data with a cutoff of I/sigI of 1 would be expected to have a free R higher than a model refined to data with a cutoff of 2? Surely we cannot say that the decrease in free R that results from changing the cutoff criteria from 1 to 2 reflects an improved model. It is the same model after all.
Sometimes this shifting application of empirical criteria enhances the adoption of new technology. Certainly the TLS parametrization of atomic motion has been widely accepted because it results in lower working and free Rs. I've seen it knock 3 to 5 percent off, and while that certainly means that the model fits the data better, I'm not sure that the quality of the hydrogen bond distances, van der Waals distances, or maps are any better. The latter details are what I really look for in a model.
On the other hand, there has been good evidence through the years that there is useful information in the data beyond an I/sigI of 2 or an Rmeas> 100% but getting people to use this data has been a hard slog. The reason for this reluctance is that the R values of the resulting models are higher. Of course they are higher! That does not mean the models are of poorer quality, only that data with lower signal/noise has been used that was discarded in the models you used to develop your "gut feeling" for the meaning of R.
When you change your criteria for selecting data you have to discard your old notions about the acceptable values of empirical quality measures. You either have to normalize your measure, as Phil Jeffrey recommends, by ensuring that you calculate your R's with the same reflections, or by making objective measures of map quality.
Dale Tronrud
P.S. It is entirely possible that refining a model to a very optimistic resolution cutoff and calculating the map to a lower resolution might be better than throwing out the data altogether.
On 5/1/2012 10:34 AM, Kendall Nettles wrote:
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise. Kendall
On May 1, 2012, at 11:10 AM, Bryan Lepore wrote:
> On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote: >> Are anisotropic cutoff desirable? > is there a peer-reviewed publication - perhaps from Acta > Crystallographica - which describes precisely why scaling or > refinement programs are inadequate to ameliorate the problem of > anisotropy, and argues why the method applied in Strong, et. al. 2006 > satisfies this need? > > -Bryan
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi Kendall, This could work. You could define a fixed set of test reflections, and never touch these, and never include them in refinement, and always use this fixed set to calculate a free R. Then you could do whatever you want, throw away some work reflections, etc, refine, and evaluate how things are working with the fixed free R set. All the best, Tom T ________________________________________ From: [email protected] [[email protected]] on behalf of Kendall Nettles [[email protected]] Sent: Thursday, May 03, 2012 7:05 AM To: PHENIX user mailing list Subject: Re: [phenixbb] Geometry Restraints - Anisotropic truncation Hi Pavel, Could you use a similar approach to figuring out where to cut your data in general? Could you compare the effects of throwing out reflections in different bins, based on I/sigma, for example, and use this to determine what is truly noise? I might predict that as you throw out "noise" reflections you will see a larger drop in Rfree than from throwing out "signal" reflections, which should converge as you approach the "true" resolution. While we don't use I/sigma exclusively, we do to tend towards cutting most of our data sets at the same i/sigma, around 1.5. It would be great if there was a more scientific approach. Best, Kendall On May 3, 2012, at 7:45 AM, Pavel Afonine wrote:
Hi Kendall,
removing same amount of data randomly gives Rwork/Rfree ~ 30/35%.
Pavel
On 5/3/12 4:13 AM, Kendall Nettles wrote:
Hi Pavel, What happens if you throw out that many reflections that have signal? Can you take out a random set of the same size? Best, Kendall
On May 3, 2012, at 2:41 AM, "Pavel Afonine"
wrote: Hi Kendall,
I just did this quick test: calculated R-factors using original and anisotropy-corrected Mike Sawaya's data (*)
Original: r_work : 0.3026 r_free : 0.3591 number of reflections: 26944
Truncated: r_work : 0.2640 r_free : 0.3178 number of reflections: 18176
The difference in R-factors is not too surprising given how many reflections was removed (about 33%).
Pavel
(*) Note, the data available in PDB is anisotropy corrected. The original data set was kindly provided to me by the author.
On 5/2/12 5:25 AM, Kendall Nettles wrote:
I didnt think the structure was publishable with Rfree of 33% because I was expecting the reviewers to complain.
We have tested a number of data sets on the UCLA server and it usually doesn't make much difference. I wouldn't expect truncation alone to change Rfree by 5%, and it usually doesn't. The two times I have seen dramatic impacts on the maps ( and Rfree ), the highly anisotrophic sets showed strong waves of difference density as well, which was fixed by throwing out the noise. We have moved to using loose data cutoffs for most structures, but I do think anisotropic truncation can be helpful in rare cases.
Kendall
On May 1, 2012, at 3:07 PM, "Dale Tronrud"
wrote: While philosophically I see no difference between a spherical resolution cutoff and an elliptical one, a drop in the free R can't be the justification for the switch. A model cannot be made more "publishable" simply by discarding data.
We have a whole bunch of empirical guides for judging the quality of this and that in our field. We determine the resolution limit of a data set (and imposing a "limit" is another empirical choice made) based on Rmrg, or Rmes, or Rpim getting too big or I/sigI getting too small and there is no agreement on how "too big/small" is too "too big/small".
We then have other empirical guides for judging the quality of the models we produce (e.g. Rwork, Rfree, rmsds of various sorts). Most people seem to recognize that the these criteria need to be applied differently for different resolutions. A lower resolution model is allowed a higher Rfree, for example.
Isn't is also true that a model refined to data with a cutoff of I/sigI of 1 would be expected to have a free R higher than a model refined to data with a cutoff of 2? Surely we cannot say that the decrease in free R that results from changing the cutoff criteria from 1 to 2 reflects an improved model. It is the same model after all.
Sometimes this shifting application of empirical criteria enhances the adoption of new technology. Certainly the TLS parametrization of atomic motion has been widely accepted because it results in lower working and free Rs. I've seen it knock 3 to 5 percent off, and while that certainly means that the model fits the data better, I'm not sure that the quality of the hydrogen bond distances, van der Waals distances, or maps are any better. The latter details are what I really look for in a model.
On the other hand, there has been good evidence through the years that there is useful information in the data beyond an I/sigI of 2 or an Rmeas> 100% but getting people to use this data has been a hard slog. The reason for this reluctance is that the R values of the resulting models are higher. Of course they are higher! That does not mean the models are of poorer quality, only that data with lower signal/noise has been used that was discarded in the models you used to develop your "gut feeling" for the meaning of R.
When you change your criteria for selecting data you have to discard your old notions about the acceptable values of empirical quality measures. You either have to normalize your measure, as Phil Jeffrey recommends, by ensuring that you calculate your R's with the same reflections, or by making objective measures of map quality.
Dale Tronrud
P.S. It is entirely possible that refining a model to a very optimistic resolution cutoff and calculating the map to a lower resolution might be better than throwing out the data altogether.
On 5/1/2012 10:34 AM, Kendall Nettles wrote:
I have seen dramatic improvements in maps and behavior during refinement following use of the UCLA anisotropy server in two different cases. For one of them the Rfree went from 33% to 28%. I don't think it would have been publishable otherwise. Kendall
On May 1, 2012, at 11:10 AM, Bryan Lepore wrote:
> On Mon, Apr 30, 2012 at 4:22 AM, Phil Evans
wrote: >> Are anisotropic cutoff desirable? > is there a peer-reviewed publication - perhaps from Acta > Crystallographica - which describes precisely why scaling or > refinement programs are inadequate to ameliorate the problem of > anisotropy, and argues why the method applied in Strong, et. al. 2006 > satisfies this need? > > -Bryan
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
Hi, we discussed this with Randy the other day.. A couple of copy-pasts from that discussion: In general, given highly anisotropic data set: 1) maps calculated using all (unmodified) data by phenix.refine, phenix.maps and similar tools are better than maps calculated using anisotropy truncated data. So, yes, for the purpose of map calculation there is no need to do anything: Phenix map calculation tools deal with anisotropy very well. 2) phenix.refine refinement may fail if one uses original anisotropic data set. This is probably because the ML target does not use experimental sigmas (and anisotropy correction by UCLA server is nothing but Miller index dependent removing the data by sigma criterion - yeah, that old well criticized practice of throwing away the data that you worked hard to measure!). May be using sigmas in ML calculation could solve the problem but that has to be proved. Nat: I added the code that does anisotropy truncation a wile ago (see in miller class - there are a couple of methods that do it) but this is not exposed to the user level. Pavel On 4/29/12 6:30 AM, Leonid Sazanov wrote:
Hi, so will we see now the possibility to apply anisotropic resolution limits truncation in Phenix? Despite the fact that a lot of people use UCLA server, there is still no possibility to do this in any major crystallographic software, AFAIK. Phaser does the scaling, but no truncation, and noise sitting in those regions where there is no signal, is a real culprit. Large SIGFs there do not resolve the problem completely. Thanks.
> Message: 4 Date: Wed, 25 Apr 2012 09:05:57 -0400 From: Mario Sanches
To: PHENIX user mailing list Subject: [phenixbb] Geometry Restraints - UPDATE .......... It turns out that the reason to my nightmares was a highly anisotropic dataset. When I tried to fix geometry related problems in coot, phenix.refine was twisting the geometry back again, probably because it was trying to fit a lot of noise due to the anisotropy. Everything was getting worst, geometry, clashscore, and Ramachandran. Pavel finally corrected it by using this server: http://services.mbi.ucla.edu/anisoscale/
............
1) maps calculated using all (unmodified) data by phenix.refine, phenix.maps and similar tools are better than maps calculated using anisotropy truncated data. So, yes, for the purpose of map calculation there is no need to do anything: Phenix map calculation tools deal with anisotropy very well.
That is not our experience in cases of really severe anisotropy.
2) phenix.refine refinement may fail if one uses original anisotropic data set. This is probably because the ML target does not use experimental sigmas (and anisotropy correction by UCLA server is nothing but Miller index dependent removing the data by sigma criterion - yeah, that old well criticized practice of throwing away the data that you worked hard to measure!). May be using sigmas in ML calculation could solve the problem but that has to be proved.
UCLA server removes all the data beyond set ellipsoid, it does not deal with individual reflections by sigma. Also one can set its own resolution limits on UCLA server, depending on personally preferred criteria. As for throwing away data, what is the difference here from "throwing away data" when we cut the resolution isotropically (one high res limit) using whatever criterion (I/sigma, R, correlation) one prefers? Cheers Leonid
1) maps calculated using all (unmodified) data by phenix.refine, phenix.maps and similar tools are better than maps calculated using anisotropy truncated data. So, yes, for the purpose of map calculation there is no need to do anything: Phenix map calculation tools deal with anisotropy very well.
That is not our experience in cases of really severe anisotropy.
Can you send me an example off-list, please (need model and data files (before and after truncation)).
2) phenix.refine refinement may fail if one uses original anisotropic data set. This is probably because the ML target does not use experimental sigmas (and anisotropy correction by UCLA server is nothing but Miller index dependent removing the data by sigma criterion - yeah, that old well criticized practice of throwing away the data that you worked hard to measure!). May be using sigmas in ML calculation could solve the problem but that has to be proved.
UCLA server removes all the data beyond set ellipsoid, it does not deal with individual reflections by sigma.
Yes, it's not done per reflection, but indirectly as part of determination of the parameters of that ellipsoid that is then used to cut the data (as far as I understand it).
Also one can set its own resolution limits on UCLA server, depending on personally preferred criteria.
May be it's just fine (given the current state-of-the-art of methods used in refinement) if it's done carefully and thoughtfully. The trend though seem to be to blindly use it "just in case it gives me a lower R", which I find dangerous (and yes, it is wrong to compare R-factors calculated using different amount of data!). Pavel
1) maps calculated using all (unmodified) data by phenix.refine, phenix.maps and similar tools are better than maps calculated using anisotropy truncated data. So, yes, for the purpose of map calculation there is no need to do anything: Phenix map calculation tools deal with anisotropy very well.
That is not our experience in cases of really severe anisotropy.
Can you send me an example off-list, please (need model and data files (before and after truncation)).
The improvement after truncation is consistent over many different datasets, but the effects are sometimes subtle, so I will have to look for a clear-cut example when I have time. Truncation also helps a lot for density modification. Don't think I ever seen maps getting worse after truncation (although we usually use a bit more generous limits than UCLA server suggests).
2) phenix.refine refinement may fail if one uses original anisotropic data set. This is probably because the ML target does not use experimental sigmas (and anisotropy correction by UCLA server is nothing but Miller index dependent removing the data by sigma criterion - yeah, that old well criticized practice of throwing away the data that you worked hard to measure!). May be using sigmas in ML calculation could solve the problem but that has to be proved.
UCLA server removes all the data beyond set ellipsoid, it does not deal with individual reflections by sigma.
Yes, it's not done per reflection, but indirectly as part of determination of the parameters of that ellipsoid that is then used to cut the data (as far as I understand it).
Also one can set its own resolution limits on UCLA server, depending on personally preferred criteria.
May be it's just fine (given the current state-of-the-art of methods used in refinement) if it's done carefully and thoughtfully. The trend though seem to be to blindly use it "just in case it gives me a lower R", which I find dangerous (and yes, it is wrong to compare R-factors calculated using different amount of data!).
Pavel
Of course the main criterion for using truncation is improved (or not) appearance of maps and ability to continue building/refining into those maps. R factors do not drop that much due to truncation per se. Leonid
participants (13)
-
Bryan Lepore
-
Dale Tronrud
-
Dialing Pretty
-
Frank von Delft
-
Johan Hattne
-
Kendall Nettles
-
Leonid Sazanov
-
Nathaniel Echols
-
Pavel Afonine
-
Phil Evans
-
Phil Jeffrey
-
Sebastiano Pasqualato
-
Terwilliger, Thomas C