How to change the FOBS/SIGMA_FOBS value in phenix.refine
Dear PhenixBB, Is there place to set the I/SIGMA_I or FOBS/SIGMA_FOBS values in phenix.refine? I found the default FOBS/SIGMA_FOBS threshold in phenix.refine is 1.35, which means any reflection lower than 1.35 will be excluded in the refinement process. This will generally result in a lower completeness range especially for the relatively weak datasets. When I uses the xray_data.remove_outliers=False, I got the error message "Sorry: Unknown command line parameter definition: remove_outliers = False", and my PHENIX version is phenix-1.12-2829, thanks. Regards Jianxun ******************************************************************* Dr. Jianxun Qi A321, Institute of Microbiology, Chinese Academy of Sciences No.1 West Beichen Road, Chaoyang District Beijing 100101, China Email: [email protected] Tel : 86-10-64806182 *******************************************************************
Hi Jianxu,
Is there place to set the I/SIGMA_I or FOBS/SIGMA_FOBS values in phenix.refine? I found the default FOBS/SIGMA_FOBS threshold in phenix.refine is 1.35, which means any reflection lower than 1.35 will be excluded in the refinement process. This will generally result in a lower completeness range especially for the relatively weak datasets. When I uses the xray_data.remove_outliers=False, I got the error message "Sorry: Unknown command line parameter definition: remove_outliers = False", and my PHENIX version is phenix-1.12-2829, thanks.
phenix.refine does not remove reflections by I/SIGI or F/SIGF criteria. What you see is the file is the result of taking all Fobs, all sigmas, defining one by another, taking min of that and reporting in PDb file header.. So, 1.35 is the actual min(FOBS/SIGFOBS) for data in your file. If you want to aply such a cutoff, then use: xray_data.sigma_fobs_rejection_criterion=VALUE or xray_data.sigma_iobs_rejection_criterion=VALUE Pavel
Dear Jianxu, What you're seeing here is the effect of the French and Wilson algorithm to turn intensities into amplitudes. As the SIGI gets larger and larger, the French and Wilson algorithm provides an F that becomes closer and closer to the mean amplitude in the Wilson distribution for amplitudes, and the SIGF becomes closer and closer to the rms deviation from the mean amplitude in the Wilson distribution. As a result the minimum F/SIGF that you can possibly find is about 1.913 for acentric reflections and 1.324 for centric reflections (http://journals.iucr.org/d/issues/2016/03/00/dz5382/index.html#FD29). The 1.35 that you see would be for a centric reflection that is close to having no information in the original intensity measurement. Best wishes, Randy Read
On 8 Jan 2018, at 03:20, Pavel Afonine
wrote: Hi Jianxu,
Is there place to set the I/SIGMA_I or FOBS/SIGMA_FOBS values in phenix.refine? I found the default FOBS/SIGMA_FOBS threshold in phenix.refine is 1.35, which means any reflection lower than 1.35 will be excluded in the refinement process. This will generally result in a lower completeness range especially for the relatively weak datasets. When I uses the xray_data.remove_outliers=False, I got the error message "Sorry: Unknown command line parameter definition: remove_outliers = False", and my PHENIX version is phenix-1.12-2829, thanks.
phenix.refine does not remove reflections by I/SIGI or F/SIGF criteria. What you see is the file is the result of taking all Fobs, all sigmas, defining one by another, taking min of that and reporting in PDb file header.. So, 1.35 is the actual min(FOBS/SIGFOBS) for data in your file. If you want to aply such a cutoff, then use:
xray_data.sigma_fobs_rejection_criterion=VALUE or xray_data.sigma_iobs_rejection_criterion=VALUE
Pavel
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
------ Randy J. Read Department of Haematology, University of Cambridge Cambridge Institute for Medical Research Tel: + 44 1223 336500 Wellcome Trust/MRC Building Fax: + 44 1223 336827 Hills Road E-mail: [email protected] Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk
Dear Randy and Pavel, There is one complication that arises from the report of the MIN(FOBS/SIGMA_FOBS) value in pdb files from phenix.refine. Nearly every PDB entry using phenix.refine reports a F/sig(F) cutoff value of 1.3x, while Buster and Refmac-generated pdbs have 0 or -/None for that value (just checked again with this week's released PDBs). This is clearly not intended. Again, the value the Protein Data Bank is reporting is a cutoff, based on the minimum value phenix.refine appears to report. Since I use French-Wilson for I to F conversions, I have had to correct this cutoff value by communicating with PDB with every deposition, but it appears that most users rarely go through the trouble. The issue seems to arise from a lack of a cutoff value in the phenix.refine generated .pdb files; Refmac has a DATA CUTOFF (SIGMA(F)) (set to NONE by default), which is picked up during structure deposition. So, either PDB has to be told that phenix.refine min value is just a minumum value and not a cutoff, or phenix.refine might add another REMARK card for DATA CUTOFF (SIGMA(F)) under the DATA USED IN REFINEMENT. section in REMARKS. Hope this helps, Engin On 1/8/18 2:38 AM, Randy Read wrote:
Dear Jianxu,
What you're seeing here is the effect of the French and Wilson algorithm to turn intensities into amplitudes. As the SIGI gets larger and larger, the French and Wilson algorithm provides an F that becomes closer and closer to the mean amplitude in the Wilson distribution for amplitudes, and the SIGF becomes closer and closer to the rms deviation from the mean amplitude in the Wilson distribution. As a result the minimum F/SIGF that you can possibly find is about 1.913 for acentric reflections and 1.324 for centric reflections (http://journals.iucr.org/d/issues/2016/03/00/dz5382/index.html#FD29). The 1.35 that you see would be for a centric reflection that is close to having no information in the original intensity measurement.
Best wishes,
Randy Read
On 8 Jan 2018, at 03:20, Pavel Afonine
wrote: Hi Jianxu,
Is there place to set the I/SIGMA_I or FOBS/SIGMA_FOBS values in phenix.refine? I found the default FOBS/SIGMA_FOBS threshold in phenix.refine is 1.35, which means any reflection lower than 1.35 will be excluded in the refinement process. This will generally result in a lower completeness range especially for the relatively weak datasets. When I uses the xray_data.remove_outliers=False, I got the error message "Sorry: Unknown command line parameter definition: remove_outliers = False", and my PHENIX version is phenix-1.12-2829, thanks. phenix.refine does not remove reflections by I/SIGI or F/SIGF criteria. What you see is the file is the result of taking all Fobs, all sigmas, defining one by another, taking min of that and reporting in PDb file header.. So, 1.35 is the actual min(FOBS/SIGFOBS) for data in your file. If you want to aply such a cutoff, then use:
xray_data.sigma_fobs_rejection_criterion=VALUE or xray_data.sigma_iobs_rejection_criterion=VALUE
Pavel
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
Randy J. Read Department of Haematology, University of Cambridge Cambridge Institute for Medical Research Tel: + 44 1223 336500 Wellcome Trust/MRC Building Fax: + 44 1223 336827 Hills Road E-mail: [email protected] Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb Unsubscribe: [email protected]
-- Engin Özkan, Ph.D. Assistant Professor Dept of Biochemistry and Molecular Biology University of Chicago http://ozkan.uchicago.edu
Hi Engin, thanks for feedback!
There is one complication that arises from the report of the MIN(FOBS/SIGMA_FOBS) value in pdb files from phenix.refine. Nearly every PDB entry using phenix.refine reports a F/sig(F) cutoff value of 1.3x,
As I eluded yesterday, this is not a cutoff but a reported fact about your data. No data is removed or otherwise manipulated related to this number.
while Buster and Refmac-generated pdbs have 0 or -/None for that value (just checked again with this week's released PDBs). This is clearly not intended. Again, the value the Protein Data Bank is reporting is a cutoff, based on the minimum value phenix.refine appears to report. Since I use French-Wilson for I to F conversions, I have had to correct this cutoff value by communicating with PDB with every deposition, but it appears that most users rarely go through the trouble.
REMARK 3 records are free format. It's up to program authors to choose what to print there. Nowhere in the record in question produced by phenix.refine is said "cutoff": REMARK 3 MIN(FOBS/SIGMA_FOBS) : 1.380 Refmac and Buster print: REMARK 3 DATA CUTOFF (SIGMA(F)) : 0.000 which clearly says "cutoff". So I guess we are fine as long as there is no wishful thinking involved and people carefully read what's written!
The issue seems to arise from a lack of a cutoff value in the phenix.refine generated .pdb files; Refmac has a DATA CUTOFF (SIGMA(F)) (set to NONE by default), which is picked up during structure deposition. So, either PDB has to be told that phenix.refine min value is just a minumum value and not a cutoff, or phenix.refine might add another REMARK card for DATA CUTOFF (SIGMA(F)) under the DATA USED IN REFINEMENT. section in REMARKS.
Sure, we can add "cutoff"record if you think it is helpful. In general, there are way more facts to reports about the data than this single number. As long as people deposit 1) data actually used in refinement (that may be truncated by sigma, resolution, automated outlier rejection, Iobs converted to Fobs, anomalous F+/- converted to non-anomalous Imean, etc) and 2) original data (not manipulated in any way), and as long as PDB actually accepts these data, then all should be fine. Note: phenix.refine always outputs MTZ containing the original input data and data actually used in refinement. All the best, Pavel
Dear Pavel, I agree with every single one of your points. As I mentioned, it is not phenix.refine that reports a cutoff, but the Protein Data Bank. My point was to have the Protein Data Bank record correctly what phenix.refine provides, and prevent confusion (as was the case for the original post). The "wishful thinking" you have pointed out is being done by the PDB on every single phenix-generated structure, so I hope that they change this practice if they see this post. I understand that PHENIX developers cannot be responsible for what third parties, such as the Protein Data Bank, do with files provided by your software. Thank you for the awesome software, Engin On 1/8/18 10:35 AM, Pavel Afonine wrote:
Hi Engin,
thanks for feedback!
There is one complication that arises from the report of the MIN(FOBS/SIGMA_FOBS) value in pdb files from phenix.refine. Nearly every PDB entry using phenix.refine reports a F/sig(F) cutoff value of 1.3x,
As I eluded yesterday, this is not a cutoff but a reported fact about your data. No data is removed or otherwise manipulated related to this number.
while Buster and Refmac-generated pdbs have 0 or -/None for that value (just checked again with this week's released PDBs). This is clearly not intended. Again, the value the Protein Data Bank is reporting is a cutoff, based on the minimum value phenix.refine appears to report. Since I use French-Wilson for I to F conversions, I have had to correct this cutoff value by communicating with PDB with every deposition, but it appears that most users rarely go through the trouble.
REMARK 3 records are free format. It's up to program authors to choose what to print there.
Nowhere in the record in question produced by phenix.refine is said "cutoff": REMARK 3 MIN(FOBS/SIGMA_FOBS) : 1.380 Refmac and Buster print: REMARK 3 DATA CUTOFF (SIGMA(F)) : 0.000 which clearly says "cutoff".
So I guess we are fine as long as there is no wishful thinking involved and people carefully read what's written!
The issue seems to arise from a lack of a cutoff value in the phenix.refine generated .pdb files; Refmac has a DATA CUTOFF (SIGMA(F)) (set to NONE by default), which is picked up during structure deposition. So, either PDB has to be told that phenix.refine min value is just a minumum value and not a cutoff, or phenix.refine might add another REMARK card for DATA CUTOFF (SIGMA(F)) under the DATA USED IN REFINEMENT. section in REMARKS.
Sure, we can add "cutoff"record if you think it is helpful.
In general, there are way more facts to reports about the data than this single number. As long as people deposit 1) data actually used in refinement (that may be truncated by sigma, resolution, automated outlier rejection, Iobs converted to Fobs, anomalous F+/- converted to non-anomalous Imean, etc) and 2) original data (not manipulated in any way), and as long as PDB actually accepts these data, then all should be fine. Note: phenix.refine always outputs MTZ containing the original input data and data actually used in refinement.
All the best, Pavel
participants (4)
-
Engin Özkan
-
Pavel Afonine
-
Randy Read
-
齐建勋