Hi Tim,
That is a fair point, although the observed (very modest) improvement with
phases was for the final ensemble. My motivation for looking into this is
that some of our unpublished work has indicated that even weak phase
information that is properly handled with a maximum likelihood target
(hence my use of HL coeffs with MLHL) can improve the modeling of disorder
using "CNS-style" non-interacting multicopy ensemble models. These are
distinct from the ensembles implemented in PHENIX in that they are not
time-averaged MD models but rather a set of n full replicates of the model
that do not interact with each other and are allowed to collectively fit
the data. I wanted to see how time-averaged ensembles compared to this
approach when phase information was included.
The overarching rationale for these tests is that phase information
provides a powerful set of additional experimental observations that may
counteract the intrinsic tendency of ensemble approaches to overfit data,
even if the phase information is noisy. In any event, I certainly did not
see a detrimental effect of added phase information on the R values of the
ensemble models with optimum choices of pTLS and using the MLHL target.
Your broader point however, is a good one-perhaps the true value of
including phase data isn't evident unless unusually high quality
experimental phases are being used.
Best regards,
Mark
Mark A. Wilson
Associate Professor
Department of Biochemistry/Redox Biology Center
University of Nebraska
N118 Beadle Center
1901 Vine Street
Lincoln, NE 68588
(402) 472-3626
[email protected]
On 11/14/14 2:50 AM, "Tim Gruene"
Dear Mark,
for a 1A structure, at least when it is near completeness, I would expect the phase information from the model much, much more accurate than from a SAD experiment, i.e. I'd rather expect the R value to get worse when you include SAD phasing information. For example, when developers show the phase errors of their methods for phasing experiments, the error is generally calculated with respect to the final model.
Did you observe the drop at an early stage of refinement with a low complete model?
Regards, Tim
On 11/13/2014 05:03 PM, Mark Wilson wrote:
Hi All, As I am (I think) the other user that Nat is referring to, I'll comment. I requested support for experimental phase information in the PHENIX ensemble refinement target and can verify that it is accepted (in my case as HL coeffs) and will run. The result in my test case was not dramatically different than an amplitude-based target, but obviously a great many factors could affect this. What differences I saw were minor improvements in R/Rfree (~1%) with Se-Met SAD phases in a 1.05 Å resolution structure of a flexible protein. I've not dug too much more into this, but I can verify that phases are accepted and do influence the final ensemble. Best regards, Mark
Mark A. Wilson Associate Professor Department of Biochemistry/Redox Biology Center University of Nebraska N118 Beadle Center 1901 Vine Street Lincoln, NE 68588 (402) 472-3626 [email protected]
On 11/13/14 9:53 AM, "Nathaniel Echols"
wrote: On Thu, Nov 13, 2014 at 6:46 AM, Joseph Brock
wrote: 1. In the associated publication (Burnley et al. eLife 2012;), the ensemble refinement is validated by comparing the correlation of the ensemble generated map, with the map generated from the experimental phases for PDB entry 1YTT... I am confused how one computes an experimentally phased from structure factors deposited in the PDB that contain only anomalous intensities/amplitudes and not Hendrickson-Lattman coefficients. Is there a program within the phenix package that can do this?
AutoSol can be used to re-solve such datasets, although in the case of 1YTT it requires additional information that wasn't deposited.
Is it possible to include experimental phases during the rolling average refinement process and could this be beneficial (if the phases were of a sufficient quality)?
It is possible, but completely untested aside from verifying that it doesn't crash. I added this a year ago at the request of another user but haven't looked into it since.
3. What is the function of the "nproc" keyword? If this is the number of CPU cores that can be used in parallel, what is the most efficient way of using phenix.ensemble_refinement on a cluster?
The only parallelization is in the optimization of the ptls parameter - i.e. if you try N values for ptls, you can run N jobs at once. For a single ptls it will run in serial. So on a cluster, you are better off running N different jobs separately.
Finally, I noticed that I cannot run phenix.ensemble_refinement using a "my_parameters.eff" file, it is necessary to type on the command line.
That sounds like a bug...
-Nat
_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb
-- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen
GPG Key ID = A46BEE1A