Re: [phenixbb] Phenix.ensemble_refinment questions

14 Nov 2014

      Hi Tim,
That is a fair point, although the observed (very modest) improvement with
phases was for the final ensemble. My motivation for looking into this is
that some of our unpublished work has indicated that even weak phase
information that is properly handled with a maximum likelihood target
(hence my use of HL coeffs with MLHL) can improve the modeling of disorder
using "CNS-style" non-interacting multicopy ensemble models.  These are
distinct from the ensembles implemented in PHENIX in that they are not
time-averaged MD models but rather a set of n full replicates of the model
that do not interact with each other and are allowed to collectively fit
the data.  I wanted to see how time-averaged ensembles compared to this
approach when phase information was included.
	The overarching rationale for these tests is that phase information
provides a powerful set of additional experimental observations that may
counteract the intrinsic tendency of ensemble approaches to overfit data,
even if the phase information is noisy.  In any event, I certainly did not
see a detrimental effect of added phase information on the R values of the
ensemble models with optimum choices of pTLS and using the MLHL target.
Your broader point however, is a good one-perhaps the true value of
including phase data isn't evident unless unusually high quality
experimental phases are being used.
Best regards,
Mark

Mark A. Wilson
Associate Professor
Department of Biochemistry/Redox Biology Center
University of Nebraska
N118 Beadle Center
1901 Vine Street
Lincoln, NE 68588
(402) 472-3626
[email protected] 

On 11/14/14 2:50 AM, "Tim Gruene"  wrote:
...
Dear Mark,
for a 1A structure, at least when it is near completeness, I would
expect the phase information from the model much, much more accurate
than from a SAD experiment, i.e. I'd rather expect the R value to get
worse when you include SAD phasing information. For example, when
developers show the phase errors of their methods for phasing
experiments, the error is generally calculated with respect to the final
model.
Did you observe the drop at an early stage of refinement with a low
complete model?
Regards,
Tim
On 11/13/2014 05:03 PM, Mark Wilson wrote:
...
Hi All,
As I am (I think) the other user that Nat is referring to, I'll comment.
I requested support for experimental phase information in the PHENIX
ensemble refinement target and can verify that it is accepted (in my
case
as HL coeffs) and will run.  The result in my test case was not
dramatically different than an amplitude-based target, but obviously a
great many factors could affect this.  What differences I saw were minor
improvements in R/Rfree (~1%) with Se-Met SAD phases in a 1.05 Å
resolution structure of a flexible protein.  I've not dug too much more
into this, but I can verify that phases are accepted and do influence
the
final ensemble.
Best regards,
Mark
Mark A. Wilson
Associate Professor
Department of Biochemistry/Redox Biology Center
University of Nebraska
N118 Beadle Center
1901 Vine Street
Lincoln, NE 68588
(402) 472-3626
[email protected]
On 11/13/14 9:53 AM, "Nathaniel Echols"  wrote:
...
On Thu, Nov 13, 2014 at 6:46 AM, Joseph Brock 
wrote:
1. In the associated publication (Burnley et
al. eLife 2012;), the ensemble refinement is validated by comparing the
correlation of the ensemble generated map, with the map generated
from the experimental phases for PDB entry 1YTT... I am confused how
one
computes an experimentally phased from structure factors deposited in
the
PDB that contain only anomalous intensities/amplitudes and not
Hendrickson-Lattman
coefficients. Is there a program within the phenix package that can do
this?
AutoSol can be used to re-solve such datasets, although in the case of
1YTT it requires additional information that wasn't deposited.
Is it possible to include experimental phases during the rolling
average
refinement process and could this be beneficial (if the phases were of
a
sufficient quality)?
It is possible, but completely untested aside from verifying that it
doesn't crash.  I added this a year ago at the request of another user
but haven't looked into it since.
3. What is the function of the "nproc" keyword? If this is the number
of
CPU cores that can be used in parallel, what is the most efficient way
of
using phenix.ensemble_refinement on a cluster?
The only parallelization is in the optimization of the ptls parameter -
i.e. if you try N values for ptls, you can run N jobs at once.  For a
single ptls it will run in serial.  So on a cluster, you are better off
running N different jobs separately.
Finally, I noticed that I cannot run phenix.ensemble_refinement using a
"my_parameters.eff" file, it is necessary to type on the command line.
That sounds like a bug...
-Nat
_______________________________________________
phenixbb mailing list
[email protected]
http://phenix-online.org/mailman/listinfo/phenixbb
-- 
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen
GPG Key ID = A46BEE1A