phenix.refine questions: weights, docs, etc.
Hi, four questions on phenix.refine: 1) What's the reason for refining bss, xyz and adp in separate cycles? CNS does that too while refmac (I believe) refines them simultaneously. I suspect I missed something, but I thought the power of ML refinement is that no refinable parameters are held constant if they're not supposed to be. (At least, wasn't that the big deal in Sharp when it came out?) 2) How does one figure the right restraint weights? (wxc and wxc). In refmac, I've always calibrated the weight with rmsd(bonds), tightening it until rmsd(bonds) is ~0.018 (or below for lower reso); invariably this is where Rfree will decrease furthest and smoothly as well (i.e. without that silly rise at the end.) With phenix.refine I'm finding that I only get Rfree to drop monotonously when I tighten the weights (both wxc and wxu) so much that rmsd(bonds) goes down to ~0.002. Would there be a different way to do this? 3) There isn't currently torsional refinement, is there? For high-temperature simulated annealing, I mean. I'm having the same weight-adjustment question for SA, except is harder to fiddle with because jobs take so long, but with torsional that becomes moot. Or should one always put very tight weights on for SA? 4) Is there good documentation for all the loads and loads of keywords? The names themselves are just descriptive enough to be interesting, but obviously not enough to be informative if I don't just want the black box. Nice program, though, got to say... phx.
Hi, thanks for your questions!
1) What's the reason for refining bss, xyz and adp in separate cycles? CNS does that too while refmac (I believe) refines them simultaneously. I suspect I missed something, but I thought the power of ML refinement is that no refinable parameters are held constant if they're not supposed to be. (At least, wasn't that the big deal in Sharp when it came out?)
It is common practice to refine bulk solvent and scale, coordinates, ADPs, an other parameters separately. There are numerical issues behind this (see for example: Acta Cryst. (1978). A34, 791-809; /Acta Cryst./ (2005). D*61*, 850-855; ...). CNS does exactly the same: it refines this parameters separately. To my knowledge, same for REFMAC. This is not target specific. The power of ML refinement is that ML target statistically models missing scatterers in a model and and errors (/Acta Cryst./ (2002). A*58*, 270-282).
2) How does one figure the right restraint weights? (wxc and wxc). In refmac, I've always calibrated the weight with rmsd(bonds), tightening it until rmsd(bonds) is ~0.018 (or below for lower reso); invariably this is where Rfree will decrease furthest and smoothly as well (i.e. without that silly rise at the end.)
With phenix.refine I'm finding that I only get Rfree to drop monotonously when I tighten the weights (both wxc and wxu) so much that rmsd(bonds) goes down to ~0.002. Would there be a different way to do this?
The weights wxc for coordinates (refinement target E = wxc*Exray + Echem) and wxu for ADP (refinement target E = wxu*Exray + Eadp) refinement are determined automatically based on the gradients ratio (Proc. Nat. Acad. Sci. USA. 1997, 94:5018-5023.). Again, this is very close to what CNS does and is demonstrated to be relatively reliable. Since it is automatic, it may give different weights from macro-cycle to macro-cycle making it possible some slight oscillation in R/Rfree. You can also scale the weights manually by using wxc_scale and wxu_scale to achieve desired gap between R and Rfree and deviations from ideal values of bonds, angles (see phenix.refine Documentation).
3) There isn't currently torsional refinement, is there?
It is under active development.
4) Is there good documentation for all the loads and loads of keywords? The names themselves are just descriptive enough to be interesting, but obviously not enough to be informative if I don't just want the black box.
Many of the important and popular scenarios and keywords are in phenix.refine Documentation. I plan to expand it before the next release. Please let me know if you have any questions. Pavel.
Pavel Afonine wrote:
Hi,
thanks for your questions!
1) What's the reason for refining bss, xyz and adp in separate cycles? CNS does that too while refmac (I believe) refines them simultaneously. I suspect I missed something, but I thought the power of ML refinement is that no refinable parameters are held constant if they're not supposed to be. (At least, wasn't that the big deal in Sharp when it came out?)
It is common practice to refine bulk solvent and scale, coordinates, ADPs, an other parameters separately. There are numerical issues behind this (see for example: Acta Cryst. (1978). A34, 791-809; /Acta Cryst./ (2005). D*61*, 850-855; ...). CNS does exactly the same: it refines this parameters separately. To my knowledge, same for REFMAC. This is not target specific. The power of ML refinement is that ML target statistically models missing scatterers in a model and and errors (/Acta Cryst./ (2002). A*58*, 270-282).
The sets of parameters that can be refined simultaneously is a result of the choice of optimization method, not the choice of the function being optimized. The crucial difference is how the second derivatives are handled. In the methods used in Phenix and CNS the seconds derivatives of all parameters are assumed to be equal and uncorrelated. To ensure that this assumption is true only parameters of the same category can be varied in a single cycle. This means that the coordinates can be varied, but the B factors and scale factors have to be held fixed. When the B factors are varied the coordinates and scale factors must be constant. When the scale factors are varied everything else must be fixed. Other refinement programs use second derivatives in a more explicit way than Phenix and do vary more types of parameters in a single cycle. Shelxd is the most powerful and, by default, varies all parameters of all classes each cycle. Refmac, to the best of my knowledge, refines both coordinates and ADPs together, but does refine TLS parameters in a separate step. Since all the parameters of our models are correlated with one another, it is better to refine as many of them at once as possible. Implementing the join refinement of all these kinds of parameters is difficult, so to save programmers' time approximations are sometimes made. Dale Tronrud
participants (3)
-
Dale Tronrud
-
Frank von Delft
-
Pavel Afonine