Re: [phenixbb] phenix.refine questions: weights, docs, etc.
Hi Dale,
The sets of parameters that can be refined simultaneously is a result of the choice of optimization method, not the choice of the function being optimized. The crucial difference is how the second derivatives are handled. In the methods used in Phenix and CNS the seconds derivatives of all parameters are assumed to be equal and uncorrelated.
This is true only for the first iteration of a minimization in a phenix macro cycle. After that the L-BFGS method builds up an approximation of the second derivative information.
To ensure that this assumption is true only parameters of the same category can be varied in a single cycle. This means that the coordinates can be varied, but the B factors and scale factors have to be held fixed. When the B factors are varied the coordinates and scale factors must be constant. When the scale factors are varied everything else must be fixed.
Other refinement programs use second derivatives in a more explicit way than Phenix and do vary more types of parameters in a single cycle. Shelxd is the most powerful and, by default, varies all parameters of all classes each cycle. Refmac, to the best of my knowledge, refines both coordinates and ADPs together, but does refine TLS parameters in a separate step.
As far as I know, Shelx doesn't use second derivatives at all. Shelx is shaped completely by the least-squares approach, using the Jacobian. You couldn't plug another target function into Shelx as you can do in the common macromolecular programs. I.e. there is in fact a link between the minimization method and choice of the target function. REFMAC uses a block-diagonal approach that only takes the correlations of the parameters of one atom into account. phenix approximates these with the L-BFGS method. We have not actually tried to refine coordinates and B-factors simultaneously since we've been too busy with other priorities. In theory it should be possible, but I'm sure it will take a lot of fine-tuning.
Since all the parameters of our models are correlated with one another, it is better to refine as many of them at once as possible. Implementing the join refinement of all these kinds of parameters is difficult, so to save programmers' time approximations are sometimes made.
We'll spend the time when we find it. :-) But I don't expect spectacular gains, no matter how long we beat on the second derivatives. Ralf
This thread is evolving into a discussion by developers of the finer points of refinement package implementation. My original goal was simply to expand on the "common practice" part of Pavel's original answer. The original questioner was correct that other packages bundle their parameters to a greater extent than Phenix and CNS. For those of you who want to roll up your sleeves and get your hands dirty, read on... Ralf W. Grosse-Kunstleve wrote:
Hi Dale,
The sets of parameters that can be refined simultaneously is a result of the choice of optimization method, not the choice of the function being optimized. The crucial difference is how the second derivatives are handled. In the methods used in Phenix and CNS the seconds derivatives of all parameters are assumed to be equal and uncorrelated.
This is true only for the first iteration of a minimization in a phenix macro cycle. After that the L-BFGS method builds up an approximation of the second derivative information.
All of the refinement packages in current use use some form of learning from history to overcome their inherent assumptions. Phenix assumes that all diagonal elements of the Normal matrix are equal to each other and all off diagonal elements are equal to zero, and then uses the very powerful tools of the L-BFGS method to "learn" the corrections required to overcome those assumptions. Buster/TNT calculates approximations for the diagonal elements and assumes that the off-diagonal elements are zero and uses the preconditioned conjugate gradient method to "learn" how to overcome those assumptions. Shelxd calculates the full Normal matrix but assumes that the second derivative of |Fc| with respect to the parameters of the model are zero. George uses the history of his refinement to "learn" how to overcome that assumption. All of this "learning" about curvatures is done, basically, by comparing the gradient of the function before and after a cycle of refinement. Such a calculation can, at most, learn about one dimension of the Normal matrix in each cycle. Since the number of dimensions is equal to the number of parameters, it would take a lot of cycles to "learn" what George builds into his Normal matrix from the start.
To ensure that this assumption is true only parameters of the same category can be varied in a single cycle. This means that the coordinates can be varied, but the B factors and scale factors have to be held fixed. When the B factors are varied the coordinates and scale factors must be constant. When the scale factors are varied everything else must be fixed.
Other refinement programs use second derivatives in a more explicit way than Phenix and do vary more types of parameters in a single cycle. Shelxd is the most powerful and, by default, varies all parameters of all classes each cycle. Refmac, to the best of my knowledge, refines both coordinates and ADPs together, but does refine TLS parameters in a separate step.
As far as I know, Shelx doesn't use second derivatives at all. Shelx is shaped completely by the least-squares approach, using the Jacobian. You couldn't plug another target function into Shelx as you can do in the common macromolecular programs. I.e. there is in fact a link between the minimization method and choice of the target function.
This is a common point of confusion in least-squares refinement. There are two different "second derivatives" involved here. The Normal matrix IS the second derivatives of the least-squares target function. Usually when there is a non-linear relationship between the parameters of the model and the restraint targets (such as |Fc| and bond angles) the Normal matrix is calculated assuming a linearized version each restraint target. This means that the calculation of second derivatives of the least-square target is performed by assuming the second derivatives of |Fc| are zero. Even though the resulting Normal matrix elements are not exactly correct, they are far better approximations to the true values than a constant times the identity matrix, which is the starting Phenix approximation. I know nothing of the practical aspects of George's code in Shelx but there is no problem in theory with putting a Maximum Likelihood target into this scheme for calculating the Normal matrix. You would simply be assuming that the second derivative of the expectation value of |Fc| is zero and use the successive cycles to "learn" the consequences of this assumption.
REFMAC uses a block-diagonal approach that only takes the correlations of the parameters of one atom into account. phenix approximates these with the L-BFGS method. We have not actually tried to refine coordinates and B-factors simultaneously since we've been too busy with other priorities. In theory it should be possible, but I'm sure it will take a lot of fine-tuning.
It is my understanding that Refmac also includes the off-diagonal elements for the geometrical restraints. This is again a big win for adding explicitly what would otherwise have to be "learned" over many cycles of refinement.
Since all the parameters of our models are correlated with one another, it is better to refine as many of them at once as possible. Implementing the join refinement of all these kinds of parameters is difficult, so to save programmers' time approximations are sometimes made.
We'll spend the time when we find it. :-) But I don't expect spectacular gains, no matter how long we beat on the second derivatives.
And I expect you will be surprised when you do put them in. When I put in my code just the diagonal elements (which is pretty easy to do) I was amazed at how much it improved my refinements. Not only was the refinement faster and easier to use (because I could refine XYX/B/Occ all at once), but a large number of other problems that I had been "fine tuning" to overcome fell away. I no longer had to fight the oscillation of heavy atoms and atoms on special position just did the right thing without me fighting them. I ended up deleting large sections of code that had been crafted to overcome particular problems because it was no longer needed.
Ralf _______________________________________________ phenixbb mailing list [email protected] http://www.phenix-online.org/mailman/listinfo/phenixbb
participants (2)
-
Dale Tronrud
-
Ralf W. Grosse-Kunstleve