Only one solution in AutoMR with very negative LLG

older
Need help in refining my twinned...

Zhang yu

14 Mar 2011 14 Mar '11

10:32 p.m.

Hi, I am working on a DNA-protein complex, the protein apo structure is already known. Recently I got a dataset of the DNA-protein complex, I tried to find a solution by autoMR in Phenix with the known protein coordinate as template. I got only one solution with very negative LLG (around -4000), and after rigid body refinement, the both Rwork and Rfree is around 0.55. Was there anyone in the same situation as me? What does that mean if there is just one solution while with very high negative LLG? Thanks

Attachments:

attachment.html (text/html — 541 bytes)

Show replies by date

Ed Pozharski

15 Mar 15 Mar

2:36 p.m.

On Mon, 2011-03-14 at 18:32 -0400, Zhang yu wrote:

...

I am working on a DNA-protein complex, the protein apo structure is already known. Recently I got a dataset of the DNA-protein complex, I tried to find a solution by autoMR in Phenix with the known protein coordinate as template. I got only one solution with very negative LLG (around -4000), and after rigid body refinement, the both Rwork and Rfree is around 0.55. Was there anyone in the same situation as me? What does that mean if there is just one solution while with very high negative LLG?

Could mean several things, but one thing is for sure - R~55% suggests that molecular replacement did not work. One possibility is that your protein undergoes conformational change when it binds to DNA. If it has distinct domains, you may be able to get a solution if running them as separate models. Negative LLG could mean that you did not correctly guess the unit cell content. Or maybe the space group is wrong. You may want to post the phaser log-file, since it's not obvious to me what you mean by only one solution. What were the Z-scores at rotation/translation steps? -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs

Zhang yu

3:45 p.m.

Hi, Pozharski I attached part of log file. Fast Rotation Function Table: 1 ------------------------------- #SET Top (Z) Second (Z) Third (Z) 1 -1129.09 32.53 --- --- --- --- ---- ---------- ----- ---------- ----- ---------- ----- --------------- FINAL SELECTION --------------- Mean used for final selection = -5302.48 Cutoff used for final selection = -2172.44 Number of sets stored before final selection = 1 Number of solutions stored before final selection = 1 Number of sets stored (deleted) after final selection = 1 (0) Number of solutions stored (deleted) after final selection = 1 (0) Select by Percentage of Top value: 75 Top TF = -5085.32 Mean TF used for final selection = -7813.43 LLG Cutoff used for final selection = -5767.35 Number of solutions stored before final selection = 1 Number of solutions stored (deleted) after final selection = 1 (0) Top TFZ = 21.3024 ----------------- TABLES OF RESULTS ----------------- Fast Translation Function Table: Space Group P 1 2 1 ---------------------------------------------------- #SET #TRIAL Top (Z) Second (Z) Third (Z) Ensemble 1 1 -5085.32 (21.30) - - - - 1 ---- ------ Solutions: ===================== Solution #1: Likelihood Gain -4625.06 ENSE 1 - EULER 326.300, 61.849, 83.068 - FRAC 0.529, -0.001, 0.102 Unit cell: (186.376, 103.164, 295.884, 90, 98.789, 90) Space group: P 1 2 1 (No. 3) SPACE GROUP OF SOLUTION: 'P 1 2 1' By the way, the Xtrige in Phenix tells that translational pseudo-symmetry is very likely present. I also attached the log file. Twinning and intensity statistics summary (acentric data): Statistics independent of twin laws /<I>^2 : 2.148 <F>^2/ : 0.803 <|E^2-1|> : 0.729 <|L|>, : 0.391, 0.215 Multivariate Z score L-test: 9.366 The multivariate Z score is a quality measure of the given spread in intensities. Good to reasonable data are expected to have a Z score lower than 3.5. Large values can indicate twinning, but small values do not necessarily exclude it. No (pseudo)merohedral twin laws were found. Patterson analyses - Largest peak height : 56.002 (corresponding p value : 2.915e-05) The analyses of the Patterson function reveals a significant off-origin peak that is 56.00 % of the origin peak, indicating pseudo translational symmetry. The chance of finding a peak of this or larger height by random in a structure without pseudo translational symmetry is equal to the 2.9153e-05. The detected tranlational NCS is most likely also responsible for the elevated intensity ratio. See the relevant section of the logfile for more details. The results of the L-test indicate that the intensity statistics are significantly different than is expected from good to reasonable, untwinned data. As there are no twin laws possible given the crystal symmetry, there could be a number of reasons for the departure of the intensity statistics from normality. Overmerging pseudo-symmetric or twinned data, intensity to amplitude conversion problems as well as bad data quality might be possible reasons. It could be worthwhile considering reprocessing the data. Yu Zhang 2011/3/15 Ed Pozharski

...

On Mon, 2011-03-14 at 18:32 -0400, Zhang yu wrote:

...
I am working on a DNA-protein complex, the protein apo structure is already known. Recently I got a dataset of the DNA-protein complex, I tried to find a solution by autoMR in Phenix with the known protein coordinate as template. I got only one solution with very negative LLG (around -4000), and after rigid body refinement, the both Rwork and Rfree is around 0.55. Was there anyone in the same situation as me? What does that mean if there is just one solution while with very high negative LLG?

Could mean several things, but one thing is for sure - R~55% suggests that molecular replacement did not work. One possibility is that your protein undergoes conformational change when it binds to DNA. If it has distinct domains, you may be able to get a solution if running them as separate models.

Negative LLG could mean that you did not correctly guess the unit cell content. Or maybe the space group is wrong. You may want to post the phaser log-file, since it's not obvious to me what you mean by only one solution. What were the Z-scores at rotation/translation steps?

-- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

Daniel Mattle

3:57 p.m.

Hey, I played around with the RMSD value that helped at least for me. Try to make it higher. Best, Daniel On 15.03.2011, at 16:45, Zhang yu wrote:

...

Hi, Pozharski

I attached part of log file.

Fast Rotation Function Table: 1 ------------------------------- #SET Top (Z) Second (Z) Third (Z) 1 -1129.09 32.53 --- --- --- --- ---- ---------- ----- ---------- ----- ---------- -----

--------------- FINAL SELECTION ---------------

Mean used for final selection = -5302.48 Cutoff used for final selection = -2172.44 Number of sets stored before final selection = 1 Number of solutions stored before final selection = 1 Number of sets stored (deleted) after final selection = 1 (0) Number of solutions stored (deleted) after final selection = 1 (0)

Select by Percentage of Top value: 75 Top TF = -5085.32 Mean TF used for final selection = -7813.43 LLG Cutoff used for final selection = -5767.35 Number of solutions stored before final selection = 1 Number of solutions stored (deleted) after final selection = 1 (0) Top TFZ = 21.3024

----------------- TABLES OF RESULTS -----------------

Fast Translation Function Table: Space Group P 1 2 1 ---------------------------------------------------- #SET #TRIAL Top (Z) Second (Z) Third (Z) Ensemble 1 1 -5085.32 (21.30) - - - - 1 ---- ------

Solutions: ===================== Solution #1: Likelihood Gain -4625.06 ENSE 1 - EULER 326.300, 61.849, 83.068 - FRAC 0.529, -0.001, 0.102

Unit cell: (186.376, 103.164, 295.884, 90, 98.789, 90) Space group: P 1 2 1 (No. 3)

SPACE GROUP OF SOLUTION: 'P 1 2 1'

By the way, the Xtrige in Phenix tells that translational pseudo-symmetry is very likely present. I also attached the log file.

Twinning and intensity statistics summary (acentric data):

Statistics independent of twin laws /<I>^2 : 2.148 <F>^2/ : 0.803 <|E^2-1|> : 0.729 <|L|>, : 0.391, 0.215 Multivariate Z score L-test: 9.366

The multivariate Z score is a quality measure of the given spread in intensities. Good to reasonable data are expected to have a Z score lower than 3.5. Large values can indicate twinning, but small values do not necessarily exclude it.

No (pseudo)merohedral twin laws were found.

Patterson analyses - Largest peak height : 56.002 (corresponding p value : 2.915e-05)

The analyses of the Patterson function reveals a significant off-origin peak that is 56.00 % of the origin peak, indicating pseudo translational symmetry. The chance of finding a peak of this or larger height by random in a structure without pseudo translational symmetry is equal to the 2.9153e-05. The detected tranlational NCS is most likely also responsible for the elevated intensity ratio. See the relevant section of the logfile for more details. The results of the L-test indicate that the intensity statistics are significantly different than is expected from good to reasonable, untwinned data. As there are no twin laws possible given the crystal symmetry, there could be a number of reasons for the departure of the intensity statistics from normality. Overmerging pseudo-symmetric or twinned data, intensity to amplitude conversion problems as well as bad data quality might be possible reasons. It could be worthwhile considering reprocessing the data.

Yu Zhang

2011/3/15 Ed Pozharski On Mon, 2011-03-14 at 18:32 -0400, Zhang yu wrote:

...
I am working on a DNA-protein complex, the protein apo structure is already known. Recently I got a dataset of the DNA-protein complex, I tried to find a solution by autoMR in Phenix with the known protein coordinate as template. I got only one solution with very negative LLG (around -4000), and after rigid body refinement, the both Rwork and Rfree is around 0.55. Was there anyone in the same situation as me? What does that mean if there is just one solution while with very high negative LLG?

Could mean several things, but one thing is for sure - R~55% suggests that molecular replacement did not work. One possibility is that your protein undergoes conformational change when it binds to DNA. If it has distinct domains, you may be able to get a solution if running them as separate models.

Negative LLG could mean that you did not correctly guess the unit cell content. Or maybe the space group is wrong. You may want to post the phaser log-file, since it's not obvious to me what you mean by only one solution. What were the Z-scores at rotation/translation steps?

-- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

Francis E Reyes

3:59 p.m.

This is probably the most revealing to me.. What did you do about this when you did the MR? On Mar 15, 2011, at 9:45 AM, Zhang yu wrote:

...

The analyses of the Patterson function reveals a significant off- origin peak that is 56.00 % of the origin peak, indicating pseudo translational symmetry.

--------------------------------------------- Francis E. Reyes M.Sc. 215 UCB University of Colorado at Boulder gpg --keyserver pgp.mit.edu --recv-keys 67BA8D5D 8AE2 F2F4 90F7 9640 28BC 686F 78FD 6669 67BA 8D5D

Zhang yu

7:13 p.m.

It indicates that pseudo translational symmetry is present in the dataset. I didn't do anything when I first try MR. Is that possible I merged two lattice together during data process, or should I reprocess the data to P1? I attached the scale log file, Shell Lower Upper Average Average Norm. Linear Square limit Angstrom I error stat. Chi**2 R-fac R-fac 50.00 10.29 1030.9 76.3 20.0 1.095 0.105 0.118 10.29 8.18 791.3 52.2 18.1 1.108 0.093 0.101 8.18 7.15 350.8 23.5 11.3 1.160 0.108 0.106 7.15 6.49 195.8 17.1 11.4 1.090 0.138 0.131 6.49 6.03 143.7 15.7 12.1 1.029 0.167 0.154 6.03 5.67 130.8 15.6 13.2 1.101 0.187 0.173 5.67 5.39 129.9 16.8 14.6 1.074 0.198 0.184 5.39 5.16 136.0 17.9 15.6 1.078 0.202 0.186 5.16 4.96 138.3 19.2 16.8 1.044 0.207 0.192 4.96 4.79 150.9 20.7 18.0 1.082 0.209 0.191 4.79 4.64 151.3 21.1 18.8 1.140 0.225 0.210 4.64 4.50 143.1 22.4 20.3 1.070 0.236 0.226 4.50 4.39 127.5 22.5 21.1 1.027 0.262 0.251 4.39 4.28 117.7 22.8 22.3 1.051 0.291 0.279 4.28 4.18 105.8 24.2 23.7 0.988 0.322 0.317 4.18 4.09 93.0 25.0 24.9 0.947 0.365 0.371 4.09 4.01 84.9 25.3 25.2 0.884 0.409 0.403 4.01 3.94 76.6 27.4 27.3 0.843 0.457 0.490 3.94 3.87 66.5 27.5 27.5 0.778 0.505 0.559 3.87 3.80 59.3 28.9 28.8 0.766 0.575 0.696 All reflections 222.9 26.7 19.5 1.037 0.165 0.128 2011/3/15 Francis E Reyes

...

This is probably the most revealing to me.. What did you do about this when you did the MR?

On Mar 15, 2011, at 9:45 AM, Zhang yu wrote:

The analyses of the Patterson function reveals a significant off-origin

...
peak that is 56.00 % of the origin peak, indicating pseudo translational symmetry.

--------------------------------------------- Francis E. Reyes M.Sc. 215 UCB University of Colorado at Boulder

gpg --keyserver pgp.mit.edu --recv-keys 67BA8D5D

8AE2 F2F4 90F7 9640 28BC 686F 78FD 6669 67BA 8D5D

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

Francis E Reyes

8:18 p.m.

Go back to your xtriage log file... Post the section that looks similar to the one below... The full list of Patterson peaks is: x y z height p-value(height) ( 0.500, 0.000, 0.233 ) : 26.198 (2.753e-03) ( 0.000, 0.338, 0.000 ) : 5.699 (7.800e-01) If the observed pseudo translationals are crystallographic the following spacegroups and unit cells are possible: space group operator unit cell of reference setting C 2 2 21 (-a,c,2*b) x+1/2, y, z+1/4 (55.47, 73.64, 81.46, 90.00, 90.00, 90.00) Systematic absences Also, post the results of the rotation function.. What is the RFZ score and how did it compare to any other rotation solutions. F On Mar 15, 2011, at 1:13 PM, Zhang yu wrote:

...

It indicates that pseudo translational symmetry is present in the dataset. I didn't do anything when I first try MR. Is that possible I merged two lattice together during data process, or should I reprocess the data to P1?

I attached the scale log file,

Shell Lower Upper Average Average Norm. Linear Square limit Angstrom I error stat. Chi**2 R-fac R-fac 50.00 10.29 1030.9 76.3 20.0 1.095 0.105 0.118 10.29 8.18 791.3 52.2 18.1 1.108 0.093 0.101 8.18 7.15 350.8 23.5 11.3 1.160 0.108 0.106 7.15 6.49 195.8 17.1 11.4 1.090 0.138 0.131 6.49 6.03 143.7 15.7 12.1 1.029 0.167 0.154 6.03 5.67 130.8 15.6 13.2 1.101 0.187 0.173 5.67 5.39 129.9 16.8 14.6 1.074 0.198 0.184 5.39 5.16 136.0 17.9 15.6 1.078 0.202 0.186 5.16 4.96 138.3 19.2 16.8 1.044 0.207 0.192 4.96 4.79 150.9 20.7 18.0 1.082 0.209 0.191 4.79 4.64 151.3 21.1 18.8 1.140 0.225 0.210 4.64 4.50 143.1 22.4 20.3 1.070 0.236 0.226 4.50 4.39 127.5 22.5 21.1 1.027 0.262 0.251 4.39 4.28 117.7 22.8 22.3 1.051 0.291 0.279 4.28 4.18 105.8 24.2 23.7 0.988 0.322 0.317 4.18 4.09 93.0 25.0 24.9 0.947 0.365 0.371 4.09 4.01 84.9 25.3 25.2 0.884 0.409 0.403 4.01 3.94 76.6 27.4 27.3 0.843 0.457 0.490 3.94 3.87 66.5 27.5 27.5 0.778 0.505 0.559 3.87 3.80 59.3 28.9 28.8 0.766 0.575 0.696 All reflections 222.9 26.7 19.5 1.037 0.165 0.128

2011/3/15 Francis E Reyes This is probably the most revealing to me.. What did you do about this when you did the MR?

On Mar 15, 2011, at 9:45 AM, Zhang yu wrote:

The analyses of the Patterson function reveals a significant off- origin peak that is 56.00 % of the origin peak, indicating pseudo translational symmetry.

--------------------------------------------- Francis E. Reyes M.Sc. 215 UCB University of Colorado at Boulder

gpg --keyserver pgp.mit.edu --recv-keys 67BA8D5D

8AE2 F2F4 90F7 9640 28BC 686F 78FD 6669 67BA 8D5D

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

Zhang yu

11:49 p.m.

*Xtrige log file of P21 * The full list of Patterson peaks is: x y z height p-value(height) ( 0.500, 0.500, 0.000 ) : 56.002 (2.915e-05) ( 0.483, 0.500, 0.034 ) : 6.152 (6.778e-01) If the observed pseudo translationals are crystallographic the following spacegroups and unit cells are possible: space group operator unit cell of reference setting C 1 2 1 (a-1/4,b-1/4,c) x+1/2, y+1/2, z (186.38, 103.16, 295.88, 90.00, 98.79, 90.00) * Xtrige log file of P2* The full list of Patterson peaks is: The Xtrige log is as follows, x y z height p-value(height) ( 0.500, 0.500, 0.000 ) : 56.000 (2.916e-05) ( 0.483, 0.500, 0.034 ) : 6.151 (6.781e-01) If the observed pseudo translationals are crystallographic the following spacegroups and unit cells are possible: space group operator unit cell of reference setting C 1 2 1 x+1/2, y+1/2, z (186.38, 103.16, 295.88, 90.00, 98.79, 90.00) *Results from rotation functions for P2 (Search two monomers in ASU)* ****************************** ******************************************************* *** Phaser Module: MOLECULAR REPLACEMENT FAST ROTATION FUNCTION 2.2.4 *** ************************************************************************************* --------------------- ANISOTROPY CORRECTION --------------------- No refinement of parameters ------------------------------- DATA FOR FAST ROTATION FUNCTION ------------------------------- Outliers with a probability less than 1e-06 will be rejected There were 2 (0.0000%) reflections rejected H K L reso F probability 5 5 62 4.48 42.404 4.049e-07 18 2 -26 8.12 112.092 7.520e-09 Space-Group Name (Number): P 1 2 1 (3) Resolution of All Data (Number): 3.82 49.18 (87989) Resolution of Selected Data (Number): 3.82 49.18 (87985) -------------------------- ENSEMBLE FOR DECOMPOSITION -------------------------- This pdb file contains 1 models The (input) RmsD of this model with respect to the real structure is 0.8 Unit Cell: 394.48 295.80 293.03 90.00 90.00 90.00 Electron Density Calculation 0% 100% |=======| DONE ------------------- WILSON DISTRIBUTION ------------------- Parameters set for Wilson log-likelihood calculation E = 0 and variance 1 for each reflection Without correction for SigF to the variances, Wilson log(likelihood) = - number of acentrics (83981) - half number of centrics (4004/2) = -85983 With correction for SigF, Wilson log(likelihood) = -89650.5 Configuring ensembles --------------------- This pdb file contains 1 models The (input) RmsD of this model with respect to the real structure is 0.8 Unit Cell: 914.73 520.03 508.93 90.00 90.00 90.00 Electron Density Calculation 0% 100% |=======| DONE ------------------------------ FAST ROTATION FUNCTION #1 OF 1 ------------------------------ Search Ensemble: 1 Known MR solutions (empty solution set - no components) Fraction of asymmetric unit modelled: 1.0047 of which the "moving" fraction is 1.0047 Fraction of asymmetric unit modelled (after correction): 0.99 of which the "moving" fraction is 0.99 Spherical Harmonics ------------------- Elmn for Data Elmn Calculation for Data 0% 100% |=========================================================================| DONE Elmn for Search Ensemble Elmn Calculation for Search Ensemble 0% 100% |===========================================================================| DONE Scanning the Range of Beta Angles --------------------------------- Performing a 1.3513 degree search. Clmn Calculation 0% 100% |====================================================================| DONE Top 59 rotations before clustering will be rescored Calculating Likelihood for RF #1 of 1 0% 100% |============================================================| DONE Scoring 500 randomly sampled rotations Generating Statistics for RF #1 of 1 0% 100% |========================================================================| DONE Mean Score (Sigma): -7144.27 (152.138) Highest Score (Z-score): -2269.9 (32.0392) Top Peaks With Clustering ------------------------- # Rank of the peak after rescoring search points (#) Rank of the peak before rescoring search points LLG Log-Likelihood Gain Z-Score Number of standard deviations of LLG above the mean FSS Fast Search Score You requested peaks over 75% of top (i.e. 0.75*(top-mean)+mean) There was 1 site over 75% of top The sites over 75% are: # (#) Euler1 Euler2 Euler3 LLG Z-score Split #Group raw/top 1 1 326.1 62.4 83.0 -2269.90 32.04 0.0 23 100.0/100.0 213.9 117.6 263.0 Fast Rotation Function Table: 1 ------------------------------- #SET Top (Z) Second (Z) Third (Z) 1 -2269.90 32.04 --- --- --- --- ---- ---------- ----- ---------- ----- ---------- ----- --------------- FINAL SELECTION --------------- Mean used for final selection = -7144.27 Cutoff used for final selection = -3488.49 Number of sets stored before final selection = 1 Number of solutions stored before final selection = 1 Number of sets stored (deleted) after final selection = 1 (0) Number of solutions stored (deleted) after final selection = 1 (0) ------------ *Results from rotation functions for P2 (Search one monomer in ASU)* ************************************************************************************* *** Phaser Module: MOLECULAR REPLACEMENT FAST ROTATION FUNCTION 2.2.4 *** ************************************************************************************* --------------------- ANISOTROPY CORRECTION --------------------- No refinement of parameters ------------------------------- DATA FOR FAST ROTATION FUNCTION ------------------------------- Outliers with a probability less than 1e-06 will be rejected There were 2 (0.0000%) reflections rejected H K L reso F probability 5 5 62 4.48 42.404 4.049e-07 18 2 -26 8.12 112.092 7.520e-09 Space-Group Name (Number): P 1 2 1 (3) Resolution of All Data (Number): 3.82 49.18 (87989) Resolution of Selected Data (Number): 3.82 49.18 (87985) -------------------------- ENSEMBLE FOR DECOMPOSITION -------------------------- This pdb file contains 1 models The (input) RmsD of this model with respect to the real structure is 0.8 Unit Cell: 323.84 274.82 261.37 90.00 90.00 90.00 Electron Density Calculation 0% 100% |=======| DONE ------------------- WILSON DISTRIBUTION ------------------- Parameters set for Wilson log-likelihood calculation E = 0 and variance 1 for each reflection Without correction for SigF to the variances, Wilson log(likelihood) = - number of acentrics (83981) - half number of centrics (4004/2) = -85983 With correction for SigF, Wilson log(likelihood) = -89650.5 Configuring ensembles --------------------- - 显示引用文字 - This pdb file contains 1 models The (input) RmsD of this model with respect to the real structure is 0.8 Unit Cell: 714.35 518.29 464.48 90.00 90.00 90.00 Electron Density Calculation 0% 100% |=======| DONE ------------------------------ FAST ROTATION FUNCTION #1 OF 1 ------------------------------ Search Ensemble: 1 Known MR solutions (empty solution set - no components) Fraction of asymmetric unit modelled: 0.831924 of which the "moving" fraction is 0.831924 Spherical Harmonics ------------------- Elmn for Data Elmn Calculation for Data 0% 100% |=========================================================================| DONE Elmn for Search Ensemble Elmn Calculation for Search Ensemble 0% 100% |===========================================================================| DONE Scanning the Range of Beta Angles --------------------------------- Performing a 1.5476 degree search. Clmn Calculation 0% 100% |===========================================================| DONE Top 41 rotations before clustering will be rescored Calculating Likelihood for RF #1 of 1 0% 100% |==========================================| DONE Scoring 500 randomly sampled rotations Generating Statistics for RF #1 of 1 0% 100% |========================================================================| DONE Mean Score (Sigma): -4412.88 (151.158) Highest Score (Z-score): -893.783 (23.2809) Top Peaks With Clustering ------------------------- # Rank of the peak after rescoring search points (#) Rank of the peak before rescoring search points LLG Log-Likelihood Gain Z-Score Number of standard deviations of LLG above the mean FSS Fast Search Score You requested peaks over 75% of top (i.e. 0.75*(top-mean)+mean) There was 1 site over 75% of top The sites over 75% are: # (#) Euler1 Euler2 Euler3 LLG Z-score Split #Group raw/top 1 1 70.6 39.1 10.6 -893.78 23.28 0.0 13 100.0/100.0 109.4 140.9 190.6 Fast Rotation Function Table: 1 ------------------------------- #SET Top (Z) Second (Z) Third (Z) 1 -893.78 23.28 --- --- --- --- ---- ---------- ----- ---------- ----- ---------- ----- --------------- FINAL SELECTION --------------- Mean used for final selection = -4412.88 Cutoff used for final selection = -1773.56 Number of sets stored before final selection = 1 Number of solutions stored before final selection = 1 Number of sets stored (deleted) after final selection = 1 (0) Number of solutions stored (deleted) after final selection = 1 (0) ------------ Results from rotation functions for P21 (Search two monomers in ASU) ************************************************************************************* *** Phaser Module: MOLECULAR REPLACEMENT FAST ROTATION FUNCTION 2.2.4 *** ************************************************************************************* --------------------- ANISOTROPY CORRECTION --------------------- No refinement of parameters ------------------------------- DATA FOR FAST ROTATION FUNCTION ------------------------------- Outliers with a probability less than 1e-06 will be rejected There were 2 (0.0000%) reflections rejected H K L reso F probability 5 5 62 4.48 42.404 4.046e-07 18 2 -26 8.12 112.092 7.516e-09 Space-Group Name (Number): P 1 21 1 (4) Resolution of All Data (Number): 3.82 49.18 (87983) Resolution of Selected Data (Number): 3.82 49.18 (87981) -------------------------- ENSEMBLE FOR DECOMPOSITION -------------------------- This pdb file contains 1 models The (input) RmsD of this model with respect to the real structure is 0.8 Unit Cell: 323.83 274.82 261.37 90.00 90.00 90.00 Electron Density Calculation 0% 100% |=======| DONE ------------------- WILSON DISTRIBUTION ------------------- Parameters set for Wilson log-likelihood calculation E = 0 and variance 1 for each reflection Without correction for SigF to the variances, Wilson log(likelihood) = - number of acentrics (83976) - half number of centrics (4005/2) = -85978 With correction for SigF, Wilson log(likelihood) = -89646.3 Configuring ensembles --------------------- This pdb file contains 1 models The (input) RmsD of this model with respect to the real structure is 0.8 Unit Cell: 714.35 518.29 464.48 90.00 90.00 90.00 Electron Density Calculation 0% 100% |=======| DONE ------------------------------ FAST ROTATION FUNCTION #1 OF 1 ------------------------------ Search Ensemble: 1 Known MR solutions (empty solution set - no components) Fraction of asymmetric unit modelled: 0.737308 of which the "moving" fraction is 0.737308 Spherical Harmonics ------------------- Elmn for Data Elmn Calculation for Data 0% 100% |=========================================================================| DONE Elmn for Search Ensemble Elmn Calculation for Search Ensemble 0% 100% |===========================================================================| DONE Scanning the Range of Beta Angles --------------------------------- Performing a 1.54708 degree search. Clmn Calculation 0% 100% |===========================================================| DONE Top 47 rotations before clustering will be rescored Calculating Likelihood for RF #1 of 1 0% 100% |================================================| DONE Scoring 500 randomly sampled rotations Generating Statistics for RF #1 of 1 0% 100% |========================================================================| DONE Mean Score (Sigma): -3258.27 (128.907) Highest Score (Z-score): -242.259 (23.3968) Top Peaks With Clustering ------------------------- # Rank of the peak after rescoring search points (#) Rank of the peak before rescoring search points LLG Log-Likelihood Gain Z-Score Number of standard deviations of LLG above the mean FSS Fast Search Score You requested peaks over 75% of top (i.e. 0.75*(top-mean)+mean) There was 1 site over 75% of top The sites over 75% are: # (#) Euler1 Euler2 Euler3 LLG Z-score Split #Group raw/top 1 1 70.7 39.1 10.6 -242.26 23.40 0.0 18 100.0/100.0 109.3 140.9 190.6 Fast Rotation Function Table: 1 ------------------------------- #SET Top (Z) Second (Z) Third (Z) 1 -242.26 23.40 --- --- --- --- ---- ---------- ----- ---------- ----- ---------- ----- --------------- FINAL SELECTION --------------- Mean used for final selection = -3258.27 Cutoff used for final selection = -996.261 Number of sets stored before final selection = 1 Number of solutions stored before final selection = 1 Number of sets stored (deleted) after final selection = 1 (0) Number of solutions stored (deleted) after final selection = 1 (0) ------------ Results from rotation functions for P21 (Search One monomers in ASU) ************************************************************************************* *** Phaser Module: MOLECULAR REPLACEMENT FAST ROTATION FUNCTION 2.2.4 *** ************************************************************************************* --------------------- ANISOTROPY CORRECTION --------------------- No refinement of parameters ------------------------------- DATA FOR FAST ROTATION FUNCTION ------------------------------- Outliers with a probability less than 1e-06 will be rejected There were 2 (0.0000%) reflections rejected H K L reso F probability 5 5 62 4.48 42.404 4.046e-07 18 2 -26 8.12 112.092 7.516e-09 Space-Group Name (Number): P 1 21 1 (4) Resolution of All Data (Number): 3.82 49.18 (87983) Resolution of Selected Data (Number): 3.82 49.18 (87981) -------------------------- ENSEMBLE FOR DECOMPOSITION -------------------------- This pdb file contains 1 models The (input) RmsD of this model with respect to the real structure is 0.8 Unit Cell: 323.83 274.82 261.37 90.00 90.00 90.00 Electron Density Calculation 0% 100% |=======| DONE ------------------- WILSON DISTRIBUTION ------------------- Parameters set for Wilson log-likelihood calculation E = 0 and variance 1 for each reflection Without correction for SigF to the variances, Wilson log(likelihood) = - number of acentrics (83976) - half number of centrics (4005/2) = -85978 With correction for SigF, Wilson log(likelihood) = -89646.3 Configuring ensembles --------------------- This pdb file contains 1 models The (input) RmsD of this model with respect to the real structure is 0.8 Unit Cell: 714.35 518.29 464.48 90.00 90.00 90.00 Electron Density Calculation 0% 100% |=======| DONE ------------------------------ FAST ROTATION FUNCTION #1 OF 1 ------------------------------ Search Ensemble: 1 Known MR solutions (empty solution set - no components) Fraction of asymmetric unit modelled: 0.737308 of which the "moving" fraction is 0.737308 Spherical Harmonics ------------------- Elmn for Data Elmn Calculation for Data 0% 100% |=========================================================================| DONE Elmn for Search Ensemble Elmn Calculation for Search Ensemble 0% 100% |===========================================================================| DONE Scanning the Range of Beta Angles --------------------------------- Performing a 1.54708 degree search. Clmn Calculation 0% 100% |===========================================================| DONE Top 47 rotations before clustering will be rescored Calculating Likelihood for RF #1 of 1 0% 100% |================================================| DONE Scoring 500 randomly sampled rotations Generating Statistics for RF #1 of 1 0% 100% |========================================================================| DONE Mean Score (Sigma): -3258.27 (128.907) Highest Score (Z-score): -242.259 (23.3968) Top Peaks With Clustering ------------------------- # Rank of the peak after rescoring search points (#) Rank of the peak before rescoring search points LLG Log-Likelihood Gain Z-Score Number of standard deviations of LLG above the mean FSS Fast Search Score You requested peaks over 75% of top (i.e. 0.75*(top-mean)+mean) There was 1 site over 75% of top The sites over 75% are: # (#) Euler1 Euler2 Euler3 LLG Z-score Split #Group raw/top 1 1 70.7 39.1 10.6 -242.26 23.40 0.0 18 100.0/100.0 109.3 140.9 190.6 Fast Rotation Function Table: 1 ------------------------------- #SET Top (Z) Second (Z) Third (Z) 1 -242.26 23.40 --- --- --- --- ---- ---------- ----- ---------- ----- ---------- ----- --------------- FINAL SELECTION --------------- Mean used for final selection = -3258.27 Cutoff used for final selection = -996.261 Number of sets stored before final selection = 1 Number of solutions stored before final selection = 1 Number of sets stored (deleted) after final selection = 1 (0) Number of solutions stored (deleted) after final selection = 1 (0) ------------ *Thanks a lot. Yu * 2011/3/15 Francis E Reyes

...

Go back to your xtriage log file... Post the section that looks similar to the one below...

The full list of Patterson peaks is:

x y z height p-value(height) ( 0.500, 0.000, 0.233 ) : 26.198 (2.753e-03) ( 0.000, 0.338, 0.000 ) : 5.699 (7.800e-01)

If the observed pseudo translationals are crystallographic the following spacegroups and unit cells are possible:

space group operator unit cell of reference setting C 2 2 21 (-a,c,2*b) x+1/2, y, z+1/4 (55.47, 73.64, 81.46, 90.00, 90.00, 90.00)

Systematic absences

Also, post the results of the rotation function.. What is the RFZ score and how did it compare to any other rotation solutions.

F

On Mar 15, 2011, at 1:13 PM, Zhang yu wrote:

...
It indicates that pseudo translational symmetry is present in the dataset. I didn't do anything when I first try MR. Is that possible I merged two lattice together during data process, or should I reprocess the data to P1?

I attached the scale log file,

Shell Lower Upper Average Average Norm. Linear Square limit Angstrom I error stat. Chi**2 R-fac R-fac 50.00 10.29 1030.9 76.3 20.0 1.095 0.105 0.118 10.29 8.18 791.3 52.2 18.1 1.108 0.093 0.101 8.18 7.15 350.8 23.5 11.3 1.160 0.108 0.106 7.15 6.49 195.8 17.1 11.4 1.090 0.138 0.131 6.49 6.03 143.7 15.7 12.1 1.029 0.167 0.154 6.03 5.67 130.8 15.6 13.2 1.101 0.187 0.173 5.67 5.39 129.9 16.8 14.6 1.074 0.198 0.184 5.39 5.16 136.0 17.9 15.6 1.078 0.202 0.186 5.16 4.96 138.3 19.2 16.8 1.044 0.207 0.192 4.96 4.79 150.9 20.7 18.0 1.082 0.209 0.191 4.79 4.64 151.3 21.1 18.8 1.140 0.225 0.210 4.64 4.50 143.1 22.4 20.3 1.070 0.236 0.226 4.50 4.39 127.5 22.5 21.1 1.027 0.262 0.251 4.39 4.28 117.7 22.8 22.3 1.051 0.291 0.279 4.28 4.18 105.8 24.2 23.7 0.988 0.322 0.317 4.18 4.09 93.0 25.0 24.9 0.947 0.365 0.371 4.09 4.01 84.9 25.3 25.2 0.884 0.409 0.403 4.01 3.94 76.6 27.4 27.3 0.843 0.457 0.490 3.94 3.87 66.5 27.5 27.5 0.778 0.505 0.559 3.87 3.80 59.3 28.9 28.8 0.766 0.575 0.696 All reflections 222.9 26.7 19.5 1.037 0.165 0.128

2011/3/15 Francis E Reyes This is probably the most revealing to me.. What did you do about this when you did the MR?

On Mar 15, 2011, at 9:45 AM, Zhang yu wrote:

The analyses of the Patterson function reveals a significant off-origin peak that is 56.00 % of the origin peak, indicating pseudo translational symmetry.

--------------------------------------------- Francis E. Reyes M.Sc. 215 UCB University of Colorado at Boulder

gpg --keyserver pgp.mit.edu --recv-keys 67BA8D5D

8AE2 F2F4 90F7 9640 28BC 686F 78FD 6669 67BA 8D5D

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

--------------------------------------------- Francis E. Reyes M.Sc. 215 UCB University of Colorado at Boulder

gpg --keyserver pgp.mit.edu --recv-keys 67BA8D5D

8AE2 F2F4 90F7 9640 28BC 686F 78FD 6669 67BA 8D5D

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

Ed Pozharski

16 Mar 16 Mar

1:38 p.m.

On Tue, 2011-03-15 at 19:49 -0400, Zhang yu wrote:

...

If the observed pseudo translationals are crystallographic the following spacegroups and unit cells are possible:

space group operator unit cell of reference setting

C 1 2 1 (a-1/4,b-1/4,c) x+1/2, y+1/2, z (186.38, 103.16, 295.88, 90.00, 98.79, 90.00)

I presume that you tried to process your data in C2 and it did not work? -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs

Zhang yu

3:32 p.m.

Yes, I rescaled my data to C2 (186.38, 103.16, 295.88, 90.00, 98.79, 90.00), and AutoMR found a solution. After refinement, the R went down to ~0.38, and the electron density fits well with my model except some flexible parts. Although there is a small crash between symmetry related molecules, due to the flexible feature of that domain, I don't think it is a problem. But the index confused me a lot. When I try to re-index my space group in HKL2000. The initial Index always suggests P1 (103.32, 185.24, 291.75, 97.55, 90.18, 90.48 ) Distortion Index 0.00% P2 (186.38, 103.16, 295.88, 90.00, 98.79, 90.00) Distortion Index 0.22% (they are basically the same) while HKL2000 couldn't index the data to correct C2, the only C2 it found is C2 (592.23, 103.32, 185.24, 90.00, 97.52, 90.00) Distortion index 4.11% the unit cell looks twice bigger than current one, and it has quite unacceptable distortion index. It couldn't be refined. (Extremely high chi square and mosaicity) So, I just rescaled the integrated data (P2) to C2. The statistics of scale is OK, and there is no obvious violation. Could someone tell me why HKL2000 couldn't find the correct C2 SP? Yu 2011/3/16 Ed Pozharski

...

On Tue, 2011-03-15 at 19:49 -0400, Zhang yu wrote:

...
If the observed pseudo translationals are crystallographic the following spacegroups and unit cells are possible:

space group operator unit cell of reference setting

C 1 2 1 (a-1/4,b-1/4,c) x+1/2, y+1/2, z (186.38, 103.16, 295.88, 90.00, 98.79, 90.00)

I presume that you tried to process your data in C2 and it did not work?

-- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

Francis E Reyes

3:55 p.m.

Yu,

...

Could someone tell me why HKL2000 couldn't find the correct C2 SP?

If you have a strong pseudotranslation (PST), you may see alternating weak/strong reflections along the axes. HKL2000 usually takes the most intense spots for indexing.Whether you include (or HKL2000 picks) the weak reflections determines whether you get the large unit cell (only the strong reflections) or the smaller unit cell (strong and weak reflections). LABELIT will work with PST quite well and I believe it picks up both the strong and weak reflections. It'll output a script suitable for mosflm. However, make sure you include images that include the strong/weak behavior (if it does exist in your images, b/c it can be xtal orientation dependent). Watch the integration in P2, do you notice that some (weak) reflections (usually in between lattice lines) are missed?

...

After refinement, the R went down to ~0.38, and the electron density fits well with my model except some flexible parts.

Congrats on solving the phase problem. If these flexible parts are important to your biochemical question (well as a crystallographer I would probably try to fit these parts), you should explore other space groups. My suggestion: solve and refine in C2. At the end of this you'll find that [1] the structure is completely built, the density fabulous, but the R/Rfree is still high or [2] the structure is not completely built (there's chain breaks, other abnormalities), or [3] something I haven't come across. Any of these will probably cause your structure to get rejected or at least disputed. Take that structure and molrep into the P2 data using AUTO for the PST (pseudotranslation) vector (molrep will generate the second copy based on the PST vector found in the patterson and then use that for the TF function). Then take the solution and submit it to Zanuda (http://www.ysbl.york.ac.uk/YSBLPrograms/index.jsp ) to find the proper origin (in case molrep found the wrong origin). This program will test other (potential) space groups as well. If you try this, I'd love to hear the results. F --------------------------------------------- Francis E. Reyes M.Sc. 215 UCB University of Colorado at Boulder gpg --keyserver pgp.mit.edu --recv-keys 67BA8D5D 8AE2 F2F4 90F7 9640 28BC 686F 78FD 6669 67BA 8D5D

Ed Pozharski

4:33 p.m.

On Wed, 2011-03-16 at 09:55 -0600, Francis E Reyes wrote:

...

If you have a strong pseudotranslation (PST), you may see alternating weak/strong reflections along the axes.

Well, in C2 there is no pseudotranslation, right?

...

Watch the integration in P2, do you notice that some (weak) reflections (usually in between lattice lines) are missed?

Shouldn't it be the other way around - with half of the spots covering absent reflections? -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs

Francis E Reyes

5:17 p.m.

Oops I didn't catch that the PST doesn't cause a unit cell change (from xtriage). I imagine that the PST in P2 mimics a C2. F On Mar 16, 2011, at 10:33 AM, Ed Pozharski wrote:

...

On Wed, 2011-03-16 at 09:55 -0600, Francis E Reyes wrote:

...
If you have a strong pseudotranslation (PST), you may see alternating weak/strong reflections along the axes.

Well, in C2 there is no pseudotranslation, right?

...
Watch the integration in P2, do you notice that some (weak) reflections (usually in between lattice lines) are missed?

Shouldn't it be the other way around - with half of the spots covering absent reflections?

-- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

Ed Pozharski

4:29 p.m.

On Wed, 2011-03-16 at 11:32 -0400, Zhang yu wrote:

...

Could someone tell me why HKL2000 couldn't find the correct C2 SP?

Who knows. In my experience, autoindexing in denzo is sometimes finicky. Most tricks alter the set of peaks used for indexing, by restricting resolution, increasing/decreasing their number, even changing the spot radius. Since denzo picked a much larger unit cell, it may have been swayed by spots from a satellite crystal. In this case, the Patterson clearly indicated which space group is right, but you may have avoided the problem by running your images by another indexing program. LabelIT is, imho, a particularly robuts tool. I am sure mosflm and/or xds could have worked too. Wish you to find the DNA in your maps, Ed. -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs

Edward A. Berry

6:06 p.m.

Zhang yu wrote:

...

Yes, I rescaled my data to C2 (186.38, 103.16, 295.88, 90.00, 98.79, 90.00), and AutoMR found a solution. After refinement, the R went down to ~0.38, and the electron density fits well with my model except some flexible parts. Although there is a small crash between symmetry related molecules, due to the flexible feature of that domain, I don't think it is a problem.

But the index confused me a lot. When I try to re-index my space group in HKL2000. The initial Index always suggests P1 (103.32, 185.24, 291.75, 97.55, 90.18, 90.48 ) Distortion Index 0.00% P2 (186.38, 103.16, 295.88, 90.00, 98.79, 90.00) Distortion Index 0.22% (they are basically the same) while HKL2000 couldn't index the data to correct C2, the only C2 it found is

C2 (592.23, 103.32, 185.24, 90.00, 97.52, 90.00) Distortion index 4.11%

the unit cell looks twice bigger than current one, and it has quite unacceptable distortion index. It couldn't be refined. (Extremely high chi square and mosaicity)

So, I just rescaled the integrated data (P2) to C2. The statistics of scale is OK, and there is no obvious violation.

Could someone tell me why HKL2000 couldn't find the correct C2 SP?

I am confused also. Your C2 cell is the same as the P2 cell. Since operators other than translation are the same in the two SG's it should scale as well in C2, that doesn't prove anything. But now half the spots that were predicted in P2 will be systematically absent. If they are really absent, the space group could be C2 ("the observed pseudo translationals are crystallographic"), but in that case denzo would not have offered the primitive cell it did. If they are very weak, it could be translational pseudosymmetry masquerading as C2, but then the space group would be P2 or p2(1). eab

Green, Todd

6:21 p.m.

Can you just rescale the data in C2 if you did the integration in P2? I know HKL2000(at least the gui will) will flag you for trying to switch bravais lattices. i feel like this is not exactly what you meant if got good scaling statistics. As for HKL2000 not auto-finding your C-centered monoclinic space group, you can play around with the number of peaks(add more or use less), the peak size, and resolution limits. Some combination has worked for us when we had a tricky crystals with alternating weak and heavy rows. -Todd -----Original Message----- From: [email protected] on behalf of Zhang yu Sent: Wed 3/16/2011 10:32 AM To: PHENIX user mailing list Subject: Re: [phenixbb] Only one solution in AutoMR with very negative LLG So, I just rescaled the integrated data (P2) to C2. The statistics of scale is OK, and there is no obvious violation. Yu

Ed Pozharski

15 Mar 15 Mar

4:42 p.m.

On Tue, 2011-03-15 at 11:45 -0400, Zhang yu wrote:

...

Fast Rotation Function Table: 1 ------------------------------- #SET Top (Z) Second (Z) Third (Z) 1 -1129.09 32.53 --- --- --- --- ---- ---------- ----- ---------- ----- ---------- -----

Well, your Z-score is very promising. What size protein/DNA you have? The likely asu content is over a megadalton, so unless the protein is huge, you probably have more than one copy - this would explain bad initial R-values. Patterson suggests that too. You probably need to play with the estimate of asu content a bit to get rid of negative LLG. What happens if you run a search looking for the second copy with the first one in place? -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs

Zhang yu

7:05 p.m.

Yeah, the protein is huge, around 450KD. Changing the ASU content did help a little bit. By searching two copies in ASU, I got a solution with better LLG #+ #* Initial LLG Initial R Refined LLG Refined R Unique Tmplt 1 1 4885.54 61.38 4885.54 61.38 YES But the Rfactor are still high and stuck during refinement. Regarding the possible effect of space group, I reindexed the dataset from P2 to P21. AutoMR also gave similar solutions with high R values. 2011/3/15 Ed Pozharski

...

On Tue, 2011-03-15 at 11:45 -0400, Zhang yu wrote:

...
Fast Rotation Function Table: 1 ------------------------------- #SET Top (Z) Second (Z) Third (Z) 1 -1129.09 32.53 --- --- --- --- ---- ---------- ----- ---------- ----- ---------- -----

Well, your Z-score is very promising. What size protein/DNA you have? The likely asu content is over a megadalton, so unless the protein is huge, you probably have more than one copy - this would explain bad initial R-values. Patterson suggests that too. You probably need to play with the estimate of asu content a bit to get rid of negative LLG. What happens if you run a search looking for the second copy with the first one in place?

-- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs

_______________________________________________ phenixbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/phenixbb

Ed Pozharski

10:19 p.m.

On Tue, 2011-03-15 at 15:05 -0400, Zhang yu wrote:

...

Yeah, the protein is huge, around 450KD. Changing the ASU content did help a little bit. By searching two copies in ASU, I got a solution with better LLG

#+ #* Initial LLG Initial R Refined LLG Refined R Unique Tmplt 1 1 4885.54 61.38 4885.54 61.38 YES But the Rfactor are still high and stuck during refinement.

Regarding the possible effect of space group, I reindexed the dataset from P2 to P21. AutoMR also gave similar solutions with high R values.

A little bit? Your LLG goes from -1000 to +5000, I'd say this solves it. The R-values are high because you are still missing huge part of it. If you try searching for the second copy, do you get any extra solutions? Even with DNA still missing (how big is it?), I'd expect the R to be a bit lower. Look at the density - see if you can start placing the DNA. Resolution is not too high though. Based on solvent content, you have room for the second copy - unfortunately in that case there won't be much left for DNA. -- "I'd jump in myself, if I weren't so good at whistling." Julian, King of Lemurs

5155

Age (days ago)

5157

Last active (days ago)

List overview

Download

18 comments

6 participants

participants (6)

Daniel Mattle
Ed Pozharski
Edward A. Berry
Francis E Reyes
Green, Todd
Zhang yu