Dear CCTBX developers, I found a big difference in CC1/2 value (say, 77.3 vs 68.3) between phenix.merging_statistics and XSCALE when more than one *anomalous* dataset was merged in XSCALE. phenix.merging_statistics seems to assume that equivalent (in case of anomalous, here I do not consider Friedel pairs as equivalent) reflections are sorted (equivalent reflections are not separeted) in the input file. Usually, this assumption should be correct. However, if multiple anomalous datasets were merged in XSCALE, this assumption is not correct. Because Friedel pairs are not necessarily separated. This is a real example (in P43212). These lines are extracted from .HKL file by XSCALE:
1 -41 1 1.749E+04 1.657E+05 2052.0 2467.7 78.8 -113.23 1 41 -1 -1 4.569E+03 6.014E+03 454.1 1112.8 132.2 116.90 1 41 1 -1 8.721E+03 2.637E+03 473.0 1064.4 130.4 116.11 2 41 1 1 7.634E+03 2.417E+03 498.2 1010.8 135.3 115.65 2 -1 -41 -1 4.847E+03 1.997E+03 1892.9 2543.0 85.9 -122.78 2
The last number indicates the dataset the reflection belongs to. Here, (1 -41 1), (41 -1 -1), (41, 1, 1), and (-1, -41, -1) are equivalent reflections, but (41, 1, -1) in the middle is not when anomalous data! Therefore, in phenix.merging_statistics, subsets calculation for (41,1,1) is performed twice (for first two and last two). When I changed the cctbx code to sort reflections after mapping to a.s.u., CC1/2 values by phenix.merging_statistics became closer to XSCALE. In merging_stats.__init__() in iotbx/merging_statistics.py, I inserted a line
array = array.sort("packed_indices") after array = array.customized_copy(anomalous_flag=anomalous).map_to_asu()
but I think there would be more correct way to fix this unexpected behavior. Best regards, Keitaro