Dear CCTBX developers,
I found a big difference in CC1/2 value (say, 77.3 vs 68.3) between
phenix.merging_statistics and XSCALE when more than one *anomalous*
dataset was merged in XSCALE.
phenix.merging_statistics seems to assume that equivalent (in case of
anomalous, here I do not consider Friedel pairs as equivalent)
reflections are sorted (equivalent reflections are not separeted) in
the input file. Usually, this assumption should be correct.
However, if multiple anomalous datasets were merged in XSCALE, this
assumption is not correct. Because Friedel pairs are not necessarily
separated.
This is a real example (in P43212). These lines are extracted from
.HKL file by XSCALE:
> 1 -41 1 1.749E+04 1.657E+05 2052.0 2467.7 78.8 -113.23 1
> 41 -1 -1 4.569E+03 6.014E+03 454.1 1112.8 132.2 116.90 1
> 41 1 -1 8.721E+03 2.637E+03 473.0 1064.4 130.4 116.11 2
> 41 1 1 7.634E+03 2.417E+03 498.2 1010.8 135.3 115.65 2
> -1 -41 -1 4.847E+03 1.997E+03 1892.9 2543.0 85.9 -122.78 2
The last number indicates the dataset the reflection belongs to.
Here, (1 -41 1), (41 -1 -1), (41, 1, 1), and (-1, -41, -1) are
equivalent reflections, but (41, 1, -1) in the middle is not when
anomalous data! Therefore, in phenix.merging_statistics, subsets
calculation for (41,1,1) is performed twice (for first two and last
two).
When I changed the cctbx code to sort reflections after mapping to
a.s.u., CC1/2 values by phenix.merging_statistics became closer to
XSCALE.
In merging_stats.__init__() in iotbx/merging_statistics.py, I inserted a line
> array = array.sort("packed_indices")
after
> array = array.customized_copy(anomalous_flag=anomalous).map_to_asu()
but I think there would be more correct way to fix this unexpected behavior.
Best regards,
Keitaro