Normally, I would also suggest that you take advantage of a resource in Phenix:
which contains model_vs_data results for all applicable PDB entries. However, this has not been updated since at least last August,
I will try to find time to update it next week. Since phenix.cif_as_mtz extracts all data arrays now (thanks Richard!) I will have to change my scripts in order to check these arrays to choose the one that was actually used to produced the final deposited structure.
and in the meantime, there have been thousands of new entries added to the PDB, plus Pavel recently changed the bulk solvent correction and scaling procedure, which tends to result in slightly lower R-factors, so I think it's officially obsolete right now.
I wouldn't say it's obsolete, it's just reflects the state as of August last year, and that's 50+k entries - good enough for statistical exploration. But of course it's not suitable if you want to look at recent year-old entries. Pavel