New subject: Adequate size for Free R test set?

3 Aug 2010

      Pavel,

Thanks so much for the suggestions. They are really helpful! A few questions relative to Phenix. Is there a way to check with Phenix the thin resolution shells? I created Free Rs in Phenix using the thin resolution shells options but removed the 2000 limit and instead used 5%. Maybe I over-did it relative to your reply. Just wondering how to check each of these thin shells. Alternatively, does Phenix ensure one has an adequate number of test reflections in each "thin shell" if this option is set? If so, what should I used for the maximum number of reflections and/or %?

Thanks again!

Joe
___________________________________________________________
Joseph P. Noel, Ph.D.
Investigator, Howard Hughes Medical Institute
Professor, The Jack H. Skirball Center for Chemical Biology and Proteomics
The Salk Institute for Biological Studies
10010 North Torrey Pines Road
La Jolla, CA  92037 USA

Phone: (858) 453-4100 extension 1442
Cell: (858) 349-4700
Fax: (858) 597-0855
E-mail: [email protected]

Web Site (Salk): http://www.salk.edu/faculty/faculty_details.php?id=37
Web Site (HHMI): http://hhmi.org/research/investigators/noel.html
___________________________________________________________

On Aug 3, 2010, at 11:02 AM, [email protected] wrote:
...
Send phenixbb mailing list submissions to
  [email protected]
To subscribe or unsubscribe via the World Wide Web, visit
  http://phenix-online.org/mailman/listinfo/phenixbb
or, via email, send a message with subject or body 'help' to
  [email protected]
You can reach the person managing the list at
  [email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of phenixbb digest..."
Today's Topics:
1. Re: Adequate size for Free R test set? (Pavel Afonine)
  2. Re: follow up on message 2: phenix.maps --Message: not
     implemented ([email protected])
  3. Re: follow up on message 2: phenix.maps --Message: not
     implemented (Pavel Afonine)
----------------------------------------------------------------------
Message: 1
Date: Tue, 03 Aug 2010 10:38:57 -0700
From: Pavel Afonine 
To: [email protected]
Cc: Joseph Noel 
Subject: Re: [phenixbb] Adequate size for Free R test set?
Message-ID: <[email protected]>
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
Hi Joe,
I think almost every one has his/her own opinion on this... Here is what 
I think:
1) The test set should be such that each "relatively thin resolution 
shell" receives at least 50 reflections, and we empirically found that 
150 is "good enough" withing phenix.refine framework.
For "relatively thin resolution shell" definition see:
Lunin & Skovoroda. Acta Cryst. (1995). A51, 880-887. "R-free 
likelihood-based estimates of errors for phases calculated from atomic 
models".
This basically defines how many test reflections you need.
2) It is customary to set aside either 5 or 10% for test set, with the 
total maximum 2000. These are all "magic numbers", that I presume more 
or less satisfy "1)" so they became widely used.
3) Presence of high-order NCS and selecting free-flags using "thin 
shells" algorithm is a different story (Acta Cryst. (2006). D62, 
227--238). It is good to do that because it removes the cross-talk 
between test and work reflections due to NCS, but at the same time it 
invalidates the requirement "1)". So, this is a gray area (for me at least).
4) Some people believe that the final refinement run should be done 
using all reflections, arguing that taking away 5-10% of test 
reflections worsens the maps. There is some truth in this, yes, removing 
the data worsens the maps, but:
a) it is noticeable (in a sense that it can reduce the interpretability 
of some parts of the map) only in extreme cases of somewhat low 
resolution or low completeness data, b) in most of all other cases it is 
simply negligible, c) removing reflections randomly has much smaller 
effect than removing them systematically (see page #40 here: 
http://www.phenix-online.org/presentations/latest/pavel_maps.pdf and 
some relevant references in 2010 PHENIX paper in Acta D). However, if 
you do that "final run", you will invalidate the final refinement 
statistics, Rfree and Rwork, and thus obtained final structure cannot 
have the Rfree associated with it anymore.
Pavel.
On 8/3/10 10:04 AM, Joseph Noel wrote:
...
Hi Folks,
Its been a while since I personally refined many structures. In the 
past, I used as a default, 5% of my unique reflections for the Free R 
test set. I have a high resolution structure with 150,000 unique 
reflections and noticed that Phenix defaults are 5% or 2000 
reflections which ever is smaller. What is the current consensus on an 
adequate number of unique reflections to use for cross-validation?
Thanks!
Joe
P.S. I really, really love Phenix.
___________________________________________________________
Joseph P. Noel, Ph.D.
Investigator, Howard Hughes Medical Institute
Professor, The Jack H. Skirball Center for Chemical Biology and Proteomics
The Salk Institute for Biological Studies
10010 North Torrey Pines Road
La Jolla, CA  92037 USA
Phone: (858) 453-4100 extension 1442
Cell: (858) 349-4700
Fax: (858) 597-0855
E-mail: [email protected] mailto:[email protected]
Web Site (Salk): http://www.salk.edu/faculty/faculty_details.php?id=37
Web Site (HHMI): http://hhmi.org/research/investigators/noel.html
___________________________________________________________
_______________________________________________
phenixbb mailing list
[email protected]
http://phenix-online.org/mailman/listinfo/phenixbb

Re: [phenixbb] Adequate size for Free R test set?

Joseph Noel

Nathaniel Echols

tags

participants (2)