[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ccp4bb]: re: unmeasured data in map-calculations

***  For details on how to be removed from this list visit the  ***
***    CCP4 home page http://www.dl.ac.uk/CCP/CCP4/main.html    ***

i did an experiment some two years ago involving hundreds
of refinements of the same structure using systematically
perturbed data and different starting models. it's a very
long story (and, yes, i will write it up properly one
millennium), but the conclusion was:

         "if you have measured it, use it !"

this was done at both 2.0 and 2.5 A, with cns 0.5 using the
MLF target in most cases (i did do an "old fashioned" run
with target=resi ["remember TARGET=RESI, granddad ?"] for
comparison). i correlated all measures of data quality and
quantity with accuracy. the strongest correlation by far
was for "completeness" (which, because i used fixed resolution
limits, really means "nr of reflections used in the refinement")
which showed correlation coefficients of ~0.7. randy read's
measure of information content also correlated well (~0.5-0.6),
but all other measures (Rmerge, average I/sigma(I), average
multiplicity, Rmeas and PCV) correlated very weakly with the
accuracy of the final model (corr. coeffs. ~0.1-0.2).
this was really a bit of a shock - but at least it gives my
audiences something to disagree with me about :-)

most of the other findings were surprisingly well in line
with the gospel according to alwyn and gerard, i.e. the
validation criteria that correlate best with model accuracy
are those that are "orthogonal" to the information used
in the refinement. in other words: rmsd bond lengths and
angles from ideal values are completely uninformative;
but rfree, ramachandran, DACA score etc. are very good.

as for what to call the resolution, i wouldn't worry
too much. eventually, we'll be quoting the ratio of
the number of bits of information in our experimental
data and the effective number of degrees of freedom
(or something similar)

well, these were my last two cents for this millennium !


On Wed, 20 Dec 2000, Andy Arvai wrote:

> ***  For details on how to be removed from this list visit the  ***
> ***    CCP4 home page http://www.dl.ac.uk/CCP/CCP4/main.html    ***
> What do people do with good, but incomplete data in the highest
> resolution shells during refinement? Should this data be used in
> refinement? If not, how complete should the data be in a resolution
> shell before it is included in refinement? If yes, what is the quoted
> resolution of the structure? It wouldn't seem fair to refine to higher
> resolution (with low completeness) and then quote statistics to a lower
> resolution. Has anyone done any tests refining against partially
> complete data?
> My feeling is that you should use all the data, however I'm not sure
> how gracefully the various refinement programs handle missing data,
> nor what the "resolution" of the structure would be.
> Andy

                        Gerard J.  Kleywegt
Dept. of Cell & Molecular Biology  Bolshevik University of Uppsala
                Biomedical Centre  Box 596
                SE-751 24 Uppsala  SWEDEN

    http://xray.bmc.uu.se/gerard/  mailto:gerard@xray.bmc.uu.se
   The opinions in this message are fictional.  Any similarity
   to actual opinions, living or dead, is purely coincidental.