[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ccp4bb]: ESD of distances!



***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***

On Monday 18 March 2002 11:49, lothar esser wrote:
> > > As a small molecule crystallographer in my previous birth, I used a
> > > program  called PARST (by Prof. Nardelli) to calculate the Min and MAxi
> > > distances between atoms taking into account the B-factors of atoms.  I
> > > cannot use PARST now as my molecule is too big.
> > >
> > > P.S:  I am now refining two more proteins at this resolution the sizes
> > > are from 60AA in asym, 325AA in Asym unit and 1500AA in asymm unit. 
> > > So, I do not want to be changing dimensions in PARST to do 1500AA.....
> >
> > Why not?  You apparently had enough memory to do full-matrix in shelx,
> > so you must therefore have ample memory for the PARST arrays.
>
>   My guess is he used CGLS as recommended for large molecules and not
>   full-matrix least-squares.

I had snipped the part where he said that he had used shelx to calculate
esds.  That option only exists in L.S. mode, not in CGLS.

It is possible to use L.S. refinement in block mode (BLOC) rather than
true full-matrix, and ask for esds.   May I recommend to people that you
not do this, however.  Or at least if you do, be aware that the esds 
estimated from blocked matrix treatment are under-estimates of the
"true" esds you would get from full-matrix treatment.   This is because
the covariance of parameters in separate blocks is necessarily omitted
from the calculation.  If you partition into, say, 5 blocks then you are only
sampling 1/5 of the covariance terms in the full matrix.

I have found for a ~500 residue protein that partitioning the calculation
into 5 blocks of 100 residues (plus a 6th for  bound oligosaccharide and 
solvent molecules) causes the esds to be underestimated
by roughly 10% relative to the values obtained from true full-matrix.
[Merritt et al (1998) J. Mol. Biol. 282, 1043-1049].  I have subsequently
obtained similar numbers using other structure refinements.
For a more mathematical investigation of how covariance distributes itself in
the matrix you might also have a look at the paper by Kevin Cowtan
and Lynn Ten Eyck  in Acta  D 56, 842-856 (2000).

Is 10% underestimate significant?  [shrug] Well, I assume that if you are
going to the trouble of calculating esds then you care enough to want 
accurate values.  And the amount of underestimation will in any case be
dependent on the actual distribution of covariance in your particular
matrix, so I cannot even predict how typical  my  value of 10% would
turn out to be.

On a related note - I have written a Perl script to extract the esds reported
in a shelx output log and reconstruct a PDB file containing the appropriate
SIGATM records.  I'd say it's in beta-test condition, if anyone wants me to
Email a copy for evaluation. 


-- 
Ethan A Merritt       merritt@u.washington.edu
Biomolecular Structure Center Box 357742
University of Washington, Seattle, WA 98195
phone: (206)543-1421
FAX:   (206)685-7002