[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[ccp4bb]: Summary: Amore Rotation Function Scoring?
*** For details on how to be removed from this list visit the ***
*** CCP4 home page http://www.ccp4.ac.uk ***
Dear CCP4 users & staff,
on September, 23rd, I've posted the following message on the CCP4BB:
"The Amore cross-rotation function basically calculates a correlation
coefficient between the observed and calculated Patterson function (CC_P).
However, the output of the cross-rotation search is for some dubious reason
sorted on the correlation coefficient between calculated and observed F
(CC_F). This doesn't make much sense to me for the following reasons:
(1) The search function is the CC_P, thus, from a methodological point of
view, the output should be sorted on this value and not on something else.
(2) Both, the calculated F and I of the model only make sense after it has
been correctly positioned, which is not the case in the cross-rotation
search.
(3) Accordingly, the signal-to-noise must be much better for CC_P than for
either CC_F or CC_I. To illustrate this, I have run a cross-rotation search
with the refined protein-only model of the A. niger phytase (Kostrewa et al.,
NSB, 4, 185ff, 1995) against its observed data. The top 10 of the amore
cross-rotation output looks like this (I've removed the TX,TY,TZ columns for
better readability):
ITAB ALPHA BETA GAMMA CC_F RF_F CC_I CC_P Icp
SOLUTIONRC 1 3.28 85.77 237.92 27.9 55.3 42.8 26.8 1
SOLUTIONRC 1 117.85 90.00 58.64 22.2 57.2 34.3 16.3 2
SOLUTIONRC 1 90.57 80.41 235.17 18.2 58.3 26.3 5.6 3
SOLUTIONRC 1 60.20 85.07 240.47 17.9 58.5 25.9 4.9 4
SOLUTIONRC 1 22.57 57.12 223.13 17.9 58.5 26.6 4.1 5
SOLUTIONRC 1 47.85 86.10 237.71 17.8 58.5 25.8 5.2 6
SOLUTIONRC 1 87.65 60.22 71.67 17.8 58.4 25.9 4.4 7
SOLUTIONRC 1 80.37 85.82 235.99 17.7 58.4 25.1 4.5 8
SOLUTIONRC 1 44.86 24.72 48.00 17.7 58.5 26.0 5.6 9
SOLUTIONRC 1 41.18 58.25 87.29 17.7 58.4 25.7 6.4 10
Interestingly, the correct top peak appears to be also the top peak in CC_F
and CC_I. However, as you can clearly see, the signal-to-noise ratio is MUCH
better for CC_P. Now, imagine that you do not have a perfect search model. In
this case, I think, the chances to find the correct peak are much poorer if
the output is sorted on CC_F rather than on CC_P. I don't know what you other
users of CCP4 think about this, but I would strongly prefer a sorting on the
real search function values rather than on something else in order to get the
best chances to find the correct molecular replacement solution.
Unfortunately, CCP4 Amore apparently does not give the user the choice on
which values he/she wants to sort the output. Thus, the request from my side
to the CCP4 developers is to give the user the choice on which values the
output should be sorted, and to set the sorting on CC_P as the default, and
not the sorting on CC_F."
I received three replies (excerpts in ""):
(1) from Steve Soisson:
"You can always just grep out the SOLUTIONRC lines and sort them using shell
commands:
cat amore.output | grep SOLUTIONRC | sort -r +8 > sort.list"
I think that this only works IF the true CC_P peak is really in the CC_F list.
It does not cure the underlying problem of sorting on the wrong value in the
first place.
(2) from David Borhani:
"I agree."
(3) from Alexandre Urzhumtsev:
"I agree with you. At my opinion, the advantage of the rot.function is that
it uses the Pattersons and does NOT make a comparison of Fs which is
useless for unpositioned and especially for uncomplete models."
Thus, I would like to repeat my request to the CCP4 staff to give the user the
choice on which value Amore's rotation function output should be sorted.
Best regards,
Dirk.
--
***************************************************************
Dirk Kostrewa
Paul Scherrer Institut E-mail: dirk.kostrewa@psi.ch
Life Sciences, OSRA/007 Phone: +41-56-310-4722
CH-5232 Villigen PSI Fax: +41-56-310-4556
Switzerland Internet: http://www.sb.psi.ch
***************************************************************