[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ccp4bb]: structure factors etc



***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***

Dear BR:

> Structure factors are the primary experimental data,
> and it appears quite odd to me that crystallography
> enjoys the unusual privilege of simply suppressing
> those


Well we both agree that structure factors should be deposited, but all 
you need to do is look at the average enzyme kinetics paper (for 
example) to see that what is reported is not the primary data, but 
rather a kinetics plot, (which is probably no more closely related to 
the primary data than is an electron density map to a set of structure 
factors).  This is too bad, because many of these are Lineweaver-Burke 
plots or some other plot that is more error-prone than say a 
maximum-likelihood plot or a Cornish-Bowden plot.  There are some 
data-sets I wouldn't mind replotting.  I think depositing an MIR map in 
an agreed format or a complete experimental phase set with native and 
derivative Fobs is probably much more useful than just depositing 
native Fobs with only model phases that are heavily biased by the 
refinement, but apart from a web page I made for my students, I haven't 
made those public myself.  For now at least I am willing to put up with 
a system where scientists are permitted to use their best judgement, 
and probably the best way to get people to cooperate is to pressure the 
journals to make depositing the maps, Fobs, etc. a precondition of 
publication.

Molecular biologists are generally supposed to make clones of genes 
they have published freely available, not only for inspection, but also 
for others to learn from and build upon.   Why should it be any 
different with source code?  Richard Stallman,  one of the principal 
authors of the gnu C compiler, Emacs, etc., and the founder of the GNU 
free software foundation, from which we all derive enormous benefit, 
makes a pretty strong case for such restrictions being a "crime against 
humanity."  http://www.gnu.org/philosophy/

In my case, where I am just learning a few things about programming, it 
is enormously helpful to see how various tasks are implemented.  Others 
with more aptitude could well make major advances.  CCP4 and CNS both 
have restrictions that would probably put Stallman into an autistic 
rage, but with source code available, we at least have the luxury of 
looking to see how things are done, possibly correct mistakes, port to 
"unsupported" platforms, and there exists the opportunity to make 
improvements and submit those.

> I think there is a conceptual difference of some significance

Sure, it isn't the same as making data available, but my point wasn't 
that the two are identical problems, but rather that similar objections 
in favor of vs. opposed to open source can be made for open use of 
structure factors.

> That the tendency of structure factor absence
> correlates with poor quality is indeed a concerning fact (ibid.)

I assume you are referring to the figure in the paper that shows a 
linear correlation between Rfree and % of structures where Fobs were 
deposited.

Let's also remember that (a) Rfactor correlates rather strongly with 
resolution, so that a similar plot might reveal that those with higher 
resolution data were more likely to supply structure factors, and (b) 
there are a number of people, like myself, who follow the advice of the 
authors of that and previous papers to the letter and tend to idealize 
geometry at the expense of R-factor and also separate our test from 
working data sets immediately rather than late in refinement, which 
gives a higher absolute value of Rfree.   We would like to think our 
structures are no worse for doing things in the recommended way.

So it isn't  clear to me that lack of structure factor deposition 
strongly correlates with poor quality.

In the cases of at least two of my structures where I deposited Fobs, I 
can't find them in the database although I have written in and been 
assured they are there.  I wonder if the higher-resolution datasets are 
being given higher priority or something.

>
> Source code is a different game. First, you don't find it in
> Science or Nature, and second, as long as a program
> verifiably does what it claims to do, I do not care
> for the source.

There are many who would say the same thing about a pdb file -- as long 
as it is consistent with the mutagenesis data and other biochemical 
results, who cares what it looks like in reciprocal space apart from a 
few specialists?  For the same reason this is not a compelling argument 
against supplying Fobs, (the point is that you should have the freedom 
to check and use it if you want to), it isn't a compelling argument for 
not making source code available.

>
> With absent SFs you got nothing except circumstantial evidence
> to prove that the product is defective (say 1jky, for example).

I'm not aware of the story behind this, but presumably there is 
something wrong with the structure?  If deducing that is dependent upon 
having access to the Fobs, how is it then that we know there is a 
problem?  My experience with proprietary software like Micro$oft's 
stuff is that it crashes, and if you complain, they tell you the 
problem can't possibly be due to their code and they try to blame the 
victim.  It seems like these two situations might have a lot in 
common...

> And in tragic contrast to death row cases, the jury there tends
> to side with the suspect.
>

Free software.  Free R.  Free Mumia.

All the best,

Bill