[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [ccp4bb]: structure factors etc
*** For details on how to be removed from this list visit the ***
*** CCP4 home page http://www.ccp4.ac.uk ***
Dear BR:
> Structure factors are the primary experimental data,
> and it appears quite odd to me that crystallography
> enjoys the unusual privilege of simply suppressing
> those
Well we both agree that structure factors should be deposited, but all
you need to do is look at the average enzyme kinetics paper (for
example) to see that what is reported is not the primary data, but
rather a kinetics plot, (which is probably no more closely related to
the primary data than is an electron density map to a set of structure
factors). This is too bad, because many of these are Lineweaver-Burke
plots or some other plot that is more error-prone than say a
maximum-likelihood plot or a Cornish-Bowden plot. There are some
data-sets I wouldn't mind replotting. I think depositing an MIR map in
an agreed format or a complete experimental phase set with native and
derivative Fobs is probably much more useful than just depositing
native Fobs with only model phases that are heavily biased by the
refinement, but apart from a web page I made for my students, I haven't
made those public myself. For now at least I am willing to put up with
a system where scientists are permitted to use their best judgement,
and probably the best way to get people to cooperate is to pressure the
journals to make depositing the maps, Fobs, etc. a precondition of
publication.
Molecular biologists are generally supposed to make clones of genes
they have published freely available, not only for inspection, but also
for others to learn from and build upon. Why should it be any
different with source code? Richard Stallman, one of the principal
authors of the gnu C compiler, Emacs, etc., and the founder of the GNU
free software foundation, from which we all derive enormous benefit,
makes a pretty strong case for such restrictions being a "crime against
humanity." http://www.gnu.org/philosophy/
In my case, where I am just learning a few things about programming, it
is enormously helpful to see how various tasks are implemented. Others
with more aptitude could well make major advances. CCP4 and CNS both
have restrictions that would probably put Stallman into an autistic
rage, but with source code available, we at least have the luxury of
looking to see how things are done, possibly correct mistakes, port to
"unsupported" platforms, and there exists the opportunity to make
improvements and submit those.
> I think there is a conceptual difference of some significance
Sure, it isn't the same as making data available, but my point wasn't
that the two are identical problems, but rather that similar objections
in favor of vs. opposed to open source can be made for open use of
structure factors.
> That the tendency of structure factor absence
> correlates with poor quality is indeed a concerning fact (ibid.)
I assume you are referring to the figure in the paper that shows a
linear correlation between Rfree and % of structures where Fobs were
deposited.
Let's also remember that (a) Rfactor correlates rather strongly with
resolution, so that a similar plot might reveal that those with higher
resolution data were more likely to supply structure factors, and (b)
there are a number of people, like myself, who follow the advice of the
authors of that and previous papers to the letter and tend to idealize
geometry at the expense of R-factor and also separate our test from
working data sets immediately rather than late in refinement, which
gives a higher absolute value of Rfree. We would like to think our
structures are no worse for doing things in the recommended way.
So it isn't clear to me that lack of structure factor deposition
strongly correlates with poor quality.
In the cases of at least two of my structures where I deposited Fobs, I
can't find them in the database although I have written in and been
assured they are there. I wonder if the higher-resolution datasets are
being given higher priority or something.
>
> Source code is a different game. First, you don't find it in
> Science or Nature, and second, as long as a program
> verifiably does what it claims to do, I do not care
> for the source.
There are many who would say the same thing about a pdb file -- as long
as it is consistent with the mutagenesis data and other biochemical
results, who cares what it looks like in reciprocal space apart from a
few specialists? For the same reason this is not a compelling argument
against supplying Fobs, (the point is that you should have the freedom
to check and use it if you want to), it isn't a compelling argument for
not making source code available.
>
> With absent SFs you got nothing except circumstantial evidence
> to prove that the product is defective (say 1jky, for example).
I'm not aware of the story behind this, but presumably there is
something wrong with the structure? If deducing that is dependent upon
having access to the Fobs, how is it then that we know there is a
problem? My experience with proprietary software like Micro$oft's
stuff is that it crashes, and if you complain, they tell you the
problem can't possibly be due to their code and they try to blame the
victim. It seems like these two situations might have a lot in
common...
> And in tragic contrast to death row cases, the jury there tends
> to side with the suspect.
>
Free software. Free R. Free Mumia.
All the best,
Bill