[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ccp4bb]: Reducing the size of a FreeR-set



***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***


I don't know if there is a standard way, but you can use the random number
generator in SFTOOLS to select randomly a subset of your original Rfree
reflections. Use the CALC HELP and SELECT HELP commands for more details but
it should work something like this:

read original.mtz
calc col random = RAN_U
select col random < 0.5
calc col rfree = 1
select all
delete col random
write modified.mtz

read original.mtz
Read in the original MTZ file

calc col random = RAN_U
Generates a column labeled "random" with a uniform distribution of values
between 0 and 1 (RAN_G would be a gaussian distribution).

select col random < 0.5
Select half of all reflections

calc col rfree = 1
This will set the rfree flag to 1 for half of all reflections, so also for
half of your old rfree set, changing them into "working set" reflections

select all
Reselect all reflections so that they will all be written out

delete col random
Remove the temporary column

write
Create a new MTZ with the modified rfree set


Maybe this needs a bit of a programmers background but by combining the select
and calc options you can do some amazingly flexible things in SFTOOLS.

Bart


On Sun, 15 Jul 2001, Wulf Blankenfeldt wrote:

> ***  For details on how to be removed from this list visit the  ***
> ***          CCP4 home page http://www.ccp4.ac.uk         ***
>
> Dear all,
>
> I'm trying to reduce the FreeR-set of one of my files from 10 % to 5 %.
>
> To be politically correct and keep half of this original set for further
> refinement my first idea was to generate a second FreeR-column with
> 'uniqueify -p 0.5'. I thought that this should select half of my original
> FreeR-reflections as double-flagged. A few conversion, grep and awk steps
> should give me the desired new FreeR-set, I thought.
>
> While I now have a script that is technically doing the steps described
> above my initial assumption of getting half of the reflections
> double-flagged turned out to be wrong. Of the original say 15000 FreeR
> data, approx. 14300 got the double flag (0.00       0.00) and only 700
> don't (0.00       1.00). I am obviously missing something
> distribution-related here I think.
>
> Does anybody know how to split a FreeR set? Is there a standard procedure
> in CCP4 that I have overlooked?
>
>
>
> Thanks in advance,
>
>
>
>
> Wulf
>
>
> -----------------
> Dr. Wulf Blankenfeldt
>
> Structural Biology Group
> Biomolecular Sciences Building
> University of St. Andrews
>
> North Haugh
> St. Andrews, Fife
> KY16 9ST
> UK
>
> Tel:    +44 (0) 13 34 - 46 72 82
> Fax:    +44 (0) 13 34 - 46 25 95
> e-mail: wb6@st-andrews.ac.uk
> -----------------
>

===============================================================================

Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta, T6G 2H7, Canada
phone:	1-780-492-0042
fax:	1-780-492-7521

===============================================================================