CCP4 web logo Basic Maths for Protein Crystallographers
Phasing
next button previous button top button

An X-ray experiment allows us to measure all I(h) to some resolution limit. If we knew both |F(h)| and f(h) then we could generate a map of the unit cell having peaks at and only at each atom position using the Fourier summation

r(rx,ry,rz) = 1
V
S |F(h)| e2pif(h) e-2pi(hrx+kry+lrz)

This (the presence of peaks) is the fundamental property of crystal diffraction which underpins all structure solution methods.

But the phases cannot be measured directly and have to be inferred from differences between sets of intensity measurements. The experimental techniques to find them are loosely labelled as MIR, MIRAS, SIR, SIRAS, MAD, SAD:

SIR
single isomorphous replacement. Measurements are taken from a "native" protein and one "derivative" where some additional atoms have been incorporated into the lattice.
MIR
multiple isomorphous replacement. Measurements are taken from a "native" protein and several derivatives.
SIRAS
single isomorphous replacement plus anomalous differences. As above but the anomalous measurements for F(h) and F(-h) for the derivative are used.
MIRAS
multiple isomorphous replacement plus anomalous differences.
SAD
single anomalous dispersion. The "native" crystal contains some atoms which scatter anomalously, and these differences are used in a similar way to the SIR treatment.
MAD
multiple anomalous dispersion.

We need to consider the structure factor equation in more detail before discussing these.

F(h) = S g(i,S) e2pi(h·xi)
=

S
prot
g(i,S) e2pi(h·xi) +


S
heavy
or anom
g(j,S) e2pi(h·xj)

In fact the scattering factor f(i,S) is:

f(i,S) + f'(i) + i f"(i)

where f' and f" describe the scattering from inner electron shells, which varies as a function of the wavelength, but is more or less constant at all resolutions (i.e. f"(i,S) = f"(i)). For many elements ( C, N, O in particular) f' and f" are very small at all accessible wavelengths. Others, such as S and Cl have a small but detectable component at CuKa (f" ~ 0.5). In general transition elements such as Se, Br have observable f" (f" ~ 3-4) at short wavelengths. Metals and other heavy elements such as Hg, Pt, I etc. have quite large f" and f' contributions at most accessible wavelengths (at CuKa f"Hg ~ 8).

It helps to re-write the FH(h) or FA(h) component like this:

FH(h) =


S
heavy
or anom
g(j,S) e2pi(h·xj)
= S f(j,h) - f'(j ) + if"(j ) e2pi(h·xj)
= FHreal(h) eifH + F"Himag(h) ei(fH+90)
pictorial representation of structure factor for h with anomalous

The anomalous contribution is always 90 degrees in ADVANCE of the real contribution. The ratio of all |F"H|/ |FH| = f"(j,h)/{f(j,h) -f'(j,h)}.

Now

FH(-h) =


S
heavy
or anom
g(j,S) e2pi(-h·xj)
= S f(j,h) - f'(j ) + if"(j ) e2pi(-h·xj)
= FH(h)real e-ifH + F"H(h)imag ei(-fH+90)
pictorial representation of structure factor for -h with anomalous

which means that, although the magnitudes of FH(h) and FH(-h) are equal, their phases are different, and FH(-h) is no longer the complex conjugate of FH(h).

And since FPH(h) = FH(h) + FP(h), and FPH(-h) = FH(-h) + FP(-h) it follows that neither the magnitudes of |FPH(h)| and |FPH(-h)| are equal, nor the phase fPH(h) equal to -fPH(-h).

How does this help phasing?

Answer: In no way unless we can position the heavy (or anomalous) atoms.

If they are known, the vector FH(h) can be calculated and from the knowledge of the three magnitudes |FH(h)|, |FP(h)| and |FPH(h)| plus the phase of FH(h), it is easy to show from a phase triangle that fP will have to equal fH±fdiff.

This is often represented with "phase circles" (or phasing diagrams) or "phase triangles": phase triangles

Deviation into Patterson theory

Since there are usually only a few heavy atoms associated with many protein atoms, they can usually be positioned using Pattersons or direct methods. Both these techniques require only an estimate of the magnitude of the FH(h).

It maybe is worth summarising here the theory behind difference Pattersons.

FPH(h) = FH(h) + FP(h).
The cos rule gives: |FPH(h)|² = |FH(h)|² + |FP(h)|² + 2 |FH(h)| |FP(h)| cosfdiff
where fdiff is the phase between vector FH(h) and vector FP(h). From this we can approximate:

|FPH(h)| = {|FH(h|² + |FP(h)|² + 2 |FH(h)| |FP(h)| cosfdiff}½ =
|FP(h)| {1 + 2 |FH(h)|/ |FP(h)| cosfdiff + (|FH(h)|/ |FP(h)|)²}½

The binomial theorem gives (1+x)½ ~ 1 + x/2 when x is small, so

|FPH(h)| ~ |FP(h)| {1 + |FH(h)|/ |FP(h)| cosfdiff + ½(|FH(h)|/ |FP(h)|)²}
= |FP(h)| + |FH(h)| cosfdiff + ½|FH(h)|²/ |FP(h)|

So |FPH(h)| - |FP(h)| ~ |FH(h)| cosfdiff + an even smaller term, providing |FH(h)| is small compared to |FP(h)|.
and a Patterson with coefficients (|FPH(h)| - |FP(h)|)² is approximately equivalent to one with coefficients (|FH(h)| cosfdiff)² = ½|FH(h)|² (1 + cos 2fdiff) (remember: cos²(x) = (1+cos(2x))/2)
The summation of ½|FH(h)|² will give the normal Patterson distribution of vectors between related atoms while the summation of ½|FH(h)|² cos 2fdiff will generate only noise.

Deviation into Difference Fourier theory

Similar equations explain why a Fourier summation gives full weight peaks at the atomic positions which have been included in the phasing, and peaks at about half the expected height for atoms excluded from the phasing. Say FPH(h) = FP(h) + FH(h) where FH is much smaller than FP; i.e. only a few atoms are excluded from the phasing. Then as above

|FPH| ~ |FP| + |FH| cos(fP-fH) + small terms

The Fourier summation

S|FPH| eifP = S|FP| eifP + S|FH| cos(fP-fH) eifP

Since cos(x) = (eix + e -ix)/2

cos(fP-fH) eifP = ½ (eifH + ei(2fP-fH))

and the second term becomes

SFH ½(eifH + ei(2fP-fH))

giving the Fourier map for the atoms contributing to FH at half weight, plus noise, since the phase 2fP-fH is not related to these atoms at all.





Valid CSS! Valid XHTML 1.0!