X-ray crystallography determines the three-dimensional atomic structure of proteins and other macromolecules by measuring the angles and intensities of X-rays diffracted by a crystal, then computing electron density maps from which atomic models are built.
Principle
When X-rays strike a crystal, they are scattered by the electron clouds of atoms. The regularly spaced lattice of the crystal causes constructive interference — diffraction — at specific angles described by Bragg’s law. Measuring the positions and intensities of diffraction spots allows reconstruction of the electron density distribution. This density map is then interpreted by fitting an atomic model, which is refined against the observed data.
Protein Crystallization
Diffraction requires crystals with regular three-dimensional order. Protein is purified to >95% homogeneity at 5–20 mg/mL. Crystallization screens test hundreds of conditions varying precipitant (PEG, ammonium sulfate), pH, buffer, salt, and additives. Sitting-drop and hanging-drop vapor diffusion are the standard methods. Droplets of protein and reservoir solution equilibrate against a larger reservoir, concentrating the protein and promoting nucleation. Crystals suitable for diffraction grow over days to weeks and are harvested with cryo-loops.
Data Collection
Crystals are flash-cooled in liquid nitrogen at 100 K to reduce radiation damage. Diffraction data are collected at synchrotron beamlines, which provide intense, tunable X-ray beams. A complete dataset comprises hundreds to thousands of diffraction images collected as the crystal rotates through a small angular range per image. Data processing with XDS, iMosflm, or DIALS indexes the reflections, integrates intensities, and scales measurements. Resolution limits are defined by the highest-angle reflections with measurable intensity; 2.0–3.0 Å is typical for protein structures.
Phase Problem
Measured intensities provide amplitudes but not phases, which are essential for the Fourier transform that reconstructs electron density. Molecular replacement solves phases by placing a homologous structure model in the crystallographic unit cell and calculating its predicted diffraction, which supplies initial phases. When no good search model exists, experimental phasing uses heavy atom derivatives (MIRAS) or selenium incorporation via selenomethionine (MAD/SAD). Modern pipelines automate most of this process.
Model Building and Refinement
The initial electron density map is interpreted in Coot or similar software, where the polypeptide chain is traced and side chains are placed. The model undergoes iterative cycles of manual adjustment and computational refinement with phenix.refine or REFMAC5. Refinement optimizes the model to minimize the difference between calculated and observed structure factors. Quality is monitored by Rwork and Rfree. A reliable structure typically has Rfree below 0.25 at 2.5 Å resolution.
Validation
MolProbity validates backbone geometry, rotamer outliers, clash scores, and Ramachandran statistics. The correlation between the model and experimental electron density is assessed by the real-space R-factor. The PDB validation report, generated for all depositions, provides a standardized quality summary. Recommended criteria: >90% Ramachandran favored, <0.3% Ramachandran outliers, <1% rotamer outliers.
Applications
X-ray crystallography has determined the majority of known protein structures in the Protein Data Bank. It has elucidated enzyme catalytic mechanisms, drug-target interactions including HIV protease and kinase inhibitors, and large macromolecular complexes such as the ribosome. Together with NMR spectroscopy and cryo-EM, it forms the core toolkit of structural biology.