Protein Structure Comparison and Alignment

Overview

Protein structure comparison is the computational process of superimposing three-dimensional protein structures to identify regions of similar fold and topology. Unlike sequence alignment, which operates on a one-dimensional string, structure alignment captures evolutionary relationships that are often undetectable at the sequence level due to divergence. Structure similarity can persist long after sequence similarity has eroded, making structural alignment a powerful tool for remote homology detection, functional annotation, and the study of protein evolution.

Methods

Structure alignment algorithms minimize the root-mean-square deviation (RMSD) of corresponding C-alpha atoms after optimal superposition. DALI decomposes proteins into hexapeptide fragments and aligns similar contact patterns. TM-align uses a TM-score rotation matrix iteratively refined by dynamic programming. The TM-score normalizes alignment quality by protein length, with scores above 0.5 typically indicating the same fold. GDT_TS (global distance test total score) is another metric used in the Critical Assessment of protein Structure Prediction (CASP) experiments.

Applications

Structure comparison is essential for protein classification — databases such as SCOP and CATH organize the structural universe into hierarchical folds, superfamilies, and families. It enables the transfer of functional annotations from characterized to uncharacterized proteins with similar folds. The technique is used alongside experimental methods such as NMR spectroscopy to validate structures, and it underpins studies of protein structure evolution. Structure alignments also guide the interpretation of mutation effects by mapping sequence changes onto three-dimensional frameworks of amino acids and inform protein folding and chaperones research by identifying conserved folding cores.