Chromatin Immunoprecipitation

Chromatin immunoprecipitation is a method for mapping protein-DNA interactions in the genome. It identifies the genomic binding sites of transcription factors, histones with specific modifications, and other chromatin-associated proteins, providing insights into gene regulation and chromatin structure.

Basic Principle

ChIP begins with crosslinking proteins to DNA in living cells, typically using formaldehyde which creates covalent crosslinks between proteins and DNA within a few angstroms. The crosslinked chromatin is sheared into fragments of 200 to 600 base pairs by sonication or enzymatic digestion. An antibody specific to the protein of interest is used to immunoprecipitate the protein-DNA complexes. The crosslinks are reversed, and the DNA is purified. The enriched DNA fragments are then identified by quantitative PCR, microarray, or sequencing.

Crosslinking

Formaldehyde crosslinking is the standard method, creating methylene bridges between closely interacting amino groups. It is reversible by heat and is effective for proteins that directly contact DNA. For proteins that bind DNA indirectly or weakly, alternative crosslinkers such as ethylene glycol bis-succinimidyl succinate can be used. Native ChIP omits crosslinking and works for tightly bound proteins such as histones.

Chromatin Shearing

Sonication produces random DNA fragments by mechanical shearing. The fragment size distribution is critical, as smaller fragments provide higher resolution of binding sites. Sonication conditions must be optimized for each cell type to avoid over-shearing or under-shearing. Enzymatic digestion using micrococcal nuclease is an alternative for native ChIP, cleaving DNA between nucleosomes.

Immunoprecipitation

The success of ChIP depends on antibody quality. The antibody must recognize its target in crosslinked chromatin with high specificity and affinity. ChIP-grade antibodies are validated for this application. Protein A or protein G agarose or magnetic beads are used to capture the antibody-chromatin complexes. Magnetic beads are preferred for lower background and faster processing.

ChIP-qPCR

For known target genes, ChIP-qPCR (quantitative PCR) quantifies enrichment at specific genomic regions. Primers are designed for the suspected binding site and for a negative control region. Enrichment is calculated as the fold increase over the negative control or input chromatin. ChIP-qPCR is sensitive, quantitative, and suitable for hypothesis-driven studies of selected loci.

ChIP-on-Chip

Microarray-based ChIP hybridizes the ChIP-enriched DNA to tiling microarrays covering genomic regions of interest. The input DNA is labeled with one fluorophore and the ChIP DNA with another. Genomic regions with high ratios correspond to binding sites. The resolution is limited by fragment size and probe spacing. ChIP-on-chip has been largely replaced by ChIP-seq.

ChIP-Sequencing

ChIP-seq combines ChIP with next-generation sequencing. The immunoprecipitated DNA fragments are sequenced, and the reads are aligned to the reference genome. Regions with statistically significant read enrichment, called peaks, indicate protein binding sites. ChIP-seq provides higher resolution, lower noise, and genome-wide coverage without requiring pre-defined probes. Peak calling algorithms such as MACS identify regions where read density significantly exceeds the background distribution.

Data Analysis

ChIP-seq data analysis begins with quality control, read alignment, and removal of PCR duplicates. Peak calling identifies enriched regions using a Poisson or negative binomial model. Multiple testing correction controls the false discovery rate. Downstream analysis includes annotation of peaks to nearby genes, motif discovery to identify sequence preferences, and comparison across conditions.

Applications

ChIP-seq maps transcription factor binding sites genome-wide, revealing the regulatory networks that control gene expression. It maps histone modification patterns that define active promoters, enhancers, and repressed chromatin. ChIP-seq for RNA polymerase II identifies transcribed genes and paused polymerases. The ENCODE and Roadmap Epigenomics projects generated thousands of ChIP-seq datasets, creating comprehensive maps of the regulatory landscape of the human genome.