RNA sequencing (RNA-Seq) is a technique that uses next-generation sequencing to analyze the complete set of RNA transcripts in a cell or tissue sample. It provides a snapshot of gene expression, revealing which genes are active and at what levels.
How RNA-Seq Works
- RNA Extraction
Total RNA is extracted from the sample using methods similar to DNA isolation, but with additional steps to preserve the fragile RNA molecules. The RNA integrity is checked using gel electrophoresis or a bioanalyzer.
- mRNA Enrichment or rRNA Depletion
Since ribosomal RNA makes up most of the total RNA, it is removed to enrich for messenger RNA (mRNA). This is done using poly-T beads that capture the poly-A tails of mRNA, or by selectively depleting ribosomal RNA sequences.
- Library Preparation
The enriched mRNA is fragmented into small pieces and reverse transcribed into complementary DNA (cDNA) using random primers. Adapters are ligated to the cDNA fragments, and the library is amplified by PCR.
- Sequencing
The cDNA library is sequenced on an NGS platform, generating millions of short reads. Each read represents a short sequence from one end of a cDNA fragment.
- Data Analysis
The reads are aligned to a reference genome or transcriptome. The number of reads mapping to each gene is counted to quantify expression levels. Differential expression analysis identifies genes that are significantly up- or down-regulated between conditions.
- Applications
RNA-Seq is used to study gene expression changes in development, disease, drug treatment, and environmental responses. It can also detect alternative splicing, novel transcripts, and non-coding RNAs.
Practical RNA-Seq Protocol
Extract total RNA using TRIzol or a column-based method with on-column DNase treatment. Assess RNA integrity on a Bioanalyzer or TapeStation — an RNA Integrity Number (RIN) > 7 is required for mRNA-Seq, while > 5 may suffice for total RNA-Seq with rRNA depletion. For mRNA enrichment, incubate 0.1–1 µg of total RNA with poly-T oligo-attached magnetic beads to capture poly-adenylated mRNA. Alternatively, use a Ribo-Zero kit to deplete cytoplasmic and mitochondrial rRNA. Fragment the enriched mRNA by incubation at 94°C for 8 minutes in fragmentation buffer, generating ~200 nt fragments. Synthesize first-strand cDNA using random hexamers and reverse transcriptase at 25°C for 10 minutes, 42°C for 50 minutes, and 70°C for 15 minutes. Replace the RNA strand with dUTP during second-strand synthesis to enable strand-specificity — the second strand containing dUTP will be selectively degraded during library amplification. Perform end repair, A-tailing, and adapter ligation as in standard NGS library preparation. Degrade the dUTP-labeled second strand using USER enzyme before PCR enrichment (10–15 cycles). Sequence the final library on an Illumina platform generating 20–40 million paired-end reads per sample (2 × 75 bp or 2 × 150 bp). Process raw FASTQ files through a bioinformatics pipeline: quality trim with cutadapt, align to the reference genome using STAR (2-pass mode for novel splice junctions), quantify gene-level counts with featureCounts or HTSeq, and identify differentially expressed genes with DESeq2 or edgeR using a false discovery rate threshold of 0.05. Include at least three biological replicates per condition for statistical power.
Real-World Application
In a study comparing drug-treated vs. control hepatocytes, RNA-Seq identifies 1,247 differentially expressed genes (FDR < 0.05). Pathway enrichment analysis reveals upregulation of lipid metabolism genes and downregulation of inflammatory pathways. The data are validated by RT-qPCR on 10 selected genes, confirming the direction and magnitude of fold changes.