Skip to content

Article image
Transcription and RNA Processing

Transcription is the synthesis of RNA from a DNA template, the first step in gene expression. In eukaryotes, the primary transcript undergoes extensive processing to produce functional RNA molecules. Transcripts can be analyzed by RNA sequencing and amplified by reverse transcription PCR.

RNA Polymerases

Prokaryotes have a single RNA polymerase, a multi-subunit enzyme that synthesizes all types of RNA. The core enzyme consists of alpha, beta, and beta-prime subunits, and associates with a sigma factor that recognizes promoter sequences. The sigma-70 factor recognizes standard bacterial promoters with -10 and -35 consensus sequences.

Eukaryotes have three nuclear RNA polymerases. RNA polymerase I transcribes ribosomal RNA genes in the nucleolus. RNA polymerase II transcribes protein-coding genes to produce mRNA and many non-coding RNAs, with its large C-terminal domain containing heptad repeats that are phosphorylated during transcription. RNA polymerase III transcribes small RNAs including tRNA, 5S rRNA, and snRNA.

Transcription Initiation

Transcription begins at promoter sequences that direct RNA polymerase to the correct start site. In eukaryotes, RNA polymerase II requires assembly of general transcription factors at the core promoter. TFIID, containing TATA-binding protein, recognizes the TATA box, and other factors recruit Pol II. The mediator complex integrates regulatory signals from enhancer-bound transcription factors. Promoter clearance follows, with the CTD becoming phosphorylated at serine 5 by TFIIH, releasing Pol II from the initiation complex.

Elongation

During elongation, RNA polymerase moves along the template strand, synthesizing RNA in the 5-prime to 3-prime direction. The active site adds ribonucleotides complementary to the template, with the growing RNA strand forming a temporary hybrid with the DNA template. The transcription bubble moves with the polymerase, with DNA rewinding behind and unwinding ahead. As elongation proceeds, the CTD becomes phosphorylated at serine 2, recruiting RNA processing factors. Pausing and backtracking can occur, providing regulatory checkpoints.

Termination

Termination mechanisms differ between polymerases. Bacterial RNA polymerase terminates at hairpin structures that destabilize the elongation complex, or through the Rho factor that actively dissociates the complex. RNA polymerase I terminates at specific sequences recognized by terminators. RNA polymerase II termination is coupled to polyadenylation, with cleavage of the pre-mRNA triggering transcript release. RNA polymerase III terminates at a run of thymine residues in the template.

5-Prime Capping

The 5-prime end of Pol II transcripts is modified co-transcriptionally. A 7-methylguanosine cap is added in reverse orientation through a 5-prime to 5-prime triphosphate linkage. The capping enzyme is recruited to the phosphorylated CTD. The cap protects the transcript from 5-prime exonucleases, enhances translation by binding eIF4E, and facilitates nuclear export. The cap is also important for efficient splicing and polyadenylation.

RNA Splicing

Most eukaryotic genes contain introns that must be removed from pre-mRNA. Splicing is catalyzed by the spliceosome, a dynamic complex of five snRNPs and numerous proteins. The reaction involves two transesterification steps. First, the 2-prime hydroxyl of the branch point adenosine attacks the 5-prime splice site, forming a lariat intermediate. Second, the free 3-prime hydroxyl of the 5-prime exon attacks the 3-prime splice site, joining the exons and releasing the lariat intron.

Alternative splicing allows a single gene to produce multiple mRNA variants. Over 95% of human genes undergo alternative splicing, greatly expanding the proteome. Splicing is regulated by enhancer and silencer sequences that bind SR proteins and hnRNPs, respectively.

3-Prime Processing

Pre-mRNA is cleaved at the polyadenylation site, and a poly-A tail of 100 to 250 adenine residues is added by poly-A polymerase. The cleavage and polyadenylation specificity factor recognizes the AAUAAA polyadenylation signal, while cleavage stimulation factor binds downstream GU-rich elements. Cleavage factors I and II carry out the endonucleolytic cleavage. Poly-A binding protein coats the tail, protecting the mRNA from degradation and promoting translation.