Skip to content

Article image
Non-coding RNAs and Gene Regulation

May 31, 2026

The discovery that the majority of the human genome is transcribed into RNA that does not encode proteins has fundamentally changed our understanding of gene regulation. Non-coding RNAs (ncRNAs) constitute a diverse and abundant class of regulatory molecules that control gene expression at multiple levels, from chromatin remodeling and transcription to mRNA stability and translation. Their dysregulation contributes to a wide spectrum of diseases, including cancer, neurological disorders, and cardiovascular disease.

MicroRNAs

MicroRNAs (miRNAs) are small ncRNAs of approximately 22 nucleotides that regulate gene expression post-transcriptionally by binding to complementary sequences in the 3 untranslated regions (UTRs) of target messenger RNAs. miRNA genes are transcribed by RNA polymerase II as primary miRNAs (pri-miRNAs) containing hairpin structures, which are cleaved in the nucleus by the Microprocessor complex (Drosha and DGCR8) to release precursor miRNAs (pre-miRNAs) of approximately 70 nucleotides. Pre-miRNAs are exported to the cytoplasm by Exportin-5 and further processed by Dicer, an RNase III endonuclease, to generate mature miRNA duplexes. One strand of the duplex is loaded into the RNA-induced silencing complex (RISC), where it guides Argonaute (AGO) proteins to target mRNAs through partial sequence complementarity, primarily within the seed region (nucleotides 2–8). miRNA binding typically represses translation and promotes mRNA deadenylation and degradation. A single miRNA can regulate hundreds of target mRNAs, and over 2,600 mature human miRNAs have been identified, collectively regulating approximately 60% of protein-coding genes. miRNAs are involved in virtually all biological processes, including cell proliferation, differentiation, apoptosis, and immune responses, and miRNA expression profiling has diagnostic and prognostic value in cancer.

Long Non-coding RNAs

Long non-coding RNAs (lncRNAs) are defined as transcripts longer than 200 nucleotides that lack protein-coding potential, though some contain small open reading frames or produce functional micropeptides. The human genome contains approximately 20,000 lncRNA genes, many of which are expressed in a cell type-specific and developmentally regulated manner. LncRNAs regulate gene expression through diverse mechanisms. In the nucleus, Xist (X-inactive specific transcript) mediates X-chromosome inactivation in females by coating the future inactive X chromosome and recruiting chromatin-modifying complexes, including Polycomb repressive complex 2 (PRC2), which deposits H3K27me3 repressive marks. HOTAIR, a lncRNA from the HOXC locus, recruits PRC2 to specific genomic sites to silence HOXD genes in trans. LncRNAs can also function as molecular scaffolds, bringing together multiple proteins to form ribonucleoprotein complexes; as decoys, sequestering transcription factors or miRNAs; and as enhancer-associated RNAs (eRNAs) that promote enhancer-promoter looping and transcriptional activation. MALAT1 (metastasis-associated lung adenocarcinoma transcript 1) is a highly expressed nuclear lncRNA that regulates alternative splicing by modulating serine/arginine-rich splicing factor phosphorylation and is associated with metastasis in multiple cancer types.

Circular RNAs

Circular RNAs (circRNAs) are a class of covalently closed RNA molecules generated by back-splicing, where a downstream 5 splice site joins to an upstream 3 splice site, producing a circular transcript without free ends. CircRNAs are resistant to exonucleolytic degradation and are highly stable compared to linear RNAs. Most circRNAs are expressed at low levels, but some accumulate to high abundance, particularly in the brain, and their expression is often conserved across species. The best-characterized function of circRNAs is as miRNA sponges: by containing multiple binding sites for specific miRNAs, circRNAs sequester miRNAs and prevent them from repressing their target mRNAs. CDR1as (cerebellar degeneration-related protein 1 antisense), also called ciRS-7, contains over 60 conserved binding sites for miR-7 and is essential for normal brain development in zebrafish and mice. CircRNAs can also regulate transcription by interacting with RNA polymerase II, modulate splicing by competing with linear splicing, and some can be translated to produce proteins or peptides in a cap-independent manner. CircRNA dysregulation is implicated in cancer, neurological disorders, cardiovascular disease, and aging.

Piwi-Interacting RNAs

Piwi-interacting RNAs (piRNAs) are small ncRNAs of 24–31 nucleotides that are primarily expressed in germ cells and function in transposon silencing and genome integrity. piRNAs associate with PIWI proteins, a subclass of Argonaute proteins, and guide them to complementary transposon transcripts through base pairing, leading to transcriptional silencing through DNA methylation and histone modifications, or post-transcriptional cleavage. The piRNA pathway is essential for maintaining germline genome integrity by suppressing transposon activity, and its disruption leads to sterility and germ cell tumors in animal models. piRNAs are generated from long single-stranded RNA precursors transcribed from piRNA clusters, processed by the endonuclease Zucchini, and loaded into PIWI proteins. In the mouse, three PIWI proteins (MIWI, MILI, MIWI2) function at different stages of spermatogenesis. Emerging evidence suggests piRNAs and PIWI proteins may also function in somatic tissues, including cancer cells, where their expression correlates with poor prognosis in several cancer types.

Small Interfering RNAs

Small interfering RNAs (siRNAs) are double-stranded RNAs of 20–25 nucleotides that mediate RNA interference (RNAi), a conserved gene silencing mechanism triggered by exogenous double-stranded RNA, such as viral RNA or experimentally introduced constructs. Synthetic siRNAs are widely used as research tools to knock down gene expression and are being developed as therapeutics. In the RNAi pathway, long double-stranded RNA is cleaved by Dicer into siRNAs, which are loaded into RISC. Unlike miRNAs, siRNAs typically have perfect complementarity to their target, leading to Argonaute-mediated endonucleolytic cleavage of the target mRNA. Endogenous siRNAs, produced from transposons, repetitive elements, or convergent transcription, regulate transposon silencing in some organisms and may function in antiviral defense in plants and invertebrates.

NcRNAs in Disease and Therapeutics

The dysregulation of ncRNAs is associated with many human diseases. In cancer, miRNAs can function as oncogenes (oncomiRs, such as miR-21, miR-155) or tumor suppressors (such as the let-7 family, miR-34), and miRNA expression signatures are used for tumor classification and prognosis. LncRNAs including PCA3 are used as diagnostic biomarkers, with PCA3 measured in urine for prostate cancer detection. Therapeutic strategies targeting ncRNAs are under active development. Antisense oligonucleotides (ASOs) targeting lncRNAs or pri-miRNAs can inhibit their function, while miRNA mimics restore tumor suppressor miRNA expression. The first RNAi-based therapeutic, patisiran (an siRNA targeting transthyretin), was approved in 2018 for hereditary transthyretin-mediated amyloidosis. Several miRNA-targeting therapies, including miravirsen (an anti-miR-122 for hepatitis C), have progressed to clinical trials. CRISPR-based approaches are being explored to edit ncRNA genes or modulate their expression, and engineered circRNAs show promise as sustained-expression vectors for therapeutic proteins.