Overview
Pathway and ontology databases provide structured representations of biological knowledge. They transform the complex, interconnected nature of cellular processes into computable formats that can be queried, analyzed, and visualized. The Gene Ontology (GO) provides a controlled vocabulary to describe gene products across three domains: cellular component, molecular function, and biological process. KEGG (Kyoto Encyclopedia of Genes and Genomes) maps genes to metabolic and signaling pathways. Reactome is an open-source, curated pathway database covering reactions in human biology.
Key Concepts
The Gene Ontology is structured as a directed acyclic graph where terms are connected by parent-child relationships (is-a, part-of, regulates). Annotations associate gene products with GO terms, supported by evidence codes. KEGG Pathway Maps are manually drawn diagrams where nodes represent genes, proteins, or compounds and edges represent interactions or reactions. KEGG Orthology (KO) groups functionally related genes across species. Reactome represents pathways as ordered sets of molecular reactions, each with detailed input-output relationships, and includes tools for pathway analysis of omics data.
Applications
Ontology and pathway databases enable high-level interpretation of experimental data. Enrichment analysis of GO terms or KEGG pathways identifies biological processes overrepresented in gene lists from metabolic pathway studies or differential expression experiments. Enzyme classification and nomenclature cross-references are integrated into KEGG for metabolic reconstruction. Cell signaling and signal transduction pathways in Reactome provide mechanistic context for phosphoproteomics and perturbation screens.