Break a sequence into all possible substrings of length k and count their frequencies to reveal the sequence’s compositional signature. Use this tool to assess sequencing quality, detect contamination, or compare genome assemblies — k-mer profiles differ markedly between species and can reveal sample mix-ups or GC bias. A smooth near-Poisson frequency distribution indicates uniform coverage, while unexpected peaks suggest repeats, contamination, or sequencing artifacts that should be investigated before further analysis.
About K-mer Analysis
K-mers are substrings of length k within a biological sequence. K-mer counting is fundamental to many bioinformatics applications including genome assembly, sequence comparison, and metagenomics.