Skip to content

Article image
Sequence Format & Type Converter

Convert your DNA, RNA, or Amino acids sequences between different types and formats with this versatile converter.

Supported Conversions

This converter supports the following conversions:

  • DNA to RNA, and Amino acids
  • RNA to DNA, and Amino acids
  • Amino acids to DNA and Amino acids to RNA are not currently supported

How to Use

  1. Enter your DNA, RNA, or Amino acids sequence in the input area.
  2. Select the input sequence type (DNA, RNA, or Amino acids).
  3. Select the desired output sequence type.
  4. Enter the sequence name, description, and accession number (if applicable) in the corresponding input fields.
  5. Select the input format from the dropdown menu.
  6. Select the desired output format from the dropdown menu.
  7. Click the “Convert” button.
  8. The converted sequence will be displayed in the output area.
  9. Click the “Download” button to save the converted sequence.

Supported Formats

This converter supports the following sequence formats:

  • FASTA: A simple text-based format representing nucleotide or Amino acids sequences. A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line must begin with a greater-than (”>”) symbol.
  • EMBL: A comprehensive format for storing nucleotide sequence data. An EMBL format file can contain multiple sequences, each with detailed annotations. Sequence data is preceded by ID, AC, DE, and SQ lines, and the sequence itself is often split into lines of 60 characters. The sequence ends with ”//”.
  • GCG: A format used by the Genetics Computer Group (GCG) software package. A GCG format file usually contains a single sequence with annotations. The start of the sequence is marked by a line ending with two dot (”..”) characters.
  • GenBank: A widely used format for storing nucleotide and Amino acids sequence data. Similar to EMBL, GenBank files can contain multiple sequences with annotations. Sequences begin after the “ORIGIN” keyword and end with ”//”.
  • IG/Stanford: A format used by the Integrated Genetics (IG) software. IG format files can contain multiple sequences, each with comments (lines beginning with ”;”), a name line, and the sequence itself, terminated by “1” (linear) or “2” (circular).
  • Plain/Raw: A simple format containing only the sequence characters (IUPAC characters and spaces). No headers or annotations are included. A plain sequence file may contain only one sequence.
  • Pretty: The sequence is formatted for readability, typically by adding spaces every 10 characters.

Note: This converter provides basic sequence type and format conversions. For more advanced manipulation or analysis of sequence data, specialized bioinformatics tools are recommended. The formatting of some formats (like GCG) might require further adjustment depending on specific software requirements. Checksums and other metadata might not be fully accurate. Always double-check the output, especially for critical applications. The input format detection is basic and may not correctly identify all variations of a format. It’s best to explicitly select the input format. Amino acids to DNA and Amino acids to RNA conversions are not yet supported. The Amino acids translation uses a simplified codon table; rare codons might not be accurately represented. Stop codons are represented by an asterisk (*).