Submit sequence data to NCBI

How to: Submit sequence data to NCBI

Starting with...	NOTES	SUBMISSION TOOLS & HELP DOCUMENTS
	Simple Sequence Submissions
Single nucleotide sequence or Several nucleotide sequences for different genes or loci	Contiguous bases of cDNA or genomic DNA, but should not be complete genomes. Complete genomes should be submitted via the appropriate protocol indicated below. Records with simple annotation may be submitted by BankIt or Sequin, while records with complicated annotation may be more easily submitted via Sequin.	BankIt or Sequin
Group of nucleotide sequences for the same gene or locus	Includes: population studies (sequences for a single organism) phylogenetic studies (sequences for multiple organisms) environmental samples (such as cultured or uncultured bacteria or metagenomic samples)	BankIt or Sequin
Batches of Sequences	Includes: Expressed Sequence Tags (ESTs) Genome Survey Sequences (GSSs)	Batch submit guidance page
	Genomic Assembly Submissions
Small complete genomes	Includes chloroplasts, mitochondria, plasmids, phages, and viruses (Locus_tag or BioProject registration is NOT required.)	Sequin
Large complete genomes	Includes paired chromosome and plasmids, as well as bacterial or eukaryotic chromosomes Questions regarding a specific submission that are not answered in the documented instructions can be sent to genomes@ncbi.nlm.nih.gov .	Prokaryotic Genomes submission Eukaryotic Genomes submission
Incomplete genomes	These can be whole genome shotgun (WGS) sequences. WGS submissions should be prepared using the tbl2asn or Sequin tools. For assistance contact genomes@ncbi.nlm.nih.gov .	Assembly submission information & Examples WGS submissions
High Throughput Genome Sequences (HTGSs)	The clones (e.g. BACs) of large-scale clone-based genome sequencing projects that are to be released quickly into GenBank can be submitted via the HTGS system. Sequences that are to be kept confidential or are few in number should be submitted as described above for Single nucleotide sequences. HTGS submissions require prior communication with NCBI staff, so please read about the HTGS submission process for details.	HTGS submissions
	Other Submission Types
Barcode of Life sequences	Mitochondrial cytochrome oxidase I sequences that are part of the Barcode of Life initiative can be submitted using a customized Bankit.	Barcode submit page
New sequence annotation for a non-RefSeq record submitted to GenBank by someone else	Third Party Annotation (TPA) submissions can be created for annotation of existing GenBank records when the submitter has experimental or inferential evidence that will be published in a peer-reviewed biological journal. Please read about the TPA database and its submissions policies before submission.	TPA information TPA FAQs
Computationally assembled transcript sequences	These records, based on those that have already been submitted to SRA or the Trace Archive, may be candidates for submission to the Transcriptome Shotgun Assembly (TSA) repository.	TSA information
Variations or Polymorphisms¹	Single nucleotide polymorphisms as well as short insertions and deletions (<50bp) should be submitted to dbSNP, while large structural variations and copy number variation (CNV) data should be submitted to dbVar. Please note that human variations/polymorphisms with clinical relevance should be submitted to a specialized Human Variation Batch submission process using HGVS nomenclature.	Variation Submission Portal
Primers, siRNAs, or probes	Primer or nucleotide-based probe sequences should be submitted to the Probe Database.	Probe submit page
High throughput sequences	The Sequence Read Archive (SRA) accepts reads from high throughput sequencing instruments. Some submissions include sets of SRA reads as part of a comprehensive package. For the specific datasets described below, please initiate submissions with the appropriate archive: Human sequence or metagenome sequence data derived from clinical isolates or from sources with privacy concerns should be submitted to dbGaP. Functional genomics studies that examine gene expression, regulation or epigenomics (using methods such as RNA-Seq, miRNA-Seq, ChIP-Seq or methyl-Seq) should be submitted to GEO. Transcript survey sequence assemblies should go to the Transcriptome Shotgun Assembly (TSA) archive. Non-human and environmental metagenomics data should go to the Metagenome archive. Whole genome sequence assemblies should be submitted to WGS. Capillary traces should be deposited in the Trace Archive. Sequences from the Barcode of Life project should be submitted to Barcode. Curators of these resources will assist submitters in sending the data to SRA during the submission process.	For data types not mentioned to the left, submit directly to SRA: SRA submit page SRA submission guidance

¹If you need a GenBank accession number for Variation or Polymorphism submissions, you will need to annotate the variations as SNPs, insertions/deletions, or microsatellite regions on a nucleotide sequence and submit this to GenBank using the appropriate mechanism for the sequence type.

NCBI

National Center for Biotechnology Information

How to: Submit sequence data to NCBI