U.S. flag

An official website of the United States government

RefSeq Collaborators and data sources

The RefSeq project is ambitious in scope and we actively welcome opportunities to work with other groups to provide this collection. We value our collaborators contributed information ranging from completely annotated genomes, advice to improve the sequence or annotation of individual RefSeq records, information about official nomenclature, and information about function.

In addition to the significant information collected by collaboration, numerous NCBI staff are involved in database support, programmatic support, and curation.

We collaborate with many groups including:

Alliance of Genome Resources (AGR)
A consortium of 7 model organism databases (MODs) and the Gene Ontology (GO) Consortium provides genomic information across species to facilitate comparative analysis
Chicken Gene Nomenclature Consortium (CGNC)
provides official nomenclature for chicken genes
Consensus CDS (CCDS) Project
consistent annotation of the human and mouse genomes is supported by a collaboration between NCBI, the Wellcome Trust Sanger Institute (WTSI) and the University of California, Santa Cruz (UCSC).
Cytochrome P450
Dr. Nelson curates gene content and representative sequences for this gene family.
Echinobase
resource for functional genomics data of echinoderm species.
EMBL's European Bioinformatics Institute (EMBL-EBI)
collaborates on the Matched Annotation from the NCBI and EMBL-EBI (MANE) project
FlyBase
FlyBase provides the Drosophila melanogaster RefSeq collection.
GENCODE
collaborates on the Matched Annotation from the NCBI and EMBL-EBI (MANE) project
Gene Ontology Consortium
provides consistent descriptions of gene products across databases
HUGO Gene Nomenclature Committee
provides official nomenclature for human genes and curate gene content and representative sequences.
Human Gene Mutation Database
contributed to the initial set of human RefSeq records
Human Protein Reference Database (HPRD)
curated proteomic information pertaining to human proteins
IMGT
International Immunogenetics Information System
Microbial Genomes
Microbial genomes are submitted to GenBank by several groups; we would like to acknowledge that their efforts add significant value to the RefSeq collection as we mine for experimentally supported data. NCBI collaborates with some groups to improve our Prokaryotic genome annotation pipeline, or to provide additional information for the genome, genes, or protein products.
mirRBase - the microRNA database
this is the primary data source for vertebrate RefSeq and Gene records of this type of small RNA molecule.
Mouse Genome Informatics
MGI provide official nomenclature for mouse genes and curate gene content and representative sequences.
OMIM
Catalog of Human Genes and Genetic Disorders
Pseudogene.org
one source of pseudogene content represented in RefSeq and Gene.
Rat Genome Database
RGD provides official nomenclature for rat genes and identities genes and representative sequences.
SGD
Saccharomyces Genome Database provides the annotated RefSeq records.
SwissProt/UniProt
NCBI and UniProt collaborate to provide cross-linking between protein datasets.
The Arabidopsis Information Network
TAIR provides the Arabidopsis thaliana RefSeq collection.
VectorBase
the source of genome annotation data represented in RefSeq and Gene for some of the invertebrate organisms that are vectors of human disease.
Vertebrate Gene Nomenclature Committee (VGNC)
provides official nomenclature for genes in vertebrate species that currently lack a nomenclature committee
VEuPathDB
Eukaryotic Pathogen, Vector and Host Informatics Resources
Viral Genome Advisors
the viral RefSeq collection is curated via an international collaboration and panel of viral advisors
WormBase
WormBase provides the Caenorhabditis elegans (nematode) RefSeq collection.
XenBase
Xenbase provides official nomenclature for Xenopus species.
Zebrafish Model Organism Database (ZFIN)
provide official nomenclature for zebrafish genes and curate gene content and representative sequences.

In addition, numerous individuals have made valuable contributions by helping to curate data for specific genes, gene families, or organisms. While it is impossible to list them all here, their assistance is very much appreciated.

Last updated: 2024-07-26T19:27:35Z