MicroBIGG-E Map Documentation
Beta Release
What is MicroBIGG-E?
MicroBIGG-E, the Microbial Browser for Identification of Genetic and Genomic Elements, contains genetic and genomic elements identified in assemblies analyzed by AMRFinderPlus as part of the Pathogen Detection Pipeline. See the AMRFinderPlus wiki for more information on how AMRFinderPlus works and the Pathogen Detection Reference Gene Catalog for a list of the elements that AMRFinderPlus is searching for. For additional information, see the MicroBIGG-E Help Documentation
The context of the sequences shown in the MicroBIGG-E Map
Additional information, including a dataflow diagram, about the sequences used to generate the AMR gene data for the MicroBIGG-E Map can be found on the MicroBIGG-E Map Details page.
What is the MicroBIGG-E map?
The MicroBIGG-E Map has five components:
-
The MicroBIGG-E International Map which, for the alleles, genes, or point mutations (‘elements’) selected by the user, displays the number of instances these elements occur in MicroBIGG-E, the number of isolates that contain one or more copies of these elements, and the proportion of total isolates that possess one or more copies of these elements. The International Map also allows users to limit the data used in other displays by selecting one or more countries.
-
A histogram of allele/gene/point mutation counts which have been selected by the user.
-
A table displaying allele/gene/point mutation counts by country, (only displayed if countries are selected using the MicroBIGG-E International Map).
-
A display of the proportion and counts of selected allele/gene/point mutation by year of addition to the Pathogen Detection system.
-
A display of the proportion and counts of selected allele/gene/point mutation by year of isolate collection.
This data resource uses assembled genome sequence data for bacterial pathogens in GenBank with acquired resistance alleles or genes identified by AMRFinderPlus and with geographic location metadata. The presence/absence of selected gene families and individual genes and alleles, with respect to geography, needs to be carefully interpreted in the context of the sequence data generated from each country. These sequences might not reflect global patterns of phenotypic resistance. Due to differences in collection and decisions to sequence genomes, within- and cross-country data should be interpreted with caution.
AMR genes in the ‘plus’ category, virulence genes, and stress response genes are not currently included. If you are interested in these additional categories, email us at: pd-help@ncbi.nlm.nih.gov.
Using the MicroBIGG-E Map
Selecting Organism Group
To begin, users should select a particular Organism Group from the dropdown menu (an Organism Group may consist of one or more species - use the "scientific_name" filter to view the species within that organism group). Users also may select “All” to examine the data for isolates from all of the Organism Groups. The total number of isolates for each selection is shown in parentheses in the selection box. More information on Organism Group is available here.
Filters
Having selected a set of isolates for analysis, using the Available filters box, the data then can be filtered by the following fields:
- Class of antibiotic: "Class" provides a broad definition of the phenotype affected by the gene or allele. More information about class and subclass fields can be found on the AMRFinderPlus wiki.
- Collection date: Date sample was collected, in the format the submitter supplied. This may differ from the type the data was submitted to INSDC and also different than the time the data was added to the Pathogen Detection project. For real-time submissions of pathogen surveillance data, these dates will be in close proximity. For legacy data, or research projects, these dates may differ wildly and be separated by years. Importantly, for some isolates these data are missing, unlike Create date, where every isolate has a first date of uptake into the Pathogen Detection system.
- Create date: The date on which this isolate was first seen by the Pathogen Detection system, in the format: YYYY-MM-DD. Note, these dates are in ISO format.
- Element symbol (gene name): The symbol assigned to the element by AMRFinderPlus. Examples include an allele symbol (blaKPC-2), a protein symbol (blaKPC), and a point mutation (gyrA_S83I).
- Host: The host species, if provided by the submitter. This field contains values exactly as they were entered by the data submitters. Some submitters might have entered a scientific name while others might have entered a common name; therefore, search for synonyms if you would like to retrieve more comprehensive results. Data field names and values are case sensitive.
- Isolate: Pathogen Detection accession of the isolate. The accession begins with the prefix "PDT," which stands for Pathogen Detection Target. This database is the primary resource issuing PDT accessions. Additional information can be found here.
- Isolation source: Describes the physical, environmental and/or local geographical source of the biological sample from which the sample was derived, if provided by the submitter. This field contains values exactly as they were entered by the data submitters. Data field names and values are case sensitive. Additional information can be found here.
- Isolation type: Isolation type of an isolate: clinical OR environmental/other OR NULL. Note, this field is derived from the attribute package selected by the isolate's submitter using one of the Pathogen templates in BioSample.
- If attribute_package=Pathogen.cl.1.0 then isolation type is clinical.
- If attribute_package=Pathogen.env.1.0 then isolation type is environmental/other, unless host or isolation_source indicates that it was isolated from a human subject in which case isolation type is clinical.
- If neither of these packages is used then isolation type is NULL. Additional information can be found here.
- Scientific name: Scientific name (in NCBI Taxonomy) of the isolate from the submitter. To search for specific taxa, you can enter the genus name (or the full genus and species name) of the pathogen, with the first letter of the genus capitalized.
- Subclass of antibiotic: Where it is known, "Subclass" provides a more specific definition of the particular antibiotics or classes that are affected by the gene or point mutation (e.g., that are resisted by the gene/allele). While most subclass designations are self-explanatory, a few others have particular meanings. Specifically, "CEPHALOSPORIN" is equivalent to the Lahey 2be definition; "CARBAPENEM" means the protein has carbapenemase activity, but it might or might not confer resistance to other beta-lactams. Where the phenotypic information is incomplete, contradictory, or unclear, the "Class" value is used for the "Subclass" value. More information about the class and subclass fields can be found on the AMRFinderPlus wiki.
- Subtype: Classification for the subtype of gene found. A more detailed description of the type and subtype fields is available on the AMRFinderPlus wiki. Data field names and values are case sensitive, and the values for this data field are written in upper case. Additional information can be found here.
Country Selection
To limit the display to individual countries, select the countries of interest on the MicroBIGG-E International Map panel. Selected countries show up in the Selected countries table, where each row is a different country after selection on the map. Users can reset the country selection by clicking the Reset Countries button in the upper right-hand corner of the MicroBIGG-E International map.
MicroBIGG-E Map Displays
MicroBIGG-E International Map Panel
Sliding the cursor over a country will display:
- the number of times that element occurs in MicroBIGG-E.
- the number of isolates that contain one or more copies of the element in MicroBIGG-E.
- the percentage of isolates that possess one or more copies of the element in MicroBIGG-E.
- the total number of isolates from that country in MicroBIGG-E that belong to the selected Organism Group.
Users can increase or decrease the size of the map within the window using the plus/minus button in the top left corner of the Map Panel.
Number of Isolates with a Given Element Panel
This panel displays a histogram in which the values reflect the number of isolates with that element. If an isolate has two or more copies of the element, it is only counted once. If an isolate has two different displayed alleles/genes, then it is counted once for each element. For beta-lactamases, the family-only label (e.g., "blaTEM") indicates the number of isolates with an unassigned member of that family; it does not indicate the total number of isolates with alleles belonging to that family. Note these counts include not only exact matches (Method: ALLELE and EXACT) or close BLAST matches to reference genes (Method: BLAST), but also partial sequences (Method: PARTIAL and PARTIAL_END_OF_CONTIG), sequences with internal stops (Method: INTERNAL_STOP), and sequences identified only by HMMs (Method: HMM).
Each bar is colored by the subclass assigned to it based on the Subclass field in MicroBIGG-E. If there are too many element symbols to fit in the display, you can use the arrows in the top left-hand corner of the display to paginate through multiple pages.
Selected Countries Panel
This is a table in which every county selected in the MicroBIGG-E International Map Panel is represented by a row, with columns representing the following:
- the number of times that element occurs in MicroBIGG-E.
- the number of isolates that contain one or more copies of the element in MicroBIGG-E.
- the percentage of isolates that possess one or more copies of the element in MicroBIGG-E.
- the total number of isolates from that country in MicroBIGG-E that belong to the selected Organism Group.
Proportion and counts of selected alleles/genes by year of addition to Pathogen Detection
These data are derived from the Create date column found in the Isolates Browser. The solid line indicates the percentage of isolates containing the selected elements, as a total number of selected isolates, in the year displayed on the x-axis, as a proportion of all years. The purple line indicates the total number of isolates from that year; note the purple line is not cumulative.
Proportion and counts of selected alleles/genes by year of collection
These data are derived from the Collection date column found in the Isolates Browser. The solid line indicates the percentage, as a total number of selected isolates, in the year displayed on the x-axis, as a proportion of all years. The purple line indicates the total number of isolates from that year; note the purple line is not cumulative. If a submitter did not provide a collection date, then that isolate will not be included.
Use cases/sample searches of MicroBIGG-E
- How To: Find, using the MicroBIGG-E Map, blaKPC-containing Klebsiella pneumoniae isolates from China and the U.S. in the Isolates Browser (.pptx)
- How To: Download, using the MicroBIGG-E Map, blaKPC nucleotide sequences from Klebsiella pneumoniae isolates from China and the U.S. (.pptx)
Downloading Figures
Each of the four graphic panels (not the Selected Countries Panel) can be downloaded as a .png file by hovering over the panel of interest, right-clicking, and selecting “Download image”.
Interaction with other NCBI tools
Having selected a set of isolates using the filters (and the MicroBIGG-E International Map Panel, if desired), users can view those isolates either in the Isolates Browser or MicroBIGG-E. To use these features users must be logged in to MyNCBI, see the login button in the upper right of the screen to log in or set up an account if you don't have one already.
To view isolates with the selected genes in the Isolates Browser, select the buttons adjacent to “View selected isolates in…” in the blue bar at the top. This cross browser selection will display the isolates containing the selected elements in the Isolates Browser, but for the MicroBIGG-E selection, the subset of rows will be for the isolate-element_symbol pairings (i.e., the rows displayed will be restricted to the elements indicated by the filter selections made in the map for the organisms of interest). To examine all elements for a set of isolates based on selections made in the map, first go to the Isolates Browser, and then to the MicroBIGG-E.
Update Frequency
The database behind the MicroBIGG-E Map is updated daily. The contents of the MicroBIGG-E Map may not agree exactly with those shown in the MicroBIGG-E web browser due to update scheduling differences. If you find unexpected discrepancies, please let us know by emailing us at pd-help@ncbi.nlm.nih.gov.