GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
Series GSE229791 Query DataSets for GSE229791
Status Public on Aug 08, 2023
Title Multimodal hierarchical classification allows for efficient annotation of CITE-seq data
Organism Homo sapiens
Experiment type Expression profiling by high throughput sequencing
Summary Single-cell RNA sequencing (scRNA-seq) is an invaluable tool for profiling cells in complex tissues and dissecting activation states that lack well-defined surface protein expression. For immune cells, the transcriptomic profile captured by scRNA- seq cannot always identify cell states and subsets defined by conventional flow cytometry. Emerging technologies have enabled multimodal sequencing of single cells, such as paired sequencing of the transcriptome and surface proteome by CITE-seq, but integrating these high dimensional modalities for accurate cell type annotation remains a challenge in the field. Here, we describe a machine learning tool called MultiModal Classifier Hierarchy (MMoCHi) for the cell-type annotation of CITE-seq data. Our classifier involves several steps: 1) we use landmark registration to remove batch-related staining artifacts in CITE-Seq protein expression, 2) the user defines a hierarchy of classifications based on cell type similarity and ontology and provides markers (protein or gene expression) for the identification of ground truth populations within the dataset by threshold gating, 3) progressing through this user-defined hierarchy, we train a random forest classifier using all available modalities (surface proteome and transcriptome data), and 4) we use these forests to predict cell types across the entire dataset. Applying MMoCHi to CITE-seq data of immune cells isolated from eight distinct tissue sites of two human organ donors yields high-purity cell type annotations encompassing the broad array of immune cell states in the dataset. This includes T and B cell memory subsets, macrophages and monocytes, and natural killer cells, as well as rare populations of plasmacytoid dendritic cells, innate T cells, and innate lymphoid cell subsets. We validate the use of feature importances extracted from the classifier hierarchy to select robust genes for improved identification of T cell memory subsets by scRNA-seq. Together, MMoCHi provides a comprehensive system of tools for the batch-correction and cell- type annotation of CITE-seq data. Moreover, this tool provides flexibility in classification hierarchy design allowing for cell type annotations to reflect a researcher’s specific experimental design. This flexibility also renders MMoCHi readily extendable beyond immune cell annotation, and potentially adaptable to other sequencing modalities.
Overall design We performed CITE-seq on immune cell populations from human blood and different human organ donor tissues.
Contributor(s) Caron D, Wells S, Szabo P, Chen D, Farber D, Sims PA
Citation(s) 37461466
Submission date Apr 14, 2023
Last update date Aug 08, 2023
Contact name Peter A Sims
Organization name Columbia University
Street address 3960 Broadway, Lasker 203AC
City New York
State/province NY
ZIP/Postal code 10032
Country USA
Platforms (2)
GPL18573 Illumina NextSeq 500 (Homo sapiens)
GPL24676 Illumina NovaSeq 6000 (Homo sapiens)
Samples (102)
GSM7177713 D496 GEX library 1
GSM7177714 D496 GEX library 2
GSM7177715 D496 GEX library 3
BioProject PRJNA955827

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE229791_D496.adt.matrix.txt.gz 11.9 Mb (ftp)(http) TXT
GSE229791_D496.gex.matrix.txt.gz 276.3 Mb (ftp)(http) TXT
GSE229791_D503.adt.matrix.txt.gz 11.0 Mb (ftp)(http) TXT
GSE229791_D503.gex.matrix.txt.gz 300.5 Mb (ftp)(http) TXT
GSE229791_PDC101.adt.matrix.txt.gz 1.7 Mb (ftp)(http) TXT
GSE229791_PDC101.gex.matrix.txt.gz 17.7 Mb (ftp)(http) TXT
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap