NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE118704 Query DataSets for GSE118704
Status Public on Aug 18, 2018
Title Designing a single cell RNA sequencing benchmark dataset to compare protocols and analysis methods (Cel_Seq)
Organism Homo sapiens
Experiment type Expression profiling by high throughput sequencing
Summary Single cell RNA sequencing (scRNA-seq) technology has undergone rapid development in recent years and brings new challenges in data processing and analysis. This has led to an explosion of tailored analysis methods for scRNA-seq to address various biological questions. However, the current lack of gold-standard benchmarking datasets makes it difficult for researchers to evaluate the performance of the many methods available in a systematic manner. Here, we designed and generated a cross-platform benchmark dataset that has in-built truth in various forms and varying levels of biological noise. We used this dataset to compare different protocols and data analysis methods. We found that different protocols have different data quality and ERCC spike-in works independently to endogenous RNA. We found significant differences in the results from the methods compared and we associated the results with data characteristics to identify methods that perform well in different situations. Our dataset and analysis provide a valuable resource for algorithm selection in different biological settings.
 
Overall design our experiment utilized the 3 human lung adenocarcinoma cell lines H2228, H1975 and HCC827. The experiment included mixtures of RNA and single cells from these cell lines. For the single cell designs, the three cell lines were mixed equally and processed by 10X chromium, Drop-seq and CEL-seq2, referred to as sc_10X, sc_Drop-seq and sc_CEL-seq2 respectively in analysis that follows. For the mixture designs, we used plate-based protocols to mix and dilute samples in 2 different ways. 9 cell mixtures from the 3 cell lines were sorted in different combinations in the cell mixture experiment and data were generated by CEL-seq2, the material after pooling from 384 wells were subsampled in either 1/9 or 1/3 to simulate cells of different sizes, with different PCR product clean up ratios ranging from 0.7 to 0.9, referred to as cellmix1 to cellmix4. For the cell mixture experiment, we also sorted wells with 10 times more cells (90 cells) to provide a pseudo bulk reference for each mixture (referred to as cellmix5). Distinct RNA mixtures which were diluted down to create single cell equivalents (ranging from 3.75, 7.5, 15 to 30 pg per well) were generated using CEL-seq2 and SORT-seq (referred to as RNAmix_CEL-seq2 and RNAmix_Sort-seq. This is the RNAmix_CEL-seq2 dataset.
 
Contributor(s) Tian L, Amann-Zalcenstein D, Ritchie ME, Su S
Citation(s) 30096152, 31133762
Submission date Aug 17, 2018
Last update date Dec 01, 2021
Contact name Shian Su
E-mail(s) su.s@wehi.edu.au
Organization name Walter and Eliza Hall Institute of Medical Research
Department Molecular Medicine
Lab Ritchie Lab
Street address 1G Royal Parade
City Melbourne
State/province Victoria
ZIP/Postal code 3052
Country Australia
 
Platforms (1)
GPL18573 Illumina NextSeq 500 (Homo sapiens)
Samples (1)
GSM3336845 CelSeq2_SC_383_Samples
This SubSeries is part of SuperSeries:
GSE118767 Designing a single cell RNA sequencing benchmark dataset to compare protocols and analysis methods
Relations
BioProject PRJNA486514
SRA SRP158265

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE118704_RAW.tar 2.4 Mb (http)(custom) TAR (of CSV, TXT)
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap
External link. Please review our privacy policy.