|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Dec 18, 2019 |
Title |
Bonn Dataset 3 of meta-analysis on AML classification, RNAseq dataset |
Sample organism |
Homo sapiens |
Experiment type |
Expression profiling by high throughput sequencing Third-party reanalysis
|
Summary |
The present dataset ("dataset 3") is a subset of a large metastudy on AML classfication. It contains normalized gene expression values of 1181 samples. In total, three datasets were generated, each containing data of a different platforms: dataset 1 (Affymetrix HG-U133 A microarrays), dataset 2 (Affymetrix HG-U133 2.0 microarrays) and dataset 3 (RNA-seq). Dataset 3 was generated using the following strategy: All data sets published in the National Center for Biotechnology Information Gene Expression Omnibus (GEO) on 20 September 2017 were reviewed for inclusion in the present study. Basic criteria for inclusion were the cell type under study (human peripheral blood mononuclear cells (PMBCs) and/or bone marrow samples) as well as the species (Homo sapiens). Furthermore, GEO SuperSeries were excluded to avoid duplicated samples. We filtered the datasets for data generated with high-throughput RNA sequencing (RNA-seq) and excluded studies with very small sample sizes (< 10 samples). We then applied a disease-specific search, in which we filtered for acute myeloid leukemia, other leukemia and healthy or non-leukemia-related samples. The results of this search strategy were then internally reviewed and data were excluded based on the following criteria: (i) exclusion of duplicated samples, (ii) exclusion of studies that sorted single cell types (e.g. T cells or B cells) prior to gene expression profiling, (iii) exclusion of studies with inaccessible data. Other than that, no studies were excluded from our analysis. In total, the datasets contained samples from the following GSE Series: GSE63085, GSE32874, GSE58335, GSE86884, GSE63703, GSE63646, GSE63816, GSE72790, GSE81259, GSE85712, GSE45735, GSE64655, GSE87186, GSE49642, GSE52656, GSE62190, GSE66917, GSE67039, GSE61162, GSE67184, GSE49601, GSE78785, GSE79970. All raw data files were downloaded from GEO. Transcript abundances were calculated using kallisto version 0.43.0 and all data was normalized with the R package DESeq2 (R version R-3.2.4, DESeq2 version 1.12.4) with standard parameters. Genome build hg38 was used for read alignment. No filtering of low-expressed genes was performed.
|
|
|
Overall design |
[Dataset_3_ensembl.txt] count matrix of 1181 samples and 37045 genes (ensembl IDs)
|
|
|
Contributor(s) |
Warnat-Herresthal S, Ulas T, Schultze JL |
Citation(s) |
31918046 |
|
Submission date |
Nov 14, 2018 |
Last update date |
Mar 25, 2020 |
Contact name |
Joachim Schultze |
E-mail(s) |
j.schultze@uni-bonn.de
|
Organization name |
LIMES (Life and Medical Sciences Center Genomics and Immunoregulation)
|
Department |
Genomics and Immunoregulation
|
Street address |
Carl-Troll-Strasse 31
|
City |
Bonn |
State/province |
NRW |
ZIP/Postal code |
53115 |
Country |
Germany |
|
|
This SubSeries is part of SuperSeries: |
GSE122517 |
Bonn Datasets of meta-analysis on AML classification |
|
Supplementary file |
Size |
Download |
File type/resource |
GSE122515_Dataset_3_ensembl.txt.gz |
150.0 Mb |
(ftp)(http) |
TXT |
GSE122515_README_list_of_all_re-analyzed_GSMs.txt |
12.7 Kb |
(ftp)(http) |
TXT |
Processed data not provided for this record |
|
|
|
|
|