NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE268099 Query DataSets for GSE268099
Status Public on May 28, 2024
Title Treehouse compendium of polyA selected RNA-Seq gene expression data from 12,747 tumors
Organism Homo sapiens
Experiment type Expression profiling by high throughput sequencing
Third-party reanalysis
Summary We uniformly analyze sequence data to generate a resource for comparative gene expression studies. Specifically, we obtained access to primary RNA sequence data from repositories and clinical partners, consistently processed the data, harmonized metadata, and released the expression values and metadata without access restrictions
 
Overall design The data contains 12747 consistently processed gene expression datasets from 34 studies. Gene expression in each sample is uniformly quantified using the dockerized TOIL RNA-Seq pipeline versions from 3.2 to 3.4.1 (Vivian et al., 2017); all of these versions produce bitwise identical RSEM gene expression outputs. The pipeline uses RSEM Version 1.2.25 (Li and Dewey, 2011) for quantification after aligning reads with STAR v 2.3.2a (Dobin et al., 2013) using indices generated from the human reference genome GRCh38 and the human gene models GENCODE 23 as described at https://github.com/UCSC-Treehouse/pipelines. Quality is assessed with the MEND pipeline https://github.com/UCSC-Treehouse/mend_qc (Beale et al., 2021).
Data pocessing steps were as follows:

Adapters are removed with CutAdapt v1.9 (Martin, 2011)
Reads are aligned by STAR v 2.4.2a using indices generated from the human reference genome GRCh38 and the human gene models Gencode 23 (Dobin et al., 2013)
RSEM 1.2.25 is used to quantify gene expression (Li and Dewey, 2011).
Gene level expression in TPM is log transformed: log2(TPM+1)

genome build: GRCh38
processed data files format and content: Gene level expression in TPM is log transformed: log2(TPM+1)
 
Contributor(s) Beale HC, Learned K, Kephart E, Jariwala S, Antilla R, Cheney A, Lyle AG, Vasquez Y, Sanders L, Haussler D, Salama SR, Vaske OM
Citation missing Has this study been published? Please login to update or notify GEO.
Submission date May 22, 2024
Last update date May 28, 2024
Contact name Olena Morozova Vaske
Organization name UCSC
Street address MCDB, Sinsheimer Labs, 1156 High Street
City Santa Cruz
State/province CA
ZIP/Postal code 95064
Country USA
 

Data table header descriptions
th_dataset_id
disease
age_at_dx
pedaya
sex
study_id
study_accession
study_donor_id
study_dataset_id
organism

Data table
th_dataset_id disease age_at_dx pedaya sex study_id study_accession study_donor_id study_dataset_id organism
TH03_0010_S01 acute leukemia of ambiguous lineage 11 Yes, age < 30 years female TH03 unavailable N/A N/A Homo sapiens
TH03_0010_S02 acute leukemia of ambiguous lineage 11 Yes, age < 30 years female TH03 unavailable N/A N/A Homo sapiens
THR33_1000_S01 medulloblastoma 7 Yes, age < 30 years female THR33 EGAD00001003279 N/A N/A Homo sapiens
THR33_1001_S01 medulloblastoma 5 Yes, age < 30 years male THR33 EGAD00001003279 N/A N/A Homo sapiens
THR33_1002_S01 medulloblastoma 5 Yes, age < 30 years female THR33 EGAD00001003279 N/A N/A Homo sapiens
THR33_1003_S01 medulloblastoma 3 Yes, age < 30 years female THR33 EGAD00001003279 N/A N/A Homo sapiens
THR33_1004_S01 medulloblastoma 26 Yes, age < 30 years male THR33 EGAD00001003279 N/A N/A Homo sapiens
THR33_1005_S01 medulloblastoma 10 Yes, age < 30 years male THR33 EGAD00001003279 N/A N/A Homo sapiens
THR33_1006_S01 medulloblastoma 3 Yes, age < 30 years male THR33 EGAD00001003279 N/A N/A Homo sapiens
THR33_1007_S01 medulloblastoma 27 Yes, age < 30 years male THR33 EGAD00001003279 N/A N/A Homo sapiens
THR33_1008_S01 medulloblastoma 4 Yes, age < 30 years female THR33 EGAD00001003279 N/A N/A Homo sapiens
TH03_0103_S01 spindle cell/sclerosing rhabdomyosarcoma 8 Yes, age < 30 years unknown TH03 unavailable N/A N/A Homo sapiens
TH03_0104_S01 hepatoblastoma 0.33 Yes, age < 30 years unknown TH03 unavailable N/A N/A Homo sapiens
TH03_0105_S01 spindle cell/sclerosing rhabdomyosarcoma 17 Yes, age < 30 years unknown TH03 unavailable N/A N/A Homo sapiens
TH03_0106_S01 Ewing sarcoma 15 Yes, age < 30 years unknown TH03 unavailable N/A N/A Homo sapiens
TH03_0107_S01 hepatoblastoma 1 Yes, age < 30 years unknown TH03 unavailable N/A N/A Homo sapiens
TH03_0011_S01 acute lymphoblastic leukemia 0.2 Yes, age < 30 years male TH03 unavailable N/A N/A Homo sapiens
THR33_1115_S01 medulloblastoma 20 Yes, age < 30 years male THR33 EGAD00001003279 N/A N/A Homo sapiens
THR33_1116_S01 medulloblastoma 31 No female THR33 EGAD00001003279 N/A N/A Homo sapiens
THR33_1117_S01 medulloblastoma 42 No female THR33 EGAD00001003279 N/A N/A Homo sapiens

Total number of rows: 12747

Table truncated, full table size 1421 Kbytes.




Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE268099_TumorCompendium_v11_PolyA_hugo_log2tpm_58581genes_2020-04-09.tsv.gz 1.2 Gb (ftp)(http) TSV
GSE268099_clinical_TumorCompendium_v11_PolyA_2020-04-09_updated_for_GEO_20240520_152626.tsv.gz 144.7 Kb (ftp)(http) TSV

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap