NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM4054634 Query DataSets for GSM4054634
Status Public on Nov 27, 2020
Title iCell_ATAC_KCl-0hr_B1_1
Sample type SRA
 
Source name iCell neurons, ATAC-Seq, unstimulated, biorep 1, tech rep 1
Organism Homo sapiens
Characteristics tissue: cultured human iPSC-derived neurons at ~DIV16
time post-kcl stimulation: unstimulated
biological replicate: 1
technical replicate: 1
chip epitope: NA
Treatment protocol KCl membrane depolarization
Growth protocol Manufacturer's protocol.
Extracted molecule genomic DNA
Extraction protocol nuclear isolation from homogenized cultures
Nextera DNA Library Prep Kit (Illumina).
 
Library strategy ATAC-seq
Library source genomic
Library selection other
Instrument model Illumina NextSeq 500
 
Description iCell GABANeurons
iCell_ATAC_KCl-0hr_B1-2-3_ext200bp_norm10M_Boulting.bw
Data processing For total RNA sequencing of human iCell cultures, strand-specific single-end reads nominally sequenced at 75 bp but occasionally shorter were 3'-truncated to 70bp and filtered to include only reads with unambiguous base calls (the ~0.1% of reads shorter than 70bp were discarded). Reads were then aligned to the hg19 or hg38 genome using BWA (v0.7.8) allowing up to 2 mismatches, zero gaps, and otherwise default parameters. The usual 24 chromosomal and MT alignment targets were supplemented with ~6.1 million short sequences comprising all possible intragenic subsets of ordered exons from the GRCh37.p5 RefSeq annotation, such that each sequence is of minimal length yet accommodates a 70-bp read spanning all junctions between consecutive exons with a subset; likewise, ~ 9.6 million such splice-junction sequences were included for the GRCh38.p2 RefSeq annotation. Typically 85-95% of all reads in each library were mappable and ~85-90% of these aligned uniquely; nonuniquely mapped reads were discarded.
Aligned total RNA-Seq reads were further analyzed using an in-house code MAPtoFeatures (M2F) that produces expression levels for individual genes and their exons. The following applies separately to the RefSeq annotations for human Build 37.3 assembly (7 Sept 2011) GRCh37.p5 and human assembly GRCh37.p2 (12 March 2015). The exons for all transcripts assigned to a gene based on RefSeq were merged (unioned); supplemented with the constructed library of splice-junction sequences, these defined each gene's complete mRNA target region. The total number of bases of all reads that overlapped a gene's exonic region were divided by region's total length to yield an average read Density (coverage). Normalization of Densities to a standard of 10M 35-bp reads was effected by multiplying the raw Density by (10M/R)*(35/70) (for 70-bp reads), where R = total number of uniquely mapped reads in a sample that did not overlap any RepeatMasker rRNA elements. (RPKM units are proportional to normalized Density units: RPKM = Density/0.35). Relevant genic sense expression levels in M2F tables appear in column S_EXN_Density.
Differential-expression analyses were performed in R (v. 3.3.0) using edgeR to compare total RNA iCell samples that were stimulated with KCl to unstimulated samples. Stim time points at 15 minutes (6 samples), 1 hour (3), 2 hours (9), and 4 hours (3) were each compared to 9 samples at 0hr (unstim). To produce a balanced normalization of Densities for each comparison, all RefSeq-annotated genes (23,202 in hg19; 24,681 in hg38) in all samples at each later time point were quantile normalized together with all samples at 0hr. These processed Densities were then converted back to read counts for edgeR input as follows: divide each gene's Density by its sample's original normalization factor (10M/R)*(35/70), multiply by gene's number of exonic bases, divide by read length 70, and round up to nearest integer. These counts were input into edgeR, removing genes with zero expression and using all defaults and the BH correction for multiple comparisons, to obtain q-values to be measured against a desired false discovery rate (FDR).
Single-cell RNAseq reads were initially demultiplexed by Illumina BaseSpace to generate FASTQ files. Counts tables were created from the FASTQ files by an established pipeline: https://github.com/indrops/indrops. Counts tables were loaded into R (version 3.4.1) and analyzed using the Seurat package (Version 2.3.4 and 3.0.0). Cells were filtered by UMI, mitochondrial gene expression, and ribosomal gene expression. Data were log normalized and scaled and combined into one Seurat object. Cells were clustered and analyzed with PCA and UMAP. The final Seurat object contains 37,101 cells and 15 clusters.
For the single cell samples, we used iCellBio v.2 beads, so read 1 (R1) is the biological read and read 2 (R2) is the metadata read, containing the cell barcode and UMI. There are two cell barcode sequences and the combination of the two is the unique barcode for each cell. Structure of R2 is on the left of the top of the diagram (included as jpeg on series record).
Genome_build: GRCh37 (hg19; Feb, 2009)
Genome_build: GRCh38 (hg38; Dec, 2013)
Supplementary_files_format_and_content: MAPtoFeatures (M2F) tables have rows for genes and TAB-separated columns for various genic parameters, gene feature counts (for exons, introns, etc., individually and in total), and, for each feature and feature type, read counts, read-base counts, and normalized read Densities, all for reads that map sense or antisense to each gene. M2F LOG files document each run's read totals and summary statistics. Differential-expression tables from edgeR analyses have rows for genes and TAB-separated columns for: genic parameters, calculated fold-change (FC) ratios for KCl-stim at a time point, whether the mean change went up or down (UP.or.dn), inferred mean counts per million (CPM), reported p-value and BH-corrected q-value, various filtering results, numbers of samples with non0 expression and mean and SE log-RPKM values converted back to linear scale at each time, and read counts per sample input into edgeR. Single-cell counts tables list genes in rows, individual cells in TAB-separated columns; each entry is a UMI for a gene in a cell.
 
Submission date Aug 29, 2019
Last update date Nov 27, 2020
Contact name Gabriella Lutz Boulting
Organization name Harvard Medical School
Department Neurobiology
Lab Greenberg Lab
Street address 200 Longwood Ave.
City Boston
State/province MA
ZIP/Postal code 02115
Country USA
 
Platform ID GPL18573
Series (1)
GSE136656 Activity-dependent regulome and transcriptome of human GABAergic neurons reveal new patterns of gene regulation and neurological disease heritability
Relations
BioSample SAMN12661523
SRA SRX6780844

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap