GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM4054634

Query DataSets for GSM4054634

Status

Public on Nov 27, 2020

Title

iCell_ATAC_KCl-0hr_B1_1

Sample type

SRA

Source name

iCell neurons, ATAC-Seq, unstimulated, biorep 1, tech rep 1

Organism

Homo sapiens

Characteristics

tissue: cultured human iPSC-derived neurons at ~DIV16
time post-kcl stimulation: unstimulated
biological replicate: 1
technical replicate: 1
chip epitope: NA

Treatment protocol

KCl membrane depolarization

Growth protocol

Manufacturer's protocol.

Extracted molecule

genomic DNA

Extraction protocol

nuclear isolation from homogenized cultures
Nextera DNA Library Prep Kit (Illumina).

Library strategy

ATAC-seq

Library source

genomic

Library selection

other

Instrument model

Illumina NextSeq 500

Description

iCell GABANeurons
iCell_ATAC_KCl-0hr_B1-2-3_ext200bp_norm10M_Boulting.bw

Data processing

For total RNA sequencing of human iCell cultures, strand-specific single-end reads nominally sequenced at 75 bp but occasionally shorter were 3'-truncated to 70bp and filtered to include only reads with unambiguous base calls (the ~0.1% of reads shorter than 70bp were discarded). Reads were then aligned to the hg19 or hg38 genome using BWA (v0.7.8) allowing up to 2 mismatches, zero gaps, and otherwise default parameters. The usual 24 chromosomal and MT alignment targets were supplemented with ~6.1 million short sequences comprising all possible intragenic subsets of ordered exons from the GRCh37.p5 RefSeq annotation, such that each sequence is of minimal length yet accommodates a 70-bp read spanning all junctions between consecutive exons with a subset; likewise, ~ 9.6 million such splice-junction sequences were included for the GRCh38.p2 RefSeq annotation. Typically 85-95% of all reads in each library were mappable and ~85-90% of these aligned uniquely; nonuniquely mapped reads were discarded.
Aligned total RNA-Seq reads were further analyzed using an in-house code MAPtoFeatures (M2F) that produces expression levels for individual genes and their exons. The following applies separately to the RefSeq annotations for human Build 37.3 assembly (7 Sept 2011) GRCh37.p5 and human assembly GRCh37.p2 (12 March 2015). The exons for all transcripts assigned to a gene based on RefSeq were merged (unioned); supplemented with the constructed library of splice-junction sequences, these defined each gene's complete mRNA target region. The total number of bases of all reads that overlapped a gene's exonic region were divided by region's total length to yield an average read Density (coverage). Normalization of Densities to a standard of 10M 35-bp reads was effected by multiplying the raw Density by (10M/R)*(35/70) (for 70-bp reads), where R = total number of uniquely mapped reads in a sample that did not overlap any RepeatMasker rRNA elements. (RPKM units are proportional to normalized Density units: RPKM = Density/0.35). Relevant genic sense expression levels in M2F tables appear in column S_EXN_Density.
Differential-expression analyses were performed in R (v. 3.3.0) using edgeR to compare total RNA iCell samples that were stimulated with KCl to unstimulated samples. Stim time points at 15 minutes (6 samples), 1 hour (3), 2 hours (9), and 4 hours (3) were each compared to 9 samples at 0hr (unstim). To produce a balanced normalization of Densities for each comparison, all RefSeq-annotated genes (23,202 in hg19; 24,681 in hg38) in all samples at each later time point were quantile normalized together with all samples at 0hr. These processed Densities were then converted back to read counts for edgeR input as follows: divide each gene's Density by its sample's original normalization factor (10M/R)*(35/70), multiply by gene's number of exonic bases, divide by read length 70, and round up to nearest integer. These counts were input into edgeR, removing genes with zero expression and using all defaults and the BH correction for multiple comparisons, to obtain q-values to be measured against a desired false discovery rate (FDR).
Single-cell RNAseq reads were initially demultiplexed by Illumina BaseSpace to generate FASTQ files. Counts tables were created from the FASTQ files by an established pipeline: https://github.com/indrops/indrops. Counts tables were loaded into R (version 3.4.1) and analyzed using the Seurat package (Version 2.3.4 and 3.0.0). Cells were filtered by UMI, mitochondrial gene expression, and ribosomal gene expression. Data were log normalized and scaled and combined into one Seurat object. Cells were clustered and analyzed with PCA and UMAP. The final Seurat object contains 37,101 cells and 15 clusters.
For the single cell samples, we used iCellBio v.2 beads, so read 1 (R1) is the biological read and read 2 (R2) is the metadata read, containing the cell barcode and UMI. There are two cell barcode sequences and the combination of the two is the unique barcode for each cell. Structure of R2 is on the left of the top of the diagram (included as jpeg on series record).
Genome_build: GRCh37 (hg19; Feb, 2009)
Genome_build: GRCh38 (hg38; Dec, 2013)
Supplementary_files_format_and_content: MAPtoFeatures (M2F) tables have rows for genes and TAB-separated columns for various genic parameters, gene feature counts (for exons, introns, etc., individually and in total), and, for each feature and feature type, read counts, read-base counts, and normalized read Densities, all for reads that map sense or antisense to each gene. M2F LOG files document each run's read totals and summary statistics. Differential-expression tables from edgeR analyses have rows for genes and TAB-separated columns for: genic parameters, calculated fold-change (FC) ratios for KCl-stim at a time point, whether the mean change went up or down (UP.or.dn), inferred mean counts per million (CPM), reported p-value and BH-corrected q-value, various filtering results, numbers of samples with non0 expression and mean and SE log-RPKM values converted back to linear scale at each time, and read counts per sample input into edgeR. Single-cell counts tables list genes in rows, individual cells in TAB-separated columns; each entry is a UMI for a gene in a cell.

Submission date

Aug 29, 2019

Last update date

Nov 27, 2020

Contact name

Gabriella Lutz Boulting

Organization name

Harvard Medical School

Department

Neurobiology

Lab

Greenberg Lab