GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM5592119

Query DataSets for GSM5592119

Status

Public on Sep 28, 2021

Title

ATACseq_SARSCoV2_Rep2

Sample type

SRA

Source name

A549-ACE2 cells

Organism

Homo sapiens

Characteristics

cell type: Lung adenocarcinoma
virus: SARS-CoV-2 (USA-WA1/2020), MOI=0.1
time point: 24 h
treatment: -

Treatment protocol

Cells were infected with SARS-CoV-2 (USA-WA1/2020) at the indicated MOI and for the indicated amount of time. Alternatively, mock infected cells were treated with

Extracted molecule

genomic DNA

Extraction protocol

Samples 1-42: Cells were homogenized in TRIzol reagent and total RNA was extracted using the Direct-zol RNA Miniprep kit according to the manufacturer's instructions. Sample 43: Cells were trypsinized with 0.25% trypsin and collected as a single cells suspension. Cells were then washed twice with ice cold 1x PBS and filtered using a 40 μm Flowmi cell strainer (Bel-Art Scienceware). Cell count and viability were determined using trypan blue stain and a Countess II automatic cell counter (Thermfisher Scientific). Based on this cell count, a target cell input volume of 3,000 cells was loaded into a Chromium Controller using Chromium Next Gem (Gel Bead-In Emulsion) Single Cell 5’ Library & Gel Bead Kit v1.1 (10x Genomics) according to manufacturer’s instructions.. Samples 44-48: Cells were trypsinized and collected as a single cell suspension in complete DMEM. 50,000 cells were isolated and washed twice with ice cold PBS. Nuclei from washed cell pellets were extracted using lysis buffer (10 mM Tris-HCl, 10 mM NaCl, 3 mM MgCl2, 0.1% IGEPAL CA-630). Transposition was performed at 37°C for 30 min using the Nextera DNA Library Prep Kit (Illumina) and transposed DNA was purified using the QIAgen MinElute PCR Purification kit according to the manufacturers’ instructions.
Samples 1-42: TruSeq Stranded mRNA Library Prep Kit (Illumina); Sampls 43: Chromium Single Cell 5’ Library Kit v1.1 (10x Genomics); Samples 44-48: Nextera DNA Library Prep Kit (Illumina)
Samples 1-42: RNA-seq; Sample 43: scRNA-seq; Samples 44-48: ATAC-seq

Library strategy

ATAC-seq

Library source

genomic

Library selection

other

Instrument model

NextSeq 2000

Description

Sample 47
Nextera DNA Library Prep Kit
ATACseq_SARSCoV2_2.bw

Data processing

cDNA libraries were sequenced using an Illumina NextSeq 500 (Samples 1-49) or NextSeq 2000 (Samples 50-54) platform.
Samples 1-42: Raw sequencing reads were aligned to the human genome (hg19) using the RNA-Seq Alignment App (v2.0.1) on Basespace (Illumina, CA) or to the SARS-CoV-2 genome using Bowtie2. Differential gene expression analysis was performed using DESeq2 comparing infedcted samples to mock-infected samples.
Sample 43: Sequencing data were processed with CellRanger v4.0.0 (10X Genomics, Inc). Reads were mapped to a combined human (GRCh38) and SARS-CoV-2 (WuhCor1, NC_045512.2, modified to reflect the USA-WA1/2020 strain, MT246667.1) genome reference using CellRanger count. Raw gene x cell counts matrices were analyzed using Seurat (v4.0.1). After an initial filter to remove cells with fewer than 4,000 UMIs (empty droplets) or greater than 65,000 UMIs (doublets) and percent mitochondrial gene expression less than 5% or greater than 40%, counts data were subject to natural logarithm normalization, cell cycle scoring, highly variable gene selection, gene expression scaling, and dimensional reduction by principal component analysis using the developer’s defaults. UMIs, mitochondrial gene expression percentage, S cell cycle phase score, and G2M cell cycle phase score were regressed during gene expression scaling. Further processing included unsupervised clustering analysis using the FindClusters function (resolution: 0.4) and visualization with Uniform Manifold Approximation and Projection (developer’s defaults). All gene expression violin plots were plotted using natural logarithm normalized counts and heatmaps with z-scaled counts. Cells were classified as infected or uninfected by performing hierarchical clustering using Ward’s minimum variance method on a distance matrix of z-scaled, log normalized viral gene expression per cell with k set to 2. Comparing total viral UMIs per cluster separated cells into high and low viral gene expressing cells, and the cluster with higher viral gene expression was classified as infected. Differential gene expression (DGE) analyses were conducted using edgeR v3.30.3, with additional modifications for scRNA-Seq data. Gene x cell count matrices were extracted from Seurat objects with infection information as metadata. SARS-CoV-2 viral genes were excluded from all differential gene expression analyses and cell cycle scores were calculated using Seurat’s CellCycleScoring function. For all analyses, genes expressed (i.e. greater than or equal to 1 UMI) in less than 10% of cells for at least one group were excluded from differential gene expression testing. To identify the transcriptional signature in infected cells, differential expression analysis was conducting comparing bystander cells versus infected cells. To mitigate the dramatic differences in host gene expression between infected cells and bystander cells, transcript counts from infected and bystander cells were randomly downsampled to the median transcript counts/cell of the infected cells group. All cells with counts below this value were excluded from differential expression analyses. edgeR linear models included factors for cell cycle score (S phase and G2M phase scores), cellular gene detection rate, and infection status. The resulting significant differentially expressed genes were defined by an adjusted p-value < 0.0001 and an absolute log2 fold-change of > +/-1. Gene set enrichment testing was conducted using the HALLMARK gene sets from the Molecular Signatures database (MSigDB) with the CAMERA function. The parameter ‘use.ranks’ was set to TRUE to minimize assumptions about data structure of scRNA-seq data as compared to bulk RNA-seq or microarray data. Gene set enrichment contrasts were set to compare infected cells versus bystander cells and significantly enriched gene sets were identified using an adjusted p-value of less than 0.001.
Samples 44-48: Quality and adapter filtering was applied to raw reads using ‘trim_galore’ before aligning to human assembly hg38 with bowtie2 using the default parameters. The Picard tool MarkDuplicates (http://broadinstitute.github.io/picard/) was used to remove reads with the same start site and orientation. The BEDTools suite (http://bedtools.readthedocs.io) was used to create read density profiles. Enriched regions were discovered using MACS2 and scored against matched input libraries (fold change > 2 and FDR-adjusted p-value < 0.1). A consensus peak atlas was then created by filtering out blacklisted regions (http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/hg38-human/) and then merging all peaks within 500 bp. A raw count matrix was computed over this atlas using featureCounts (http://subread.sourceforge.net/) with the ‘-p’ option for fragment counting. The count matrix and all genome browser tracks were normalized to a sequencing depth of ten million mapped fragments. DESeq2 was used to classify differential peaks between two conditions using fold change > 2 and FDR-adjusted p-value < 0.1. Peak-gene associations were made using linear genomic distance to the nearest transcription start site with Homer (http://homer.ucsd.edu).
Genome_build: human (hg19 or hg38), SARS-CoV-2 (MN985325.1 or MT246667.1)
Supplementary_files_format_and_content: Comma-separated value files (csv) for raw read count table, raw feature-cell matrux files and bigwig ATAC-seq peak files.

Submission date

Sep 21, 2021

Last update date

Sep 28, 2021

Contact name

Benjamin Erik Nilsson-Payant

E-mail(s)

benjamin.nilsson-payant@twincore.de

Organization name

TWINCORE Centre for Experimental and Clinical Infection Research

Department

Experimental Virology