NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM6090101 Query DataSets for GSM6090101
Status Public on Jun 22, 2024
Title Ramos_D3_1_seq
Sample type SRA
 
Source name Ramos + S2
Organisms Drosophila melanogaster; Homo sapiens
Characteristics cell line: Ramos + S2
genotype: AID-/-_cD3
growth protocol: full RPMI
treatment: unstimulated
sorted: -
Extracted molecule total RNA
Extraction protocol Cells were harvested and nuclei isolated. Transcription run-on in the presence of 4 Biotin-11-NTPs. Afterwards Trizol extraction of total RNA
total extraction with trizol, labeled RNA enrichement .3' and 5' linker attachment. cDNA preparation and PCR
 
Library strategy OTHER
Library source transcriptomic
Library selection other
Instrument model Illumina NovaSeq 6000
 
Description RNA was generated by performing transcription run-on on freshly isolated nuclei
5' and 3'linkers with random 8mers were added to allow PCR deduplication
True-seq barcoded RPI primers for multiplexing were used in the PCR step to generate barcoded libraries
PRO-seq
Data processing PRO-seq
Reads were trimmed for standard adapters and low quality (Q<30) 3’-end bases and filtered for a remaining length of at least 20nt (excluding the UMI, if applicable) using cutadapt v2.5.
Alignment to the reference genome was done with Bowtie v1.2.3. The NCBI GRCm38.6 assembly was used as the mouse reference. To this we added the fixated sequence of the recombined VDJ locus as an additional chromosome. Similarly, for human data, the Hg38 assembly was used, with the relevant fixed VDJ sequence added as a chromosome.
Where applicable, the dmr6 assembly of Drosophila melanogaster from Flybase (https://wiki.flybase.org/wiki/FlyBase:About#Citing_FlyBase, release version number 6.27) was used as the spike-in reference and was added to the alignment index. During alignment, up to 3 mismatches were allowed.
To accommodate mapping to the repetitive IgH locus, a high degree of multimapping was allowed (194 potential V segments annotated for GRCm38.6). In the case of alignment with spike-ins, only reads mapping exclusively to either genome were considered.
For PCR deduplication, UMIs were identified (8 nucleotides at the 3' end) and filtered with umi-tools v1.0.0. No sequence differences were allowed in UMIs when collapsing the duplicates.
Multimapping reads were then filtered to identify those specific to the VDJ sequence, taking into account the repetitiveness of the reference IgH locus. Reads mapping to the fixed VDJ sequence were allowed to multimap only to the native IgH locus (mouse chr12:113572929-116009954, human ch14:105836764-106875071). The native loci were defined as the region between the earliest position belonging to an annotated V segment and the latest position belonging to an annotated J segment. Reads with a mapped location outside these areas were rejected. This filtering was implemented in a custom Python script. The qualifying reads were then classified as reads mapping uniquely to the particular VDJ sequence and reads mapping to both the VDJ sequence and to the native IgH region.
Genome browser tracks were created by quantifying and scaling read coverage of the VDJ sequence using bedtools v2.29.0. Reads were split by strand, and strand labels for PROseq were inverted in order to match the strand designations of the respective PROcap reads. For both data types, only the first 5' base of the strand-adjusted reads was used for the tracks. Additionlly, the 5' end of PROseq reads was shifted by 1 base downstream, to compensate for the last base in the mRNA being the exogenous termination base. For normalized tracks, read counts were scaled to RPM (reads per million). Finally, “-“ strand coverage tracks were further scaled by -1, for visualization convenience.
Code for the workflow and the custom scripts is available on Github at https://github.com/PavriLab/IgH_VDJ_PROcapseq .
Assembly: NCBI mm9, NCBI HG38, custom IGH genome (see github)
Supplementary files format and content: bigWig of rpm-normalized read densities
Library strategy: PRO-seq
 
Submission date May 02, 2022
Last update date Jun 22, 2024
Contact name Maximilian Christian von der Linde
E-mail(s) maximilian.linde@imp.ac.at, max_vdl@outlook.com
Phone +4368181646556
Organization name Institute of Molecular Pathology
Lab Pavri GRP
Street address Campus-vienna-biocenter 1, IMP, Pavri GRP
City Vienna
State/province Vienna
ZIP/Postal code 1030
Country Austria
 
Platform ID GPL29805
Series (2)
GSE202041 High-resolution transcriptional analysis of immunoglobulin variable regions reveals the absence of direct relationships between somatic hypermutation, nascent transcription and epigenetic marks [PRO-seq]
GSE202042 High-resolution transcriptional analysis of immunoglobulin variable regions reveals the absence of direct relationships between somatic hypermutation, nascent transcription and epigenetic marks
Relations
BioSample SAMN28036225
SRA SRX15105575

Supplementary file Size Download File type/resource
GSM6090101_Proseq_D3_1rep_122230_trim_igh_libsize_3prime_fwd.bw 101.9 Kb (ftp)(http) BW
GSM6090101_Proseq_D3_1rep_122230_trim_igh_libsize_3prime_rev.bw 113.0 Kb (ftp)(http) BW
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap