NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM6069341 Query DataSets for GSM6069341
Status Public on Apr 27, 2022
Title Plasma cfDNA HU005.12
Sample type SRA
 
Source name Plasma
Organism Homo sapiens
Characteristics disease state: Healthy control
Extracted molecule genomic DNA
Extraction protocol Blood collected in EDTA tubes, centrifuged 1500g for 10 minutes. cfDNA extracted from 4mL of plasma using QIAamp Circulating Nucleic Acid Kit (QIAGEN, 55114), and stored at -80C before library construction.
Oxford Nanopore SQK-LSK109 (all samples except for 19_326 also used the multiplex protocol NBD-EXP104)
 
Library strategy OTHER
Library source genomic
Library selection other
Instrument model MinION
 
Description circulating DNA
Data processing Base calling and alignment: Fast5 files were taken from previous publication (EGAD00001006888), which were generated using real-time high-accuracy basecalling during GridION run. Fast5 files were demultiplexed with guppy_barcoder (Version 5.0.16+b9fcd7b5b) using “--trim_barcodes --barcode_kits EXP-NBD104”. Individual barcode Fast5 files were processed into fastq and bam files using Megalodon v. 2.4.2 with the following command-line parameters “--edge-buffer 0 --mod-min-prob 0 --guppy-params ‘-d /usr/local/hurcs/guppy/6.0.1/data --barcode_kits EXP-NBD104 --trim_barcodes’ --remora-modified-bases dna_r9.4.1_e8 hac 0.0.0 5mc CG 0 --guppy-config dna_r9.4.1_450bps_hac.cfg”. Internally, Megalodon used Guppy server version 6.0.1+652ffd1, and basecalling model r9.4.1_450bps_hac. By default, Megalodon filters out multi-mapping (supplementary) reads and uses the minimap2 “map-ont” mode to filter low quality mappings.
Anonymized BAM files: Individual tile Fast5 files were run individually, and mod_mapping.bam files were merged using samtools merge (v1.14). Samtools/HTSlib versions before v.1.14 do not handle the Mm/Ml modification tages. Because Megalodon reports only the reference sequence in the BAM records, and does not report any base substitutions, these are anonymous BAM files which do not contain any SNP information, and thus contain no personally identifiable information. These are the primary files used for both fragmentomic and methylation analysis. These are the "meg242.remora1.edgefix.mod_mappings.sorted.hg38.bam" files deposited here.
Megalodon methyl bed file extraction: To extract (stranded) methylation information from the mod_mapping.bam files, we used modbam2bed (https://github.com/epi2me-labs/modbam2bed) v.0.4.5, specifying a minimum probability threshold of 0.667, and filtering out positions with 0 confident reads using awk. The full command line was “modbam2bed --cpg -t 4 -a 0.333 -b 0.667 | awk ‘($5>0){print} > out.bed”. All coordinates are in GRCh38 and are 0-based. These files are named “*.meg242.remora1.edgefix.modbam2bed.5mC.cut0.667.hg38.bed”. Column 11 corresponds to the percent of reads methylated. Modbam2bed does not provide a column for the actual number of reads that this percentage is based on, but it can be calculated from the other columns. readCount=(col5*col10)/1000. We also processed this into a simple bedgrpah with just the methylation fraction (beta) values in files named "meg242.remora1.edgefix.modbam2bed.5mC.cut0.667.hg38.sorted.bedgraph". These can be loaded into any genome browser.
DeepSignal methylation calling and processing.  Raw Fast5 files were processed with DeepSignal Version 0.1.8 (4), with model “model.CpG.R9.4_1D.human_hx1.bn17.sn360.v0.1.7+/bn_17.sn_360.epoch_9.ckpt”, which was downloaded from the DeepSignal Google Drive (https://drive.google.com/open?id=1zkK8Q1gyfviWWnXUBMcIwEDw3SocJg7P). We used the DeepSignal call_mods (modification_call) output tsv file, extracting the (strand-specific) methylation calls for each CpG from column 9 (called_label field), and calculated a methylation beta value by taking the number of methylated reads (value 1) divided by the total number of reads (value 0 or value 1). These were collapsed into a bedgraph file with a value between 0-1 for every CpG covered. These are the "sorted.grouped.0based.bedgraph.gz" deposited here.
Assembly: GRCh38
Library strategy: WGS
 
Submission date Apr 27, 2022
Last update date Apr 27, 2022
Contact name Benjamin P Berman
E-mail(s) benbfly@gmail.com
Organization name Hebrew University of Jerusalem
Street address Hadassah University Medical Center, Bldg 3 rm 30
City Ein Kerem
ZIP/Postal code 91120
Country Israel
 
Platform ID GPL24106
Series (1)
GSE185307 Detecting cell-of-origin and cancer-specific methylation features of cell-free DNA from Nanopore sequencing
Relations
BioSample SAMN27913711
SRA SRX15014391

Supplementary file Size Download File type/resource
GSM6069341_HU005.12.hg38.sorted.grouped.0based.bedgraph.gz 19.3 Mb (ftp)(http) BEDGRAPH
GSM6069341_HU005.12.meg242.remora1.edgefix.modbam2bed.5mC.cut0.667.hg38.bed.gz 33.7 Mb (ftp)(http) BED
GSM6069341_HU005.12.meg242.remora1.edgefix.modbam2bed.5mC.cut0.667.hg38.sorted.bedgraph.gz 20.9 Mb (ftp)(http) BEDGRAPH
GSM6069341_HU005.12.meg242.remora1.edgefix.modbam2bed.5mC.cut0.667.hg38.sorted.bedgraph.gz.tbi.gz 1.4 Mb (ftp)(http) TBI
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap