NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM2219503 Query DataSets for GSM2219503
Status Public on Mar 07, 2017
Title Single cell Hi-C cell 7
Sample type SRA
 
Source name mESCs
Organism Mus musculus
Characteristics cell type: haploid mESCs derived from strain 129/Ola
chip antibody: none
Growth protocol All mouse embryonic stem cells (mESCs) were cultured on 0.2 % gelatin in 2i media (NDiff B27 base medium, Stem Cell Sciences Ltd, catalogue no: SCS-SF-NB-02, supplemented with 1 uM PD0325901, 3 uM CHIR99021 and 20 ng/ml LIF). Haploid mESCs were sorted every 4 passages to enrich for haploid cells as previously described in Leeb M & Wutz A, Nature, 2011
Extracted molecule genomic DNA
Extraction protocol 5-10 million Haploid mouse ES cells were fixed for 5 min with 2 % formaldehyde in PBS and then the reaction was quenched for another 5 min in 0.125 M glycine. Nuclei were then extracted by incubating cells on ice for 30 min (inverting every 10 min) in 10 mM Tris pH 8.0, 10 mM NaCl, 0.2 % NP-40 (IGEPAL CA-630) and protease inhibitor cocktail (Roche).
Single cell Hi-C libraries were prepared using Illumina's Nextera XT sample kit with the following modifications. Briefly, biotinylated junctions were captured on streptavidin M-280 dynabeads and digested with AluI enzyme. Then tagmentaion was carried out using 1:10 to 1:100 dilution of Amplicon Tagment Mix. After tagmentation DNA was PCR amplified with Illumina primers for 9 + 15 cycles and library fragments of ~300-700 bp (insert plus adaptor and PCR primer sequences) were isolated using AMPure beads (Beckman). The purified DNA was captured on an Illumina flow cell for cluster generation. Libraries were sequenced on the MiSeq following the manufacturer's protocols.
 
Library strategy OTHER
Library source genomic
Library selection other
Instrument model Illumina MiSeq
 
Data processing Library strategy: Single cell Hi-C
Basecalls performed using Phred quality score (Q score)
ChIP-seq reads were aligend to the GRCm38/mm10 mouse genome reference using Bowtie 2 v2.1.050 and filtered to retain reads with mapping quality >30.
ChIP-seq peaks were called using MACS2 v2.1.0.2015073157 with a minimum FDR cutoff of 0.01 (-q 0.01), except for broad features (H3K27me3, H3K36me3 and H3K9me3) where a cutoff of 0.05 was used (-q 0.05).
ChIP-seq peaks were filtered to remove those not corresponding to the canonical chromosomes.
RNA-seq reads were aligned to the reference genome GRCm38/mm10, downloaded from the Ensembl database ftp.ensembl.org using Gsnap version gmap-2014-12-17. Only uniquely mapped reads were used for further analysis. Gene counts from SAM files were obtained using htseq-count version 0.6.1 with mode intersection non-empty, -s reverse. The gene annotation was extracted from Ensembl Gene Release 75. Differential gene expression analysis was conducted using Bioconductor R (R-3.1.2) package DESeq2 version 1.6.3. An adjusted p Value threshold of 0.05 was used to determine differential gene expression. Expression values are FPKM (read number normalized across samples by DESeq size factor and across genes by gene length) calculated using custom R-scripts.
Single cell Hi-C reads were initially processed using the NucProcessing software package (available upon request). Processed reads were aligned to the GRCm38/mm10 mouse genome reference using Bowtie 2 and filtered to retain reads that formed a valid Hi-C contact junction between two RE1 resriction sites. Output files were further processed to perform extra, single-cell specific processing and cleanup. These remove any pairs that represent only a single observation of a specific RE1-RE1 ligation junction after PCR amplification, while at least two separate, albeit sometimes identical, molecules must be paired-end sequenced to confirm a ligation junction. Next the sequence pairs were filtered to remove those with promiscuous ends: where the RE1 fragment at either end was involved in more than one ligation event. Finally, the redundancy in amplified RE1-RE1 ligation events was removed to create a single list of paired RE1 fragment ends.
Genome_build: mm10
Supplementary_files_format_and_content: A summary table of results (.txt) from the analysis of the RNA-seq samples includes expression estimates for all genes (Ensembl Gene Release 75; GRCm38/mm10). Expression values are FPKM (read number normalized across samples by DESeq size factor and across genes by gene length) calculated using custom R-scripts.
Supplementary_files_format_and_content: bed files (.bed) are tab-delimited text files containing chromosomal coordinates of all ChIP-seq peaks (chromosome name, start, end).
Supplementary_files_format_and_content: Single cell Hi-C txt file (.txt) is a tab-separated text format containing the Hi-C contacts. After the header line, each individual contact is represented on a line consisting of chromosome_1 seq_pos_1 chromosome_2seq_pos_2
Supplementary_files_format_and_content: Single cell genome structure pdb file (.pdb) is protein data bank format. Atom type N is used to indicate restrained particles. Atom type C is used to represent unrestrained backbone particles. Residue number corresponds to the particle and increases every 100 kb. Chain letter represents the chromosome. The last column represents the particle sequence position.
Supplementary_files_format_and_content: Single cell CENP-A coordinate file (.xyz). CENP-A superposition coordinates are represented in xyz format and may be viewed using molecular graphics software such as PyMol. The xyz format contain centromere positions after fitting to the corresponding genome structure. Atom type O was used to represent centromere positions.
Supplementary_files_format_and_content: HDF5 file (.hdf5) used to calculate distances between genomic loci and depth in the single cell structures. The hierarchy format is;name :: String -- the name of the NucFrame;bin_size :: Int -- the bin_size of the nuc files;chrms :: ["X", "1", ..] -- all of the chromosomes that are present;bp_pos/chrm :: [Int] -- The start bp index of each particle in each chrm;position/chrm :: [[[Float]]] -- (model, bead_idx, xyz);expr_contacts/chrm/chrm :: [[Int]] -- (bp, bp), raw contact count;dists/chrm/chrm :: [[Float]] -- (bead_idx, bead_idx), distanes between beads;depths/i/alpha :: Float -- alpha value used to calculate depths;depths/i/chrm/ :: [Float] -- (bead_idx, ), depth of point from surface i;alpha_shape/k/simplices :: [[Int]] -- (n_simplicies, k), indices of k simplices;alpha_shape/k/ab :: [(a, b)] -- (n_simplicies, 2), a and b values for k-simplices;surface_dist/alpha_val/tag :: optional tag for this value of alpha;surface_dist/alpha_val/i/surface_size :: size of surface i for alpha
 
Submission date Jun 29, 2016
Last update date May 15, 2019
Contact name Andre J Faure
E-mail(s) andre.faure@crg.eu
Organization name Centre for Genomic Regulation (CRG)
Department EMBL-CRG Systems Biology Unit
Street address Dr. Aiguader 88
City Barcelona
ZIP/Postal code 08003
Country Spain
 
Platform ID GPL16417
Series (1)
GSE80280 3D structures of individual mammalian genomes reveal principles of nuclear organization
Relations
BioSample SAMN05323842
SRA SRX1884205

Supplementary file Size Download File type/resource
GSM2219503_Cell-7.hdf5.gz 2.4 Gb (ftp)(http) HDF5
GSM2219503_Cell_7_CENP-A_image_fitted.xyz.gz 348 b (ftp)(http) XYZ
GSM2219503_Cell_7_contact_pairs.txt.gz 483.8 Kb (ftp)(http) TXT
GSM2219503_Cell_7_genome_structure_model.pdb.gz 4.6 Mb (ftp)(http) PDB
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap