Bacillus cereus strain 874 Amplicons representing 79 of 217 and 41 of 122 genes from pXO1 and pXO2 respectively, and 3601 of 5753 chromosomal genes as predicted by Glimmer 30 ( and see supplemental methods) were arrayed onto glass microscope slides (Telechem, Inc). Redundant genes were generally represented once or a few times on the array. Genomic DNA was labeled with Cy3 and Cy5 according to J. DeRisi (http://www.microarrays.org/Pdfs/GenomicDNALabel_B.pdf), except that genomic DNA was not digested or sheared before labeling. Arrays were scanned with a GenePix 4000B scanner (Axon Inc.). Hybridization signals were quantitated using TIGR SPOTFINDER (software available at http://www.tigr.org/softlab). Hybridization experiments were competitive using probes derived from B. anthracis Ames (reference) and a B. cereus group (query) strain. Normalized signal intensities were used to generate relative hybridization ratios (query/reference). Data representing weak signal were removed. The ratios from a maximum of six data points (duplicate spots, hybridizations performed in triplicate) were placed in three bins (<0.1) gene is absent in query strain, (0.1-0.3) present but diverged in query strain, and (>0.3) gene is present in the query strain. A majority rule was applied to the data for binning such that more than 50% of ratios were in agreement as to assignment and that at least two data points were used (exceeded in 99% of the cases). In cases where less than two data points existed, the gene was treated as data missing. The criteria for the numerical ranges of our bins were established in two ways. First, we determined the presence or absence of sequences homologous to 3601 B. anthracis genes in the sequence of B. cereus ATCC 14579 (Integrated Genomics, Inc; http://www.integratedgenomics.com/) using BLASTN and compared that to the assignments inferred from hybridization ratios. A threshold of 0.1 was found to be suitable for classifying a gene as absent (i.e., agreement between sequence and CGH data in 99% of the cases), while a cutoff value of 0.3 was conservative for gene presence (agreement in 92% of the cases). Second, we used a set of 65 genes conserved in 26 bacterial genomes, NCBI COG database (http://www.ncbi.nlm.nih.gov/COG/). Genes judged as present in query strains using our selected cut-offs correctly binned data in 1225 out of 1235 total calls. There was a tendency for underprediction of plasmid homologs by CGH, when compared to results from the sequence analysis. Two possible explanations for this are variability in plasmid copy number in B. cereus strains relative to B. anthracis and/or that the average divergence of plasmid genes is greater than chromosomal genes.and that at least two data points were used (exceeded in 99% of the cases)