A practical exact maximum compatibility algorithm for reconstruction of recent evolutionary history

BMC Bioinformatics. 2017 Feb 23;18(1):127. doi: 10.1186/s12859-017-1520-4.

Abstract

Background: Maximum compatibility is a method of phylogenetic reconstruction that is seldom applied to molecular sequences. It may be ideal for certain applications, such as reconstructing phylogenies of closely-related bacteria on the basis of whole-genome sequencing.

Results: Here I present an algorithm that rapidly computes phylogenies according to a compatibility criterion. Although based on solutions to the maximum clique problem, this algorithm deals properly with ambiguities in the data. The algorithm is applied to bacterial data sets containing up to nearly 2000 genomes with several thousand variable nucleotide sites. Run times are several seconds or less. Computational experiments show that maximum compatibility is less sensitive than maximum parsimony to the inclusion of nucleotide data that, though derived from actual sequence reads, has been identified as likely to be misleading.

Conclusions: Maximum compatibility is a useful tool for certain phylogenetic problems, such as inferring the relationships among closely-related bacteria from whole-genome sequence data. The algorithm presented here rapidly solves fairly large problems of this type, and provides robustness against misleading characters than can pollute large-scale sequencing data.

Keywords: Bacterial genomes; Homoplasy; Maximum compatibility; Phylogeny.

MeSH terms

  • Algorithms*
  • Evolution, Molecular*
  • Genome, Bacterial
  • Phylogeny
  • Salmonella enterica / classification
  • Salmonella enterica / genetics
  • Sequence Analysis, DNA
  • Software