NCBI C++ ToolKit
Public Types | Static Public Member Functions | Static Private Member Functions | List of all members
CDistMethods Class Reference

Search Toolkit Book for CDistMethods

#include <algo/phy_tree/dist_methods.hpp>

Public Types

enum  EFastMePar { eOls = 1 , eBalanced = 2 }
 
typedef CNcbiMatrix< double > TMatrix
 
typedef TPhyTreeNode TTree
 

Static Public Member Functions

static void JukesCantorDist (const TMatrix &frac_diff, TMatrix &result)
 Jukes-Cantor distance calculation for DNA sequences: d = -3/4 ln(1 - (4/3)p). More...
 
static void PoissonDist (const TMatrix &frac_diff, TMatrix &result)
 Simple distance calculation for protein sequences: d = -ln(1 - p). More...
 
static void KimuraDist (const TMatrix &frac_diff, TMatrix &result)
 Kimura's distance for protein sequences: d = -ln(1 - p - 0.2p^2). More...
 
static void GrishinDist (const TMatrix &frac_diff, TMatrix &result)
 Grishin's distance for protein sequences: 1 - p = (1 - e^(2*d)) / (2 * d) approximated with d = p(2 + p) / (2(1 - p)) proposed in Grishin, J Mol Evol 41:675-679, 1995. More...
 
static void GrishinGeneralDist (const TMatrix &frac_diff, TMatrix &result)
 Grishin's distance for protein sequences 1 - p = ln(1 + 2d) / 2d. More...
 
static double Divergence (const string &seq1, const string &seq2)
 Calculate pairwise fractions of non-identity. More...
 
static double FractionIdentity (const string &seq1, const string &seq2)
 Calculate pairwise fraction identity based on positions where both sequences have a base/residue. More...
 
static void Divergence (const objects::CAlnVec &avec_in, TMatrix &result)
 
static TTreeNjTree (const TMatrix &dist_mat, const vector< string > &labels=vector< string >())
 Compute a tree by neighbor joining; as per Hillis et al. More...
 
static TTreeFastMeTree (const TMatrix &dist_mat, const vector< string > &labels=vector< string >(), EFastMePar btype=eOls, EFastMePar wtype=eOls, EFastMePar ntype=eBalanced)
 Compute a tree using the fast minimum evolution algorithm. More...
 
static void ZeroNegativeBranches (TTree *node)
 Sets negative lengths of branches of a tree to zero. More...
 
static bool AllFinite (const TMatrix &mat)
 Check a matrix for NaNs and Infs. More...
 
static TTreeRerootTree (TTree *tree, TTree *node=NULL)
 Reroot tree, new root is placed in the middle of the edge specified by node. More...
 

Static Private Member Functions

static TTreex_FindLargestEdge (TTree *tree, TTree *best_node)
 Find node with the longest edge in the tree. More...
 

Detailed Description

Definition at line 52 of file dist_methods.hpp.

Member Typedef Documentation

◆ TMatrix

Definition at line 55 of file dist_methods.hpp.

◆ TTree

Definition at line 56 of file dist_methods.hpp.

Member Enumeration Documentation

◆ EFastMePar

Enumerator
eOls 
eBalanced 

Definition at line 58 of file dist_methods.hpp.

Member Function Documentation

◆ AllFinite()

bool CDistMethods::AllFinite ( const TMatrix mat)
static

Check a matrix for NaNs and Infs.

Definition at line 667 of file dist_methods.cpp.

References CNcbiMatrix< T >::GetData(), and ITERATE.

Referenced by s_ThrowIfNotAllFinite(), and CTreeBuilderJob::x_CreateProjectItems().

◆ Divergence() [1/2]

static void CDistMethods::Divergence ( const objects::CAlnVec &  avec_in,
TMatrix result 
)
static

◆ Divergence() [2/2]

double CDistMethods::Divergence ( const string seq1,
const string seq2 
)
static

Calculate pairwise fractions of non-identity.

Definition at line 485 of file dist_methods.cpp.

Referenced by CTreeBuilderJob::x_CreateProjectItems(), and CTreeBuilderJob::x_Divergence().

◆ FastMeTree()

CDistMethods::TTree * CDistMethods::FastMeTree ( const TMatrix dist_mat,
const vector< string > &  labels = vector<string>(),
EFastMePar  btype = eOls,
EFastMePar  wtype = eOls,
EFastMePar  ntype = eBalanced 
)
static

◆ FractionIdentity()

double CDistMethods::FractionIdentity ( const string seq1,
const string seq2 
)
static

Calculate pairwise fraction identity based on positions where both sequences have a base/residue.

Definition at line 504 of file dist_methods.cpp.

References _ASSERT.

Referenced by CPhyTreeCalc::x_CalcDivergenceMatrix().

◆ GrishinDist()

void CDistMethods::GrishinDist ( const TMatrix frac_diff,
TMatrix result 
)
static

Grishin's distance for protein sequences: 1 - p = (1 - e^(2*d)) / (2 * d) approximated with d = p(2 + p) / (2(1 - p)) proposed in Grishin, J Mol Evol 41:675-679, 1995.

1 - p = (1 - e^(2*d)) / (2 * d) using approximation d = p(2 - p) / (2(1 - p))

Definition at line 104 of file dist_methods.cpp.

References CNcbiMatrix< T >::GetCols(), CNcbiMatrix< T >::GetRows(), i, and result.

Referenced by CPhyTreeCalc::x_CalcDistMatrix().

◆ GrishinGeneralDist()

void CDistMethods::GrishinGeneralDist ( const TMatrix frac_diff,
TMatrix result 
)
static

Grishin's distance for protein sequences 1 - p = ln(1 + 2d) / 2d.

d = 0.65((1 - p)^(-1/0.65) - 1)

For general model: substitution rates vary for amino acids and sites proposed in Grishin N, J Mol Evol, 41:675-679, 1995 approximated with d = 0.65((1 - p)^(-1/0.65) - 1) (gamma distance) accoriding to M Nei and S Mumar, Mollecular Evolution and Phylogenetics.

Definition at line 121 of file dist_methods.cpp.

References CNcbiMatrix< T >::GetCols(), CNcbiMatrix< T >::GetRows(), i, and result.

Referenced by CPhyTreeCalc::x_CalcDistMatrix().

◆ JukesCantorDist()

void CDistMethods::JukesCantorDist ( const TMatrix frac_diff,
TMatrix result 
)
static

Jukes-Cantor distance calculation for DNA sequences: d = -3/4 ln(1 - (4/3)p).

d = -3/4 ln(1 - (4/3)p).

Definition at line 64 of file dist_methods.cpp.

References CNcbiMatrix< T >::GetCols(), CNcbiMatrix< T >::GetRows(), i, log, and result.

Referenced by CPhyTreeCalc::x_CalcDistMatrix(), and CTreeBuilderJob::x_CreateProjectItems().

◆ KimuraDist()

void CDistMethods::KimuraDist ( const TMatrix frac_diff,
TMatrix result 
)
static

Kimura's distance for protein sequences: d = -ln(1 - p - 0.2p^2).

d = -ln(1 - p - 0.2p^2)

Definition at line 90 of file dist_methods.cpp.

References CNcbiMatrix< T >::GetCols(), CNcbiMatrix< T >::GetRows(), i, log, and result.

Referenced by CPhyTreeCalc::x_CalcDistMatrix(), and CTreeBuilderJob::x_CreateProjectItems().

◆ NjTree()

CDistMethods::TTree * CDistMethods::NjTree ( const TMatrix dist_mat,
const vector< string > &  labels = vector<string>() 
)
static

Compute a tree by neighbor joining; as per Hillis et al.

As per Hillis et al. (Ed.), Molecular Systematics, pg. 488-489.

(Ed.), Molecular Systematics, pg. 488-489.

Definition at line 138 of file dist_methods.cpp.

References CTreeNode< TValue, TKeyGetterP >::AddNode(), CNcbiMatrix< T >::GetRows(), CTreeNode< TValue, TKeyGetterP >::GetValue(), i, NStr::IntToString(), max(), n, r(), CNcbiMatrix< T >::Resize(), s_ThrowIfNotAllFinite(), and swap().

Referenced by BOOST_AUTO_TEST_CASE(), CTree::ComputeTree(), CPhyTreeCalc::x_ComputeTree(), and CTreeBuilderJob::x_CreateProjectItems().

◆ PoissonDist()

void CDistMethods::PoissonDist ( const TMatrix frac_diff,
TMatrix result 
)
static

Simple distance calculation for protein sequences: d = -ln(1 - p).

d = -ln(1 - p)

Definition at line 77 of file dist_methods.cpp.

References CNcbiMatrix< T >::GetCols(), CNcbiMatrix< T >::GetRows(), i, log, and result.

Referenced by CPhyTreeCalc::x_CalcDistMatrix(), and CTreeBuilderJob::x_CreateProjectItems().

◆ RerootTree()

CDistMethods::TTree * CDistMethods::RerootTree ( CDistMethods::TTree tree,
CDistMethods::TTree node = NULL 
)
static

Reroot tree, new root is placed in the middle of the edge specified by node.

Parameters
treeTree root [in]
nodeNew root, if NULL node with the longest edge will be used [in]
Returns
New tree root

Definition at line 240 of file dist_methods.cpp.

References _ASSERT, CTreeNode< TValue, TKeyGetterP >::AddNode(), CTreeNode< TValue, TKeyGetterP >::DetachNode(), CTreeNode< TValue, TKeyGetterP >::GetParent(), CTreeNode< TValue, TKeyGetterP >::GetValue(), leaf(), and x_FindLargestEdge().

Referenced by CTree::ComputeTree(), and CPhyTreeCalc::x_ComputeTree().

◆ x_FindLargestEdge()

CDistMethods::TTree * CDistMethods::x_FindLargestEdge ( CDistMethods::TTree node,
CDistMethods::TTree best_node 
)
staticprivate

Find node with the longest edge in the tree.

Parameters
treeTree root [in]
best_nodeNode with longes edge found so far (used in recursion) [in]
Returns
Node with the longest edge

Definition at line 338 of file dist_methods.cpp.

References CTreeNode< TValue, TKeyGetterP >::GetValue(), CTreeNode< TValue, TKeyGetterP >::IsLeaf(), CTreeNode< TValue, TKeyGetterP >::SubNodeBegin(), and CTreeNode< TValue, TKeyGetterP >::SubNodeEnd().

Referenced by RerootTree().

◆ ZeroNegativeBranches()

void CDistMethods::ZeroNegativeBranches ( TTree node)
static

The documentation for this class was generated from the following files:
Modified on Sun Apr 14 05:29:09 2024 by modify_doxy.py rev. 669887