NCBI C++ ToolKit
|
High level definitions and declarations for the PSSM engine of PSI-BLAST. More...
#include <algo/blast/core/ncbi_std.h>
#include <algo/blast/core/blast_export.h>
#include <algo/blast/core/blast_options.h>
#include <algo/blast/core/blast_stat.h>
Go to the source code of this file.
Go to the SVN repository for this file.
Classes | |
struct | PSIMsaCell |
Structure to describe the characteristics of a position in the multiple sequence alignment data structure. More... | |
struct | PSIMsaDimensions |
Structure representing the dimensions of the multiple sequence alignment data structure. More... | |
struct | PSIMsa |
Multiple sequence alignment (msa) data structure containing the raw data needed by the PSSM engine to create a PSSM. More... | |
struct | PSICdMsaCellData |
Data needed for PSSM computation stored in MSA cell for single column in CD aligned to a position in the query. More... | |
struct | PSICdMsaCell |
Alignment cell that represents one column of CD aligned to a position in the query. More... | |
struct | PSICdMsa |
Data structure representing multiple alignemnt of CDs and query sequence along with data needed for PSSM computation. More... | |
struct | PSIMatrix |
This is the main return value from the PSSM engine. More... | |
struct | PSIDiagnosticsRequest |
Structure to allow requesting various diagnostics data to be collected by PSSM engine. More... | |
struct | PSIDiagnosticsResponse |
This structure contains the diagnostics information requested using the PSIDiagnosticsRequest structure. More... | |
Typedefs | |
typedef struct PSIMsaCell | PSIMsaCell |
Structure to describe the characteristics of a position in the multiple sequence alignment data structure. More... | |
typedef struct PSIMsaDimensions | PSIMsaDimensions |
Structure representing the dimensions of the multiple sequence alignment data structure. More... | |
typedef struct PSIMsa | PSIMsa |
Multiple sequence alignment (msa) data structure containing the raw data needed by the PSSM engine to create a PSSM. More... | |
typedef struct PSICdMsaCellData | PSICdMsaCellData |
Data needed for PSSM computation stored in MSA cell for single column in CD aligned to a position in the query. More... | |
typedef struct PSICdMsaCell | PSICdMsaCell |
Alignment cell that represents one column of CD aligned to a position in the query. More... | |
typedef struct PSICdMsa | PSICdMsa |
Data structure representing multiple alignemnt of CDs and query sequence along with data needed for PSSM computation. More... | |
typedef struct PSIMatrix | PSIMatrix |
This is the main return value from the PSSM engine. More... | |
typedef struct PSIDiagnosticsRequest | PSIDiagnosticsRequest |
Structure to allow requesting various diagnostics data to be collected by PSSM engine. More... | |
typedef struct PSIDiagnosticsResponse | PSIDiagnosticsResponse |
This structure contains the diagnostics information requested using the PSIDiagnosticsRequest structure. More... | |
Functions | |
PSIMsa * | PSIMsaNew (const PSIMsaDimensions *dimensions) |
Allocates and initializes the multiple sequence alignment data structure for use as input to the PSSM engine. More... | |
PSIMsa * | PSIMsaFree (PSIMsa *msa) |
Deallocates the PSIMsa structure. More... | |
PSIMatrix * | PSIMatrixNew (Uint4 query_length, Uint4 alphabet_size) |
Allocates a new PSIMatrix structure. More... | |
PSIMatrix * | PSIMatrixFree (PSIMatrix *matrix) |
Deallocates the PSIMatrix structure passed in. More... | |
PSIDiagnosticsRequest * | PSIDiagnosticsRequestNew (void) |
Allocates a PSIDiagnosticsRequest structure, setting all fields to false. More... | |
PSIDiagnosticsRequest * | PSIDiagnosticsRequestNewEx (Boolean save_ascii_pssm) |
Allocates a PSIDiagnosticsRequest structure, setting fields to their default values for their use in the context of the PSI-BLAST application. More... | |
PSIDiagnosticsRequest * | PSIDiagnosticsRequestFree (PSIDiagnosticsRequest *diags_request) |
Deallocates the PSIDiagnosticsRequest structure passed in. More... | |
PSIDiagnosticsResponse * | PSIDiagnosticsResponseNew (Uint4 query_length, Uint4 alphabet_size, const PSIDiagnosticsRequest *request) |
Allocates a new PSI-BLAST diagnostics structure based on which fields of the PSIDiagnosticsRequest structure are TRUE. More... | |
PSIDiagnosticsResponse * | PSIDiagnosticsResponseFree (PSIDiagnosticsResponse *diags) |
Deallocates the PSIDiagnosticsResponse structure passed in. More... | |
int | PSICreatePssm (const PSIMsa *msap, const PSIBlastOptions *options, BlastScoreBlk *sbp, PSIMatrix **pssm) |
Main entry point to core PSSM engine to calculate the PSSM. More... | |
int | PSICreatePssmWithDiagnostics (const PSIMsa *msap, const PSIBlastOptions *options, BlastScoreBlk *sbp, const PSIDiagnosticsRequest *request, PSIMatrix **pssm, PSIDiagnosticsResponse **diagnostics) |
Main entry point to core PSSM engine which allows to request diagnostics information. More... | |
int | PSICreatePssmFromCDD (const PSICdMsa *cd_msa, const PSIBlastOptions *options, BlastScoreBlk *sbp, const PSIDiagnosticsRequest *request, PSIMatrix **pssm, PSIDiagnosticsResponse **diagnostics) |
Main entry point to core PSSM engine for computing CDD-based PSSMs. More... | |
int | PSICreatePssmFromFrequencyRatios (const Uint1 *query, Uint4 query_length, BlastScoreBlk *sbp, double **freq_ratios, double impala_scaling_factor, PSIMatrix **pssm) |
Top-level function to create a PSSM given a matrix of frequency ratios and perform scaling on the resulting PSSM (i.e. More... | |
High level definitions and declarations for the PSSM engine of PSI-BLAST.
Definition in file blast_psi.h.
Data structure representing multiple alignemnt of CDs and query sequence along with data needed for PSSM computation.
typedef struct PSICdMsaCell PSICdMsaCell |
Alignment cell that represents one column of CD aligned to a position in the query.
typedef struct PSICdMsaCellData PSICdMsaCellData |
Data needed for PSSM computation stored in MSA cell for single column in CD aligned to a position in the query.
typedef struct PSIDiagnosticsRequest PSIDiagnosticsRequest |
Structure to allow requesting various diagnostics data to be collected by PSSM engine.
typedef struct PSIDiagnosticsResponse PSIDiagnosticsResponse |
This structure contains the diagnostics information requested using the PSIDiagnosticsRequest structure.
Multiple sequence alignment (msa) data structure containing the raw data needed by the PSSM engine to create a PSSM.
By convention, the first row of the data field contains the query sequence
typedef struct PSIMsaCell PSIMsaCell |
Structure to describe the characteristics of a position in the multiple sequence alignment data structure.
typedef struct PSIMsaDimensions PSIMsaDimensions |
Structure representing the dimensions of the multiple sequence alignment data structure.
int PSICreatePssm | ( | const PSIMsa * | msap, |
const PSIBlastOptions * | options, | ||
BlastScoreBlk * | sbp, | ||
PSIMatrix ** | pssm | ||
) |
Main entry point to core PSSM engine to calculate the PSSM.
msap | multiple sequence alignment data structure [in] |
options | options to the PSSM engine [in] |
sbp | BLAST score block structure [in|out] |
pssm | PSSM and statistical information (the latter is also returned in the sbp->kbp_gap_psi[0]) |
Definition at line 95 of file blast_psi.c.
References NULL, and PSICreatePssmWithDiagnostics().
int PSICreatePssmFromCDD | ( | const PSICdMsa * | cd_msa, |
const PSIBlastOptions * | options, | ||
BlastScoreBlk * | sbp, | ||
const PSIDiagnosticsRequest * | request, | ||
PSIMatrix ** | pssm, | ||
PSIDiagnosticsResponse ** | diagnostics | ||
) |
Main entry point to core PSSM engine for computing CDD-based PSSMs.
cd_msa | information about CDs that match to query sequence [in] |
options | options to PSSM engine [in] |
sbp | BLAST score block structure [in|out] |
request | diagnostics information request [in] |
pssm | PSSM [out] |
diagnostics | diagnostics information response, expects a pointer to uninitialized structure [in|out] |
Definition at line 229 of file blast_psi.c.
References _PSIComputeFreqRatiosFromCDs(), _PSIComputeFrequenciesFromCDs(), _PSICreateAndScalePssmFromFrequencyRatios(), _PSIInternalPssmDataNew(), _PSISaveCDDiagnostics(), _PSISequenceWeightsNew(), _PSIValidateCdMSA(), BlastScoreBlk::alphabet_size, PSICdMsa::dimensions, PSIBlastOptions::impala_scaling_factor, NULL, PSIBlastOptions::pseudo_count, PSI_SUCCESS, PSIDiagnosticsResponseFree(), PSIDiagnosticsResponseNew(), PSIERR_BADPARAM, PSIERR_OUTOFMEM, PSIMatrixNew(), PSICdMsa::query, PSIMsaDimensions::query_length, s_PSICreatePssmCleanUp(), s_PSISavePssm(), and _PSISequenceWeights::std_prob.
Referenced by CPssmEngine::x_CreatePssmFromCDD().
int PSICreatePssmFromFrequencyRatios | ( | const Uint1 * | query, |
Uint4 | query_length, | ||
BlastScoreBlk * | sbp, | ||
double ** | freq_ratios, | ||
double | impala_scaling_factor, | ||
PSIMatrix ** | pssm | ||
) |
Top-level function to create a PSSM given a matrix of frequency ratios and perform scaling on the resulting PSSM (i.e.
: performs the last two stages of the algorithm) Note that no diagnostics can be returned as those are calculated in earlier stages of the algorithm.
query | query sequence in ncbistdaa format, no sentinels needed [in] |
query_length | length of the query sequence [in] |
sbp | BLAST score block structure [in|out] |
freq_ratios | matrix of frequency ratios, dimensions are query_length by BLASTAA_SIZE [in] |
impala_scaling_factor | scaling factor used in IMPALA-style scaling if its value is NOT kPSSM_NoImpalaScaling (otherwise it performs standard PSI-BLAST scaling) [in] |
pssm | PSSM and statistical information [in|out] |
Definition at line 344 of file blast_psi.c.
References _PSICopyMatrix_double(), _PSICreateAndScalePssmFromFrequencyRatios(), _PSIInternalPssmDataNew(), BlastScoreBlk::alphabet_size, BLAST_GetStandardAaProbabilities(), _PSIInternalPssmData::freq_ratios, _PSIInternalPssmData::ncols, _PSIInternalPssmData::nrows, NULL, PSI_SUCCESS, PSIERR_OUTOFMEM, PSIMatrixNew(), query, s_PSICreatePssmFromFrequencyRatiosCleanUp(), and s_PSISavePssm().
Referenced by CPssmEngine::x_CreatePssmFromFreqRatios().
int PSICreatePssmWithDiagnostics | ( | const PSIMsa * | msap, |
const PSIBlastOptions * | options, | ||
BlastScoreBlk * | sbp, | ||
const PSIDiagnosticsRequest * | request, | ||
PSIMatrix ** | pssm, | ||
PSIDiagnosticsResponse ** | diagnostics | ||
) |
Main entry point to core PSSM engine which allows to request diagnostics information.
msap | multiple sequence alignment data structure [in] |
options | options to the PSSM engine [in] |
sbp | BLAST score block structure [in|out] |
request | diagnostics information request [in] |
pssm | PSSM and statistical information (the latter is also returned in the sbp->kbp_gap_psi[0]) [out] |
diagnostics | diagnostics information response, expects a pointer to an uninitialized structure which will be populated with data requested in requests [in|out] |
Definition at line 105 of file blast_psi.c.
References _PSIAlignedBlockNew(), _PSIComputeAlignmentBlocks(), _PSIComputeFreqRatios(), _PSIComputeSequenceWeights(), _PSICreateAndScalePssmFromFrequencyRatios(), _PSIInternalPssmDataNew(), _PSIMsaNew(), _PSIPackedMsaFree(), _PSIPackedMsaNew(), _PSIPurgeBiasedSegments(), _PSISaveDiagnostics(), _PSISequenceWeightsNew(), _PSIStructureGroupCustomization(), _PSIValidateMSA(), _PSIValidateMSA_StructureGroup(), BlastScoreBlk::alphabet_size, _PSIMsa::dimensions, PSIBlastOptions::ignore_unaligned_positions, PSIBlastOptions::impala_scaling_factor, PSIBlastOptions::nsg_compatibility_mode, NULL, PSIBlastOptions::pseudo_count, PSI_SUCCESS, PSIDiagnosticsResponseFree(), PSIDiagnosticsResponseNew(), PSIERR_BADPARAM, PSIERR_OUTOFMEM, PSIMatrixNew(), _PSIMsa::query, PSIMsaDimensions::query_length, s_PSICreatePssmCleanUp(), s_PSISavePssm(), and _PSISequenceWeights::std_prob.
Referenced by PSICreatePssm(), and CPssmEngine::x_CreatePssmFromMsa().
PSIDiagnosticsRequest* PSIDiagnosticsRequestFree | ( | PSIDiagnosticsRequest * | diags_request | ) |
Deallocates the PSIDiagnosticsRequest structure passed in.
diags_request | structure to deallocate [in] |
Definition at line 611 of file blast_psi.c.
Referenced by CPsiBlastInputClustalW::~CPsiBlastInputClustalW().
PSIDiagnosticsRequest* PSIDiagnosticsRequestNew | ( | void | ) |
Allocates a PSIDiagnosticsRequest structure, setting all fields to false.
Definition at line 585 of file blast_psi.c.
References calloc().
Referenced by CPsiBlastInputClustalW::CPsiBlastInputClustalW(), PSIDiagnosticsRequestNewEx(), and CPsiBlastTestFixture::x_ComputePssmForNextIteration().
PSIDiagnosticsRequest* PSIDiagnosticsRequestNewEx | ( | Boolean | save_ascii_pssm | ) |
Allocates a PSIDiagnosticsRequest structure, setting fields to their default values for their use in the context of the PSI-BLAST application.
save_ascii_pssm | corresponds to the command line argument to save the PSSM in ASCII format [in] |
Definition at line 591 of file blast_psi.c.
References PSIDiagnosticsRequest::frequency_ratios, PSIDiagnosticsRequest::gapless_column_weights, PSIDiagnosticsRequest::information_content, PSIDiagnosticsRequest::interval_sizes, NULL, PSIDiagnosticsRequest::num_matching_seqs, PSIDiagnosticsRequestNew(), PSIDiagnosticsRequest::sigma, TRUE, and PSIDiagnosticsRequest::weighted_residue_frequencies.
Referenced by BOOST_AUTO_TEST_CASE(), CPsiBlastApp::ComputePssmForNextIteration(), CDeltaBlastApp::ComputePssmForNextPsiBlastIteration(), CDeltaBlast::Run(), and CPsiBlastArgs::x_CreatePssmFromMsa().
PSIDiagnosticsResponse* PSIDiagnosticsResponseFree | ( | PSIDiagnosticsResponse * | diags | ) |
Deallocates the PSIDiagnosticsResponse structure passed in.
diags | structure to deallocate [in] |
Definition at line 717 of file blast_psi.c.
References _PSIDeallocateMatrix(), PSIDiagnosticsResponse::frequency_ratios, PSIDiagnosticsResponse::gapless_column_weights, PSIDiagnosticsResponse::independent_observations, PSIDiagnosticsResponse::information_content, PSIDiagnosticsResponse::interval_sizes, NULL, PSIDiagnosticsResponse::num_matching_seqs, PSIDiagnosticsResponse::query_length, PSIDiagnosticsResponse::residue_freqs, sfree, PSIDiagnosticsResponse::sigma, and PSIDiagnosticsResponse::weighted_residue_freqs.
Referenced by PSICreatePssmFromCDD(), PSICreatePssmWithDiagnostics(), and PSIDiagnosticsResponseNew().
PSIDiagnosticsResponse* PSIDiagnosticsResponseNew | ( | Uint4 | query_length, |
Uint4 | alphabet_size, | ||
const PSIDiagnosticsRequest * | request | ||
) |
Allocates a new PSI-BLAST diagnostics structure based on which fields of the PSIDiagnosticsRequest structure are TRUE.
Note: this is declared here for consistency - this does not need to be called by client code of this API, it is called in the PSICreatePssm* functions to allocate the diagnostics response structure.
query_length | length of the query sequence [in] |
alphabet_size | length of the alphabet [in] |
request | diagnostics to retrieve from PSSM engine [in] |
Definition at line 618 of file blast_psi.c.
References _PSIAllocateMatrix(), PSIDiagnosticsResponse::alphabet_size, calloc(), PSIDiagnosticsRequest::frequency_ratios, PSIDiagnosticsResponse::frequency_ratios, PSIDiagnosticsRequest::gapless_column_weights, PSIDiagnosticsResponse::gapless_column_weights, PSIDiagnosticsRequest::independent_observations, PSIDiagnosticsResponse::independent_observations, PSIDiagnosticsRequest::information_content, PSIDiagnosticsResponse::information_content, PSIDiagnosticsRequest::interval_sizes, PSIDiagnosticsResponse::interval_sizes, NULL, PSIDiagnosticsRequest::num_matching_seqs, PSIDiagnosticsResponse::num_matching_seqs, PSIDiagnosticsResponseFree(), PSIDiagnosticsResponse::query_length, PSIDiagnosticsResponse::residue_freqs, PSIDiagnosticsRequest::residue_frequencies, PSIDiagnosticsRequest::sigma, PSIDiagnosticsResponse::sigma, PSIDiagnosticsResponse::weighted_residue_freqs, and PSIDiagnosticsRequest::weighted_residue_frequencies.
Referenced by PSICreatePssmFromCDD(), and PSICreatePssmWithDiagnostics().
Deallocates the PSIMatrix structure passed in.
matrix | structure to deallocate [in] |
Definition at line 569 of file blast_psi.c.
References _PSIDeallocateMatrix(), PSIMatrix::ncols, NULL, PSIMatrix::pssm, and sfree.
Referenced by PSIMatrixNew(), s_PSICreatePssmCleanUp(), and s_PSICreatePssmFromFrequencyRatiosCleanUp().
Allocates a new PSIMatrix structure.
query_length | number of columns allocated for the PSSM [in] |
alphabet_size | number of rows allocated for the PSSM [in] |
Definition at line 541 of file blast_psi.c.
References _PSIAllocateMatrix(), PSIMatrix::h, PSIMatrix::kappa, PSIMatrix::lambda, malloc(), PSIMatrix::ncols, PSIMatrix::nrows, NULL, PSIMatrixFree(), PSIMatrix::pssm, PSIMatrix::ung_h, PSIMatrix::ung_kappa, and PSIMatrix::ung_lambda.
Referenced by PSICreatePssmFromCDD(), PSICreatePssmFromFrequencyRatios(), and PSICreatePssmWithDiagnostics().
Deallocates the PSIMsa structure.
msa | multiple sequence alignment structure to deallocate [in] |
Definition at line 513 of file blast_psi.c.
References _PSIDeallocateMatrix(), PSIMsa::data, PSIMsa::dimensions, NULL, PSIMsaDimensions::num_seqs, and sfree.
Referenced by CPssmInputTestData::CPssmInputTestData(), PSIMsaNew(), CdPssmInput::~CdPssmInput(), CPsiBlastInputClustalW::~CPsiBlastInputClustalW(), CPsiBlastInputData::~CPsiBlastInputData(), CPssmInputFlankingGaps::~CPssmInputFlankingGaps(), CPssmInputTestData::~CPssmInputTestData(), and SU_PSSMInput::~SU_PSSMInput().
PSIMsa* PSIMsaNew | ( | const PSIMsaDimensions * | dimensions | ) |
Allocates and initializes the multiple sequence alignment data structure for use as input to the PSSM engine.
dimensions | dimensions of multiple sequence alignment data structure to allocate [in] |
Definition at line 462 of file blast_psi.c.
References _PSIAllocateMatrix(), calloc(), PSIMsa::data, PSIMsa::dimensions, FALSE, PSIMsaCell::is_aligned, PSIMsaCell::letter, malloc(), NULL, PSIMsaDimensions::num_seqs, PSIMsaFree(), and PSIMsaDimensions::query_length.
Referenced by BOOST_AUTO_TEST_CASE(), CdPssmInput::CdPssmInput(), CPssmInputFlankingGaps::CPssmInputFlankingGaps(), CPsiBlastInputClustalW::Process(), CPsiBlastInputData::Process(), CPssmInputTestData::SetupDuplicateHit(), CPssmInputTestData::SetupHenikoffsPositionBasedSequenceWeights(), CPssmInputTestData::SetupMsaHasUnalignedRegion(), CPssmInputTestData::SetupQueryAlignedWithInternalGaps(), CPssmInputTestData::SetupSelfHit(), and SU_PSSMInput::SU_PSSMInput().