NCBI C++ ToolKit
Classes | Macros | Typedefs | Functions
blast_hits.h File Reference

Structures and API used for saving BLAST hits. More...

#include <algo/blast/core/ncbi_std.h>
#include <algo/blast/core/blast_export.h>
#include <algo/blast/core/blast_program.h>
#include <algo/blast/core/blast_query_info.h>
#include <algo/blast/core/blast_options.h>
#include <algo/blast/core/blast_parameters.h>
#include <algo/blast/core/blast_stat.h>
#include <algo/blast/core/gapinfo.h>
#include <algo/blast/core/blast_seqsrc.h>
#include <algo/blast/core/pattern.h>
+ Include dependency graph for blast_hits.h:
+ This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Go to the SVN repository for this file.

Classes

struct  SBlastHitsParameters
 Keeps prelim_hitlist_size and HitSavingOptions together, mostly for use by hspstream. More...
 
struct  BlastSeg
 One sequence segment within an HSP. More...
 
struct  SPHIHspInfo
 In PHI BLAST: information about pattern match in a given HSP. More...
 
struct  BlastHSPMappingInfo
 Mapping information for an HSP. More...
 
struct  BlastHSP
 Structure holding all information about an HSP. More...
 
struct  BlastHSPList
 The structure to hold all HSPs for a given sequence after the gapped alignment. More...
 
struct  BlastHitList
 The structure to contain all BLAST results for one query sequence. More...
 
struct  BlastHSPResults
 The structure to contain all BLAST results, for multiple queries. More...
 

Macros

#define DBSEQ_CHUNK_OVERLAP   100
 By how much should the chunks of a subject sequence overlap if it is too long and has to be split. More...
 

Typedefs

typedef struct SequenceOverhangs SequenceOverhangs
 
typedef struct SBlastHitsParameters SBlastHitsParameters
 Keeps prelim_hitlist_size and HitSavingOptions together, mostly for use by hspstream. More...
 
typedef struct BlastSeg BlastSeg
 One sequence segment within an HSP. More...
 
typedef struct SPHIHspInfo SPHIHspInfo
 In PHI BLAST: information about pattern match in a given HSP. More...
 
typedef struct JumperEditsBlock JumperEditsBlock
 
typedef struct BlastHSPMappingInfo BlastHSPMappingInfo
 Mapping information for an HSP. More...
 
typedef struct BlastHSP BlastHSP
 Structure holding all information about an HSP. More...
 
typedef struct BlastHSPList BlastHSPList
 The structure to hold all HSPs for a given sequence after the gapped alignment. More...
 
typedef struct BlastHitList BlastHitList
 The structure to contain all BLAST results for one query sequence. More...
 
typedef struct BlastHSPResults BlastHSPResults
 The structure to contain all BLAST results, for multiple queries. More...
 

Functions

Int2 SBlastHitsParametersNew (const BlastHitSavingOptions *hit_options, const BlastExtensionOptions *ext_options, const BlastScoringOptions *scoring_options, SBlastHitsParameters **retval)
 Sets up small structures used by blast_hit.c for saving HSPs. More...
 
SBlastHitsParametersSBlastHitsParametersDup (const SBlastHitsParameters *hit_params)
 Make a deep copy of the SBlastHitsParameters structure passed in. More...
 
SBlastHitsParametersSBlastHitsParametersFree (SBlastHitsParameters *param)
 Deallocated SBlastHitsParameters. More...
 
BlastHSPBlast_HSPFree (BlastHSP *hsp)
 Deallocate memory for an HSP structure. More...
 
BlastHSPBlast_HSPNew (void)
 Allocate and zeros out memory for an HSP structure. More...
 
Int2 Blast_HSPInit (Int4 query_start, Int4 query_end, Int4 subject_start, Int4 subject_end, Int4 query_gapped_start, Int4 subject_gapped_start, Int4 query_context, Int2 query_frame, Int2 subject_frame, Int4 score, GapEditScript **gap_edit, BlastHSP **ret_hsp)
 Allocates BlastHSP and inits with information from input. More...
 
BlastHSPBlast_HSPClone (const BlastHSP *hsp)
 Make a deep copy of an HSP. More...
 
BlastHSPMappingInfoBlastHSPMappingInfoFree (BlastHSPMappingInfo *info)
 Deallocate memory for an HSP's additional data structure. More...
 
BlastHSPMappingInfoBlastHSPMappingInfoNew (void)
 Allocate memory for an HSP's additional data structure. More...
 
Boolean Blast_HSPReevaluateWithAmbiguitiesGapped (BlastHSP *hsp, const Uint1 *query_start, const Int4 query_length, const Uint1 *subject_start, const Int4 subject_length, const BlastHitSavingParameters *hit_params, const BlastScoringParameters *score_params, const BlastScoreBlk *sbp)
 Reevaluate the HSP's score and percent identity after taking into account the ambiguity information. More...
 
Boolean Blast_HSPReevaluateWithAmbiguitiesUngapped (BlastHSP *hsp, const Uint1 *query_start, const Uint1 *subject_start, const BlastInitialWordParameters *word_params, BlastScoreBlk *sbp, Boolean translated)
 Reevaluate the HSP's score and percent identity after taking into account the ambiguity information. More...
 
Int2 Blast_HSPGetNumIdentities (const Uint1 *query, const Uint1 *subject, BlastHSP *hsp, const BlastScoringOptions *score_options, Int4 *align_length_ptr)
 Calculate number of identities in an HSP and set the BlastHSP::num_ident field (unconditionally) More...
 
Int2 Blast_HSPGetNumIdentitiesAndPositives (const Uint1 *query, const Uint1 *subject, BlastHSP *hsp, const BlastScoringOptions *score_options, Int4 *align_length_ptr, const BlastScoreBlk *sbp)
 Calculate number of identities and positives in an HSP and set the BlastHSP::num_ident and BlastHSP::num_positives fields. More...
 
Boolean Blast_HSPTest (BlastHSP *hsp, const BlastHitSavingOptions *hit_options, Int4 align_length)
 Determines whether this HSP should be kept or deleted. More...
 
Boolean Blast_HSPTestIdentityAndLength (EBlastProgramType program_number, BlastHSP *hsp, const Uint1 *query, const Uint1 *subject, const BlastScoringOptions *score_options, const BlastHitSavingOptions *hit_options)
 Calculates number of identities and alignment lengths of an HSP via Blast_HSPGetNumIdentities and determines whether this HSP should be kept or deleted. More...
 
double Blast_HSPGetQueryCoverage (const BlastHSP *hsp, Int4 query_length)
 Calculate query coverage percentage of an hsp. More...
 
Boolean Blast_HSPQueryCoverageTest (BlastHSP *hsp, double min_query_coverage_pct, Int4 query_length)
 Calculate query coverage percentage of an hsp. More...
 
Int2 Blast_TrimHSPListByMaxHsps (BlastHSPList *hsp_list, const BlastHitSavingOptions *hit_options)
 
Int4 BlastHspNumMax (Boolean gapped_calculation, const BlastHitSavingOptions *options)
 Calculated the number of HSPs that should be saved. More...
 
void Blast_HSPCalcLengthAndGaps (const BlastHSP *hsp, Int4 *length, Int4 *gaps, Int4 *gap_opens)
 Calculate length of an HSP as length in query plus length of gaps in query. More...
 
void Blast_HSPGetAdjustedOffsets (EBlastProgramType program, BlastHSP *hsp, Int4 query_length, Int4 subject_length, Int4 *q_start, Int4 *q_end, Int4 *s_start, Int4 *s_end)
 Adjust HSP endpoint offsets according to strand/frame; return values in 1-offset coordinates instead of internal 0-offset. More...
 
Int2 Blast_HSPGetPartialSubjectTranslation (BLAST_SequenceBlk *subject_blk, BlastHSP *hsp, Boolean is_ooframe, const Uint1 *gen_code_string, Uint1 **translation_buffer_ptr, Uint1 **subject_ptr, Int4 *subject_length_ptr, Int4 *start_shift_ptr)
 Performs the translation and coordinates adjustment, if only part of the subject sequence is translated for gapped alignment. More...
 
void Blast_HSPAdjustSubjectOffset (BlastHSP *hsp, Int4 start_shift)
 Adjusts offsets if partial sequence was used for extension. More...
 
const Uint1Blast_HSPGetTargetTranslation (SBlastTargetTranslation *target_t, const BlastHSP *hsp, Int4 *translated_length)
 Returns a buffer with a protein translated from nucleotide. More...
 
BlastHSPListBlast_HSPListFree (BlastHSPList *hsp_list)
 Deallocate memory for an HSP list structure as well as all it's components. More...
 
BlastHSPListBlast_HSPListNew (Int4 hsp_max)
 Creates HSP list structure with a default size HSP array. More...
 
Boolean Blast_HSPList_IsEmpty (const BlastHSPList *hsp_list)
 Returns true if the BlastHSPList contains no HSPs. More...
 
BlastHSPListBlastHSPListDup (const BlastHSPList *hsp_list)
 Returns a duplicate (deep copy) of the given hsp list. More...
 
void Blast_HSPListSwap (BlastHSPList *list1, BlastHSPList *list2)
 Swaps the two HSP lists via structure assignment. More...
 
Int2 Blast_HSPListSaveHSP (BlastHSPList *hsp_list, BlastHSP *hsp)
 Saves HSP information into a BlastHSPList structure. More...
 
Int2 Blast_HSPListGetEvalues (EBlastProgramType program_number, const BlastQueryInfo *query_info, Int4 subject_length, BlastHSPList *hsp_list, Boolean gapped_calculation, Boolean RPS_prelim, const BlastScoreBlk *sbp, double gap_decay_rate, double scaling_factor)
 Calculate the expected values for all HSPs in a hit list, without using the sum statistics. More...
 
void Blast_HSPListPHIGetEvalues (BlastHSPList *hsp_list, BlastScoreBlk *sbp, const BlastQueryInfo *query_info, const SPHIPatternSearchBlk *pattern_blk)
 Calculate e-values for a PHI BLAST HSP list. More...
 
Int2 Blast_HSPListGetBitScores (BlastHSPList *hsp_list, Boolean gapped_calculation, const BlastScoreBlk *sbp)
 Calculate bit scores from raw scores in an HSP list. More...
 
void Blast_HSPListPHIGetBitScores (BlastHSPList *hsp_list, BlastScoreBlk *sbp)
 Calculate bit scores from raw scores in an HSP list for a PHI BLAST search. More...
 
Int2 Blast_HSPListReapByEvalue (BlastHSPList *hsp_list, const BlastHitSavingOptions *hit_options)
 Discard the HSPs above the e-value threshold from the HSP list. More...
 
Int2 Blast_HSPListReapByRawScore (BlastHSPList *hsp_list, const BlastHitSavingOptions *hit_options)
 Discard the HSPs above the raw threshold from the HSP list. More...
 
Int2 Blast_HSPListReapByQueryCoverage (BlastHSPList *hsp_list, const BlastHitSavingOptions *hit_options, const BlastQueryInfo *query_info, EBlastProgramType program_number)
 Discard the HSPs below the min query coverage pct from the HSP list. More...
 
Int2 Blast_HSPListPurgeNullHSPs (BlastHSPList *hsp_list)
 Cleans out the NULLed out HSP's from the HSP array that is part of the BlastHSPList. More...
 
Int4 Blast_HSPListPurgeHSPsWithCommonEndpoints (EBlastProgramType program, BlastHSPList *hsp_list, Boolean purge)
 Check for an overlap of two different alignments and remove redundant HSPs. More...
 
Int4 Blast_HSPListSubjectBestHit (EBlastProgramType program, const BlastHSPSubjectBestHitOptions *subject_besthit_opts, const BlastQueryInfo *query_info, BlastHSPList *hsp_list)
 
Int2 Blast_HSPListReevaluateUngapped (EBlastProgramType program, BlastHSPList *hsp_list, BLAST_SequenceBlk *query_blk, BLAST_SequenceBlk *subject_blk, const BlastInitialWordParameters *word_params, const BlastHitSavingParameters *hit_params, const BlastQueryInfo *query_info, BlastScoreBlk *sbp, const BlastScoringParameters *score_params, const BlastSeqSrc *seq_src, const Uint1 *gen_code_string)
 Reevaluate all ungapped HSPs in an HSP list. More...
 
Int2 Blast_HSPListAppend (BlastHSPList **old_hsp_list_ptr, BlastHSPList **combined_hsp_list_ptr, Int4 hsp_num_max)
 Append one HSP list to the other. More...
 
Int2 Blast_HSPListsMerge (BlastHSPList **hsp_list, BlastHSPList **combined_hsp_list_ptr, Int4 hsp_num_max, Int4 *split_points, Int4 contexts_per_query, Int4 chunk_overlap_size, Boolean allow_gap, Boolean short_reads)
 Merge an HSP list from a chunk of the subject sequence into a previously computed HSP list. More...
 
void Blast_HSPListAdjustOffsets (BlastHSPList *hsp_list, Int4 offset)
 Adjust subject offsets in an HSP list if only part of the subject sequence was searched. More...
 
void Blast_HSPListAdjustOddBlastnScores (BlastHSPList *hsp_list, Boolean gapped_calculation, const BlastScoreBlk *sbp)
 For nucleotide BLAST, if the match reward score is equal to 2, random alignments are dominated by runs of exact matches, which all have even scores. More...
 
Boolean Blast_HSPListIsSortedByScore (const BlastHSPList *hsp_list)
 Check if HSP list is sorted by score. More...
 
void Blast_HSPListSortByScore (BlastHSPList *hsp_list)
 Sort the HSPs in an HSP list by score. More...
 
void Blast_HSPListSortByEvalue (BlastHSPList *hsp_list)
 Sort the HSPs in an HSP list by e-value, with scores and other criteria used to resolve ties. More...
 
BlastHitListBlast_HitListNew (Int4 hitlist_size)
 Allocate memory for a hit list of a given size. More...
 
BlastHitListBlast_HitListFree (BlastHitList *hitlist)
 Deallocate memory for the hit list. More...
 
Int2 Blast_HitListHSPListsFree (BlastHitList *hitlist)
 Deallocate memory for every HSP list on BlastHitList, as well as all their components. More...
 
Int2 Blast_HitListUpdate (BlastHitList *hit_list, BlastHSPList *hsp_list)
 Insert a new HSP list into the hit list. More...
 
Int2 Blast_HitListMerge (BlastHitList **old_hit_list_ptr, BlastHitList **combined_hit_list_ptr, Int4 contexts_per_query, Int4 *split_offsets, Int4 chunk_overlap_size, Boolean allow_gap)
 Combine two hitlists; both HitLists must contain HSPs that represent alignments to the same query sequence. More...
 
Int2 Blast_HitListPurgeNullHSPLists (BlastHitList *hit_list)
 Purges a BlastHitList of NULL HSP lists. More...
 
Int2 Blast_HitListSortByEvalue (BlastHitList *hit_list)
 Sort BlastHitLIst bon evalue. More...
 
BlastHSPResultsBlast_HSPResultsNew (Int4 num_queries)
 Initialize the results structure. More...
 
BlastHSPResultsBlast_HSPResultsFree (BlastHSPResults *results)
 Deallocate memory for BLAST results. More...
 
Int2 Blast_HSPResultsSortByEvalue (BlastHSPResults *results)
 Sort each hit list in the BLAST results by best e-value. More...
 
Int2 Blast_HSPResultsReverseSort (BlastHSPResults *results)
 Sort each hit list in the BLAST results by best e-value, in reverse order. More...
 
Int2 Blast_HSPResultsReverseOrder (BlastHSPResults *results)
 Reverse order of HSP lists in each hit list in the BLAST results. More...
 
Int2 Blast_HSPResultsInsertHSPList (BlastHSPResults *results, BlastHSPList *hsp_list, Int4 hitlist_size)
 Blast_HSPResultsInsertHSPList Insert an HSP list to the appropriate place in the results structure. More...
 
BlastHSPResultsBlast_HSPResultsFromHSPStream (struct BlastHSPStream *hsp_stream, size_t num_queries, SBlastHitsParameters *hit_param)
 Move all of the hits within an HSPStream into a BlastHSPResults structure. More...
 
BlastHSPResultsBlast_HSPResultsFromHSPStreamWithLimit (struct BlastHSPStream *hsp_stream, Uint4 num_queries, SBlastHitsParameters *hit_param, Uint4 max_num_hsps, Boolean *removed_hsps)
 As Blast_HSPResultsFromHSPStream, except the total number of HSPs kept for each query does not exceed an explicit limit. More...
 
BlastHSPResultsBlast_HSPResultsFromHSPStreamWithLimitEx (struct BlastHSPStream *hsp_stream, Uint4 num_queries, SBlastHitsParameters *hit_param, Uint4 max_num_hsps, Boolean *removed_hsps)
 As Blast_HSPResultsFromHSPStreamWithLimit, except accept and return array of Boolen flags specifying which query exceeded HSP limits. More...
 
BlastHSPResults ** PHIBlast_HSPResultsSplit (const BlastHSPResults *results, const SPHIQueryInfo *pattern_info)
 Splits the BlastHSPResults structure for a PHI BLAST search into an array of BlastHSPResults structures, corresponding to different pattern occurrences in query. More...
 
Int4 PhiBlastGetEffectiveNumberOfPatterns (const BlastQueryInfo *query_info)
 Count the number of occurrences of pattern in sequence, which do not overlap by more than half the pattern match length. More...
 
Int2 Blast_HSPResultsApplyMasklevel (BlastHSPResults *results, const BlastQueryInfo *query_info, Int4 masklevel, Int4 query_length)
 Apply Cross_match like masklevel to HSP list. More...
 
Int4 GetPrelimHitlistSize (Int4 hitlist_size, Int4 compositionBasedStats, Boolean gapped_calculation)
 

Detailed Description

Structures and API used for saving BLAST hits.

Definition in file blast_hits.h.

Macro Definition Documentation

◆ DBSEQ_CHUNK_OVERLAP

#define DBSEQ_CHUNK_OVERLAP   100

By how much should the chunks of a subject sequence overlap if it is too long and has to be split.

Definition at line 192 of file blast_hits.h.

Typedef Documentation

◆ BlastHitList

typedef struct BlastHitList BlastHitList

The structure to contain all BLAST results for one query sequence.

◆ BlastHSP

typedef struct BlastHSP BlastHSP

Structure holding all information about an HSP.

◆ BlastHSPList

typedef struct BlastHSPList BlastHSPList

The structure to hold all HSPs for a given sequence after the gapped alignment.

◆ BlastHSPMappingInfo

Mapping information for an HSP.

◆ BlastHSPResults

The structure to contain all BLAST results, for multiple queries.

◆ BlastSeg

typedef struct BlastSeg BlastSeg

One sequence segment within an HSP.

◆ JumperEditsBlock

Definition at line 90 of file blast_hits.h.

◆ SBlastHitsParameters

Keeps prelim_hitlist_size and HitSavingOptions together, mostly for use by hspstream.

◆ SequenceOverhangs

Definition at line 1 of file blast_hits.h.

◆ SPHIHspInfo

typedef struct SPHIHspInfo SPHIHspInfo

In PHI BLAST: information about pattern match in a given HSP.

Function Documentation

◆ Blast_HitListFree()

BlastHitList* Blast_HitListFree ( BlastHitList hitlist)

◆ Blast_HitListHSPListsFree()

Int2 Blast_HitListHSPListsFree ( BlastHitList hitlist)

Deallocate memory for every HSP list on BlastHitList, as well as all their components.

Parameters
hitlistcontains the BlastHSPList array to be freed [in/out].

Definition at line 3148 of file blast_hits.c.

References Blast_HSPListFree(), BlastHitList::hsplist_array, BlastHitList::hsplist_count, and sfree.

Referenced by Blast_HitListFree().

◆ Blast_HitListMerge()

Int2 Blast_HitListMerge ( BlastHitList **  old_hit_list_ptr,
BlastHitList **  combined_hit_list_ptr,
Int4  contexts_per_query,
Int4 split_offsets,
Int4  chunk_overlap_size,
Boolean  allow_gap 
)

Combine two hitlists; both HitLists must contain HSPs that represent alignments to the same query sequence.

Parameters
old_hit_list_ptrPointer to original HitList, will be NULLed out on return [in|out]
combined_hit_list_ptrPointer to the combined HitList [in|out] t*
contexts_per_queryThe number of different contexts that can occur in hits from old_hit_list and combined_hit_list [in]
split_offsetsthe query offset that marks the boundary between combined_hit_list and old_hit_list. HSPs in old_hit_list that hit to context i are assumed to lie to the right of split_offsets[i] [in]
chunk_overlap_sizeThe length of the overlap region between the sequence region containing hit_list and that containing combined_hit_list [in]
allow_gapAllow merging HSPs at different diagonals [in]

Definition at line 2119 of file blast_hits.c.

References ASSERT, Blast_HitListFree(), Blast_HitListNew(), Blast_HitListUpdate(), Blast_HSPListAppend(), Blast_HSPListsMerge(), FALSE, BlastHSPList::hsp_max, BlastHitList::hsplist_array, BlastHitList::hsplist_count, BlastHitList::hsplist_max, i, NULL, BlastHSPList::oid, s_SortHSPListByOid(), and TRUE.

Referenced by BlastHSPStreamMerge().

◆ Blast_HitListNew()

BlastHitList* Blast_HitListNew ( Int4  hitlist_size)

◆ Blast_HitListPurgeNullHSPLists()

Int2 Blast_HitListPurgeNullHSPLists ( BlastHitList hit_list)

Purges a BlastHitList of NULL HSP lists.

Parameters
hit_listBLAST hit list to purge. [in] [out]

Definition at line 3300 of file blast_hits.c.

References BlastHitList::hsplist_array, BlastHitList::hsplist_count, and NULL.

Referenced by Blast_HSPResultsApplyMasklevel(), and s_FilterBlastResults().

◆ Blast_HitListSortByEvalue()

Int2 Blast_HitListSortByEvalue ( BlastHitList hit_list)

Sort BlastHitLIst bon evalue.

Parameters
hit_lsitBLAST hit list to be sorted [in] [out]

Definition at line 3329 of file blast_hits.c.

References BlastHitList::hsplist_array, BlastHitList::hsplist_count, s_BlastHitListPurge(), and s_EvalueCompareHSPLists().

Referenced by BlastHSPCBSStreamClose(), s_BlastHSPBestHitFinal(), and s_BlastHSPCullingPipeRun().

◆ Blast_HitListUpdate()

Int2 Blast_HitListUpdate ( BlastHitList hit_list,
BlastHSPList hsp_list 
)

Insert a new HSP list into the hit list.

Before capacity of the hit list is reached, just add to the end; After that, store in a heap, to ensure efficient insertion and deletion. The heap order is reverse, with worst e-value on top, for convenience of deletion.

Parameters
hit_listContains all HSP lists saved so far [in] [out]
hsp_listA new HSP list to be inserted into the hit list [in]

Definition at line 3241 of file blast_hits.c.

References ASSERT, BlastHSPList::best_evalue, Blast_HSPListFree(), Blast_HSPListSortByEvalue(), BlastHitList::heapified, BlastHSPList::hsp_array, BlastHitList::hsplist_array, BlastHitList::hsplist_count, BlastHitList::hsplist_current, BlastHitList::hsplist_max, BlastHitList::low_score, MAX, MIN, s_Blast_HitListGrowHSPListArray(), s_BlastCheckBestEvalue(), s_BlastGetBestEvalue(), s_BlastHitListInsertHSPListInHeap(), s_CreateHeap(), s_EvalueCompareHSPLists(), BlastHSP::score, TRUE, and BlastHitList::worst_evalue.

Referenced by Blast_HitListMerge(), Blast_HSPResultsInsertHSPList(), BOOST_AUTO_TEST_CASE(), s_BlastHSPBestHitFinal(), s_BlastHSPCollectorRun(), s_BlastHSPCollectorRun_RPS(), s_ExportToHitlist(), and s_FillResultsFromCompoHeaps().

◆ Blast_HSPAdjustSubjectOffset()

void Blast_HSPAdjustSubjectOffset ( BlastHSP hsp,
Int4  start_shift 
)

Adjusts offsets if partial sequence was used for extension.

Parameters
hspThe hit to work on [in][out]
start_shiftamount of database sequence not used for extension. [in]

Definition at line 1316 of file blast_hits.c.

References BlastSeg::end, BlastSeg::gapped_start, BlastSeg::offset, and BlastHSP::subject.

Referenced by Blast_TracebackFromHSPList(), and s_GetTraceback().

◆ Blast_HSPCalcLengthAndGaps()

void Blast_HSPCalcLengthAndGaps ( const BlastHSP hsp,
Int4 length,
Int4 gaps,
Int4 gap_opens 
)

Calculate length of an HSP as length in query plus length of gaps in query.

If gap information is unavailable, return maximum between length in query and in subject.

Parameters
hspAn HSP structure [in]
lengthLength of this HSP [out]
gapsTotal number of gaps in this HSP [out]
gap_opensNumber of gap openings in this HSP [out]

Definition at line 1055 of file blast_hits.c.

References eGapAlignDel, eGapAlignIns, BlastSeg::end, BlastHSP::gap_info, GapEditScript::num, BlastSeg::offset, GapEditScript::op_type, BlastHSP::query, GapEditScript::size, and BlastHSP::subject.

◆ Blast_HSPClone()

BlastHSP* Blast_HSPClone ( const BlastHSP hsp)

◆ Blast_HSPFree()

BlastHSP* Blast_HSPFree ( BlastHSP hsp)

◆ Blast_HSPGetAdjustedOffsets()

void Blast_HSPGetAdjustedOffsets ( EBlastProgramType  program,
BlastHSP hsp,
Int4  query_length,
Int4  subject_length,
Int4 q_start,
Int4 q_end,
Int4 s_start,
Int4 s_end 
)

Adjust HSP endpoint offsets according to strand/frame; return values in 1-offset coordinates instead of internal 0-offset.

Parameters
programType of BLAST program [in]
hspAn HSP structure [in]
query_lengthLength of query [in]
subject_lengthLength of subject [in]
q_startStart of alignment in query [out]
q_endEnd of alignment in query [out]
s_startStart of alignment in subject [out]
s_endEnd of alignment in subject [out]

Definition at line 1109 of file blast_hits.c.

References Blast_QueryIsTranslated(), Blast_SubjectIsTranslated(), BlastSeg::end, BlastSeg::frame, BlastHSP::gap_info, BlastSeg::offset, BlastHSP::query, s_BlastSegGetTranslatedOffsets(), and BlastHSP::subject.

◆ Blast_HSPGetNumIdentities()

Int2 Blast_HSPGetNumIdentities ( const Uint1 query,
const Uint1 subject,
BlastHSP hsp,
const BlastScoringOptions score_options,
Int4 align_length_ptr 
)

Calculate number of identities in an HSP and set the BlastHSP::num_ident field (unconditionally)

Parameters
queryThe query sequence [in]
subjectThe uncompressed subject sequence [in]
hspAll information about the HSP, the output of this function will be stored in its num_ident field [in|out]
score_optionsScoring options [in]
align_length_ptrThe alignment length, including gaps (optional) [out]
Returns
0 on success, -1 on invalid parameters or error

Definition at line 940 of file blast_hits.c.

References BlastScoringOptions::is_ooframe, NULL, BlastHSP::num_ident, BlastScoringOptions::program_number, query, s_Blast_HSPGetNumIdentitiesAndPositives(), s_Blast_HSPGetOOFNumIdentitiesAndPositives(), and subject.

Referenced by Blast_HSPTestIdentityAndLength(), and BOOST_AUTO_TEST_CASE().

◆ Blast_HSPGetNumIdentitiesAndPositives()

Int2 Blast_HSPGetNumIdentitiesAndPositives ( const Uint1 query,
const Uint1 subject,
BlastHSP hsp,
const BlastScoringOptions score_options,
Int4 align_length_ptr,
const BlastScoreBlk sbp 
)

Calculate number of identities and positives in an HSP and set the BlastHSP::num_ident and BlastHSP::num_positives fields.

Parameters
queryThe query sequence [in]
subjectThe uncompressed subject sequence [in]
hspAll information about the HSP, the output of this function will be stored in its num_ident field [in|out]
score_optionsScoring options [in]
align_length_ptrThe alignment length, including gaps (optional) [out]
sbpScore blk containing the matrix for counting positives [in]
Returns
0 on success, -1 on invalid parameters or error

Definition at line 966 of file blast_hits.c.

References BlastScoringOptions::is_ooframe, BlastHSP::num_ident, BlastHSP::num_positives, BlastScoringOptions::program_number, query, s_Blast_HSPGetNumIdentitiesAndPositives(), s_Blast_HSPGetOOFNumIdentitiesAndPositives(), and subject.

Referenced by Blast_HSPListReevaluateUngapped(), Blast_TracebackFromHSPList(), and s_ComputeNumIdentities().

◆ Blast_HSPGetPartialSubjectTranslation()

Int2 Blast_HSPGetPartialSubjectTranslation ( BLAST_SequenceBlk subject_blk,
BlastHSP hsp,
Boolean  is_ooframe,
const Uint1 gen_code_string,
Uint1 **  translation_buffer_ptr,
Uint1 **  subject_ptr,
Int4 subject_length_ptr,
Int4 start_shift_ptr 
)

Performs the translation and coordinates adjustment, if only part of the subject sequence is translated for gapped alignment.

Parameters
subject_blkSubject sequence structure [in]
hspThe HSP information [in] [out]
is_ooframeReturn a mixed-frame sequence if TRUE [in]
gen_code_stringDatabase genetic code [in]
translation_buffer_ptrPointer to buffer holding the translation [out]
subject_ptrPointer to sequence to be passed to the gapped alignment [out]
subject_length_ptrLength of the translated sequence [out]
start_shift_ptrHow far is the partial sequence shifted w.r.t. the full sequence. [out]

Definition at line 1239 of file blast_hits.c.

References ASSERT, Blast_GetPartialTranslation(), CODON_LENGTH, BlastSeg::end, BlastSeg::frame, BlastSeg::gapped_start, BLAST_SequenceBlk::length, MAX, MAX_FULL_TRANSLATION, MIN, NULL, BlastSeg::offset, BLAST_SequenceBlk::sequence_start, sfree, BlastHSP::subject, and subject.

◆ Blast_HSPGetQueryCoverage()

double Blast_HSPGetQueryCoverage ( const BlastHSP hsp,
Int4  query_length 
)

Calculate query coverage percentage of an hsp.

Parameters
hspAn HSP structure [in]
query_lengthLength of query [in]
Returns
percentage query coverage of the input hsp

Definition at line 1034 of file blast_hits.c.

References BlastSeg::end, BlastSeg::offset, and BlastHSP::query.

Referenced by Blast_HSPQueryCoverageTest(), BOOST_AUTO_TEST_CASE(), and s_BuildScoreList().

◆ Blast_HSPGetTargetTranslation()

const Uint1* Blast_HSPGetTargetTranslation ( SBlastTargetTranslation target_t,
const BlastHSP hsp,
Int4 translated_length 
)

◆ Blast_HSPInit()

Int2 Blast_HSPInit ( Int4  query_start,
Int4  query_end,
Int4  subject_start,
Int4  subject_end,
Int4  query_gapped_start,
Int4  subject_gapped_start,
Int4  query_context,
Int2  query_frame,
Int2  subject_frame,
Int4  score,
GapEditScript **  gap_edit,
BlastHSP **  ret_hsp 
)

Allocates BlastHSP and inits with information from input.

structure.

Parameters
query_startStart of query alignment [in]
query_endEnd of query alignment [in]
subject_startStart of subject alignment [in]
subject_endEnd of subject alignment [in]
query_gapped_startWhere gapped alignment started on query [in]
subject_gapped_startWhere gapped alignment started on subject [in]
query_contextThe index of the query containing this HSP [in]
query_frameQuery frame: -3..3 for translated sequence, 1 or -1 for blastn, 0 for blastp [in]
subject_frameSubject frame: -3..3 for translated sequence, 1 for blastn, 0 for blastp [in]
scorescore of alignment [in]
gap_editWill be transferred to HSP and nulled out if a traceback was not calculated may be NULL [in] [out]
ret_hspallocated and filled in BlastHSP [out]

Definition at line 151 of file blast_hits.c.

References Blast_HSPNew(), BLASTERR_MEMORY, BlastHSP::context, BlastSeg::end, BlastSeg::frame, BlastHSP::gap_info, BlastSeg::gapped_start, NULL, BlastSeg::offset, BlastHSP::query, BlastHSP::score, and BlastHSP::subject.

Referenced by BLAST_GetGappedScore(), BLAST_GetUngappedHSPList(), BLAST_SmithWatermanGetGappedScore(), BlastNaExtendJumper(), BOOST_AUTO_TEST_CASE(), PHIGetGappedScore(), s_BlastHSPCopy(), s_CreateHSP(), s_CreateHSPForWordHit(), s_GetTraceback(), s_HSPListFromDistinctAlignments(), CRedoAlignmentTestFixture::setUpHSPList(), and ShortRead_IndexedWordFinder().

◆ Blast_HSPList_IsEmpty()

Boolean Blast_HSPList_IsEmpty ( const BlastHSPList hsp_list)

Returns true if the BlastHSPList contains no HSPs.

Parameters
hsp_listlist of HSPs to examine [in]

Definition at line 1578 of file blast_hits.c.

References FALSE, BlastHSPList::hspcnt, and TRUE.

Referenced by SThreadLocalDataArrayConsolidateResults().

◆ Blast_HSPListAdjustOddBlastnScores()

void Blast_HSPListAdjustOddBlastnScores ( BlastHSPList hsp_list,
Boolean  gapped_calculation,
const BlastScoreBlk sbp 
)

For nucleotide BLAST, if the match reward score is equal to 2, random alignments are dominated by runs of exact matches, which all have even scores.

This makes it impossible to estimate statistical parameters correctly for odd scores. Hence the raw score formula is adjusted - all scores are rounded down to the nearest even value in order to provide a conservative estimate.

Parameters
hsp_listHSP list structure to adjust scores for. [in] [out]
gapped_calculationnot an ungapped alignment [in]
sbpused for round_down Boolean

Definition at line 3051 of file blast_hits.c.

References Blast_HSPListSortByScore(), FALSE, BlastHSPList::hsp_array, BlastHSPList::hspcnt, BlastScoreBlk::round_down, and BlastHSP::score.

Referenced by Blast_HSPListReevaluateUngapped(), BLAST_LinkHsps(), BOOST_AUTO_TEST_CASE(), s_BlastSearchEngineOneContext(), and s_HSPListPostTracebackUpdate().

◆ Blast_HSPListAdjustOffsets()

void Blast_HSPListAdjustOffsets ( BlastHSPList hsp_list,
Int4  offset 
)

Adjust subject offsets in an HSP list if only part of the subject sequence was searched.

Used when long subject sequence is split into more manageable chunks.

Parameters
hsp_listList of HSPs from a chunk of a subject sequence [in]
offsetOffset where the chunk starts [in]

Definition at line 3035 of file blast_hits.c.

References BlastSeg::end, BlastSeg::gapped_start, BlastHSPList::hsp_array, BlastHSPList::hspcnt, BlastSeg::offset, offset, and BlastHSP::subject.

Referenced by s_BlastSearchEngineOneContext().

◆ Blast_HSPListAppend()

Int2 Blast_HSPListAppend ( BlastHSPList **  old_hsp_list_ptr,
BlastHSPList **  combined_hsp_list_ptr,
Int4  hsp_num_max 
)

Append one HSP list to the other.

Discard lower scoring HSPs if there is not enough space to keep all.

Parameters
old_hsp_list_ptrlist of HSPs, will be NULLed out on return [in|out]
combined_hsp_list_ptrPointer to the combined list of HSPs, possibly containing previously saved HSPs [in] [out]
hsp_num_maxMaximal allowed number of HSPs to save (unlimited if INT4_MAX) [in]
Returns
Status: 0 on success, -1 on failure.

Definition at line 2807 of file blast_hits.c.

References BlastHSPList::allocated, Blast_HSPListFree(), BlastHSPList::do_not_reallocate, BlastHSPList::hsp_array, BlastHSPList::hspcnt, MIN, NULL, s_BlastHSPListsCombineByScore(), and TRUE.

Referenced by Blast_HitListMerge(), and s_BlastSearchEngineCore().

◆ Blast_HSPListFree()

BlastHSPList* Blast_HSPListFree ( BlastHSPList hsp_list)

◆ Blast_HSPListGetBitScores()

Int2 Blast_HSPListGetBitScores ( BlastHSPList hsp_list,
Boolean  gapped_calculation,
const BlastScoreBlk sbp 
)

Calculate bit scores from raw scores in an HSP list.

Parameters
hsp_listList of HSPs [in] [out]
gapped_calculationIs this a gapped search? [in]
sbpScoring block with statistical parameters [in]

Definition at line 1907 of file blast_hits.c.

References ASSERT, BlastHSP::bit_score, BlastHSP::context, FALSE, BlastHSPList::hsp_array, BlastHSPList::hspcnt, BlastScoreBlk::kbp, BlastScoreBlk::kbp_gap, Blast_KarlinBlk::Lambda, Blast_KarlinBlk::logK, NCBIMATH_LN2, NULL, BlastScoreBlk::round_down, and BlastHSP::score.

Referenced by BLAST_ComputeTraceback_MT(), BLAST_PreliminarySearchEngine(), s_GetBitScores(), and s_HSPListPostTracebackUpdate().

◆ Blast_HSPListGetEvalues()

Int2 Blast_HSPListGetEvalues ( EBlastProgramType  program_number,
const BlastQueryInfo query_info,
Int4  subject_length,
BlastHSPList hsp_list,
Boolean  gapped_calculation,
Boolean  RPS_prelim,
const BlastScoreBlk sbp,
double  gap_decay_rate,
double  scaling_factor 
)

Calculate the expected values for all HSPs in a hit list, without using the sum statistics.

In case of multiple queries, the offsets are assumed to be already adjusted to individual query coordinates, and the contexts are set for each HSP.

Parameters
program_numberType of BLAST program [in]
query_infoAuxiliary query information - needed only for effective search space calculation if it is not provided [in]
subject_lengthSubject length - needed for Spouge's new FSC [in]
hsp_listList of HSPs for one subject sequence [in] [out]
gapped_calculationIs this for a gapped or ungapped search? [in]
RPS_prelimIs this for a RPS preliminary search? [in]
sbpStructure containing statistical information [in]
gap_decay_rateAdjustment parameter to compensate for the effects of performing multiple tests when linking HSPs. No adjustment is made if 0. [in]
scaling_factorScaling factor by which Lambda should be divided. Used in RPS BLAST only; should be set to 1.0 in other cases. [in]

Definition at line 1811 of file blast_hits.c.

References ASSERT, BlastHSPList::best_evalue, BLAST_GapDecayDivisor(), Blast_HSPListIsSortedByScore(), BLAST_KarlinStoE_simple(), Blast_ProgramIsRpsBlast(), BLAST_SpougeStoE(), BlastHSP::context, BlastQueryInfo::contexts, BlastContextInfo::eff_searchsp, BlastHSP::evalue, FALSE, BlastScoreBlk::gbp, BlastHSPList::hsp_array, BlastHSPList::hspcnt, i, BlastScoreBlk::kbp, BlastScoreBlk::kbp_gap, Blast_KarlinBlk::Lambda, NULL, BlastScoreBlk::number_of_contexts, BlastContextInfo::query_length, BlastScoreBlk::round_down, s_BlastGetBestEvalue(), and BlastHSP::score.

Referenced by BLAST_LinkHsps(), BLAST_PreliminarySearchEngine(), s_BlastSearchEngineCore(), s_HitlistEvaluateAndPurge(), and s_HSPListPostTracebackUpdate().

◆ Blast_HSPListIsSortedByScore()

Boolean Blast_HSPListIsSortedByScore ( const BlastHSPList hsp_list)

◆ Blast_HSPListNew()

BlastHSPList* Blast_HSPListNew ( Int4  hsp_max)

◆ Blast_HSPListPHIGetBitScores()

void Blast_HSPListPHIGetBitScores ( BlastHSPList hsp_list,
BlastScoreBlk sbp 
)

Calculate bit scores from raw scores in an HSP list for a PHI BLAST search.

Parameters
hsp_listList of HSPs [in] [out]
sbpScoring block with statistical parameters [in]

Definition at line 1934 of file blast_hits.c.

References ASSERT, BlastHSP::bit_score, BlastHSPList::hsp_array, BlastHSPList::hspcnt, BlastScoreBlk::kbp_gap, Blast_KarlinBlk::Lambda, lambda(), log, NCBIMATH_LN2, NULL, Blast_KarlinBlk::paramC, and BlastHSP::score.

Referenced by s_PHITracebackFromHSPList().

◆ Blast_HSPListPHIGetEvalues()

void Blast_HSPListPHIGetEvalues ( BlastHSPList hsp_list,
BlastScoreBlk sbp,
const BlastQueryInfo query_info,
const SPHIPatternSearchBlk pattern_blk 
)

Calculate e-values for a PHI BLAST HSP list.

Parameters
hsp_listHSP list found by PHI BLAST [in] [out]
sbpScoring block with statistical parameters [in]
query_infoStructure containing information about pattern counts [in]
pattern_blkStructure containing information about pattern hits in db [in]

Definition at line 1955 of file blast_hits.c.

References ASSERT, BlastHSPList::best_evalue, Blast_HSPListIsSortedByScore(), BlastHSPList::hsp_array, BlastHSPList::hspcnt, s_BlastGetBestEvalue(), and s_HSPPHIGetEvalue().

Referenced by BOOST_AUTO_TEST_CASE(), and s_PHITracebackFromHSPList().

◆ Blast_HSPListPurgeHSPsWithCommonEndpoints()

Int4 Blast_HSPListPurgeHSPsWithCommonEndpoints ( EBlastProgramType  program,
BlastHSPList hsp_list,
Boolean  purge 
)

Check for an overlap of two different alignments and remove redundant HSPs.

A sufficient overlap is when two alignments have the same start or end values If an overlap is found the HSP with the lowest score is removed, if both scores are the same then the first is removed.

Parameters
programType of BLAST program. For some programs (PHI BLAST), the purge should not be performed. [in]
hsp_listContains array of pointers to HSPs to purge [in]
purgeShould the hsp be purged? [in]
Returns
The number of valid alignments remaining.

Definition at line 2455 of file blast_hits.c.

References Blast_HSPFree(), Blast_HSPListPurgeNullHSPs(), Blast_ProgramIsPhiBlast(), context, eBlastTypeBlastn, BlastSeg::end, FALSE, BlastHSPList::hsp_array, BlastHSPList::hspcnt, i, NULL, BlastSeg::offset, BlastHSP::query, query, s_CutOffGapEditScript(), s_QueryEndCompareHSPs(), s_QueryOffsetCompareHSPs(), BlastHSP::subject, and TRUE.

Referenced by Blast_TracebackFromHSPList(), BOOST_AUTO_TEST_CASE(), and s_BlastSearchEngineOneContext().

◆ Blast_HSPListPurgeNullHSPs()

Int2 Blast_HSPListPurgeNullHSPs ( BlastHSPList hsp_list)

Cleans out the NULLed out HSP's from the HSP array that is part of the BlastHSPList.

Parameters
hsp_listContains array of pointers to HSP structures [in]
Returns
status of function call.

Definition at line 2225 of file blast_hits.c.

References BlastHSPList::hsp_array, BlastHSPList::hspcnt, and NULL.

Referenced by Blast_HSPListPurgeHSPsWithCommonEndpoints(), Blast_HSPListReevaluateUngapped(), Blast_HSPListsMerge(), Blast_HSPListSubjectBestHit(), Blast_TracebackFromHSPList(), BOOST_AUTO_TEST_CASE(), and s_PHITracebackFromHSPList().

◆ Blast_HSPListReapByEvalue()

Int2 Blast_HSPListReapByEvalue ( BlastHSPList hsp_list,
const BlastHitSavingOptions hit_options 
)

Discard the HSPs above the e-value threshold from the HSP list.

Parameters
hsp_listList of HSPs for one subject sequence [in] [out]
hit_optionsOptions block containing the e-value cut-off [in]

Definition at line 1976 of file blast_hits.c.

References ASSERT, Blast_HSPFree(), BlastHSP::evalue, BlastHitSavingOptions::expect_value, BlastHSPList::hsp_array, BlastHSPList::hspcnt, and NULL.

Referenced by BOOST_AUTO_TEST_CASE(), s_HitlistEvaluateAndPurge(), s_HSPListPostTracebackUpdate(), s_PHITracebackFromHSPList(), and LinkHspTestFixture::testUnevenGapLinkHsps().

◆ Blast_HSPListReapByQueryCoverage()

Int2 Blast_HSPListReapByQueryCoverage ( BlastHSPList hsp_list,
const BlastHitSavingOptions hit_options,
const BlastQueryInfo query_info,
EBlastProgramType  program_number 
)

Discard the HSPs below the min query coverage pct from the HSP list.

Parameters
hsp_listList of HSPs for one subject sequence [in] [out]
hit_optionsOptions block containing the min query coverage pct [in]
query_infoStructure containing information about the queries [in]
program_numberType of BLAST program.

Definition at line 2010 of file blast_hits.c.

References ASSERT, BlastHSPList::best_evalue, Blast_HSPFree(), Blast_HSPQueryCoverageTest(), BlastHSP::context, BlastQueryInfo::contexts, FALSE, BlastHSPList::hsp_array, BlastHSPList::hspcnt, NULL, BlastHitSavingOptions::query_cov_hsp_perc, BlastContextInfo::query_length, s_BlastGetBestEvalue(), and TRUE.

Referenced by BLAST_PreliminarySearchEngine(), BOOST_AUTO_TEST_CASE(), and s_FilterBlastResults().

◆ Blast_HSPListReapByRawScore()

Int2 Blast_HSPListReapByRawScore ( BlastHSPList hsp_list,
const BlastHitSavingOptions hit_options 
)

Discard the HSPs above the raw threshold from the HSP list.

Parameters
hsp_listList of HSPs for one subject sequence [in] [out]
hit_optionsOptions block containing the e-value cut-off [in] -RMH-

Discard the HSPs above the raw threshold from the HSP list.

-RMH-

Definition at line 2076 of file blast_hits.c.

References ASSERT, Blast_HSPFree(), BlastHitSavingOptions::cutoff_score, BlastHSPList::hsp_array, BlastHSPList::hspcnt, NULL, and BlastHSP::score.

Referenced by BLAST_PreliminarySearchEngine(), BOOST_AUTO_TEST_CASE(), and s_BlastSearchEngineCore().

◆ Blast_HSPListReevaluateUngapped()

Int2 Blast_HSPListReevaluateUngapped ( EBlastProgramType  program,
BlastHSPList hsp_list,
BLAST_SequenceBlk query_blk,
BLAST_SequenceBlk subject_blk,
const BlastInitialWordParameters word_params,
const BlastHitSavingParameters hit_params,
const BlastQueryInfo query_info,
BlastScoreBlk sbp,
const BlastScoringParameters score_params,
const BlastSeqSrc seq_src,
const Uint1 gen_code_string 
)

Reevaluate all ungapped HSPs in an HSP list.

This is only done for an ungapped search, or if traceback is already available. Subject sequence is uncompressed and saved here (for nucleotide sequences). The number of identities is calculated for each HSP along the way, hence this function is called for all programs.

Parameters
programType of BLAST program [in]
hsp_listThe list of HSPs for one subject sequence [in] [out]
query_blkThe query sequence [in]
subject_blkThe subject sequence [in] [out]
word_paramsInitial word parameters, containing ungapped cutoff score [in]
hit_paramsHit saving parameters, including cutoff score [in]
query_infoAuxiliary query information [in]
sbpThe statistical information [in]
score_paramsParameters related to scoring [in]
seq_srcThe BLAST database structure (for retrieving uncompressed sequence) [in]
gen_code_stringGenetic code string in case of a translated database search. [in]

Definition at line 2607 of file blast_hits.c.

References ASSERT, Blast_HSPFree(), Blast_HSPGetNumIdentitiesAndPositives(), Blast_HSPGetTargetTranslation(), Blast_HSPListAdjustOddBlastnScores(), Blast_HSPListPurgeNullHSPs(), Blast_HSPListSortByScore(), Blast_HSPReevaluateWithAmbiguitiesUngapped(), Blast_HSPTest(), BLAST_SEQSRC_EXCLUDED, Blast_SubjectIsNucleotide(), Blast_SubjectIsTranslated(), BlastSeqSrcGetSequence(), BlastSeqSrcReleaseSequence(), BlastTargetTranslationFree(), BlastTargetTranslationNew(), BlastSeqSrcGetSeqArg::check_oid_exclusion, BlastHSP::context, context, BlastQueryInfo::contexts, eBlastEncodingNcbi4na, eBlastEncodingNucleotide, BlastSeqSrcGetSeqArg::encoding, FALSE, BlastScoringOptions::gapped_calculation, BlastHSPList::hsp_array, BlastHSPList::hspcnt, BlastScoringOptions::is_ooframe, NULL, BLAST_SequenceBlk::oid, BlastSeqSrcGetSeqArg::oid, BlastHitSavingParameters::options, BlastScoringParameters::options, BlastContextInfo::query_offset, BlastSeqSrcGetSeqArg::seq, BLAST_SequenceBlk::sequence, BLAST_SequenceBlk::sequence_nomask, BLAST_SequenceBlk::sequence_start, and TRUE.

Referenced by BLAST_PreliminarySearchEngine(), and BOOST_AUTO_TEST_CASE().

◆ Blast_HSPListSaveHSP()

Int2 Blast_HSPListSaveHSP ( BlastHSPList hsp_list,
BlastHSP hsp 
)

◆ Blast_HSPListsMerge()

Int2 Blast_HSPListsMerge ( BlastHSPList **  hsp_list,
BlastHSPList **  combined_hsp_list_ptr,
Int4  hsp_num_max,
Int4 split_points,
Int4  contexts_per_query,
Int4  chunk_overlap_size,
Boolean  allow_gap,
Boolean  short_reads 
)

Merge an HSP list from a chunk of the subject sequence into a previously computed HSP list.

Parameters
hsp_listContains HSPs from the new chunk [in]
combined_hsp_list_ptrContains HSPs from previous chunks [in] [out]
hsp_num_maxMaximal allowed number of HSPs to save (unlimited if INT4_MAX) [in]
split_pointsOffset The sequence offset (query or subject) that is the boundary between HSPs in combined_hsp_list and hsp_list. [in]
contexts_per_queryIf positive, the number of query contexts that hits can contain. If negative, the (one) split point occurs on the subject sequence [in]
chunk_overlap_sizeThe length of the overlap region between the sequence region containing hsp_list and that containing combined_hsp_list [in]
allow_gapAllow merging HSPs at different diagonals [in]
short_readsAssume that queries are shorter than the database overlap region [in]
Returns
0 if HSP lists have been merged successfully, -1 otherwise.

Definition at line 2855 of file blast_hits.c.

References ABS, BlastHSPList::allocated, Blast_HSPFree(), Blast_HSPListFree(), Blast_HSPListPurgeNullHSPs(), BlastHSP::context, BlastHSPList::do_not_reallocate, BlastSeg::end, FALSE, BlastSeg::frame, BlastHSPList::hsp_array, BlastHSPList::hspcnt, MIN, NULL, BlastSeg::offset, OVERLAP_DIAG_CLOSE, BlastHSP::query, query, s_BlastHSPListsCombineByScore(), s_BlastMergeTwoHSPs(), s_HSPEndDiag(), s_HSPStartDiag(), BlastHSP::subject, and TRUE.

Referenced by Blast_HitListMerge(), BOOST_AUTO_TEST_CASE(), and s_BlastSearchEngineOneContext().

◆ Blast_HSPListSortByEvalue()

void Blast_HSPListSortByEvalue ( BlastHSPList hsp_list)

Sort the HSPs in an HSP list by e-value, with scores and other criteria used to resolve ties.

Checks if the HSP array is already sorted before proceeding with quicksort.

Parameters
hsp_listStructure containing array of HSPs to be sorted. [in] [out]

Definition at line 1437 of file blast_hits.c.

References BlastHSPList::hsp_array, BlastHSPList::hspcnt, and s_EvalueCompareHSPs().

Referenced by Blast_HitListUpdate(), BlastHitList2SeqAlign_OMF(), BOOST_AUTO_TEST_CASE(), s_BLAST_OneSubjectResults2CSeqAlign(), and s_BlastHSPCullingPipeRun().

◆ Blast_HSPListSortByScore()

void Blast_HSPListSortByScore ( BlastHSPList hsp_list)

Sort the HSPs in an HSP list by score.

This type of sorting is done before the e-values are calcaulted, and also at the beginning of the traceback stage, where it is needed to eliminate the effects of wrong score order because of application of sum statistics. Checks if the HSP array is already sorted before proceeding with quicksort.

Parameters
hsp_listStructure containing array of HSPs to be sorted. [in] [out]

Definition at line 1374 of file blast_hits.c.

References Blast_HSPListIsSortedByScore(), BlastHSPList::hsp_array, BlastHSPList::hspcnt, and ScoreCompareHSPs().

Referenced by BLAST_GetUngappedHSPList(), Blast_HSPListAdjustOddBlastnScores(), Blast_HSPListReevaluateUngapped(), Blast_HSPResultsApplyMasklevel(), BLAST_LinkHsps(), Blast_TracebackFromHSPList(), BlastHSPStreamMerge(), BOOST_AUTO_TEST_CASE(), PHIGetGappedScore(), CRedoAlignmentTestFixture::runRedoAlignmentCoreUnitTest(), s_BlastHSPBestHitFinal(), s_BlastHSPCullingFinal(), s_BlastHSPListRPSUpdate(), s_BlastHSPListsCombineByScore(), s_BlastSearchEngineOneContext(), s_HSPListFromDistinctAlignments(), s_HSPListRescaleScores(), s_PHITracebackFromHSPList(), CTracebackSearchTestFixture::x_GetSampleHspStream(), and CTracebackSearchTestFixture::x_GetSelfHitHspStream().

◆ Blast_HSPListSubjectBestHit()

Int4 Blast_HSPListSubjectBestHit ( EBlastProgramType  program,
const BlastHSPSubjectBestHitOptions subject_besthit_opts,
const BlastQueryInfo query_info,
BlastHSPList hsp_list 
)

◆ Blast_HSPListSwap()

void Blast_HSPListSwap ( BlastHSPList list1,
BlastHSPList list2 
)

Swaps the two HSP lists via structure assignment.

Definition at line 1614 of file blast_hits.c.

References tmp.

Referenced by Blast_RedoAlignmentCore_MT(), and Blast_TracebackFromHSPList().

◆ Blast_HSPNew()

BlastHSP* Blast_HSPNew ( void  )

◆ Blast_HSPQueryCoverageTest()

Boolean Blast_HSPQueryCoverageTest ( BlastHSP hsp,
double  min_query_coverage_pct,
Int4  query_length 
)

Calculate query coverage percentage of an hsp.

Parameters
hspAn HSP structure [in]
min_query_coverage_pctMin query coverage pct for saving the hsp[in]
query_lengthLength of query [in]
Returns
true if hsp's query coverage pct < min_query_coverage_pct (delete hsp)

Definition at line 1045 of file blast_hits.c.

References Blast_HSPGetQueryCoverage().

Referenced by Blast_HSPListReapByQueryCoverage(), and BOOST_AUTO_TEST_CASE().

◆ Blast_HSPReevaluateWithAmbiguitiesGapped()

Boolean Blast_HSPReevaluateWithAmbiguitiesGapped ( BlastHSP hsp,
const Uint1 query_start,
const Int4  query_length,
const Uint1 subject_start,
const Int4  subject_length,
const BlastHitSavingParameters hit_params,
const BlastScoringParameters score_params,
const BlastScoreBlk sbp 
)

Reevaluate the HSP's score and percent identity after taking into account the ambiguity information.

Used only for blastn after a greedy gapped extension with traceback. This function can remove part of the alignment at either end, if its score becomes negative after reevaluation. Traceback is also adjusted in that case.

Parameters
hspThe HSP structure [in] [out]
query_startPointer to the start of the query sequence [in]
query_lengthLength of the query sequence [in]
subject_startPointer to the start of the subject sequence [in]
subject_lengthLength of the subject sequence [in]
hit_paramsHit saving parameters containing score cut-off [in]
score_paramsScoring parameters [in]
sbpScore block with Karlin-Altschul parameters [in]
Returns
Should this HSP be deleted after the score reevaluation?

Definition at line 479 of file blast_hits.c.

References ASSERT, BlastHSP::context, BlastGappedCutoffs::cutoff_score, BlastHitSavingParameters::cutoffs, SBlastScoreMatrix::data, eGapAlignDel, eGapAlignIns, eGapAlignSub, BlastScoringParameters::gap_extend, BlastHSP::gap_info, BlastScoringParameters::gap_open, BlastScoreBlk::matrix, GapEditScript::num, BlastSeg::offset, GapEditScript::op_type, BlastScoringParameters::penalty, BlastHSP::query, query, BlastScoringParameters::reward, s_UpdateReevaluatedHSP(), GapEditScript::size, ncbi::grid::netcache::search::fields::size, BlastHSP::subject, subject, and TRUE.

Referenced by Blast_TracebackFromHSPList(), and BOOST_AUTO_TEST_CASE().

◆ Blast_HSPReevaluateWithAmbiguitiesUngapped()

Boolean Blast_HSPReevaluateWithAmbiguitiesUngapped ( BlastHSP hsp,
const Uint1 query_start,
const Uint1 subject_start,
const BlastInitialWordParameters word_params,
BlastScoreBlk sbp,
Boolean  translated 
)

Reevaluate the HSP's score and percent identity after taking into account the ambiguity information.

Used for ungapped searches with nucleotide database (blastn, tblastn, tblastx).

Parameters
hspThe HSP structure [in] [out]
query_startPointer to the start of the query sequence [in]
subject_startPointer to the start of the subject sequence [in]
word_paramsInitial word parameters with ungapped cutoff score [in]
sbpScore block with Karlin-Altschul parameters [in]
translatedAre sequences protein (with a translated subject)? [in]
Returns
Should this HSP be deleted after the score reevaluation?

Definition at line 676 of file blast_hits.c.

References BlastHSP::context, BlastUngappedCutoffs::cutoff_score, BlastInitialWordParameters::cutoffs, SBlastScoreMatrix::data, BlastSeg::end, BlastScoreBlk::matrix, BlastSeg::offset, BlastHSP::query, query, s_UpdateReevaluatedHSPUngapped(), BlastHSP::subject, and subject.

Referenced by Blast_HSPListReevaluateUngapped(), and BOOST_AUTO_TEST_CASE().

◆ Blast_HSPResultsApplyMasklevel()

Int2 Blast_HSPResultsApplyMasklevel ( BlastHSPResults results,
const BlastQueryInfo query_info,
Int4  masklevel,
Int4  query_length 
)

◆ Blast_HSPResultsFree()

BlastHSPResults* Blast_HSPResultsFree ( BlastHSPResults results)

◆ Blast_HSPResultsFromHSPStream()

BlastHSPResults* Blast_HSPResultsFromHSPStream ( struct BlastHSPStream hsp_stream,
size_t  num_queries,
SBlastHitsParameters hit_param 
)

Move all of the hits within an HSPStream into a BlastHSPResults structure.

Parameters
hsp_streamThe HSPStream [in][out]
num_queriesNumber of queries in the search [in]
hit_paramHit parameters [in]
Returns
The generated collection of HSP results

Definition at line 3633 of file blast_hits.c.

References Blast_HSPResultsInsertHSPList(), Blast_HSPResultsNew(), BlastHSPStreamRead(), kBlastHSPStream_Eof, NULL, SBlastHitsParameters::prelim_hitlist_size, and SBlastHitsParametersFree().

Referenced by Blast_HSPResultsFromHSPStreamWithLimit(), and Blast_HSPResultsFromHSPStreamWithLimitEx().

◆ Blast_HSPResultsFromHSPStreamWithLimit()

BlastHSPResults* Blast_HSPResultsFromHSPStreamWithLimit ( struct BlastHSPStream hsp_stream,
Uint4  num_queries,
SBlastHitsParameters hit_param,
Uint4  max_num_hsps,
Boolean removed_hsps 
)

As Blast_HSPResultsFromHSPStream, except the total number of HSPs kept for each query does not exceed an explicit limit.

The database sequences with the smallest number of hits are saved first, and hits are removed from query i if the average number of hits saved threatens to exceed (max_num_hsps / (number of DB sequences with hits to query i))

Parameters
hsp_streamThe HSPStream [in][out]
num_queriesNumber of queries in the search [in]
hit_paramHit parameters [in]
max_num_hspsThe limit on the number of HSPs to be kept for each query sequence [in]
removed_hspsSet to TRUE if any hits were removed [out]
Returns
The generated collection of HSP results

Definition at line 3855 of file blast_hits.c.

References Blast_HSPResultsFromHSPStream(), FALSE, and s_TrimResultsByTotalHSPLimit().

◆ Blast_HSPResultsFromHSPStreamWithLimitEx()

BlastHSPResults* Blast_HSPResultsFromHSPStreamWithLimitEx ( struct BlastHSPStream hsp_stream,
Uint4  num_queries,
SBlastHitsParameters hit_param,
Uint4  max_num_hsps,
Boolean removed_hsps 
)

As Blast_HSPResultsFromHSPStreamWithLimit, except accept and return array of Boolen flags specifying which query exceeded HSP limits.

Definition at line 3873 of file blast_hits.c.

References Blast_HSPResultsFromHSPStream(), FALSE, and s_TrimResultsByTotalHSPLimitEx().

Referenced by CBlastPrelimSearch::ComputeBlastHSPResults().

◆ Blast_HSPResultsInsertHSPList()

Int2 Blast_HSPResultsInsertHSPList ( BlastHSPResults results,
BlastHSPList hsp_list,
Int4  hitlist_size 
)

Blast_HSPResultsInsertHSPList Insert an HSP list to the appropriate place in the results structure.

All HSPs in this list must be from the same query and same subject; the oid and query_index fields must be set in the BlastHSPList input structure.

Parameters
resultsThe structure holding results for all queries [in] [out]
hsp_listThe results for one query-subject sequence pair. [in]
hitlist_sizeMaximal allowed hit list size. [in]

Definition at line 3552 of file blast_hits.c.

References ASSERT, Blast_HitListNew(), Blast_HitListUpdate(), BlastHSPList::hspcnt, BlastHSPList::query_index, and results.

Referenced by BLAST_ComputeTraceback_MT(), Blast_HSPResultsFromHSPStream(), BOOST_AUTO_TEST_CASE(), PHIBlast_HSPResultsSplit(), s_RPSComputeTraceback(), s_TrimResultsByTotalHSPLimitEx(), and CPhiblastTestFixture::x_SetupResults().

◆ Blast_HSPResultsNew()

BlastHSPResults* Blast_HSPResultsNew ( Int4  num_queries)

◆ Blast_HSPResultsReverseOrder()

Int2 Blast_HSPResultsReverseOrder ( BlastHSPResults results)

Reverse order of HSP lists in each hit list in the BLAST results.

This allows to return HSP lists from the end of the arrays when reading from a collector HSP stream.

Definition at line 3418 of file blast_hits.c.

References BlastHitList::hsplist_array, BlastHitList::hsplist_count, and results.

Referenced by BlastHSPStreamClose(), and s_FillResultsFromCompoHeaps().

◆ Blast_HSPResultsReverseSort()

Int2 Blast_HSPResultsReverseSort ( BlastHSPResults results)

Sort each hit list in the BLAST results by best e-value, in reverse order.

Definition at line 3402 of file blast_hits.c.

References BlastHitList::hsplist_array, BlastHitList::hsplist_count, results, s_BlastHitListPurge(), and s_EvalueCompareHSPListsRev().

Referenced by BlastHSPStreamClose().

◆ Blast_HSPResultsSortByEvalue()

Int2 Blast_HSPResultsSortByEvalue ( BlastHSPResults results)

◆ Blast_HSPTest()

Boolean Blast_HSPTest ( BlastHSP hsp,
const BlastHitSavingOptions hit_options,
Int4  align_length 
)

Determines whether this HSP should be kept or deleted.

Parameters
hspAn HSP structure [in] [out]
hit_optionsHit saving options containing percent identity and HSP length thresholds.
align_lengthalignment length including gaps
Returns
FALSE if HSP passes the test, TRUE if it should be deleted.

Definition at line 1027 of file blast_hits.c.

References s_HSPTest().

Referenced by Blast_HSPListReevaluateUngapped(), and Blast_TracebackFromHSPList().

◆ Blast_HSPTestIdentityAndLength()

Boolean Blast_HSPTestIdentityAndLength ( EBlastProgramType  program_number,
BlastHSP hsp,
const Uint1 query,
const Uint1 subject,
const BlastScoringOptions score_options,
const BlastHitSavingOptions hit_options 
)

Calculates number of identities and alignment lengths of an HSP via Blast_HSPGetNumIdentities and determines whether this HSP should be kept or deleted.

Parameters
program_numberType of BLAST program [in]
hspAn HSP structure [in] [out]
queryQuery sequence [in]
subjectSubject sequence [in]
score_optionsScoring options, needed to distinguish the out-of-frame case. [in]
hit_optionsHit saving options containing percent identity and HSP length thresholds.
Returns
FALSE if HSP passes the test, TRUE if it should be deleted.

Definition at line 1004 of file blast_hits.c.

References ASSERT, Blast_HSPGetNumIdentities(), FALSE, query, s_HSPTest(), and subject.

Referenced by Blast_TracebackFromHSPList(), BOOST_AUTO_TEST_CASE(), and s_GetTraceback().

◆ Blast_TrimHSPListByMaxHsps()

Int2 Blast_TrimHSPListByMaxHsps ( BlastHSPList hsp_list,
const BlastHitSavingOptions hit_options 
)

◆ BlastHSPListDup()

BlastHSPList* BlastHSPListDup ( const BlastHSPList hsp_list)

Returns a duplicate (deep copy) of the given hsp list.

Definition at line 1583 of file blast_hits.c.

References BlastHSPList::hsp_array, BlastHSPList::hspcnt, and malloc().

Referenced by Blast_TracebackFromHSPList().

◆ BlastHSPMappingInfoFree()

BlastHSPMappingInfo* BlastHSPMappingInfoFree ( BlastHSPMappingInfo info)

Deallocate memory for an HSP's additional data structure.

Definition at line 192 of file blast_hits.c.

References info, JumperEditsBlockFree(), NULL, SequenceOverhangsFree(), and sfree.

Referenced by Blast_HSPFree().

◆ BlastHSPMappingInfoNew()

BlastHSPMappingInfo* BlastHSPMappingInfoNew ( void  )

Allocate memory for an HSP's additional data structure.

Definition at line 207 of file blast_hits.c.

References calloc().

Referenced by Blast_HSPClone(), BlastNaExtendJumper(), s_CreateHSP(), s_CreateHSPForWordHit(), and ShortRead_IndexedWordFinder().

◆ BlastHspNumMax()

Int4 BlastHspNumMax ( Boolean  gapped_calculation,
const BlastHitSavingOptions options 
)

Calculated the number of HSPs that should be saved.

Parameters
gapped_calculationungapped if false [in]
optionsHitSavingoptions object [in]
Returns
number of HSPs to save.

Definition at line 213 of file blast_hits.c.

References BlastHitSavingOptions::hsp_num_max, and INT4_MAX.

Referenced by BLAST_GetGappedScore(), BLAST_GetUngappedHSPList(), BLAST_SmithWatermanGetGappedScore(), BlastHSPBestHitParamsNew(), BlastHSPCollectorParamsNew(), JumperNaWordFinder(), PHIGetGappedScore(), s_BlastSearchEngineCore(), s_BlastSearchEngineOneContext(), SBlastHitsParametersNew(), and ShortRead_IndexedWordFinder().

◆ GetPrelimHitlistSize()

Int4 GetPrelimHitlistSize ( Int4  hitlist_size,
Int4  compositionBasedStats,
Boolean  gapped_calculation 
)

Definition at line 44 of file blast_hits.c.

References MAX, MIN, and NULL.

Referenced by BlastHSPBestHitParamsNew(), BlastHSPCollectorParamsNew(), and SBlastHitsParametersNew().

◆ PHIBlast_HSPResultsSplit()

BlastHSPResults** PHIBlast_HSPResultsSplit ( const BlastHSPResults results,
const SPHIQueryInfo pattern_info 
)

Splits the BlastHSPResults structure for a PHI BLAST search into an array of BlastHSPResults structures, corresponding to different pattern occurrences in query.

All HSPs are copied, so it is safe to free the returned BlastHSPResults structures independently of the input results structure.

Parameters
resultsAll results from a PHI BLAST search, with HSPs for different query pattern occurrences mixed together. [in]
pattern_infoInformation about pattern occurrences in query. [in]
Returns
Array of pointers to BlastHSPResults structures, corresponding to different pattern occurrences.

Definition at line 3570 of file blast_hits.c.

References Blast_HSPListNew(), Blast_HSPListSaveHSP(), Blast_HSPResultsInsertHSPList(), Blast_HSPResultsNew(), Blast_HSPResultsSortByEvalue(), calloc(), BlastHSPList::hsp_array, BlastHSPList::hspcnt, BlastHitList::hsplist_array, BlastHitList::hsplist_count, BlastHitList::hsplist_max, SPHIHspInfo::index, NULL, BlastHSPList::oid, BlastHSP::pat_info, pattern_info(), results, s_BlastHSPCopy(), and sfree.

Referenced by BOOST_AUTO_TEST_CASE(), and PhiBlastResults2SeqAlign_OMF().

◆ PhiBlastGetEffectiveNumberOfPatterns()

Int4 PhiBlastGetEffectiveNumberOfPatterns ( const BlastQueryInfo query_info)

Count the number of occurrences of pattern in sequence, which do not overlap by more than half the pattern match length.

Parameters
query_infoQuery information structure, containing pattern info. [in]

Definition at line 360 of file blast_hits.c.

References ASSERT, BlastQueryInfo::contexts, count, BlastContextInfo::length_adjustment, SPHIQueryInfo::num_patterns, SPHIQueryInfo::occurrences, SPHIPatternInfo::offset, and BlastQueryInfo::pattern_info.

Referenced by s_HSPPHIGetEvalue(), and s_PhiBlastCutoffScore().

◆ SBlastHitsParametersDup()

SBlastHitsParameters* SBlastHitsParametersDup ( const SBlastHitsParameters hit_params)

Make a deep copy of the SBlastHitsParameters structure passed in.

Parameters
hit_paramssource hit parameters structure [in]
Returns
NULL if out of memory, otherwise deep copy of first argument

Definition at line 101 of file blast_hits.c.

References malloc(), and NULL.

◆ SBlastHitsParametersFree()

SBlastHitsParameters* SBlastHitsParametersFree ( SBlastHitsParameters param)

Deallocated SBlastHitsParameters.

Parameters
paramobject to be freed.
Returns
NULL pointer.

Definition at line 115 of file blast_hits.c.

References NULL, and sfree.

Referenced by Blast_HSPResultsFromHSPStream(), BOOST_AUTO_TEST_CASE(), CBlastTracebackSearch::Run(), and CBlastTracebackSearch::RunSimple().

◆ SBlastHitsParametersNew()

Int2 SBlastHitsParametersNew ( const BlastHitSavingOptions hit_options,
const BlastExtensionOptions ext_options,
const BlastScoringOptions scoring_options,
SBlastHitsParameters **  retval 
)

Sets up small structures used by blast_hit.c for saving HSPs.

Parameters
hit_optionsfield hitlist_size and hsp_num_max needed, a pointer to this structure will be stored on resulting structure.[in]
ext_optionsfield compositionBasedStats needed here. [in]
scoring_optionsgapped_calculation needed here. [in]
retvalthe allocated SBlastHitsParameters*
Returns
zero on success, 1 on NULL parameter, 2 if calloc fails.

Definition at line 75 of file blast_hits.c.

References ASSERT, BlastHspNumMax(), BlastExtensionOptions::compositionBasedStats, BlastScoringOptions::gapped_calculation, GetPrelimHitlistSize(), BlastHitSavingOptions::hitlist_size, malloc(), and NULL.

Referenced by BOOST_AUTO_TEST_CASE(), CBlastPrelimSearch::ComputeBlastHSPResults(), CBlastTracebackSearch::Run(), and CBlastTracebackSearch::RunSimple().

Modified on Fri Sep 20 14:57:43 2024 by modify_doxy.py rev. 669887