NCBI C++ ToolKit
|
BLAST filtering functions. More...
#include <algo/blast/core/ncbi_std.h>
#include <algo/blast/core/blast_def.h>
#include <algo/blast/core/blast_program.h>
#include <algo/blast/core/blast_query_info.h>
#include <algo/blast/core/blast_message.h>
#include <algo/blast/core/blast_options.h>
Go to the source code of this file.
Go to the SVN repository for this file.
Macros | |
#define | REPEATS_SEARCH_EVALUE 0.1 |
Repeats filtering default options. More... | |
#define | REPEATS_SEARCH_MINSCORE 26 |
Default score cutoff. More... | |
#define | REPEATS_SEARCH_PENALTY -1 |
Default mismatch penalty. More... | |
#define | REPEATS_SEARCH_REWARD 1 |
Default match reward. More... | |
#define | REPEATS_SEARCH_GAP_OPEN 2 |
Default gap opening cost. More... | |
#define | REPEATS_SEARCH_GAP_EXTEND 1 |
Default gap extension cost. More... | |
#define | REPEATS_SEARCH_WORD_SIZE 11 |
Default word size. More... | |
#define | REPEATS_SEARCH_XDROP_UNGAPPED 40 |
Default X-dropoff for ungapped extension. More... | |
#define | REPEATS_SEARCH_XDROP_FINAL 90 |
Default X-dropoff for gapped extension with traceback. More... | |
#define | REPEATS_SEARCH_FILTER_STRING "F" |
Default filter string - no filtering. More... | |
#define | REPEAT_MASK_LINK_VALUE 5 |
Largest gap allowed to be filled between repeat mask intervals. More... | |
Functions | |
BlastSeqLoc * | BlastSeqLocNew (BlastSeqLoc **head, Int4 from, Int4 to) |
Create and initialize a new sequence interval. More... | |
BlastSeqLoc * | BlastSeqLocAppend (BlastSeqLoc **head, BlastSeqLoc *node) |
Appends the BlastSeqLoc to the list of BlastSeqLoc-s pointed to by head. More... | |
BlastSeqLoc * | BlastSeqLocNodeFree (BlastSeqLoc *node) |
Deallocate a single BlastSeqLoc structure and its contents, without following its next pointer. More... | |
BlastSeqLoc * | BlastSeqLocFree (BlastSeqLoc *loc) |
Deallocate all BlastSeqLoc objects in a chain. More... | |
BlastSeqLoc * | BlastSeqLocListDup (BlastSeqLoc *head) |
Make a deep copy of the linked list of BlastSeqLoc-s pointed to by its argument. More... | |
void | BlastSeqLocReverse (BlastSeqLoc *masks, Int4 query_length) |
Converts reverse strand coordinates to forward strand in place. More... | |
void | BlastSeqLocCombine (BlastSeqLoc **mask_loc, Int4 link_value) |
Go through all mask locations in one sequence and combine any that overlap, deallocating the unneeded locations. More... | |
BlastMaskLoc * | BlastMaskLocNew (Int4 total) |
Allocate memory for a BlastMaskLoc. More... | |
BlastMaskLoc * | BlastMaskLocDup (const BlastMaskLoc *mask_loc) |
Perform a deep copy of the BlastMaskLoc structure passed to this function. More... | |
BlastMaskLoc * | BlastMaskLocFree (BlastMaskLoc *mask_loc) |
Deallocate memory for a BlastMaskLoc structure as well as the BlastSeqLoc's pointed to. More... | |
Int2 | BlastMaskLocDNAToProtein (BlastMaskLoc *mask_loc, const BlastQueryInfo *query_info) |
Given a BlastMaskLoc with an array of lists of DNA mask locations, substitutes that array by a new array of per-protein-frame mask location lists. More... | |
Int2 | BlastMaskLocProteinToDNA (BlastMaskLoc *mask_loc, const BlastQueryInfo *query_info) |
Given a BlastMaskLoc with an array of lists of mask locations per protein frame, recalculates all mask offsets in terms of the DNA sequence. More... | |
Int2 | BLAST_ComplementMaskLocations (EBlastProgramType program_number, const BlastQueryInfo *query_info, const BlastMaskLoc *mask_loc, BlastSeqLoc **complement_mask) |
This function takes the list of mask locations (i.e., regions that should not be searched or not added to lookup table) and makes up a set of SSeqRange*'s in the concatenated sequence built from a set of queries, that should be searched (that is, takes the complement). More... | |
Int2 | BlastSetUp_Filter (EBlastProgramType program_number, Uint1 *sequence, Int4 length, Int4 offset, const SBlastFilterOptions *filter_options, BlastSeqLoc **seqloc_retval, Blast_Message **blast_message) |
Runs seg filtering functions, according to the filtering options, returns BlastSeqLoc*. More... | |
Int2 | BlastSetUp_GetFilteringLocations (BLAST_SequenceBlk *query_blk, const BlastQueryInfo *query_info, EBlastProgramType program_number, const SBlastFilterOptions *filter_options, BlastMaskLoc **filter_out, Blast_Message **blast_message) |
Does preparation for filtering and then calls BlastSetUp_Filter. More... | |
void | Blast_MaskTheResidues (Uint1 *buffer, Int4 length, Boolean is_na, const BlastSeqLoc *mask_loc, Boolean reverse, Int4 offset) |
Masks the letters in buffer. More... | |
void | Blast_MaskUnsupportedAA (BLAST_SequenceBlk *seq, Uint1 min_invalid) |
Mask protein letters that are currently unsupported. More... | |
void | BlastSetUp_MaskQuery (BLAST_SequenceBlk *query_blk, const BlastQueryInfo *query_info, const BlastMaskLoc *filter_maskloc, EBlastProgramType program_number) |
Masks the sequence given a BlastMaskLoc. More... | |
Int2 | BlastFilteringOptionsFromString (EBlastProgramType program_number, const char *instructions, SBlastFilterOptions **filtering_options, Blast_Message **blast_message) |
Produces SBlastFilterOptions from a string that has been traditionally supported in blast. More... | |
char * | BlastFilteringOptionsToString (const SBlastFilterOptions *filtering_options) |
Convert the filtering options structure to a string. More... | |
static NCBI_INLINE Boolean | BlastIsReverseStrand (Boolean is_na, Int4 context) |
Determines whether this is a nucleotide query and whether this a minus strand or not. More... | |
Variables | |
const Uint1 | kNuclMask |
BLASTNA element used to mask bases in BLAST. More... | |
const Uint1 | kProtMask |
NCBISTDAA element used to mask residues in BLAST. More... | |
BLAST filtering functions.
Definition in file blast_filter.h.
#define REPEAT_MASK_LINK_VALUE 5 |
Largest gap allowed to be filled between repeat mask intervals.
Definition at line 72 of file blast_filter.h.
#define REPEATS_SEARCH_EVALUE 0.1 |
Repeats filtering default options.
Default e-value threshold, keep for C toolkit
Definition at line 57 of file blast_filter.h.
#define REPEATS_SEARCH_FILTER_STRING "F" |
Default filter string - no filtering.
Definition at line 69 of file blast_filter.h.
#define REPEATS_SEARCH_GAP_EXTEND 1 |
Default gap extension cost.
Definition at line 62 of file blast_filter.h.
#define REPEATS_SEARCH_GAP_OPEN 2 |
Default gap opening cost.
Definition at line 61 of file blast_filter.h.
#define REPEATS_SEARCH_MINSCORE 26 |
Default score cutoff.
Definition at line 58 of file blast_filter.h.
#define REPEATS_SEARCH_PENALTY -1 |
Default mismatch penalty.
Definition at line 59 of file blast_filter.h.
#define REPEATS_SEARCH_REWARD 1 |
Default match reward.
Definition at line 60 of file blast_filter.h.
#define REPEATS_SEARCH_WORD_SIZE 11 |
Default word size.
Definition at line 63 of file blast_filter.h.
#define REPEATS_SEARCH_XDROP_FINAL 90 |
Default X-dropoff for gapped extension with traceback.
Definition at line 67 of file blast_filter.h.
#define REPEATS_SEARCH_XDROP_UNGAPPED 40 |
Default X-dropoff for ungapped extension.
Definition at line 65 of file blast_filter.h.
Int2 BLAST_ComplementMaskLocations | ( | EBlastProgramType | program_number, |
const BlastQueryInfo * | query_info, | ||
const BlastMaskLoc * | mask_loc, | ||
BlastSeqLoc ** | complement_mask | ||
) |
This function takes the list of mask locations (i.e., regions that should not be searched or not added to lookup table) and makes up a set of SSeqRange*'s in the concatenated sequence built from a set of queries, that should be searched (that is, takes the complement).
If all sequences in the query set are completely filtered, then an SSeqRange is created and both of its elements (left and right) are set to -1 to indicate this. If any of the mask_loc's is NULL, an SSeqRange for the full span of the respective query sequence is created.
program_number | Type of BLAST program [in] |
query_info | The query information structure [in] |
mask_loc | All mask locations [in] |
complement_mask | Linked list of SSeqRange*s in the concatenated sequence to be indexed in the lookup table . [out] |
Definition at line 1017 of file blast_filter.c.
References ASSERT, BlastIsReverseStrand(), BlastSeqLocListReverse(), BlastSeqLocNew(), context, BlastQueryInfo::contexts, eBlastTypeBlastn, eBlastTypeMapping, FALSE, first(), BlastQueryInfo::first_context, BlastContextInfo::is_valid, SSeqRange::left, BlastSeqLoc::next, NULL, BlastContextInfo::query_length, BlastContextInfo::query_offset, SSeqRange::right, BlastMaskLoc::seqloc_array, BlastSeqLoc::ssr, and TRUE.
Referenced by BLAST_MainSetUp().
void Blast_MaskTheResidues | ( | Uint1 * | buffer, |
Int4 | length, | ||
Boolean | is_na, | ||
const BlastSeqLoc * | mask_loc, | ||
Boolean | reverse, | ||
Int4 | offset | ||
) |
Masks the letters in buffer.
This is a low-level routine and takes a raw buffer which it assumes to be in ncbistdaa (protein) or blastna (nucleotide).
buffer | the sequence to be masked (will be modified, cannot be NULL or undefined behavior will result).[in|out] |
length | length of the sequence to be masked . [in] |
is_na | nucleotide if TRUE [in] |
mask_loc | the BlastSeqLoc to use for masking [in] |
reverse | minus strand if TRUE [in] |
offset | how far along sequence is 1st residuse in buffer [in] |
Definition at line 1307 of file blast_filter.c.
References ASSERT, buffer, kNuclMask, kProtMask, SSeqRange::left, BlastSeqLoc::next, offset, SSeqRange::right, and BlastSeqLoc::ssr.
Referenced by BlastSetUp_MaskQuery(), and s_DoSegSequenceData().
void Blast_MaskUnsupportedAA | ( | BLAST_SequenceBlk * | seq, |
Uint1 | min_invalid | ||
) |
Mask protein letters that are currently unsupported.
This routine is used to make the core ignore letters within protein sequences that cannot (yet) be correctly handled
seq | Protein sequence to be masked (ncbistdaa format required). Letters whose numerical value exceeds a cutoff are converted into kProtMask values [in|out] |
min_invalid | The first ncbistdaa value that is considered invalid. All sequence letters with numerical value >= this number are masked [in] |
Definition at line 1336 of file blast_filter.c.
References i, kProtMask, BLAST_SequenceBlk::length, and BLAST_SequenceBlk::sequence.
Referenced by LookupTableWrapInit_MT().
Int2 BlastFilteringOptionsFromString | ( | EBlastProgramType | program_number, |
const char * | instructions, | ||
SBlastFilterOptions ** | filtering_options, | ||
Blast_Message ** | blast_message | ||
) |
Produces SBlastFilterOptions from a string that has been traditionally supported in blast.
program_number | Type of BLAST program [in] |
instructions | the string describing the filtering to be done [in] |
filtering_options | the structure to be filled in [out] |
blast_message | optional field for error messages [out] |
Definition at line 436 of file blast_filter.c.
References Blast_MessageWrite(), buffer, calloc(), SRepeatFilterOptions::database, SWindowMaskerOptions::database, dbname(), eBlastSevError, eBlastTypeBlastn, eBlastTypeMapping, eEmpty, FALSE, SSegOptions::hicut, kBlastMessageNoContext, SDustOptions::level, SDustOptions::linker, SSegOptions::locut, NULL, NULLB, s_LoadOptionsToBuffer(), s_ParseDustOptions(), s_ParseRepeatOptions(), s_ParseSegOptions(), s_ParseWindowMaskerOptions(), SBlastFilterOptionsNew(), SDustOptionsFree(), SDustOptionsNew(), sfree, SRepeatFilterOptionsFree(), SRepeatFilterOptionsNew(), SSegOptionsFree(), SSegOptionsNew(), strcasecmp, SWindowMaskerOptionsFree(), SWindowMaskerOptionsNew(), SWindowMaskerOptions::taxid, TRUE, SDustOptions::window, and SSegOptions::window.
Referenced by BLAST_FillQuerySetUpOptions(), BLAST_MainSetUp(), BOOST_AUTO_TEST_CASE(), s_DoSegSequenceData(), and CBlastOptionsLocal::SetFilterString().
char* BlastFilteringOptionsToString | ( | const SBlastFilterOptions * | filtering_options | ) |
Convert the filtering options structure to a string.
filtering_options | filtering options structure, assumed to be correctly filled in [in] |
Definition at line 321 of file blast_filter.c.
References buffer, calloc(), SRepeatFilterOptions::database, SWindowMaskerOptions::database, SBlastFilterOptions::dustOptions, SSegOptions::hicut, kDustLevel, kDustLinker, kDustWindow, kSegHicut, kSegLocut, kSegWindow, SDustOptions::level, SDustOptions::linker, SSegOptions::locut, NULL, SBlastFilterOptions::repeatFilterOptions, s_SafeStrCat(), SBlastFilterOptionsMaskAtHash(), SBlastFilterOptions::segOptions, strdup, SWindowMaskerOptions::taxid, SDustOptions::window, SSegOptions::window, and SBlastFilterOptions::windowMaskerOptions.
Referenced by BOOST_AUTO_TEST_CASE(), and CBlastOptionsLocal::GetFilterString().
|
static |
Determines whether this is a nucleotide query and whether this a minus strand or not.
is_na | the query is nucleotide |
context | offset in the QueryInfo array |
Definition at line 325 of file blast_filter.h.
References context.
Referenced by BLAST_ComplementMaskLocations(), BlastSetUp_MaskQuery(), and s_GetFilteringLocationsForOneContext().
Int2 BlastMaskLocDNAToProtein | ( | BlastMaskLoc * | mask_loc, |
const BlastQueryInfo * | query_info | ||
) |
Given a BlastMaskLoc with an array of lists of DNA mask locations, substitutes that array by a new array of per-protein-frame mask location lists.
mask_loc | Mask locations structure. This structure can have either masks for all frames in nucleotide coordinates (e.g.: the results of translating protein masks to nucleotide) or a single mask per query (i.e.:location NUM_FRAMES*query_index). In the latter case, this mask will be used for all frames. [in|out] |
query_info | Query information structure, containing contexts data [in] |
Definition at line 806 of file blast_filter.c.
References ASSERT, BLAST_ContextToFrame(), BlastQueryInfoGetQueryLength(), BlastSeqLocFree(), BlastSeqLocNew(), CODON_LENGTH, context, BlastQueryInfo::contexts, eBlastTypeBlastx, BlastQueryInfo::last_context, SSeqRange::left, BlastSeqLoc::next, NULL, NUM_FRAMES, BlastQueryInfo::num_queries, BlastContextInfo::query_length, SSeqRange::right, BlastMaskLoc::seqloc_array, BlastSeqLoc::ssr, and BlastMaskLoc::total_size.
Referenced by BOOST_AUTO_TEST_CASE().
BlastMaskLoc* BlastMaskLocDup | ( | const BlastMaskLoc * | mask_loc | ) |
Perform a deep copy of the BlastMaskLoc structure passed to this function.
mask_loc | Source masking location structure [in] |
Definition at line 770 of file blast_filter.c.
References BlastMaskLocNew(), BlastSeqLocListDup(), NULL, BlastMaskLoc::seqloc_array, and BlastMaskLoc::total_size.
BlastMaskLoc* BlastMaskLocFree | ( | BlastMaskLoc * | mask_loc | ) |
Deallocate memory for a BlastMaskLoc structure as well as the BlastSeqLoc's pointed to.
mask_loc | the object to be deleted [in] |
Definition at line 789 of file blast_filter.c.
References BlastSeqLocFree(), NULL, BlastMaskLoc::seqloc_array, sfree, and BlastMaskLoc::total_size.
Referenced by BLAST_MainSetUp(), BlastSequenceBlkFree(), and BOOST_AUTO_TEST_CASE().
BlastMaskLoc* BlastMaskLocNew | ( | Int4 | total | ) |
Allocate memory for a BlastMaskLoc.
total | number of contexts for which SSeqLocs should be allocated (result of number of queries * number of contexts for given program) [in] |
Definition at line 760 of file blast_filter.c.
References calloc(), BlastMaskLoc::seqloc_array, and BlastMaskLoc::total_size.
Referenced by BlastMaskLocDup(), BlastSetUp_GetFilteringLocations(), BOOST_AUTO_TEST_CASE(), SetupQueries_OMF(), and x_TestGetSeqLocInfoVector().
Int2 BlastMaskLocProteinToDNA | ( | BlastMaskLoc * | mask_loc, |
const BlastQueryInfo * | query_info | ||
) |
Given a BlastMaskLoc with an array of lists of mask locations per protein frame, recalculates all mask offsets in terms of the DNA sequence.
mask_loc | Mask locations structure [in|out] |
query_info | Query information structure, containing contexts data [in] |
Definition at line 892 of file blast_filter.c.
References ASSERT, BLAST_ContextToFrame(), BlastQueryInfoGetQueryLength(), CODON_LENGTH, eBlastTypeBlastx, BlastQueryInfo::last_context, SSeqRange::left, BlastSeqLoc::next, NUM_FRAMES, BlastQueryInfo::num_queries, SSeqRange::right, BlastMaskLoc::seqloc_array, BlastSeqLoc::ssr, and BlastMaskLoc::total_size.
Referenced by BLAST_MainSetUp(), and BOOST_AUTO_TEST_CASE().
BlastSeqLoc* BlastSeqLocAppend | ( | BlastSeqLoc ** | head, |
BlastSeqLoc * | node | ||
) |
Appends the BlastSeqLoc to the list of BlastSeqLoc-s pointed to by head.
head | Pointer to the head of the linked list of BlastSeqLoc-s [in] |
node | Pointer to the node to be added to the list. If this is NULL, this function does nothing. [in] |
Definition at line 621 of file blast_filter.c.
References head, NULL, and tmp.
Referenced by BlastSeqLocListDup(), BlastSeqLocNew(), and s_GetFilteringLocationsForOneContext().
void BlastSeqLocCombine | ( | BlastSeqLoc ** | mask_loc, |
Int4 | link_value | ||
) |
Go through all mask locations in one sequence and combine any that overlap, deallocating the unneeded locations.
mask_loc | The list of masks to be merged (in place) [in|out] |
link_value | Largest gap size between locations for which they should be linked together [in] |
Definition at line 972 of file blast_filter.c.
References ASSERT, BlastSeqLocNodeFree(), i, SSeqRange::left, MAX, BlastSeqLoc::next, NULL, SSeqRange::right, s_BlastSeqLocListToArrayOfPointers(), s_SeqRangeSortByStartPosition(), sfree, and BlastSeqLoc::ssr.
Referenced by BOOST_AUTO_TEST_CASE(), s_FillMaskLocFromBlastResults(), and s_GetFilteringLocationsForOneContext().
BlastSeqLoc* BlastSeqLocFree | ( | BlastSeqLoc * | loc | ) |
Deallocate all BlastSeqLoc objects in a chain.
loc | object to be freed [in] |
Definition at line 737 of file blast_filter.c.
References BlastSeqLocNodeFree(), BlastSeqLoc::next, and NULL.
Referenced by AascanTestFixture::AascanTestFixture(), BlastKmerGetKmerSet(), BlastMaskLocDNAToProtein(), BlastMaskLocFree(), BlastMBLookupTableDestruct(), BlastNaHashLookupTableDestruct(), BlastNaLookupTableDestruct(), BlastSmallNaLookupTableDestruct(), BOOST_AUTO_TEST_CASE(), CompressedAascanTestFixture::CompressedAascanTestFixture(), CSetupFactory::CreateScoreBlock(), CScore_SegPct::Get(), CSegMasker::operator()(), s_DoSegSequenceData(), s_FillMaskLocFromBlastResults(), s_GetInitialWordParameters(), s_SegsToBlastSeqLoc(), LinkHspTestFixture::setupCutoffScores(), TestFixture::TearDownQuery(), CPhiblastTestFixture::x_FindQueryOccurrences(), CMakeProfileDBApp::x_RPSUpdateLookup(), AalookupTestFixture::~AalookupTestFixture(), CBlastQueryFilteredFrames::~CBlastQueryFilteredFrames(), CBlastSeqLocWrap::~CBlastSeqLocWrap(), CompressedAalookupTestFixture::~CompressedAalookupTestFixture(), CTracebackTestFixture::~CTracebackTestFixture(), and NtlookupTestFixture::~NtlookupTestFixture().
BlastSeqLoc* BlastSeqLocListDup | ( | BlastSeqLoc * | head | ) |
Make a deep copy of the linked list of BlastSeqLoc-s pointed to by its argument.
head | head of the linked list [in] |
Definition at line 747 of file blast_filter.c.
References BlastSeqLocAppend(), head, NULL, and s_BlastSeqLocNodeDup().
Referenced by BlastMaskLocDup().
BlastSeqLoc* BlastSeqLocNew | ( | BlastSeqLoc ** | head, |
Int4 | from, | ||
Int4 | to | ||
) |
Create and initialize a new sequence interval.
head | existing BlastSeqLoc to append onto, if *head is NULL then it will be set to new BlastSeqLoc, may be NULL [in|out] |
from | Start of the interval [in] |
to | End of the interval [in] |
Definition at line 608 of file blast_filter.c.
References BlastSeqLocAppend(), calloc(), head, SSeqRange::left, NULL, SSeqRange::right, and BlastSeqLoc::ssr.
Referenced by AascanTestFixture::AascanTestFixture(), CBlastQueryFilteredFrames::AddSeqLoc(), BLAST_ComplementMaskLocations(), BlastMaskLocDNAToProtein(), BOOST_AUTO_TEST_CASE(), CompressedAascanTestFixture::CompressedAascanTestFixture(), CSeqLoc2BlastSeqLoc(), NtlookupTestFixture::debruijnInit(), CompressedAalookupTestFixture::GetSeqBlk(), AalookupTestFixture::GetSeqBlk(), s_BlastSeqLocNodeDup(), s_GetInitialWordParameters(), s_RangeVector2BlastSeqLoc(), s_SeqAlignToBlastSeqLoc(), s_SeqLocListInvert(), LinkHspTestFixture::setupCutoffScores(), NtlookupTestFixture::SetUpQuery(), TestFixture::SetUpQuery(), CPhiblastTestFixture::x_FindQueryOccurrences(), CMakeProfileDBApp::x_RPSUpdateLookup(), and x_TestGetSeqLocInfoVector().
BlastSeqLoc* BlastSeqLocNodeFree | ( | BlastSeqLoc * | node | ) |
Deallocate a single BlastSeqLoc structure and its contents, without following its next pointer.
node | structure to deallocate [in] |
Definition at line 727 of file blast_filter.c.
References NULL, sfree, and BlastSeqLoc::ssr.
Referenced by BlastSeqLoc_RestrictToInterval(), BlastSeqLocCombine(), and BlastSeqLocFree().
void BlastSeqLocReverse | ( | BlastSeqLoc * | masks, |
Int4 | query_length | ||
) |
Converts reverse strand coordinates to forward strand in place.
masks | BlastSeqLoc to be reversed [in|out] |
query_length | length of query [in] |
Definition at line 1173 of file blast_filter.c.
References SSeqRange::left, BlastSeqLoc::next, SSeqRange::right, and BlastSeqLoc::ssr.
Referenced by s_GetFilteringLocationsForOneContext().
Int2 BlastSetUp_Filter | ( | EBlastProgramType | program_number, |
Uint1 * | sequence, | ||
Int4 | length, | ||
Int4 | offset, | ||
const SBlastFilterOptions * | filter_options, | ||
BlastSeqLoc ** | seqloc_retval, | ||
Blast_Message ** | blast_message | ||
) |
Runs seg filtering functions, according to the filtering options, returns BlastSeqLoc*.
Should combine all SeqLocs so they are non-redundant.
program_number | Type of BLAST program [in] |
sequence | The sequence or part of the sequence to be filtered [in] |
length | Length of the (sub)sequence [in] |
offset | Offset into the full sequence [in] |
filter_options | specifies how filtering is to be done [in] |
seqloc_retval | Resulting locations for filtered region. [out] |
blast_message | error messages on error [out] |
Definition at line 1122 of file blast_filter.c.
References ASSERT, FilterQueriesForMapping(), SSegOptions::hicut, SegParameters::hicut, SSegOptions::locut, SegParameters::locut, NULL, offset, SegParameters::overlaps, SBlastFilterOptions::readQualityOptions, SBlastFilterOptionsValidate(), SBlastFilterOptions::segOptions, SegParametersFree(), SegParametersNewAa(), SeqBufferSeg(), TRUE, SSegOptions::window, and SegParameters::window.
Referenced by BOOST_AUTO_TEST_CASE(), s_DoSegSequenceData(), and s_GetFilteringLocationsForOneContext().
Int2 BlastSetUp_GetFilteringLocations | ( | BLAST_SequenceBlk * | query_blk, |
const BlastQueryInfo * | query_info, | ||
EBlastProgramType | program_number, | ||
const SBlastFilterOptions * | filter_options, | ||
BlastMaskLoc ** | filter_out, | ||
Blast_Message ** | blast_message | ||
) |
Does preparation for filtering and then calls BlastSetUp_Filter.
query_blk | sequence to be filtered [in] |
query_info | info on sequence to be filtered [in] |
program_number | one of blastn,blastp,blastx,etc. [in] |
filter_options | specifies how filtering is to be done [in] |
filter_out | resulting locations for filtered region. [out] |
blast_message | message that needs to be sent back to user. |
Definition at line 1262 of file blast_filter.c.
References ASSERT, BLAST_GetNumberOfContexts(), Blast_MessageWrite(), BlastMaskLocNew(), context, eBlastSevError, BlastQueryInfo::first_context, BlastQueryInfo::last_context, NULL, BlastQueryInfo::num_queries, and s_GetFilteringLocationsForOneContext().
Referenced by BLAST_MainSetUp(), and BOOST_AUTO_TEST_CASE().
void BlastSetUp_MaskQuery | ( | BLAST_SequenceBlk * | query_blk, |
const BlastQueryInfo * | query_info, | ||
const BlastMaskLoc * | filter_maskloc, | ||
EBlastProgramType | program_number | ||
) |
Masks the sequence given a BlastMaskLoc.
query_blk | sequence to be filtered [in] |
query_info | info on sequence to be filtered [in] |
filter_maskloc | Locations to filter [in] |
program_number | one of blastn,blastp,blastx,etc. [in] |
Definition at line 1350 of file blast_filter.c.
References ASSERT, Blast_MaskTheResidues(), BlastIsReverseStrand(), BlastMemDup(), buffer, context, BlastQueryInfo::contexts, eBlastTypeBlastn, eBlastTypeMapping, FALSE, BlastQueryInfo::first_context, BlastContextInfo::is_valid, BlastQueryInfo::last_context, BLAST_SequenceBlk::nomask_allocated, NULL, BlastContextInfo::query_length, BlastContextInfo::query_offset, BlastMaskLoc::seqloc_array, BLAST_SequenceBlk::sequence, BLAST_SequenceBlk::sequence_nomask, BLAST_SequenceBlk::sequence_start, BLAST_SequenceBlk::sequence_start_nomask, BlastMaskLoc::total_size, and TRUE.
Referenced by BLAST_MainSetUp(), and BOOST_AUTO_TEST_CASE().
BLASTNA element used to mask bases in BLAST.
Definition at line 38 of file blast_filter.c.
Referenced by Blast_MaskTheResidues(), and x_AreAllBasesMasked().
NCBISTDAA element used to mask residues in BLAST.
Definition at line 39 of file blast_filter.c.
Referenced by Blast_MaskTheResidues(), Blast_MaskUnsupportedAA(), and CPsiBlastInputClustalW::x_ValidateQueryInMsa().