NCBI C++ ToolKit
|
Search Toolkit Book for CSeqMasker
Main interface to window based masker functionality. More...
#include <algo/winmask/seq_masker.hpp>
Classes | |
class | CSeqMaskerException |
Represents different error situations that can occur in the masking process. More... | |
struct | mitem |
Public Types | |
typedef pair< TSeqPos, TSeqPos > | TMaskedInterval |
Type representing a masked interval within a sequence. More... | |
typedef vector< TMaskedInterval > | TMaskList |
A type representing the total of masking information about a sequence. More... | |
Public Member Functions | |
CSeqMasker (const string &lstat_name, Uint1 arg_window_size, Uint4 arg_window_step, Uint1 arg_unit_step, Uint4 arg_textend, Uint4 arg_cutoff_score, Uint4 arg_max_score, Uint4 arg_min_score, Uint4 arg_set_max_score, Uint4 arg_set_min_score, bool arg_merge_pass, Uint4 arg_merge_cutoff_score, Uint4 arg_abs_merge_cutoff_dist, Uint4 arg_mean_merge_cutoff_dist, Uint1 arg_merge_unit_step, const string &arg_trigger, Uint1 tmin_count, bool arg_discontig, Uint4 arg_pattern, bool arg_use_ba, double min_pct=-1.0, double extend_pct=-1.0, double thres_pct=-1.0, double max_pct=-1.0) | |
Object constructor. More... | |
~CSeqMasker () | |
Object destructor. More... | |
TMaskList * | operator() (const objects::CSeqVector &data) const |
Sequence masking operator. More... | |
Static Public Member Functions | |
static void | MergeMaskInfo (TMaskList *dest, const TMaskList *src) |
Merge together two result lists. More... | |
Static Public Attributes | |
static CSeqMaskerVersion | AlgoVersion |
Version of window masking algorithm. More... | |
Private Types | |
enum | { eTrigger_Mean = 0 , eTrigger_Min } |
typedef list< mitem > | TMList |
Private Member Functions | |
TMaskList * | DoMask (const objects::CSeqVector &data, TSeqPos start, TSeqPos end) const |
double | MergeAvg (TMList::iterator mi, const TMList::iterator &umi, Uint4 unit_size) const |
void | Merge (TMList &m, TMList::iterator mi, TMList &um, TMList::iterator &umi) const |
Private Attributes | |
CRef< CSeqMaskerIstat > | ustat |
CSeqMaskerScore * | score |
CSeqMaskerScore * | score_p3 |
CSeqMaskerScore * | trigger_score |
Uint1 | window_size |
Uint4 | window_step |
Uint1 | unit_step |
bool | merge_pass |
Uint4 | merge_cutoff_score |
Uint4 | abs_merge_cutoff_dist |
Uint4 | mean_merge_cutoff_dist |
Uint1 | merge_unit_step |
enum CSeqMasker:: { ... } | trigger |
bool | discontig |
Uint4 | pattern |
Friends | |
struct | CSeqMasker::mitem |
Main interface to window based masker functionality.
Definition at line 52 of file seq_masker.hpp.
typedef pair< TSeqPos, TSeqPos > CSeqMasker::TMaskedInterval |
Type representing a masked interval within a sequence.
If A is an object of type TMaskedInterval, then A.first is the offset (starting from 0) of the beginning of the interval; A.second is the offset of the end of the interval.
Definition at line 67 of file seq_masker.hpp.
typedef vector< TMaskedInterval > CSeqMasker::TMaskList |
A type representing the total of masking information about a sequence.
Definition at line 74 of file seq_masker.hpp.
|
private |
Definition at line 234 of file seq_masker.hpp.
|
private |
Enumerator | |
---|---|
eTrigger_Mean | Using mean of unit scores in the window. |
eTrigger_Min | Using min score of k unit in the window. |
Definition at line 354 of file seq_masker.hpp.
CSeqMasker::CSeqMasker | ( | const string & | lstat_name, |
Uint1 | arg_window_size, | ||
Uint4 | arg_window_step, | ||
Uint1 | arg_unit_step, | ||
Uint4 | arg_textend, | ||
Uint4 | arg_cutoff_score, | ||
Uint4 | arg_max_score, | ||
Uint4 | arg_min_score, | ||
Uint4 | arg_set_max_score, | ||
Uint4 | arg_set_min_score, | ||
bool | arg_merge_pass, | ||
Uint4 | arg_merge_cutoff_score, | ||
Uint4 | arg_abs_merge_cutoff_dist, | ||
Uint4 | arg_mean_merge_cutoff_dist, | ||
Uint1 | arg_merge_unit_step, | ||
const string & | arg_trigger, | ||
Uint1 | tmin_count, | ||
bool | arg_discontig, | ||
Uint4 | arg_pattern, | ||
bool | arg_use_ba, | ||
double | min_pct = -1.0 , |
||
double | extend_pct = -1.0 , |
||
double | thres_pct = -1.0 , |
||
double | max_pct = -1.0 |
||
) |
Object constructor.
Parameters to the constructor determine the behaviour of the window based masking procedure.
lstat_name | the name of the file containing length statistics |
arg_window_size | the window size in bps |
arg_window_step | the window step |
arg_unit_step | the unit step |
arg_textend | the score above which it is allowed to keep masking |
arg_cutoff_score | the unit score triggering the masking |
arg_max_score | maximum allowed unit score |
arg_min_score | minimum allowed unit score |
arg_set_max_score | score to use for units exceeding max_score |
arg_set_min_score | score to use for units below min_score |
arg_merge_pass | whether or not to perform an interval merging pass |
arg_merge_cutoff_score | combined average score at which intervals should be merged |
arg_abs_merge_cutoff_dist | maximum distance between intervals at which they can be merged unconditionally |
arg_mean_merge_cutoff_dist | maximum distance between intervals at which they can be merged if they satisfy arg_merge_cutoff_score threshold |
arg_merge_unit_step | unit step to use for interval merging |
arg_trigger | determines which method to use to trigger masking |
tmin_count | if arg_trigger is "min" then determines how many of the units in a window should be above the score threshold in order to trigger masking |
arg_discontig | whether or not to use discontiguous units |
arg_pattern | base pattern to form discontiguous units |
arg_use_ba | use bit array optimization, if available |
Definition at line 71 of file seq_masker.cpp.
References eTrigger_Min, int, NCBI_THROW, score, score_p3, trigger, trigger_score, CSeqMaskerIstat::UnitSize(), ustat, and window_size.
CSeqMasker::~CSeqMasker | ( | ) |
Object destructor.
Definition at line 155 of file seq_masker.cpp.
References score, score_p3, and trigger_score.
|
private |
Definition at line 170 of file seq_masker.cpp.
References _TRACE, abs_merge_cutoff_dist, CSeqMaskerUtil::BitCount(), CSeqMaskerIstat::optimization_data::cba_, CSeqMaskerCacheBoost::Check(), count, data, discontig, CSeqMaskerWindow::End(), eTrigger_Min, CSeqMaskerIstat::get_optimization_data(), CSeqMaskerIstat::get_textend(), CSeqMaskerIstat::get_threshold(), i, l(), mask, mean_merge_cutoff_dist, Merge(), merge_cutoff_score, merge_pass, MergeAvg(), pattern, CSeqMaskerScore::PostAdvance(), score, CSeqMaskerScore::SetWindow(), CSeqMaskerWindow::Start(), tmp, CSeqMaskerIstat::total_, trigger, trigger_score, unit_step, CSeqMaskerIstat::UnitSize(), ustat, window_size, and window_step.
Referenced by operator()().
|
private |
Definition at line 412 of file seq_masker.cpp.
References merge_unit_step, N, and tmp.
Referenced by DoMask().
Merge together two result lists.
Used to merge results lists obtained from winmask and dust algorithms.
dest | this list will contain the merged data |
src | the other results list |
Definition at line 508 of file seq_masker.cpp.
References si.
Referenced by CWinMaskDemoApplication::Run(), and CWinMaskApplication::Run().
CSeqMasker::TMaskList * CSeqMasker::operator() | ( | const objects::CSeqVector & | data | ) | const |
Sequence masking operator.
seq_masker objects are function objects with. Main processing is done by () operator.
data | the original sequence data in iupacna format |
Definition at line 165 of file seq_masker.cpp.
|
friend |
Definition at line 229 of file seq_masker.hpp.
|
private |
Definition at line 333 of file seq_masker.hpp.
Referenced by DoMask().
|
static |
Version of window masking algorithm.
Definition at line 57 of file seq_masker.hpp.
Referenced by CWinMaskApplication::CWinMaskApplication().
|
private |
Definition at line 363 of file seq_masker.hpp.
Referenced by DoMask(), and CSeqMasker::mitem::mitem().
|
private |
Definition at line 339 of file seq_masker.hpp.
Referenced by DoMask().
|
private |
Definition at line 327 of file seq_masker.hpp.
Referenced by DoMask().
|
private |
Definition at line 321 of file seq_masker.hpp.
Referenced by DoMask().
|
private |
Definition at line 349 of file seq_masker.hpp.
Referenced by MergeAvg(), and CSeqMasker::mitem::mitem().
|
private |
Definition at line 368 of file seq_masker.hpp.
Referenced by DoMask(), and CSeqMasker::mitem::mitem().
|
private |
Definition at line 285 of file seq_masker.hpp.
Referenced by CSeqMasker(), DoMask(), CSeqMasker::mitem::mitem(), and ~CSeqMasker().
|
private |
Definition at line 290 of file seq_masker.hpp.
Referenced by CSeqMasker(), CSeqMasker::mitem::mitem(), and ~CSeqMasker().
enum { ... } CSeqMasker::trigger |
Referenced by CSeqMasker(), and DoMask().
|
private |
Definition at line 295 of file seq_masker.hpp.
Referenced by CSeqMasker(), DoMask(), and ~CSeqMasker().
|
private |
Definition at line 316 of file seq_masker.hpp.
Referenced by DoMask().
|
private |
Definition at line 280 of file seq_masker.hpp.
Referenced by CSeqMasker(), DoMask(), and CSeqMasker::mitem::mitem().
|
private |
Definition at line 300 of file seq_masker.hpp.
Referenced by CSeqMasker(), DoMask(), and CSeqMasker::mitem::mitem().
|
private |
Definition at line 308 of file seq_masker.hpp.
Referenced by DoMask().