NCBI C++ ToolKit
|
Search Toolkit Book for CAgpToSeqEntry
This class is used to turn an AGP file into a vector of Seq-entry's. More...
#include <objtools/readers/agp_seq_entry.hpp>
Public Types | |
enum | EFlags { fSetSeqGap = (1 << 0) , fForceLocalId = (1 << 1) } |
typedef vector< CRef< objects::CSeq_entry > > | TSeqEntryRefVec |
This is the way the results will be returned Each Seq-entry contains just one Bioseq, built from the AGP file(s). More... | |
typedef int | TFlags |
![]() | |
enum | EFinalize { eFinalize_No , eFinalize_Yes } |
Whether or not the function should call Finalize() when it's done successfully. More... | |
Public Member Functions | |
CAgpToSeqEntry (TFlags fFlags=0, EAgpVersion agp_version=eAgpVersion_auto, CAgpErr *arg=nullptr) | |
After construction, you probably want to do something like call ReadStream and then GetResult. More... | |
TSeqEntryRefVec & | GetResult (void) |
This gets the results found, but don't call before finalizing. More... | |
![]() | |
CAgpReader (CAgpErr *arg, EAgpVersion agp_version=eAgpVersion_auto) | |
CAgpReader (EAgpVersion agp_version=eAgpVersion_auto) | |
virtual | ~CAgpReader () |
virtual int | ReadStream (CNcbiIstream &is, EFinalize eFinalize=eFinalize_Yes) |
Read an AGP file from the given input stream. More... | |
int | ReadStream (CNcbiIstream &is, bool bFinalize) |
Deprecated backward-compatibility wrapper. More... | |
virtual string | GetErrorMessage (const string &filename=NcbiEmptyString) |
Return a string with one (or two, depending on error) source line(s) on which the error occured, along with the error message(s) themselves. More... | |
bool | ProcessThisRow () |
Invoked from ReadStream(), after the row has been parsed, and seldom needs to be invoked by user. More... | |
virtual void | SetVersion (EAgpVersion ver) |
Change what AGP version to use for the next input that's read. More... | |
CAgpErr * | GetErrorHandler () |
void | SetErrorHandler (CAgpErr *arg) |
EAgpVersion | GetVersion () |
Static Public Member Functions | |
static CRef< objects::CSeq_id > | s_DefaultSeqIdFromStr (const std::string &str) |
This is the default method used to turn strings into Seq-ids in AGP contexts. More... | |
static CRef< objects::CSeq_id > | s_LocalSeqIdFromStr (const std::string &str) |
Turn a string into a local Seq-id (removing "lcl|" from the beginning if needed) More... | |
Protected Member Functions | |
virtual void | OnGapOrComponent (void) |
Builds new part of delta-seq in current bioseq, or adds bioseq and starts building a new one. More... | |
virtual int | Finalize (void) |
Parent finalize plus making sure last m_bioseq is added. More... | |
void | x_FinishedBioseq (void) |
Our own finalization after parent's finalization. More... | |
virtual CRef< objects::CSeq_id > | x_GetSeqIdFromStr (const std::string &str) |
If you must change exactly how strings are turned into Seq-ids, you can override this in a subclass. More... | |
void | x_SetSeqGap (objects::CSeq_gap &out_gap_info) |
Fills in out_gap_info based on current CAgpRow. More... | |
![]() | |
virtual void | OnScaffoldEnd () |
virtual void | OnObjectChange () |
virtual bool | OnError () |
virtual void | OnComment () |
Protected Attributes | |
const TFlags | m_fFlags |
CRef< objects::CBioseq > | m_bioseq |
This is the bioseq currently being built. More... | |
vector< CRef< objects::CSeq_entry > > | m_entries |
Holds the results. More... | |
![]() | |
EAgpVersion | m_agp_version |
bool | m_at_beg |
bool | m_at_end |
bool | m_line_skipped |
bool | m_prev_line_skipped |
bool | m_new_obj |
bool | m_content_line_seen |
int | m_error_code |
CRef< CAgpRow > | m_prev_row |
CRef< CAgpRow > | m_this_row |
int | m_line_num |
int | m_prev_line_num |
string | m_line |
Private Member Functions | |
CAgpToSeqEntry (const CAgpToSeqEntry &) | |
CAgpToSeqEntry & | operator= (const CAgpToSeqEntry &) |
This class is used to turn an AGP file into a vector of Seq-entry's.
Definition at line 50 of file agp_seq_entry.hpp.
typedef int CAgpToSeqEntry::TFlags |
Definition at line 65 of file agp_seq_entry.hpp.
typedef vector< CRef<objects::CSeq_entry> > CAgpToSeqEntry::TSeqEntryRefVec |
This is the way the results will be returned Each Seq-entry contains just one Bioseq, built from the AGP file(s).
Definition at line 55 of file agp_seq_entry.hpp.
Definition at line 57 of file agp_seq_entry.hpp.
CAgpToSeqEntry::CAgpToSeqEntry | ( | TFlags | fFlags = 0 , |
EAgpVersion | agp_version = eAgpVersion_auto , |
||
CAgpErr * | arg = nullptr |
||
) |
After construction, you probably want to do something like call ReadStream and then GetResult.
agp_version | What is the AGP version of the input? Default is to auto-detect AGP version, which is likely what the user wants to do most of the time. |
Definition at line 57 of file agp_seq_entry.cpp.
|
private |
|
protectedvirtual |
Parent finalize plus making sure last m_bioseq is added.
Reimplemented from CAgpReader.
Definition at line 171 of file agp_seq_entry.cpp.
References CAgpReader::Finalize(), and x_FinishedBioseq().
|
inline |
This gets the results found, but don't call before finalizing.
We are intentionally giving a non-const reference because the caller is free to take the seq-entries inside and do whatever they like with them. Each Seq-entry contains just one Bioseq, built from the AGP file(s).
Definition at line 81 of file agp_seq_entry.hpp.
References m_entries.
Referenced by CAgpObjectLoader::Execute(), CFileLoader::x_LoadAGP(), CAgpConverter::x_ReadAgpEntries(), CFormatGuessEx::x_TryAgp(), and CMultiReaderApp::xProcessAgp().
|
protectedvirtual |
Builds new part of delta-seq in current bioseq, or adds bioseq and starts building a new one.
Reimplemented from CAgpReader.
Definition at line 105 of file agp_seq_entry.cpp.
References CAgpRow::component_beg, CAgpRow::component_end, CAgpRow::component_type, CSeq_inst_Base::eMol_dna, eNa_strand_minus, eNa_strand_other, eNa_strand_plus, eNa_strand_unknown, CAgpRow::eOrientationIrrelevant, CAgpRow::eOrientationMinus, CAgpRow::eOrientationPlus, CAgpRow::eOrientationUnknown, CSeq_inst_Base::eRepr_delta, fSetSeqGap, CAgpRow::gap_length, CAgpRow::GetComponentId(), CAgpRow::GetObject(), NStr::IntToString(), CAgpRow::is_gap, m_bioseq, m_fFlags, CAgpReader::m_prev_row, CAgpReader::m_this_row, CAgpRow::orientation, CRef< C, Locker >::Reset(), s_LocalSeqIdFromStr(), CSeq_inst_Base::SetExt(), CSeq_loc::SetInt(), CSeq_inst_Base::SetLength(), CSeq_inst_Base::SetMol(), CSeq_inst_Base::SetRepr(), x_FinishedBioseq(), x_GetSeqIdFromStr(), and x_SetSeqGap().
|
private |
|
static |
This is the default method used to turn strings into Seq-ids in AGP contexts.
Definition at line 66 of file agp_seq_entry.cpp.
References CRef< C, Locker >::Reset(), s_LocalSeqIdFromStr(), and str().
Referenced by x_GetSeqIdFromStr().
|
static |
Turn a string into a local Seq-id (removing "lcl|" from the beginning if needed)
Definition at line 80 of file agp_seq_entry.cpp.
References NStr::eNocase, NStr::fAllowLeadingSpaces, NStr::fAllowTrailingSpaces, NStr::fConvErr_NoThrow, CTempString::length(), CObject_id_Base::SetId(), CSeq_id_Base::SetLocal(), CObject_id_Base::SetStr(), NStr::StartsWith(), str(), NStr::StringToInt(), and CTempString::substr().
Referenced by OnGapOrComponent(), s_DefaultSeqIdFromStr(), and x_GetSeqIdFromStr().
|
protected |
Our own finalization after parent's finalization.
Definition at line 181 of file agp_seq_entry.cpp.
References m_bioseq, m_entries, CRef< C, Locker >::Reset(), and CSeq_entry_Base::SetSeq().
Referenced by Finalize(), and OnGapOrComponent().
|
protectedvirtual |
If you must change exactly how strings are turned into Seq-ids, you can override this in a subclass.
The default
Definition at line 193 of file agp_seq_entry.cpp.
References fForceLocalId, m_fFlags, s_DefaultSeqIdFromStr(), s_LocalSeqIdFromStr(), and str().
Referenced by OnGapOrComponent().
|
protected |
Fills in out_gap_info based on current CAgpRow.
Definition at line 202 of file agp_seq_entry.cpp.
References _ASSERT, DEFINE_STATIC_ARRAY_MAP, CAgpRow::eGapCentromere, CAgpRow::eGapClone, CAgpRow::eGapContig, CAgpRow::eGapFragment, CAgpRow::eGapHeterochromatin, CAgpRow::eGapRepeat, CAgpRow::eGapScaffold, CAgpRow::eGapShort_arm, CAgpRow::eGapTelomere, CSeq_gap_Base::eLinkage_linked, CSeq_gap_Base::eLinkage_unlinked, CLinkage_evidence_Base::eType_align_genus, CLinkage_evidence_Base::eType_align_trnscpt, CLinkage_evidence_Base::eType_align_xgenus, CSeq_gap_Base::eType_centromere, CSeq_gap_Base::eType_clone, CLinkage_evidence_Base::eType_clone_contig, CSeq_gap_Base::eType_contig, CSeq_gap_Base::eType_fragment, CSeq_gap_Base::eType_heterochromatin, CLinkage_evidence_Base::eType_map, CLinkage_evidence_Base::eType_paired_ends, CLinkage_evidence_Base::eType_pcr, CLinkage_evidence_Base::eType_proximity_ligation, CSeq_gap_Base::eType_repeat, CSeq_gap_Base::eType_scaffold, CSeq_gap_Base::eType_short_arm, CLinkage_evidence_Base::eType_strobe, CSeq_gap_Base::eType_telomere, CLinkage_evidence_Base::eType_unspecified, CLinkage_evidence_Base::eType_within_clone, CAgpRow::fLinkageEvidence_align_genus, CAgpRow::fLinkageEvidence_align_trnscpt, CAgpRow::fLinkageEvidence_align_xgenus, CAgpRow::fLinkageEvidence_clone_contig, CAgpRow::fLinkageEvidence_map, CAgpRow::fLinkageEvidence_na, CAgpRow::fLinkageEvidence_paired_ends, CAgpRow::fLinkageEvidence_pcr, CAgpRow::fLinkageEvidence_proximity_ligation, CAgpRow::fLinkageEvidence_strobe, CAgpRow::fLinkageEvidence_unspecified, CAgpRow::fLinkageEvidence_within_clone, CAgpRow::gap_type, ITERATE, CAgpRow::linkage, CAgpRow::linkage_evidence_flags, CAgpRow::linkage_evidences, CAgpReader::m_this_row, NCBI_USER_THROW_FMT, CSeq_gap_Base::SetLinkage(), CSeq_gap_Base::SetLinkage_evidence(), and CSeq_gap_Base::SetType().
Referenced by OnGapOrComponent().
|
protected |
This is the bioseq currently being built.
Definition at line 114 of file agp_seq_entry.hpp.
Referenced by OnGapOrComponent(), and x_FinishedBioseq().
|
protected |
Holds the results.
Definition at line 116 of file agp_seq_entry.hpp.
Referenced by GetResult(), and x_FinishedBioseq().
Definition at line 93 of file agp_seq_entry.hpp.
Referenced by OnGapOrComponent(), and x_GetSeqIdFromStr().