NCBI C++ ToolKit
Public Types | Public Member Functions | Static Public Member Functions | Protected Member Functions | Protected Attributes | Private Member Functions | List of all members
CAgpToSeqEntry Class Reference

Search Toolkit Book for CAgpToSeqEntry

This class is used to turn an AGP file into a vector of Seq-entry's. More...

#include <objtools/readers/agp_seq_entry.hpp>

+ Inheritance diagram for CAgpToSeqEntry:
+ Collaboration diagram for CAgpToSeqEntry:

Public Types

enum  EFlags { fSetSeqGap = (1 << 0) , fForceLocalId = (1 << 1) }
 
typedef vector< CRef< objects::CSeq_entry > > TSeqEntryRefVec
 This is the way the results will be returned Each Seq-entry contains just one Bioseq, built from the AGP file(s). More...
 
typedef int TFlags
 
- Public Types inherited from CAgpReader
enum  EFinalize { eFinalize_No , eFinalize_Yes }
 Whether or not the function should call Finalize() when it's done successfully. More...
 

Public Member Functions

 CAgpToSeqEntry (TFlags fFlags=0, EAgpVersion agp_version=eAgpVersion_auto, CAgpErr *arg=nullptr)
 After construction, you probably want to do something like call ReadStream and then GetResult. More...
 
TSeqEntryRefVecGetResult (void)
 This gets the results found, but don't call before finalizing. More...
 
- Public Member Functions inherited from CAgpReader
 CAgpReader (CAgpErr *arg, EAgpVersion agp_version=eAgpVersion_auto)
 
 CAgpReader (EAgpVersion agp_version=eAgpVersion_auto)
 
virtual ~CAgpReader ()
 
virtual int ReadStream (CNcbiIstream &is, EFinalize eFinalize=eFinalize_Yes)
 Read an AGP file from the given input stream. More...
 
int ReadStream (CNcbiIstream &is, bool bFinalize)
 Deprecated backward-compatibility wrapper. More...
 
virtual string GetErrorMessage (const string &filename=NcbiEmptyString)
 Return a string with one (or two, depending on error) source line(s) on which the error occured, along with the error message(s) themselves. More...
 
bool ProcessThisRow ()
 Invoked from ReadStream(), after the row has been parsed, and seldom needs to be invoked by user. More...
 
virtual void SetVersion (EAgpVersion ver)
 Change what AGP version to use for the next input that's read. More...
 
CAgpErrGetErrorHandler ()
 
void SetErrorHandler (CAgpErr *arg)
 
EAgpVersion GetVersion ()
 

Static Public Member Functions

static CRef< objects::CSeq_id > s_DefaultSeqIdFromStr (const std::string &str)
 This is the default method used to turn strings into Seq-ids in AGP contexts. More...
 
static CRef< objects::CSeq_id > s_LocalSeqIdFromStr (const std::string &str)
 Turn a string into a local Seq-id (removing "lcl|" from the beginning if needed) More...
 

Protected Member Functions

virtual void OnGapOrComponent (void)
 Builds new part of delta-seq in current bioseq, or adds bioseq and starts building a new one. More...
 
virtual int Finalize (void)
 Parent finalize plus making sure last m_bioseq is added. More...
 
void x_FinishedBioseq (void)
 Our own finalization after parent's finalization. More...
 
virtual CRef< objects::CSeq_id > x_GetSeqIdFromStr (const std::string &str)
 If you must change exactly how strings are turned into Seq-ids, you can override this in a subclass. More...
 
void x_SetSeqGap (objects::CSeq_gap &out_gap_info)
 Fills in out_gap_info based on current CAgpRow. More...
 
- Protected Member Functions inherited from CAgpReader
virtual void OnScaffoldEnd ()
 
virtual void OnObjectChange ()
 
virtual bool OnError ()
 
virtual void OnComment ()
 

Protected Attributes

const TFlags m_fFlags
 
CRef< objects::CBioseq > m_bioseq
 This is the bioseq currently being built. More...
 
vector< CRef< objects::CSeq_entry > > m_entries
 Holds the results. More...
 
- Protected Attributes inherited from CAgpReader
EAgpVersion m_agp_version
 
bool m_at_beg
 
bool m_at_end
 
bool m_line_skipped
 
bool m_prev_line_skipped
 
bool m_new_obj
 
bool m_content_line_seen
 
int m_error_code
 
CRef< CAgpRowm_prev_row
 
CRef< CAgpRowm_this_row
 
int m_line_num
 
int m_prev_line_num
 
string m_line
 

Private Member Functions

 CAgpToSeqEntry (const CAgpToSeqEntry &)
 
CAgpToSeqEntryoperator= (const CAgpToSeqEntry &)
 

Detailed Description

This class is used to turn an AGP file into a vector of Seq-entry's.

Definition at line 50 of file agp_seq_entry.hpp.

Member Typedef Documentation

◆ TFlags

Definition at line 65 of file agp_seq_entry.hpp.

◆ TSeqEntryRefVec

typedef vector< CRef<objects::CSeq_entry> > CAgpToSeqEntry::TSeqEntryRefVec

This is the way the results will be returned Each Seq-entry contains just one Bioseq, built from the AGP file(s).

Definition at line 55 of file agp_seq_entry.hpp.

Member Enumeration Documentation

◆ EFlags

Enumerator
fSetSeqGap 

Found gaps will not be given Seq-data such as Type and Linkage.

fForceLocalId 

All IDs will be treated as local IDs.

The default if this is NOT set is to first try to parse the ID, and only make local if parsing fails.

Definition at line 57 of file agp_seq_entry.hpp.

Constructor & Destructor Documentation

◆ CAgpToSeqEntry() [1/2]

CAgpToSeqEntry::CAgpToSeqEntry ( TFlags  fFlags = 0,
EAgpVersion  agp_version = eAgpVersion_auto,
CAgpErr arg = nullptr 
)

After construction, you probably want to do something like call ReadStream and then GetResult.

Parameters
agp_versionWhat is the AGP version of the input? Default is to auto-detect AGP version, which is likely what the user wants to do most of the time.

Definition at line 57 of file agp_seq_entry.cpp.

◆ CAgpToSeqEntry() [2/2]

CAgpToSeqEntry::CAgpToSeqEntry ( const CAgpToSeqEntry )
private

Member Function Documentation

◆ Finalize()

int CAgpToSeqEntry::Finalize ( void  )
protectedvirtual

Parent finalize plus making sure last m_bioseq is added.

Reimplemented from CAgpReader.

Definition at line 171 of file agp_seq_entry.cpp.

References CAgpReader::Finalize(), and x_FinishedBioseq().

◆ GetResult()

TSeqEntryRefVec& CAgpToSeqEntry::GetResult ( void  )
inline

This gets the results found, but don't call before finalizing.

We are intentionally giving a non-const reference because the caller is free to take the seq-entries inside and do whatever they like with them. Each Seq-entry contains just one Bioseq, built from the AGP file(s).

Definition at line 81 of file agp_seq_entry.hpp.

References m_entries.

Referenced by CAgpObjectLoader::Execute(), CFileLoader::x_LoadAGP(), CAgpConverter::x_ReadAgpEntries(), CFormatGuessEx::x_TryAgp(), and CMultiReaderApp::xProcessAgp().

◆ OnGapOrComponent()

void CAgpToSeqEntry::OnGapOrComponent ( void  )
protectedvirtual

◆ operator=()

CAgpToSeqEntry& CAgpToSeqEntry::operator= ( const CAgpToSeqEntry )
private

◆ s_DefaultSeqIdFromStr()

CRef< CSeq_id > CAgpToSeqEntry::s_DefaultSeqIdFromStr ( const std::string str)
static

This is the default method used to turn strings into Seq-ids in AGP contexts.

See also
x_GetSeqIdFromStr

Definition at line 66 of file agp_seq_entry.cpp.

References CRef< C, Locker >::Reset(), s_LocalSeqIdFromStr(), and str().

Referenced by x_GetSeqIdFromStr().

◆ s_LocalSeqIdFromStr()

CRef< objects::CSeq_id > CAgpToSeqEntry::s_LocalSeqIdFromStr ( const std::string str)
static

◆ x_FinishedBioseq()

void CAgpToSeqEntry::x_FinishedBioseq ( void  )
protected

Our own finalization after parent's finalization.

Definition at line 181 of file agp_seq_entry.cpp.

References m_bioseq, m_entries, CRef< C, Locker >::Reset(), and CSeq_entry_Base::SetSeq().

Referenced by Finalize(), and OnGapOrComponent().

◆ x_GetSeqIdFromStr()

CRef< CSeq_id > CAgpToSeqEntry::x_GetSeqIdFromStr ( const std::string str)
protectedvirtual

If you must change exactly how strings are turned into Seq-ids, you can override this in a subclass.

The default

Definition at line 193 of file agp_seq_entry.cpp.

References fForceLocalId, m_fFlags, s_DefaultSeqIdFromStr(), s_LocalSeqIdFromStr(), and str().

Referenced by OnGapOrComponent().

◆ x_SetSeqGap()

void CAgpToSeqEntry::x_SetSeqGap ( objects::CSeq_gap &  out_gap_info)
protected

Fills in out_gap_info based on current CAgpRow.

Definition at line 202 of file agp_seq_entry.cpp.

References _ASSERT, DEFINE_STATIC_ARRAY_MAP, CAgpRow::eGapCentromere, CAgpRow::eGapClone, CAgpRow::eGapContig, CAgpRow::eGapFragment, CAgpRow::eGapHeterochromatin, CAgpRow::eGapRepeat, CAgpRow::eGapScaffold, CAgpRow::eGapShort_arm, CAgpRow::eGapTelomere, CSeq_gap_Base::eLinkage_linked, CSeq_gap_Base::eLinkage_unlinked, CLinkage_evidence_Base::eType_align_genus, CLinkage_evidence_Base::eType_align_trnscpt, CLinkage_evidence_Base::eType_align_xgenus, CSeq_gap_Base::eType_centromere, CSeq_gap_Base::eType_clone, CLinkage_evidence_Base::eType_clone_contig, CSeq_gap_Base::eType_contig, CSeq_gap_Base::eType_fragment, CSeq_gap_Base::eType_heterochromatin, CLinkage_evidence_Base::eType_map, CLinkage_evidence_Base::eType_paired_ends, CLinkage_evidence_Base::eType_pcr, CLinkage_evidence_Base::eType_proximity_ligation, CSeq_gap_Base::eType_repeat, CSeq_gap_Base::eType_scaffold, CSeq_gap_Base::eType_short_arm, CLinkage_evidence_Base::eType_strobe, CSeq_gap_Base::eType_telomere, CLinkage_evidence_Base::eType_unspecified, CLinkage_evidence_Base::eType_within_clone, CAgpRow::fLinkageEvidence_align_genus, CAgpRow::fLinkageEvidence_align_trnscpt, CAgpRow::fLinkageEvidence_align_xgenus, CAgpRow::fLinkageEvidence_clone_contig, CAgpRow::fLinkageEvidence_map, CAgpRow::fLinkageEvidence_na, CAgpRow::fLinkageEvidence_paired_ends, CAgpRow::fLinkageEvidence_pcr, CAgpRow::fLinkageEvidence_proximity_ligation, CAgpRow::fLinkageEvidence_strobe, CAgpRow::fLinkageEvidence_unspecified, CAgpRow::fLinkageEvidence_within_clone, CAgpRow::gap_type, ITERATE, CAgpRow::linkage, CAgpRow::linkage_evidence_flags, CAgpRow::linkage_evidences, CAgpReader::m_this_row, NCBI_USER_THROW_FMT, CSeq_gap_Base::SetLinkage(), CSeq_gap_Base::SetLinkage_evidence(), and CSeq_gap_Base::SetType().

Referenced by OnGapOrComponent().

Member Data Documentation

◆ m_bioseq

CRef<objects::CBioseq> CAgpToSeqEntry::m_bioseq
protected

This is the bioseq currently being built.

Definition at line 114 of file agp_seq_entry.hpp.

Referenced by OnGapOrComponent(), and x_FinishedBioseq().

◆ m_entries

vector< CRef<objects::CSeq_entry> > CAgpToSeqEntry::m_entries
protected

Holds the results.

Definition at line 116 of file agp_seq_entry.hpp.

Referenced by GetResult(), and x_FinishedBioseq().

◆ m_fFlags

const TFlags CAgpToSeqEntry::m_fFlags
protected

Definition at line 93 of file agp_seq_entry.hpp.

Referenced by OnGapOrComponent(), and x_GetSeqIdFromStr().


The documentation for this class was generated from the following files:
Modified on Fri Sep 20 14:57:02 2024 by modify_doxy.py rev. 669887