NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|398366157|ref|NP_058174|]
View 

gag-pol fusion protein [Saccharomyces cerevisiae S288C]

Protein Classification

Ty1/Copia family ribonuclease HI( domain architecture ID 10470242)

Ty1/Copia family ribonuclease HI in long-term repeat (LTR) retroelements is involved in the degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
TYA pfam01021
Ty transposon capsid protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a ...
17-396 0e+00

Ty transposon capsid protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles. This entry corresponds to the capsid protein from Ty1 and Ty2 transposons.


:

Pssm-ID: 425992  Cd Length: 384  Bit Score: 679.37  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157    17 AYASVTSKEVPSNQDPLAVSASNLPEFDRDSTKVNSQQETTPGTSAVPENHHHVSPQPASVPPPQNGQYQQHGMMTPNKA 96
Cdd:pfam01021    1 ACASVTSKEVHTNQDPLDVSASKLQEYDKDSTKANSQQTTTPGSSAVPENHHHASPQPASVPPPQNGPYSQQCMMTPNQA 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157    97 MASNWAHYQQPSMMTCSHYQTSPAYYQPDPHYPLPQYIP----PLSTSSPDPIDLKNQHSEIPQAKTKVGNNVLPPHTLT 172
Cdd:pfam01021   81 NPSGWPFYGHPSMMPYTPYQMSPMYFPPGPQSQFPQYPSsvgtPLSTPSPESGNTFTDSSSAKSDMTSTNKYVRPPPILT 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157   173 SEENFSTWVKFYIRFLKNSNLGDIIPNDQGEIKRQMTYEEHAYIYNTFQAFAPFHLLPTWVKQILEINYADILTVLCKSV 252
Cdd:pfam01021  161 SPNDFLNWVKTYIKFLQNSNLGDIIPTATGKAVRQMTDDEHTFLYNTFQLFAPSQFLPTWVKDILSVDYTDIMKILSKSI 240
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157   253 SKMQTNNQELKDWIALANLEYDGSTSADTFEITVSTIIQRLKENNINVSDRLACQLILKGLSGDFKYLRNQYRTKTNMKL 332
Cdd:pfam01021  241 EKMQSDTQEVNDIITLANLHYNGSTPADTFETTVTNIIDRLNNNGININDKVACQLIMRGLSGEYKFLRYTRHRHINMTV 320
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 398366157   333 SQLFAEIQLIYDENKIMNLNKPSQYKQHSEYKNVSRTSPNTTNTKVTTRNYHRTNSSKPRAAKA 396
Cdd:pfam01021  321 ADLFSDIHAIYEEQQESKRNKPTYRRNPSDEKNDSRTYTNTTKTKVITRNSQKTNNSKSRTAKA 384
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1621-1757 2.15e-28

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


:

Pssm-ID: 260004  Cd Length: 140  Bit Score: 111.79  E-value: 2.15e-28
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157 1621 VAISDASYGNQPY-YKSQIGNIFLLNGKVIGGKSTKASLTCTSTTEAEIHAVSEAIPLLNNLSHLVQEL---NKKPIIkg 1696
Cdd:cd09272     1 EGYSDADWAGDPDdRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELgipLDGPTT-- 78
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 398366157 1697 LLTDSRSTISIIKStneEKF--RNRFFGTKAMRLRDEVSGNNLYVYYIETKKNIADVMTKPLP 1757
Cdd:cd09272    79 IYCDNQSAIALAKN---PVFhsRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138
RVT_2 super family cl06662
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
1285-1468 1.42e-22

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


The actual alignment was detected with superfamily member pfam07727:

Pssm-ID: 400190  Cd Length: 243  Bit Score: 98.82  E-value: 1.42e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157  1285 NKYYDRNDIdPK--KVINSMFIFNKK----RDGTHKARFVARGDIQHPDTYDSDMQSNTVHHYALMTSLSIALDNDYYIT 1358
Cdd:pfam07727    1 NETWTLVKL-PKnvKPIGTTWVHTHKindlKEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVH 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157  1359 QLDISSAYLYADIKEELYIRPPPHL---GLNDKLLRLRKSLYGLKQSGANWYETIKSYLINCCDMQE-------VRGwsc 1428
Cdd:pfam07727   80 HMDVSSAFLNGDIDEEIYVKQPPGFnidNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDLNFEPDtaesgmyCRG--- 156
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|
gi 398366157  1429 vFKNSQVTICLFVDDMILFSKDLNANKKIITTLKKQYDTK 1468
Cdd:pfam07727  157 -FGENKLIVGLYVDDMFITGSDITIINDFKLELAKHFKMK 195
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
660-761 3.41e-09

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


:

Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 55.78  E-value: 3.41e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157   660 PFQYLHTDIFgpvhHLPKSAP---SYFISFTDEKTRFQWVYPLhdRREESILNVFTSILAFIKNQFNaRVLVIQMDRGSE 736
Cdd:pfam00665    1 PNQLWQGDFT----YIRIPGGggkLYLLVIVDDFSREILAWAL--SSEMDAELVLDALERAIAFRGG-VPLIIHSDNGSE 73
                           90       100
                   ....*....|....*....|....*
gi 398366157   737 YTNKTLHKFFTNRGITACYTTTADS 761
Cdd:pfam00665   74 YTSKAFREFLKDLGIKPSFSRPGNP 98
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
574-641 2.31e-06

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


:

Pssm-ID: 372857  Cd Length: 67  Bit Score: 46.59  E-value: 2.31e-06
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 398366157   574 KLTINNVNKSKSV---NKYPYPLIHRMLGHANFRSIQKSLKKNAVTYLKESDIEwsnastyQCPDCLIGKS 641
Cdd:pfam13976    4 LLDLSSVANSSIAvasKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISKDL-------VCESCQLGKQ 67
 
Name Accession Description Interval E-value
TYA pfam01021
Ty transposon capsid protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a ...
17-396 0e+00

Ty transposon capsid protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles. This entry corresponds to the capsid protein from Ty1 and Ty2 transposons.


Pssm-ID: 425992  Cd Length: 384  Bit Score: 679.37  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157    17 AYASVTSKEVPSNQDPLAVSASNLPEFDRDSTKVNSQQETTPGTSAVPENHHHVSPQPASVPPPQNGQYQQHGMMTPNKA 96
Cdd:pfam01021    1 ACASVTSKEVHTNQDPLDVSASKLQEYDKDSTKANSQQTTTPGSSAVPENHHHASPQPASVPPPQNGPYSQQCMMTPNQA 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157    97 MASNWAHYQQPSMMTCSHYQTSPAYYQPDPHYPLPQYIP----PLSTSSPDPIDLKNQHSEIPQAKTKVGNNVLPPHTLT 172
Cdd:pfam01021   81 NPSGWPFYGHPSMMPYTPYQMSPMYFPPGPQSQFPQYPSsvgtPLSTPSPESGNTFTDSSSAKSDMTSTNKYVRPPPILT 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157   173 SEENFSTWVKFYIRFLKNSNLGDIIPNDQGEIKRQMTYEEHAYIYNTFQAFAPFHLLPTWVKQILEINYADILTVLCKSV 252
Cdd:pfam01021  161 SPNDFLNWVKTYIKFLQNSNLGDIIPTATGKAVRQMTDDEHTFLYNTFQLFAPSQFLPTWVKDILSVDYTDIMKILSKSI 240
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157   253 SKMQTNNQELKDWIALANLEYDGSTSADTFEITVSTIIQRLKENNINVSDRLACQLILKGLSGDFKYLRNQYRTKTNMKL 332
Cdd:pfam01021  241 EKMQSDTQEVNDIITLANLHYNGSTPADTFETTVTNIIDRLNNNGININDKVACQLIMRGLSGEYKFLRYTRHRHINMTV 320
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 398366157   333 SQLFAEIQLIYDENKIMNLNKPSQYKQHSEYKNVSRTSPNTTNTKVTTRNYHRTNSSKPRAAKA 396
Cdd:pfam01021  321 ADLFSDIHAIYEEQQESKRNKPTYRRNPSDEKNDSRTYTNTTKTKVITRNSQKTNNSKSRTAKA 384
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1621-1757 2.15e-28

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260004  Cd Length: 140  Bit Score: 111.79  E-value: 2.15e-28
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157 1621 VAISDASYGNQPY-YKSQIGNIFLLNGKVIGGKSTKASLTCTSTTEAEIHAVSEAIPLLNNLSHLVQEL---NKKPIIkg 1696
Cdd:cd09272     1 EGYSDADWAGDPDdRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELgipLDGPTT-- 78
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 398366157 1697 LLTDSRSTISIIKStneEKF--RNRFFGTKAMRLRDEVSGNNLYVYYIETKKNIADVMTKPLP 1757
Cdd:cd09272    79 IYCDNQSAIALAKN---PVFhsRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138
RVT_2 pfam07727
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
1285-1468 1.42e-22

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


Pssm-ID: 400190  Cd Length: 243  Bit Score: 98.82  E-value: 1.42e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157  1285 NKYYDRNDIdPK--KVINSMFIFNKK----RDGTHKARFVARGDIQHPDTYDSDMQSNTVHHYALMTSLSIALDNDYYIT 1358
Cdd:pfam07727    1 NETWTLVKL-PKnvKPIGTTWVHTHKindlKEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVH 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157  1359 QLDISSAYLYADIKEELYIRPPPHL---GLNDKLLRLRKSLYGLKQSGANWYETIKSYLINCCDMQE-------VRGwsc 1428
Cdd:pfam07727   80 HMDVSSAFLNGDIDEEIYVKQPPGFnidNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDLNFEPDtaesgmyCRG--- 156
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|
gi 398366157  1429 vFKNSQVTICLFVDDMILFSKDLNANKKIITTLKKQYDTK 1468
Cdd:pfam07727  157 -FGENKLIVGLYVDDMFITGSDITIINDFKLELAKHFKMK 195
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
660-761 3.41e-09

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 55.78  E-value: 3.41e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157   660 PFQYLHTDIFgpvhHLPKSAP---SYFISFTDEKTRFQWVYPLhdRREESILNVFTSILAFIKNQFNaRVLVIQMDRGSE 736
Cdd:pfam00665    1 PNQLWQGDFT----YIRIPGGggkLYLLVIVDDFSREILAWAL--SSEMDAELVLDALERAIAFRGG-VPLIIHSDNGSE 73
                           90       100
                   ....*....|....*....|....*
gi 398366157   737 YTNKTLHKFFTNRGITACYTTTADS 761
Cdd:pfam00665   74 YTSKAFREFLKDLGIKPSFSRPGNP 98
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
574-641 2.31e-06

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


Pssm-ID: 372857  Cd Length: 67  Bit Score: 46.59  E-value: 2.31e-06
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 398366157   574 KLTINNVNKSKSV---NKYPYPLIHRMLGHANFRSIQKSLKKNAVTYLKESDIEwsnastyQCPDCLIGKS 641
Cdd:pfam13976    4 LLDLSSVANSSIAvasKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISKDL-------VCESCQLGKQ 67
Tra5 COG2801
Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];
727-778 1.10e-04

Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];


Pssm-ID: 442053 [Multi-domain]  Cd Length: 309  Bit Score: 46.30  E-value: 1.10e-04
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|..
gi 398366157  727 LVIQMDRGSEYTNKTLHKFFTNRGITACYTTTADSRAHGVAERLNRTLLNDC 778
Cdd:COG2801   210 LILHSDNGSQYTSKAYQELLKKLGITQSMSRPGNPQDNAFIESFFGTLKYEL 261
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
52-144 9.84e-04

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 41.70  E-value: 9.84e-04
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157     52 SQQetTPGTSAVPEnHHHVSPQPASVP-PPQNGQYQQ--HGMMTPNKAMASNW-AHYQQPSMMTcshYQTSPAYYQPDPH 127
Cdd:smart00818   43 SQQ--HPPTHTLQP-HHHIPVLPAQQPvVPQQPLMPVpgQHSMTPTQHHQPNLpQPAQQPFQPQ---PLQPPQPQQPMQP 116
                            90
                    ....*....|....*..
gi 398366157    128 YPLPQYIPPLSTSSPDP 144
Cdd:smart00818  117 QPPVHPIPPLPPQPPLP 133
transpos_IS3 NF033516
IS3 family transposase;
727-783 1.31e-03

IS3 family transposase;


Pssm-ID: 468052 [Multi-domain]  Cd Length: 369  Bit Score: 42.94  E-value: 1.31e-03
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 398366157  727 LVIQMDRGSEYTNKTLHKFFTNRGITACYTTTADSRAHGVAERLNRTL------LNDCRTLLH 783
Cdd:NF033516  277 LILHSDNGSQYTSKAYREWLKEHGITQSMSRPGNCWDNAVAESFFGTLkreclyRRRFRTLEE 339
transpos_IS481 NF033577
IS481 family transposase; null
648-776 1.91e-03

IS481 family transposase; null


Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 42.19  E-value: 1.91e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157  648 KGSRLKYQESYePFQYLHTDIFGpVHHLPKSAPSYFISFTDEKTRFQWVYPLHDRREEsilnvfTSIlAFIKNQFNA--- 724
Cdd:NF033577  116 TGKVKRYERAH-PGELWHIDIKK-LGRIPDVGRLYLHTAIDDHSRFAYAELYPDETAE------TAA-DFLRRAFAEhgi 186
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|....
gi 398366157  725 RVLVIQMDRGSEYTNKT--LHKFFTNRGITACYTTTADSRAHGVAERLNRTLLN 776
Cdd:NF033577  187 PIRRVLTDNGSEFRSRAhgFELALAELGIEHRRTRPYHPQTNGKVERFHRTLKD 240
 
Name Accession Description Interval E-value
TYA pfam01021
Ty transposon capsid protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a ...
17-396 0e+00

Ty transposon capsid protein; Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles. This entry corresponds to the capsid protein from Ty1 and Ty2 transposons.


Pssm-ID: 425992  Cd Length: 384  Bit Score: 679.37  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157    17 AYASVTSKEVPSNQDPLAVSASNLPEFDRDSTKVNSQQETTPGTSAVPENHHHVSPQPASVPPPQNGQYQQHGMMTPNKA 96
Cdd:pfam01021    1 ACASVTSKEVHTNQDPLDVSASKLQEYDKDSTKANSQQTTTPGSSAVPENHHHASPQPASVPPPQNGPYSQQCMMTPNQA 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157    97 MASNWAHYQQPSMMTCSHYQTSPAYYQPDPHYPLPQYIP----PLSTSSPDPIDLKNQHSEIPQAKTKVGNNVLPPHTLT 172
Cdd:pfam01021   81 NPSGWPFYGHPSMMPYTPYQMSPMYFPPGPQSQFPQYPSsvgtPLSTPSPESGNTFTDSSSAKSDMTSTNKYVRPPPILT 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157   173 SEENFSTWVKFYIRFLKNSNLGDIIPNDQGEIKRQMTYEEHAYIYNTFQAFAPFHLLPTWVKQILEINYADILTVLCKSV 252
Cdd:pfam01021  161 SPNDFLNWVKTYIKFLQNSNLGDIIPTATGKAVRQMTDDEHTFLYNTFQLFAPSQFLPTWVKDILSVDYTDIMKILSKSI 240
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157   253 SKMQTNNQELKDWIALANLEYDGSTSADTFEITVSTIIQRLKENNINVSDRLACQLILKGLSGDFKYLRNQYRTKTNMKL 332
Cdd:pfam01021  241 EKMQSDTQEVNDIITLANLHYNGSTPADTFETTVTNIIDRLNNNGININDKVACQLIMRGLSGEYKFLRYTRHRHINMTV 320
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 398366157   333 SQLFAEIQLIYDENKIMNLNKPSQYKQHSEYKNVSRTSPNTTNTKVTTRNYHRTNSSKPRAAKA 396
Cdd:pfam01021  321 ADLFSDIHAIYEEQQESKRNKPTYRRNPSDEKNDSRTYTNTTKTKVITRNSQKTNNSKSRTAKA 384
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1621-1757 2.15e-28

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260004  Cd Length: 140  Bit Score: 111.79  E-value: 2.15e-28
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157 1621 VAISDASYGNQPY-YKSQIGNIFLLNGKVIGGKSTKASLTCTSTTEAEIHAVSEAIPLLNNLSHLVQEL---NKKPIIkg 1696
Cdd:cd09272     1 EGYSDADWAGDPDdRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELgipLDGPTT-- 78
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 398366157 1697 LLTDSRSTISIIKStneEKF--RNRFFGTKAMRLRDEVSGNNLYVYYIETKKNIADVMTKPLP 1757
Cdd:cd09272    79 IYCDNQSAIALAKN---PVFhsRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLP 138
RVT_2 pfam07727
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
1285-1468 1.42e-22

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


Pssm-ID: 400190  Cd Length: 243  Bit Score: 98.82  E-value: 1.42e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157  1285 NKYYDRNDIdPK--KVINSMFIFNKK----RDGTHKARFVARGDIQHPDTYDSDMQSNTVHHYALMTSLSIALDNDYYIT 1358
Cdd:pfam07727    1 NETWTLVKL-PKnvKPIGTTWVHTHKindlKEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVH 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157  1359 QLDISSAYLYADIKEELYIRPPPHL---GLNDKLLRLRKSLYGLKQSGANWYETIKSYLINCCDMQE-------VRGwsc 1428
Cdd:pfam07727   80 HMDVSSAFLNGDIDEEIYVKQPPGFnidNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDLNFEPDtaesgmyCRG--- 156
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|
gi 398366157  1429 vFKNSQVTICLFVDDMILFSKDLNANKKIITTLKKQYDTK 1468
Cdd:pfam07727  157 -FGENKLIVGLYVDDMFITGSDITIINDFKLELAKHFKMK 195
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
660-761 3.41e-09

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 55.78  E-value: 3.41e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157   660 PFQYLHTDIFgpvhHLPKSAP---SYFISFTDEKTRFQWVYPLhdRREESILNVFTSILAFIKNQFNaRVLVIQMDRGSE 736
Cdd:pfam00665    1 PNQLWQGDFT----YIRIPGGggkLYLLVIVDDFSREILAWAL--SSEMDAELVLDALERAIAFRGG-VPLIIHSDNGSE 73
                           90       100
                   ....*....|....*....|....*
gi 398366157   737 YTNKTLHKFFTNRGITACYTTTADS 761
Cdd:pfam00665   74 YTSKAFREFLKDLGIKPSFSRPGNP 98
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
574-641 2.31e-06

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


Pssm-ID: 372857  Cd Length: 67  Bit Score: 46.59  E-value: 2.31e-06
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 398366157   574 KLTINNVNKSKSV---NKYPYPLIHRMLGHANFRSIQKSLKKNAVTYLKESDIEwsnastyQCPDCLIGKS 641
Cdd:pfam13976    4 LLDLSSVANSSIAvasKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISKDL-------VCESCQLGKQ 67
Tra5 COG2801
Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];
727-778 1.10e-04

Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];


Pssm-ID: 442053 [Multi-domain]  Cd Length: 309  Bit Score: 46.30  E-value: 1.10e-04
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|..
gi 398366157  727 LVIQMDRGSEYTNKTLHKFFTNRGITACYTTTADSRAHGVAERLNRTLLNDC 778
Cdd:COG2801   210 LILHSDNGSQYTSKAYQELLKKLGITQSMSRPGNPQDNAFIESFFGTLKYEL 261
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
52-144 9.84e-04

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 41.70  E-value: 9.84e-04
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157     52 SQQetTPGTSAVPEnHHHVSPQPASVP-PPQNGQYQQ--HGMMTPNKAMASNW-AHYQQPSMMTcshYQTSPAYYQPDPH 127
Cdd:smart00818   43 SQQ--HPPTHTLQP-HHHIPVLPAQQPvVPQQPLMPVpgQHSMTPTQHHQPNLpQPAQQPFQPQ---PLQPPQPQQPMQP 116
                            90
                    ....*....|....*..
gi 398366157    128 YPLPQYIPPLSTSSPDP 144
Cdd:smart00818  117 QPPVHPIPPLPPQPPLP 133
transpos_IS3 NF033516
IS3 family transposase;
727-783 1.31e-03

IS3 family transposase;


Pssm-ID: 468052 [Multi-domain]  Cd Length: 369  Bit Score: 42.94  E-value: 1.31e-03
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 398366157  727 LVIQMDRGSEYTNKTLHKFFTNRGITACYTTTADSRAHGVAERLNRTL------LNDCRTLLH 783
Cdd:NF033516  277 LILHSDNGSQYTSKAYREWLKEHGITQSMSRPGNCWDNAVAESFFGTLkreclyRRRFRTLEE 339
transpos_IS481 NF033577
IS481 family transposase; null
648-776 1.91e-03

IS481 family transposase; null


Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 42.19  E-value: 1.91e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 398366157  648 KGSRLKYQESYePFQYLHTDIFGpVHHLPKSAPSYFISFTDEKTRFQWVYPLHDRREEsilnvfTSIlAFIKNQFNA--- 724
Cdd:NF033577  116 TGKVKRYERAH-PGELWHIDIKK-LGRIPDVGRLYLHTAIDDHSRFAYAELYPDETAE------TAA-DFLRRAFAEhgi 186
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|....
gi 398366157  725 RVLVIQMDRGSEYTNKT--LHKFFTNRGITACYTTTADSRAHGVAERLNRTLLN 776
Cdd:NF033577  187 PIRRVLTDNGSEFRSRAhgFELALAELGIEHRRTRPYHPQTNGKVERFHRTLKD 240
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH