NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1663694841|ref|XP_029179221|]
View 

uncharacterized protein LOC114946738 [Nylanderia fulva]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
Peptidase_A17 pfam05380
Pao retrotransposon peptidase; Corresponds to Merops family A17. These proteins are homologous ...
884-1045 1.25e-69

Pao retrotransposon peptidase; Corresponds to Merops family A17. These proteins are homologous to aspartic proteinases encoded by retroposons and retroviruses.


:

Pssm-ID: 461634  Cd Length: 162  Bit Score: 230.86  E-value: 1.25e-69
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  884 TKRRILSDIAKFFDPLGWATPVIIRAKILMQRLWIAKCDWDEVAPPNLLEAWQQYHTHLKQLEEVMIPRWIQLGHHVLHL 963
Cdd:pfam05380    1 TKREVLSFIARIFDPLGLLSPVIVKGKILMQKLWQLKIDWDDPLPDELLEEWEKYRSELPELSTLTIPRQLTPPYDISSV 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  964 ELHGFSDASTKAYAAAVYIRVVTVDGMVSVNLLAAKSKVAPVKVMSVPRLELSAAQLLARLIHFIREALDFREVNVYCWT 1043
Cdd:pfam05380   81 ELHGFSDASEKAYGAVVYLRSEDTDGTVTVKLLCAKSKVAPLKKLTIPRLELCGALLLARLANYVIKELSLKISSVYAWS 160

                   ..
gi 1663694841 1044 DS 1045
Cdd:pfam05380  161 DS 162
RT_pepA17 cd01644
RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT ...
653-869 1.10e-67

RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT domain of a multifunctional enzyme. C-terminal to the RT domain is a domain homologous to aspartic proteinases (corresponding to Merops family A17) encoded by retrotransposons and retroviruses. RT catalyzes DNA replication from an RNA template and is responsible for the replication of retroelements.


:

Pssm-ID: 238822  Cd Length: 213  Bit Score: 227.19  E-value: 1.10e-67
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  653 YIPHHAVFRANSLTTRIRIVFNASSRTsNGTSLNDHLYPGPKLQKDLAAVILRWRTFRYVYSADVAKMYRQIQVDPRDRD 732
Cdd:cd01644      3 YLPHHAVIKPSKTTTKLRVVFDASARY-NGVSLNDMLLKGPDLLNSLFGVLLRFRQGKIAVSADIEKMFHQVKVRPEDRD 81
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  733 FQRIFWRPSPTHQ-VEEYRLCTVTYGMTSAPYLALKVMNQLAiDEGASFPLAtSVIDRQMYVDDFIFGADDKVLARQTRE 811
Cdd:cd01644     82 VLRFLWRKDGDEPkPIEYRMTVVPFGAASAPFLANRALKQHA-EDHPHEAAA-KIIKRNFYVDDILVSTDTLNEAVNVAK 159
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1663694841  812 QIVSLLKRGGFTLRKWASNVSELLDEIEDTDHGLAQSRDLNEdeslKILGLTWQPDRD 869
Cdd:cd01644    160 RLIALLKKGGFNLRKWASNSQEVLDDLPEERVLLDRDSDVTE----KTLGLRWNPKTD 213
DUF5641 pfam18701
Family of unknown function (DUF5641); This presumed domain is found in a range of ...
1554-1647 1.40e-45

Family of unknown function (DUF5641); This presumed domain is found in a range of retrotransposon polyproteins.


:

Pssm-ID: 465838  Cd Length: 94  Bit Score: 159.19  E-value: 1.40e-45
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841 1554 SRWQLLQHITEKFWKSWSNDYLLTLQQRPKWKIAQRLATVGRIVLLRNALAPPSHWELGRIIECHPGNDGLVRVVTVRTA 1633
Cdd:pfam18701    1 SRWQLVQQLRQHFWKRWSSEYLPQLQQRSKWTSSKPNLKVGDLVLVKEDNLPPLKWPLGRVIEVHPGSDGVVRVVTVRTA 80
                           90
                   ....*....|....
gi 1663694841 1634 RSQYKRPIVKLCFL 1647
Cdd:pfam18701   81 TGELKRPVVKLCPL 94
DUF1759 super family cl04160
Protein of unknown function (DUF1759); This is a family of proteins of unknown function. Most ...
101-215 2.88e-19

Protein of unknown function (DUF1759); This is a family of proteins of unknown function. Most of the members are gag-polyproteins.


The actual alignment was detected with superfamily member pfam03564:

Pssm-ID: 281552  Cd Length: 148  Bit Score: 85.96  E-value: 2.88e-19
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  101 FDGSFDKWESFRDRFQSMIIEEKSLSNVQKLHHLFSCLKGEALTAIEHLTITSDNFAVAWTILSSNFENERRLINSLDKS 180
Cdd:pfam03564    1 FSGDYKEWPAFWDLFESTIHSKPHLPKVQKFNYLKSLLKGEAANVVAHLAITASNYESAWEALKKRYDNPRVIKRSLLNE 80
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 1663694841  181 SREAWELKLGKTTVYPTF-DEIsaflDSRIRALDAL 215
Cdd:pfam03564   81 FMKLPSTNEDSVSQLRRFvDAA----NEIIRGLEAL 112
pepsin_retropepsin_like super family cl11403
Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular ...
362-522 7.93e-11

Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family).


The actual alignment was detected with superfamily member pfam05585:

Pssm-ID: 472175  Cd Length: 164  Bit Score: 62.25  E-value: 7.93e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  362 VRVHSMHGRFQWARALIDQGSVSSFITENLVQSLRLPKLNTAVRVTGIGETQTSVRHAVHLTITPSNADTPVYRTTALIL 441
Cdd:pfam05585    1 VVVSNAQGARTKCRLLFDSGSELSYISERCINRLGLARTPSRILVIGISGDKAPQTRGSNRTVISSRLSNGTLAVRAHVL 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  442 KSLTRYLPNRLDIPEQWPHLNGLSLADPDPAGSDPIEIIIGADLFGSLILDGVRKGAVDEPIAQNTALGWIISGPIAQLR 521
Cdd:pfam05585   81 KKITSSLERHVIDDSILAVFNDLKPADLNFRKIAPIDILLGSDYFWDFITGTKIKDLGGGLIAISSIFGWVITSKQAEKK 160

                   .
gi 1663694841  522 P 522
Cdd:pfam05585  161 E 161
Integrase_H2C2 pfam17921
Integrase zinc binding domain; This zinc binding domain is found in a wide variety of ...
1273-1325 9.00e-08

Integrase zinc binding domain; This zinc binding domain is found in a wide variety of integrase proteins.


:

Pssm-ID: 465569 [Multi-domain]  Cd Length: 58  Bit Score: 50.32  E-value: 9.00e-08
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1663694841 1273 DHTLVRMIIMDTHRRMLHAGSQLTLARIREKFWILRARSLVRAVLYKCVSCTR 1325
Cdd:pfam17921    2 PKSLRKEILKEAHDSGGHLGIEKTLARLRRRYWWPGMRKDVKKYVKSCETCQR 54
rve super family cl47583
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
1375-1428 8.57e-04

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


The actual alignment was detected with superfamily member pfam00665:

Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 40.38  E-value: 8.57e-04
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1663694841 1375 KAYIAVFVCMTTKAIHLELVSD-YSSDAFLATLNRFVSRRGY-PASIYSDNGTTFQ 1428
Cdd:pfam00665   20 KLYLLVIVDDFSREILAWALSSeMDAELVLDALERAIAFRGGvPLIIHSDNGSEYT 75
 
Name Accession Description Interval E-value
Peptidase_A17 pfam05380
Pao retrotransposon peptidase; Corresponds to Merops family A17. These proteins are homologous ...
884-1045 1.25e-69

Pao retrotransposon peptidase; Corresponds to Merops family A17. These proteins are homologous to aspartic proteinases encoded by retroposons and retroviruses.


Pssm-ID: 461634  Cd Length: 162  Bit Score: 230.86  E-value: 1.25e-69
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  884 TKRRILSDIAKFFDPLGWATPVIIRAKILMQRLWIAKCDWDEVAPPNLLEAWQQYHTHLKQLEEVMIPRWIQLGHHVLHL 963
Cdd:pfam05380    1 TKREVLSFIARIFDPLGLLSPVIVKGKILMQKLWQLKIDWDDPLPDELLEEWEKYRSELPELSTLTIPRQLTPPYDISSV 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  964 ELHGFSDASTKAYAAAVYIRVVTVDGMVSVNLLAAKSKVAPVKVMSVPRLELSAAQLLARLIHFIREALDFREVNVYCWT 1043
Cdd:pfam05380   81 ELHGFSDASEKAYGAVVYLRSEDTDGTVTVKLLCAKSKVAPLKKLTIPRLELCGALLLARLANYVIKELSLKISSVYAWS 160

                   ..
gi 1663694841 1044 DS 1045
Cdd:pfam05380  161 DS 162
RT_pepA17 cd01644
RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT ...
653-869 1.10e-67

RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT domain of a multifunctional enzyme. C-terminal to the RT domain is a domain homologous to aspartic proteinases (corresponding to Merops family A17) encoded by retrotransposons and retroviruses. RT catalyzes DNA replication from an RNA template and is responsible for the replication of retroelements.


Pssm-ID: 238822  Cd Length: 213  Bit Score: 227.19  E-value: 1.10e-67
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  653 YIPHHAVFRANSLTTRIRIVFNASSRTsNGTSLNDHLYPGPKLQKDLAAVILRWRTFRYVYSADVAKMYRQIQVDPRDRD 732
Cdd:cd01644      3 YLPHHAVIKPSKTTTKLRVVFDASARY-NGVSLNDMLLKGPDLLNSLFGVLLRFRQGKIAVSADIEKMFHQVKVRPEDRD 81
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  733 FQRIFWRPSPTHQ-VEEYRLCTVTYGMTSAPYLALKVMNQLAiDEGASFPLAtSVIDRQMYVDDFIFGADDKVLARQTRE 811
Cdd:cd01644     82 VLRFLWRKDGDEPkPIEYRMTVVPFGAASAPFLANRALKQHA-EDHPHEAAA-KIIKRNFYVDDILVSTDTLNEAVNVAK 159
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1663694841  812 QIVSLLKRGGFTLRKWASNVSELLDEIEDTDHGLAQSRDLNEdeslKILGLTWQPDRD 869
Cdd:cd01644    160 RLIALLKKGGFNLRKWASNSQEVLDDLPEERVLLDRDSDVTE----KTLGLRWNPKTD 213
DUF5641 pfam18701
Family of unknown function (DUF5641); This presumed domain is found in a range of ...
1554-1647 1.40e-45

Family of unknown function (DUF5641); This presumed domain is found in a range of retrotransposon polyproteins.


Pssm-ID: 465838  Cd Length: 94  Bit Score: 159.19  E-value: 1.40e-45
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841 1554 SRWQLLQHITEKFWKSWSNDYLLTLQQRPKWKIAQRLATVGRIVLLRNALAPPSHWELGRIIECHPGNDGLVRVVTVRTA 1633
Cdd:pfam18701    1 SRWQLVQQLRQHFWKRWSSEYLPQLQQRSKWTSSKPNLKVGDLVLVKEDNLPPLKWPLGRVIEVHPGSDGVVRVVTVRTA 80
                           90
                   ....*....|....
gi 1663694841 1634 RSQYKRPIVKLCFL 1647
Cdd:pfam18701   81 TGELKRPVVKLCPL 94
DUF1759 pfam03564
Protein of unknown function (DUF1759); This is a family of proteins of unknown function. Most ...
101-215 2.88e-19

Protein of unknown function (DUF1759); This is a family of proteins of unknown function. Most of the members are gag-polyproteins.


Pssm-ID: 281552  Cd Length: 148  Bit Score: 85.96  E-value: 2.88e-19
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  101 FDGSFDKWESFRDRFQSMIIEEKSLSNVQKLHHLFSCLKGEALTAIEHLTITSDNFAVAWTILSSNFENERRLINSLDKS 180
Cdd:pfam03564    1 FSGDYKEWPAFWDLFESTIHSKPHLPKVQKFNYLKSLLKGEAANVVAHLAITASNYESAWEALKKRYDNPRVIKRSLLNE 80
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 1663694841  181 SREAWELKLGKTTVYPTF-DEIsaflDSRIRALDAL 215
Cdd:pfam03564   81 FMKLPSTNEDSVSQLRRFvDAA----NEIIRGLEAL 112
DUF1758 pfam05585
Putative peptidase (DUF1758); This is a family of nematode proteins of unknown function. ...
362-522 7.93e-11

Putative peptidase (DUF1758); This is a family of nematode proteins of unknown function. However, it seems likely that these proteins act as aspartic peptidases.


Pssm-ID: 147642  Cd Length: 164  Bit Score: 62.25  E-value: 7.93e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  362 VRVHSMHGRFQWARALIDQGSVSSFITENLVQSLRLPKLNTAVRVTGIGETQTSVRHAVHLTITPSNADTPVYRTTALIL 441
Cdd:pfam05585    1 VVVSNAQGARTKCRLLFDSGSELSYISERCINRLGLARTPSRILVIGISGDKAPQTRGSNRTVISSRLSNGTLAVRAHVL 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  442 KSLTRYLPNRLDIPEQWPHLNGLSLADPDPAGSDPIEIIIGADLFGSLILDGVRKGAVDEPIAQNTALGWIISGPIAQLR 521
Cdd:pfam05585   81 KKITSSLERHVIDDSILAVFNDLKPADLNFRKIAPIDILLGSDYFWDFITGTKIKDLGGGLIAISSIFGWVITSKQAEKK 160

                   .
gi 1663694841  522 P 522
Cdd:pfam05585  161 E 161
Integrase_H2C2 pfam17921
Integrase zinc binding domain; This zinc binding domain is found in a wide variety of ...
1273-1325 9.00e-08

Integrase zinc binding domain; This zinc binding domain is found in a wide variety of integrase proteins.


Pssm-ID: 465569 [Multi-domain]  Cd Length: 58  Bit Score: 50.32  E-value: 9.00e-08
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1663694841 1273 DHTLVRMIIMDTHRRMLHAGSQLTLARIREKFWILRARSLVRAVLYKCVSCTR 1325
Cdd:pfam17921    2 PKSLRKEILKEAHDSGGHLGIEKTLARLRRRYWWPGMRKDVKKYVKSCETCQR 54
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
1375-1428 8.57e-04

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 40.38  E-value: 8.57e-04
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1663694841 1375 KAYIAVFVCMTTKAIHLELVSD-YSSDAFLATLNRFVSRRGY-PASIYSDNGTTFQ 1428
Cdd:pfam00665   20 KLYLLVIVDDFSREILAWALSSeMDAELVLDALERAIAFRGGvPLIIHSDNGSEYT 75
retropepsin_like cd00303
Retropepsins; pepsin-like aspartate proteases; The family includes pepsin-like aspartate ...
372-445 4.04e-03

Retropepsins; pepsin-like aspartate proteases; The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.


Pssm-ID: 133136  Cd Length: 92  Bit Score: 38.09  E-value: 4.04e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1663694841  372 QWARALIDQGSVSSFITENLVQSLRLPK--LNTAVRVTGI-GETQTSVRHAVHLTITPSNADTPVyrtTALILKSLT 445
Cdd:cd00303      8 VPVRALVDSGASVNFISESLAKKLGLPPrlLPTPLKVKGAnGSSVKTLGVILPVTIGIGGKTFTV---DFYVLDLLS 81
 
Name Accession Description Interval E-value
Peptidase_A17 pfam05380
Pao retrotransposon peptidase; Corresponds to Merops family A17. These proteins are homologous ...
884-1045 1.25e-69

Pao retrotransposon peptidase; Corresponds to Merops family A17. These proteins are homologous to aspartic proteinases encoded by retroposons and retroviruses.


Pssm-ID: 461634  Cd Length: 162  Bit Score: 230.86  E-value: 1.25e-69
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  884 TKRRILSDIAKFFDPLGWATPVIIRAKILMQRLWIAKCDWDEVAPPNLLEAWQQYHTHLKQLEEVMIPRWIQLGHHVLHL 963
Cdd:pfam05380    1 TKREVLSFIARIFDPLGLLSPVIVKGKILMQKLWQLKIDWDDPLPDELLEEWEKYRSELPELSTLTIPRQLTPPYDISSV 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  964 ELHGFSDASTKAYAAAVYIRVVTVDGMVSVNLLAAKSKVAPVKVMSVPRLELSAAQLLARLIHFIREALDFREVNVYCWT 1043
Cdd:pfam05380   81 ELHGFSDASEKAYGAVVYLRSEDTDGTVTVKLLCAKSKVAPLKKLTIPRLELCGALLLARLANYVIKELSLKISSVYAWS 160

                   ..
gi 1663694841 1044 DS 1045
Cdd:pfam05380  161 DS 162
RT_pepA17 cd01644
RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT ...
653-869 1.10e-67

RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT domain of a multifunctional enzyme. C-terminal to the RT domain is a domain homologous to aspartic proteinases (corresponding to Merops family A17) encoded by retrotransposons and retroviruses. RT catalyzes DNA replication from an RNA template and is responsible for the replication of retroelements.


Pssm-ID: 238822  Cd Length: 213  Bit Score: 227.19  E-value: 1.10e-67
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  653 YIPHHAVFRANSLTTRIRIVFNASSRTsNGTSLNDHLYPGPKLQKDLAAVILRWRTFRYVYSADVAKMYRQIQVDPRDRD 732
Cdd:cd01644      3 YLPHHAVIKPSKTTTKLRVVFDASARY-NGVSLNDMLLKGPDLLNSLFGVLLRFRQGKIAVSADIEKMFHQVKVRPEDRD 81
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  733 FQRIFWRPSPTHQ-VEEYRLCTVTYGMTSAPYLALKVMNQLAiDEGASFPLAtSVIDRQMYVDDFIFGADDKVLARQTRE 811
Cdd:cd01644     82 VLRFLWRKDGDEPkPIEYRMTVVPFGAASAPFLANRALKQHA-EDHPHEAAA-KIIKRNFYVDDILVSTDTLNEAVNVAK 159
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1663694841  812 QIVSLLKRGGFTLRKWASNVSELLDEIEDTDHGLAQSRDLNEdeslKILGLTWQPDRD 869
Cdd:cd01644    160 RLIALLKKGGFNLRKWASNSQEVLDDLPEERVLLDRDSDVTE----KTLGLRWNPKTD 213
DUF5641 pfam18701
Family of unknown function (DUF5641); This presumed domain is found in a range of ...
1554-1647 1.40e-45

Family of unknown function (DUF5641); This presumed domain is found in a range of retrotransposon polyproteins.


Pssm-ID: 465838  Cd Length: 94  Bit Score: 159.19  E-value: 1.40e-45
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841 1554 SRWQLLQHITEKFWKSWSNDYLLTLQQRPKWKIAQRLATVGRIVLLRNALAPPSHWELGRIIECHPGNDGLVRVVTVRTA 1633
Cdd:pfam18701    1 SRWQLVQQLRQHFWKRWSSEYLPQLQQRSKWTSSKPNLKVGDLVLVKEDNLPPLKWPLGRVIEVHPGSDGVVRVVTVRTA 80
                           90
                   ....*....|....
gi 1663694841 1634 RSQYKRPIVKLCFL 1647
Cdd:pfam18701   81 TGELKRPVVKLCPL 94
DUF1759 pfam03564
Protein of unknown function (DUF1759); This is a family of proteins of unknown function. Most ...
101-215 2.88e-19

Protein of unknown function (DUF1759); This is a family of proteins of unknown function. Most of the members are gag-polyproteins.


Pssm-ID: 281552  Cd Length: 148  Bit Score: 85.96  E-value: 2.88e-19
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  101 FDGSFDKWESFRDRFQSMIIEEKSLSNVQKLHHLFSCLKGEALTAIEHLTITSDNFAVAWTILSSNFENERRLINSLDKS 180
Cdd:pfam03564    1 FSGDYKEWPAFWDLFESTIHSKPHLPKVQKFNYLKSLLKGEAANVVAHLAITASNYESAWEALKKRYDNPRVIKRSLLNE 80
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 1663694841  181 SREAWELKLGKTTVYPTF-DEIsaflDSRIRALDAL 215
Cdd:pfam03564   81 FMKLPSTNEDSVSQLRRFvDAA----NEIIRGLEAL 112
DUF1758 pfam05585
Putative peptidase (DUF1758); This is a family of nematode proteins of unknown function. ...
362-522 7.93e-11

Putative peptidase (DUF1758); This is a family of nematode proteins of unknown function. However, it seems likely that these proteins act as aspartic peptidases.


Pssm-ID: 147642  Cd Length: 164  Bit Score: 62.25  E-value: 7.93e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  362 VRVHSMHGRFQWARALIDQGSVSSFITENLVQSLRLPKLNTAVRVTGIGETQTSVRHAVHLTITPSNADTPVYRTTALIL 441
Cdd:pfam05585    1 VVVSNAQGARTKCRLLFDSGSELSYISERCINRLGLARTPSRILVIGISGDKAPQTRGSNRTVISSRLSNGTLAVRAHVL 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  442 KSLTRYLPNRLDIPEQWPHLNGLSLADPDPAGSDPIEIIIGADLFGSLILDGVRKGAVDEPIAQNTALGWIISGPIAQLR 521
Cdd:pfam05585   81 KKITSSLERHVIDDSILAVFNDLKPADLNFRKIAPIDILLGSDYFWDFITGTKIKDLGGGLIAISSIFGWVITSKQAEKK 160

                   .
gi 1663694841  522 P 522
Cdd:pfam05585  161 E 161
Integrase_H2C2 pfam17921
Integrase zinc binding domain; This zinc binding domain is found in a wide variety of ...
1273-1325 9.00e-08

Integrase zinc binding domain; This zinc binding domain is found in a wide variety of integrase proteins.


Pssm-ID: 465569 [Multi-domain]  Cd Length: 58  Bit Score: 50.32  E-value: 9.00e-08
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1663694841 1273 DHTLVRMIIMDTHRRMLHAGSQLTLARIREKFWILRARSLVRAVLYKCVSCTR 1325
Cdd:pfam17921    2 PKSLRKEILKEAHDSGGHLGIEKTLARLRRRYWWPGMRKDVKKYVKSCETCQR 54
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
1375-1428 8.57e-04

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 40.38  E-value: 8.57e-04
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1663694841 1375 KAYIAVFVCMTTKAIHLELVSD-YSSDAFLATLNRFVSRRGY-PASIYSDNGTTFQ 1428
Cdd:pfam00665   20 KLYLLVIVDDFSREILAWALSSeMDAELVLDALERAIAFRGGvPLIIHSDNGSEYT 75
retropepsin_like cd00303
Retropepsins; pepsin-like aspartate proteases; The family includes pepsin-like aspartate ...
372-445 4.04e-03

Retropepsins; pepsin-like aspartate proteases; The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.


Pssm-ID: 133136  Cd Length: 92  Bit Score: 38.09  E-value: 4.04e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1663694841  372 QWARALIDQGSVSSFITENLVQSLRLPK--LNTAVRVTGI-GETQTSVRHAVHLTITPSNADTPVyrtTALILKSLT 445
Cdd:cd00303      8 VPVRALVDSGASVNFISESLAKKLGLPPrlLPTPLKVKGAnGSSVKTLGVILPVTIGIGGKTFTV---DFYVLDLLS 81
RT_DIRS1 cd03714
RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of retransposons. Members ...
716-824 8.38e-03

RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of retransposons. Members of the subfamily include the Dictyostelium DIRS-1, Volvox carteri kangaroo, and Panagrellus redivivus PAT elements. These elements differ from LTR and conventional non-LTR retrotransposons. They contain split direct repeat (SDR) termini, and have been proposed to integrate via double-stranded closed-circle DNA intermediates assisted by an encoded recombinase which is similar to gamma-site-specific integrase.


Pssm-ID: 239684 [Multi-domain]  Cd Length: 119  Bit Score: 38.09  E-value: 8.38e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1663694841  716 DVAKMYRQIQVDPRDRDFQRIFWRPSPthqveeYRLCTVTYGMTSAPYLALKVMnqlaidEGASFPLATSVIDRQMYVDD 795
Cdd:cd03714      2 DLKDAYFHIPILPRSRDLLGFAWQGET------YQFKALPFGLSLAPRVFTKVV------EALLAPLRLLGVRIFSYLDD 69
                           90       100       110
                   ....*....|....*....|....*....|
gi 1663694841  796 F-IFGADDKVLARQTREQIVSLLKRGGFTL 824
Cdd:cd03714     70 LlIIASSIKTSEAVLRHLRATLLANLGFTL 99
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH