NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1868331368|ref|XP_035297551|]
View 

LOW QUALITY PROTEIN: uncharacterized protein LOC113836537 isoform X2 [Cricetulus griseus]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
Gag_p30 super family cl03444
Gag P30 core shell protein; According to Swiss-Prot annotation this protein is the viral core ...
235-443 6.10e-121

Gag P30 core shell protein; According to Swiss-Prot annotation this protein is the viral core shell protein. P30 is essential for viral assembly.


The actual alignment was detected with superfamily member pfam02093:

Pssm-ID: 426597  Cd Length: 208  Bit Score: 375.93  E-value: 6.10e-121
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  235 QYWPFSSADLYNWKAHNPPFSQDPQALTGLIESILITHQPTWDDCQQLLQILLTTEERQRVLLEARKNVPGPDGRPTQLP 314
Cdd:pfam02093    1 QYWPFSSSDLYNWKNNNPSFSEDPGKLTALIESVLVTHQPTWDDCQQLLGTLLTGEEKQRVLLEARKAVRGNDGRPTQLP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  315 NEIDEVFPLIRPtDWDINTAAGRERHRLYRQTLLAGLKGAGRRPTNLAKVRAVVQGLEETPAGFLERLMEAYRMYTPFDP 394
Cdd:pfam02093   81 NEVDAAFPLERP-DWDYTTPAGRNHLVLYRQLLLAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDP 159
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*....
gi 1868331368  395 QAQDREADIIMSFIGQSAPDIRTKLQRLEGLQGYTLRDLVKEAEKIFNK 443
Cdd:pfam02093  160 EDPGQETNVSMSFIWQSAPDIGRKLERLEDLKSKTLGDLVREAEKIFNK 208
RT_ZFREV_like cd03715
RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the ...
717-930 2.39e-105

RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the intact endogenous retrovirus ZFERV from zebrafish and to Moloney murine leukemia virus RT. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs. Phylogenetic analysis suggests that ZFERV belongs to a distinct group of retroviruses.


:

Pssm-ID: 239685 [Multi-domain]  Cd Length: 210  Bit Score: 333.16  E-value: 2.39e-105
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  717 PAAIKQYPLSQEAREGIRPHINRLLQQGILRPCHSPWNTPLLPIKKPGTGEYRPVQDLREVNRRTEDIHPTVPNPYNLLS 796
Cdd:cd03715      1 PVNQKQYPLPREAREGITPHIQELLEAGILVPCQSPWNTPILPVKKPGGNDYRMVQDLRLVNQAVLPIHPAVPNPYTLLS 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  797 MLPPSHIWYTVLDLKDTFFCLRLSPQSQPMFAFEWKDpetgfsGQLTWTSLPQGFKNSPTLFDEALHQDLADFRIHNPNL 876
Cdd:cd03715     81 LLPPKHQWYTVLDLANAFFSLPLAPDSQPLFAFEWEG------QQYTFTRLPQGFKNSPTLFHEALARDLAPFPLEHEGT 154
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....
gi 1868331368  877 ILLQYVDDLLLAAESEKDCLQGTGALLQKLGELGYRASAKKAQLYREQVTYLGY 930
Cdd:cd03715    155 ILLQYVDDLLLAADSEEDCLKGTDALLTHLGELGYKVSPKKAQICRAEVKFLGV 208
Gag_MA pfam01140
Matrix protein (MA), p15; The matrix protein, p15, is encoded by the gag gene. MA is involved ...
2-127 1.38e-63

Matrix protein (MA), p15; The matrix protein, p15, is encoded by the gag gene. MA is involved in pathogenicity.


:

Pssm-ID: 426076 [Multi-domain]  Cd Length: 126  Bit Score: 211.86  E-value: 1.38e-63
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368    2 GQQLTTPLSLTLGHWQDVLSRARKESVEIRKKKWQTLCFSEWPALNTGWPRDGTFDLTTILQVKTRVFQPGPWGYPDQVP 81
Cdd:pfam01140    1 GQTVTTPLSLTLGHWSDVPSRACNQSVDVKKRRWVTFCSAEWPTLNVGWPRDGTFNLTTILQVKTRVFAPGPHGHPDQVP 80
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*.
gi 1868331368   82 YIVTWESLSRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPP 127
Cdd:pfam01140   81 YIVTWEALAADPPPWVRPFLTPKPPPPQPPAAPGLRPPLPPASAPP 126
RNase_HI_RT_Bel cd09273
Bel/Pao family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes ...
1181-1321 6.29e-58

Bel/Pao family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryote. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Bel/Pao family has been described only in metazoan genomes. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


:

Pssm-ID: 260005 [Multi-domain]  Cd Length: 131  Bit Score: 195.63  E-value: 6.29e-58
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1181 TWFTDGSSFlqdgirKAGAAVVDGQTTIWASALPPGTSAQRAELIALTQALKMAEGRRVNIYTDSRYAFATAHVHGEIYR 1260
Cdd:cd09273      1 TVFTDGSSF------KAGYAIVSGTEIVEAQPLPPGTSAQRAELIALIQALELAKGKPVNIYTDSAYAVHALHLLETIGI 74
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1868331368 1261 RRGLLtsaeKDIKNKTEILELLQALFLPRRLSIIHCPGHQKGNDPVARGNRMADEEARKAA 1321
Cdd:cd09273     75 ERGFL----KSIKNLSLFLQLLEAVQRPKPVAIIHIRAHSKLPGPLAEGNAQADAAAKQAA 131
RP_RTVL_H_like cd06095
Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements; This family ...
552-633 7.19e-25

Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements; This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.


:

Pssm-ID: 133159  Cd Length: 86  Bit Score: 99.71  E-value: 7.19e-25
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  552 VTLNVGGQPVTFLVDTGAQHSVLTQTQGP---LSTRTAWVQGATGGKLHRWTTERK-VHLSTGHVTHSFLLVPECPYPLL 627
Cdd:cd06095      1 VTITVEGVPIVFLVDTGATHSVLKSDLGPkqeLSTTSVLIRGVSGQSQQPVTTYRTlVDLGGHTVSHSFLVVPNCPDPLL 80

                   ....*.
gi 1868331368  628 GRDLLS 633
Cdd:cd06095     81 GRDLLS 86
zf-H2C2 super family cl07828
H2C2 zinc finger; This domain binds to histone upstream activating sequence (UAS) elements ...
1340-1432 2.11e-19

H2C2 zinc finger; This domain binds to histone upstream activating sequence (UAS) elements that are found in histone gene promoters.


The actual alignment was detected with superfamily member pfam16721:

Pssm-ID: 447530  Cd Length: 96  Bit Score: 84.39  E-value: 2.11e-19
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1340 FEYTDSDLETVQSLGANYEQEANVWRYQGKILLPQGAAKELLSQLHRWTHLGHKKLKALLQREEQTYYIHNPNALIQQIT 1419
Cdd:pfam16721    3 FHYTVTDIKDLTKLGAIYDKTKKYWVYQGKPVMPDQFTFELLDFLHQLTHLSFSKMKALLERSHSPYYMLNRDRTLKNIT 82
                           90
                   ....*....|...
gi 1868331368 1420 STCTPCAKVNTGR 1432
Cdd:pfam16721   83 ETCKACAQVNASK 95
RT_RNaseH_2 pfam17919
RNase H-like domain found in reverse transcriptase;
995-1058 1.56e-17

RNase H-like domain found in reverse transcriptase;


:

Pssm-ID: 465567 [Multi-domain]  Cd Length: 100  Bit Score: 79.08  E-value: 1.56e-17
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1868331368  995 WTEEHQKAFDGIKKALLSAPALSLPNINKPFTLYVDekA---GMAkGVLTQ-QLGPWKRPVAYFSKKL 1058
Cdd:pfam17919    1 WTEECQKAFEKLKQALTSAPVLAHPDPDKPFILETD--AsdyGIG-AVLSQeDDDGGERPIAYASRKL 65
RT_RNaseH super family cl39037
RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) ...
1022-1125 1.10e-12

RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) activities allow reverse transcriptases to convert the single-stranded retroviral RNA genome into double-stranded DNA, which is integrated into the host chromosome during infection. This entry represents the RNase H like domain.


The actual alignment was detected with superfamily member pfam17917:

Pssm-ID: 465565  Cd Length: 104  Bit Score: 65.61  E-value: 1.10e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1022 NKPFTLYVDE-KAGMAkGVLTQQL-GPWKRPVAYFSKKLDNVAMGWPPCLQMVAAVAVLTKDEDRLTLGQQLTVIAPHAI 1099
Cdd:pfam17917    3 SKPFILETDAsDYGIG-AVLSQKDeDGKERPIAYASRKLTPAERNYSTTEKELLAIVWALKKFRHYLLGRKFTVYTDHKP 81
                           90       100
                   ....*....|....*....|....*.
gi 1868331368 1100 EAVVRQPpdrWLSNARMTHYQALLLN 1125
Cdd:pfam17917   82 LKYLFTP---KELNGRLARWALFLQE 104
PHA03247 super family cl33720
large tegument protein UL36; Provisional
81-222 3.23e-08

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 58.80  E-value: 3.23e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   81 PYIVTW----ESLSRD----PPPWVRPFLPPRLSGPR-PTPLPSPAPLIPT-------PSAPPSTSSLLPLTEPQSPNPR 144
Cdd:PHA03247  2531 PRMLTWirglEELASDdagdPPPPLPPAAPPAAPDRSvPPPRPAPRPSEPAvtsrarrPDAPPQSARPRAPVDDRGDPRG 2610
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1868331368  145 PKAPAVLPdtqEDLLLLDSPPPYPNAEQAASQLPAADQSPPPPSsttgDPGDPSPPSTRLRSRRDRTLEGPDSSGISQ 222
Cdd:PHA03247  2611 PAPPSPLP---PDTHAPDPPPPSPSPAANEPDPHPPPTVPPPER----PRDDPAPGRVSRPRRARRLGRAAQASSPPQ 2681
 
Name Accession Description Interval E-value
Gag_p30 pfam02093
Gag P30 core shell protein; According to Swiss-Prot annotation this protein is the viral core ...
235-443 6.10e-121

Gag P30 core shell protein; According to Swiss-Prot annotation this protein is the viral core shell protein. P30 is essential for viral assembly.


Pssm-ID: 426597  Cd Length: 208  Bit Score: 375.93  E-value: 6.10e-121
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  235 QYWPFSSADLYNWKAHNPPFSQDPQALTGLIESILITHQPTWDDCQQLLQILLTTEERQRVLLEARKNVPGPDGRPTQLP 314
Cdd:pfam02093    1 QYWPFSSSDLYNWKNNNPSFSEDPGKLTALIESVLVTHQPTWDDCQQLLGTLLTGEEKQRVLLEARKAVRGNDGRPTQLP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  315 NEIDEVFPLIRPtDWDINTAAGRERHRLYRQTLLAGLKGAGRRPTNLAKVRAVVQGLEETPAGFLERLMEAYRMYTPFDP 394
Cdd:pfam02093   81 NEVDAAFPLERP-DWDYTTPAGRNHLVLYRQLLLAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDP 159
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*....
gi 1868331368  395 QAQDREADIIMSFIGQSAPDIRTKLQRLEGLQGYTLRDLVKEAEKIFNK 443
Cdd:pfam02093  160 EDPGQETNVSMSFIWQSAPDIGRKLERLEDLKSKTLGDLVREAEKIFNK 208
RT_ZFREV_like cd03715
RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the ...
717-930 2.39e-105

RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the intact endogenous retrovirus ZFERV from zebrafish and to Moloney murine leukemia virus RT. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs. Phylogenetic analysis suggests that ZFERV belongs to a distinct group of retroviruses.


Pssm-ID: 239685 [Multi-domain]  Cd Length: 210  Bit Score: 333.16  E-value: 2.39e-105
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  717 PAAIKQYPLSQEAREGIRPHINRLLQQGILRPCHSPWNTPLLPIKKPGTGEYRPVQDLREVNRRTEDIHPTVPNPYNLLS 796
Cdd:cd03715      1 PVNQKQYPLPREAREGITPHIQELLEAGILVPCQSPWNTPILPVKKPGGNDYRMVQDLRLVNQAVLPIHPAVPNPYTLLS 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  797 MLPPSHIWYTVLDLKDTFFCLRLSPQSQPMFAFEWKDpetgfsGQLTWTSLPQGFKNSPTLFDEALHQDLADFRIHNPNL 876
Cdd:cd03715     81 LLPPKHQWYTVLDLANAFFSLPLAPDSQPLFAFEWEG------QQYTFTRLPQGFKNSPTLFHEALARDLAPFPLEHEGT 154
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....
gi 1868331368  877 ILLQYVDDLLLAAESEKDCLQGTGALLQKLGELGYRASAKKAQLYREQVTYLGY 930
Cdd:cd03715    155 ILLQYVDDLLLAADSEEDCLKGTDALLTHLGELGYKVSPKKAQICRAEVKFLGV 208
Gag_MA pfam01140
Matrix protein (MA), p15; The matrix protein, p15, is encoded by the gag gene. MA is involved ...
2-127 1.38e-63

Matrix protein (MA), p15; The matrix protein, p15, is encoded by the gag gene. MA is involved in pathogenicity.


Pssm-ID: 426076 [Multi-domain]  Cd Length: 126  Bit Score: 211.86  E-value: 1.38e-63
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368    2 GQQLTTPLSLTLGHWQDVLSRARKESVEIRKKKWQTLCFSEWPALNTGWPRDGTFDLTTILQVKTRVFQPGPWGYPDQVP 81
Cdd:pfam01140    1 GQTVTTPLSLTLGHWSDVPSRACNQSVDVKKRRWVTFCSAEWPTLNVGWPRDGTFNLTTILQVKTRVFAPGPHGHPDQVP 80
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*.
gi 1868331368   82 YIVTWESLSRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPP 127
Cdd:pfam01140   81 YIVTWEALAADPPPWVRPFLTPKPPPPQPPAAPGLRPPLPPASAPP 126
RNase_HI_RT_Bel cd09273
Bel/Pao family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes ...
1181-1321 6.29e-58

Bel/Pao family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryote. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Bel/Pao family has been described only in metazoan genomes. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260005 [Multi-domain]  Cd Length: 131  Bit Score: 195.63  E-value: 6.29e-58
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1181 TWFTDGSSFlqdgirKAGAAVVDGQTTIWASALPPGTSAQRAELIALTQALKMAEGRRVNIYTDSRYAFATAHVHGEIYR 1260
Cdd:cd09273      1 TVFTDGSSF------KAGYAIVSGTEIVEAQPLPPGTSAQRAELIALIQALELAKGKPVNIYTDSAYAVHALHLLETIGI 74
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1868331368 1261 RRGLLtsaeKDIKNKTEILELLQALFLPRRLSIIHCPGHQKGNDPVARGNRMADEEARKAA 1321
Cdd:cd09273     75 ERGFL----KSIKNLSLFLQLLEAVQRPKPVAIIHIRAHSKLPGPLAEGNAQADAAAKQAA 131
RNase_H pfam00075
RNase H; RNase H digests the RNA strand of an RNA/DNA hybrid. Important enzyme in retroviral ...
1181-1321 8.33e-36

RNase H; RNase H digests the RNA strand of an RNA/DNA hybrid. Important enzyme in retroviral replication cycle, and often found as a domain associated with reverse transcriptases. Structure is a mixed alpha+beta fold with three a/b/a layers.


Pssm-ID: 395028 [Multi-domain]  Cd Length: 141  Bit Score: 132.89  E-value: 8.33e-36
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1181 TWFTDGSSFLQDGIRKAGAaVVDGQTTIWASALPPGTSAQRAELIALTQALK-MAEGRRVNIYTDSRYAFATAH--VHGE 1257
Cdd:pfam00075    5 TVYTDGSCLGNPGPGGAGA-VLYRGHENISAPLPGRTTNNRAELQAVIEALKaLKSPSKVNIYTDSQYVIGGITqwVHGW 83
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1868331368 1258 IYRRRGlLTSAEKDIKNKtEILELLQALFLPRRLSIIHCPGHqKGNdpvaRGNRMADEEARKAA 1321
Cdd:pfam00075   84 KKNGWP-TTSEGKPVKNK-DLWQLLKALCKKHQVYWQWVKGH-AGN----PGNEMADRLAKQGA 140
RVT_1 pfam00078
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
760-932 4.40e-30

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses.


Pssm-ID: 395031 [Multi-domain]  Cd Length: 189  Bit Score: 118.17  E-value: 4.40e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  760 IKKPGTGEYRPV----QDLREVNRRTED-IHPTVPNPYNLLSMLPPSHI-----WYTVLDLKDTFFCLRLSPQSQPMFAF 829
Cdd:pfam00078    1 IPKKGKGKYRPIsllsIDYKALNKIIVKrLKPENLDSPPQPGFRPGLAKlkkakWFLKLDLKKAFDQVPLDELDRKLTAF 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  830 ------EWKDPETGfSGQLTWTSLPQGFKNSPTLFDEALHQDLADFRiHNPNLILLQYVDDLLLAAESEKDCLQGTGALL 903
Cdd:pfam00078   81 ttppinINWNGELS-GGRYEWKGLPQGLVLSPALFQLFMNELLRPLR-KRAGLTLVRYADDILIFSKSEEEHQEALEEVL 158
                          170       180       190
                   ....*....|....*....|....*....|.
gi 1868331368  904 QKLGELGYRASAKKAQLYR--EQVTYLGYRL 932
Cdd:pfam00078  159 EWLKESGLKINPEKTQFFLksKEVKYLGVTL 189
RP_RTVL_H_like cd06095
Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements; This family ...
552-633 7.19e-25

Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements; This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.


Pssm-ID: 133159  Cd Length: 86  Bit Score: 99.71  E-value: 7.19e-25
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  552 VTLNVGGQPVTFLVDTGAQHSVLTQTQGP---LSTRTAWVQGATGGKLHRWTTERK-VHLSTGHVTHSFLLVPECPYPLL 627
Cdd:cd06095      1 VTITVEGVPIVFLVDTGATHSVLKSDLGPkqeLSTTSVLIRGVSGQSQQPVTTYRTlVDLGGHTVSHSFLVVPNCPDPLL 80

                   ....*.
gi 1868331368  628 GRDLLS 633
Cdd:cd06095     81 GRDLLS 86
RVP pfam00077
Retroviral aspartyl protease; Single domain aspartyl proteases from retroviruses, ...
548-640 1.87e-23

Retroviral aspartyl protease; Single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases such as pepsins, cathepsins, and renins (pfam00026).


Pssm-ID: 425454  Cd Length: 101  Bit Score: 96.28  E-value: 1.87e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  548 PEPRVTLNVGGQPVTFLVDTGAQHSVLTQTQGPLS----TRTAWVQGATGGKLHRWTTERKVHLSTGHVTH--SFLLVPE 621
Cdd:pfam00077    3 QRPLLTVKIGGKYFTALLDTGADDTVISQNDWPTNwpkqKATTNIQGIGGGINVRQSDQILILIGEDKFRGtvSPLILPT 82
                           90
                   ....*....|....*....
gi 1868331368  622 CPYPLLGRDLLSKVGAHIH 640
Cdd:pfam00077   83 CPVNIIGRDLLQQLGGRLT 101
zf-H3C2 pfam16721
Zinc-finger like, probable DNA-binding; This is a family of probably DNA-binding zinc-fingers ...
1340-1432 2.11e-19

Zinc-finger like, probable DNA-binding; This is a family of probably DNA-binding zinc-fingers found on Gag-Pol polyproteins from mouse retroviruses. Added to clan to resolve overlaps with zf-H2C2, but neither are true members.


Pssm-ID: 293326  Cd Length: 96  Bit Score: 84.39  E-value: 2.11e-19
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1340 FEYTDSDLETVQSLGANYEQEANVWRYQGKILLPQGAAKELLSQLHRWTHLGHKKLKALLQREEQTYYIHNPNALIQQIT 1419
Cdd:pfam16721    3 FHYTVTDIKDLTKLGAIYDKTKKYWVYQGKPVMPDQFTFELLDFLHQLTHLSFSKMKALLERSHSPYYMLNRDRTLKNIT 82
                           90
                   ....*....|...
gi 1868331368 1420 STCTPCAKVNTGR 1432
Cdd:pfam16721   83 ETCKACAQVNASK 95
RT_RNaseH_2 pfam17919
RNase H-like domain found in reverse transcriptase;
995-1058 1.56e-17

RNase H-like domain found in reverse transcriptase;


Pssm-ID: 465567 [Multi-domain]  Cd Length: 100  Bit Score: 79.08  E-value: 1.56e-17
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1868331368  995 WTEEHQKAFDGIKKALLSAPALSLPNINKPFTLYVDekA---GMAkGVLTQ-QLGPWKRPVAYFSKKL 1058
Cdd:pfam17919    1 WTEECQKAFEKLKQALTSAPVLAHPDPDKPFILETD--AsdyGIG-AVLSQeDDDGGERPIAYASRKL 65
RnhA COG0328
Ribonuclease HI [Replication, recombination and repair];
1183-1321 6.24e-16

Ribonuclease HI [Replication, recombination and repair];


Pssm-ID: 440097 [Multi-domain]  Cd Length: 136  Bit Score: 76.04  E-value: 6.24e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1183 FTDGSSFLQDGIRKAGAAVVDGQTTIWASALPPGTSAQRAELIALTQALKMAE---GRRVNIYTDSRYAFATAHVHGEIY 1259
Cdd:COG0328      6 YTDGACRGNPGPGGWGAVIRYGGEEKELSGGLGDTTNNRAELTALIAALEALKelgPCEVEIYTDSQYVVNQITGWIHGW 85
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1868331368 1260 RRRGlltsaEKDIKNKtEILELLQALFLPRRLSIIHCPGHQkGNdpvaRGNRMADEEARKAA 1321
Cdd:COG0328     86 KKNG-----WKPVKNP-DLWQRLDELLARHKVTFEWVKGHA-GH----PGNERADALANKAL 136
RT_RNaseH pfam17917
RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) ...
1022-1125 1.10e-12

RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) activities allow reverse transcriptases to convert the single-stranded retroviral RNA genome into double-stranded DNA, which is integrated into the host chromosome during infection. This entry represents the RNase H like domain.


Pssm-ID: 465565  Cd Length: 104  Bit Score: 65.61  E-value: 1.10e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1022 NKPFTLYVDE-KAGMAkGVLTQQL-GPWKRPVAYFSKKLDNVAMGWPPCLQMVAAVAVLTKDEDRLTLGQQLTVIAPHAI 1099
Cdd:pfam17917    3 SKPFILETDAsDYGIG-AVLSQKDeDGKERPIAYASRKLTPAERNYSTTEKELLAIVWALKKFRHYLLGRKFTVYTDHKP 81
                           90       100
                   ....*....|....*....|....*.
gi 1868331368 1100 EAVVRQPpdrWLSNARMTHYQALLLN 1125
Cdd:pfam17917   82 LKYLFTP---KELNGRLARWALFLQE 104
PHA03247 PHA03247
large tegument protein UL36; Provisional
81-222 3.23e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 58.80  E-value: 3.23e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   81 PYIVTW----ESLSRD----PPPWVRPFLPPRLSGPR-PTPLPSPAPLIPT-------PSAPPSTSSLLPLTEPQSPNPR 144
Cdd:PHA03247  2531 PRMLTWirglEELASDdagdPPPPLPPAAPPAAPDRSvPPPRPAPRPSEPAvtsrarrPDAPPQSARPRAPVDDRGDPRG 2610
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1868331368  145 PKAPAVLPdtqEDLLLLDSPPPYPNAEQAASQLPAADQSPPPPSsttgDPGDPSPPSTRLRSRRDRTLEGPDSSGISQ 222
Cdd:PHA03247  2611 PAPPSPLP---PDTHAPDPPPPSPSPAANEPDPHPPPTVPPPER----PRDDPAPGRVSRPRRARRLGRAAQASSPPQ 2681
Gag_p12 pfam01141
Gag polyprotein, inner coat protein p12; The retroviral p12 is a virion structural protein. ...
144-224 4.19e-07

Gag polyprotein, inner coat protein p12; The retroviral p12 is a virion structural protein. p12 is proline rich. The function carried out by p12 in assembly and replication is unknown. p12 is associated with pathogenicity of the virus.


Pssm-ID: 279483 [Multi-domain]  Cd Length: 85  Bit Score: 48.98  E-value: 4.19e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  144 RPKAPAVLPDTQEDL--LLLDSPPPYPNAEQAasqlPAADQSPPPPSSTTGDPGDPSPPSTRLRSRRDRTLEgpdSSGIS 221
Cdd:pfam01141   10 KPPKPQVLPDSGGPLidLLTEDPPPYRDAQPP----PSARDGNEEEAAPAGEAPDPSPMASRLRGRRDPPAA---DSTTS 82

                   ...
gi 1868331368  222 QAF 224
Cdd:pfam01141   83 QAF 85
COG3577 COG3577
Predicted aspartyl protease [General function prediction only];
523-593 5.24e-05

Predicted aspartyl protease [General function prediction only];


Pssm-ID: 442797  Cd Length: 152  Bit Score: 44.94  E-value: 5.24e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  523 NPRRKPTPILPLEKGD*DSQGQVPPPE-----PRVTLNVGGQPVTFLVDTGAQHSVLTQT-------QGPLSTRTAWVQG 590
Cdd:COG3577     10 GGRGVRAQLNPGQAPVSTGGGEVVLKRdrdghFVVEGTINGQPVRFLVDTGASTVVLSESdarrlglDPEDLGRPVRVQT 89

                   ...
gi 1868331368  591 ATG 593
Cdd:COG3577     90 ANG 92
Not5 COG5665
CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription];
108-201 9.09e-05

CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription];


Pssm-ID: 444384 [Multi-domain]  Cd Length: 874  Bit Score: 46.96  E-value: 9.09e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  108 PRPTPLPSPaPLIPTPSAPPS----TSSLLPLTEPQSPNPRPKA--PAVLPDTQEDLLLLDSPPPYPNAEQAASQLPAAD 181
Cdd:COG5665    256 SSQQPKSQP-TSPSGGTTPPStnqlTTSNTPTSTAKAQPQPPTKkqPAKEPPSDTASGNPSAPSVLINSDSPTSEDPATA 334
                           90       100
                   ....*....|....*....|..
gi 1868331368  182 QSPPPPSSTTG-DPGD-PSPPS 201
Cdd:COG5665    335 SVPTTEETTAFtTPSSvPSTPA 356
BimA_second NF040983
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia ...
108-202 1.99e-04

trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia intracellular motility A), WP_004266405.1-like proteins in Burkholderia mallei or B. pseudomallei. The term BimA has also been used for WP_011205626.1-like homologs that have a very different N-terminal half.


Pssm-ID: 468913 [Multi-domain]  Cd Length: 382  Bit Score: 45.66  E-value: 1.99e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  108 PRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPavlpdtqedlllldsPPPYPnaeqaasqlpaadqSPPPP 187
Cdd:NF040983    86 PNKVPPPPPPPPPPPPPPPTPPPPPPPPPPPPPPSPPPPPP---------------PSPPP--------------SPPPP 136
                           90
                   ....*....|....*
gi 1868331368  188 SSTtgdPGDPSPPST 202
Cdd:NF040983   137 TTT---PPTRTTPST 148
rnhA PRK00203
ribonuclease H; Reviewed
1215-1321 4.56e-04

ribonuclease H; Reviewed


Pssm-ID: 178927 [Multi-domain]  Cd Length: 150  Bit Score: 42.12  E-value: 4.56e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1215 PGTSAQRAELIALTQALKM-AEGRRVNIYTDSRY---AFaTAHVHGeiYRRRGLLTSAEKDIKNKteilELLQALflpRR 1290
Cdd:PRK00203    39 ALTTNNRMELMAAIEALEAlKEPCEVTLYTDSQYvrqGI-TEWIHG--WKKNGWKTADKKPVKNV----DLWQRL---DA 108
                           90       100       110
                   ....*....|....*....|....*....|....*..
gi 1868331368 1291 LSIIH------CPGHQkGNdpvaRGNRMADEEARKAA 1321
Cdd:PRK00203   109 ALKRHqikwhwVKGHA-GH----PENERCDELARAGA 140
PLN02983 PLN02983
biotin carboxyl carrier protein of acetyl-CoA carboxylase
93-143 2.60e-03

biotin carboxyl carrier protein of acetyl-CoA carboxylase


Pssm-ID: 215533 [Multi-domain]  Cd Length: 274  Bit Score: 41.36  E-value: 2.60e-03
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1868331368   93 PPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPST-SSLLPLTEP------QSPNP 143
Cdd:PLN02983   157 PPHAMPPASPPAAQPAPSAPASSPPPTPASPPPAKAPkSSHPPLKSPmagtfyRSPAP 214
 
Name Accession Description Interval E-value
Gag_p30 pfam02093
Gag P30 core shell protein; According to Swiss-Prot annotation this protein is the viral core ...
235-443 6.10e-121

Gag P30 core shell protein; According to Swiss-Prot annotation this protein is the viral core shell protein. P30 is essential for viral assembly.


Pssm-ID: 426597  Cd Length: 208  Bit Score: 375.93  E-value: 6.10e-121
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  235 QYWPFSSADLYNWKAHNPPFSQDPQALTGLIESILITHQPTWDDCQQLLQILLTTEERQRVLLEARKNVPGPDGRPTQLP 314
Cdd:pfam02093    1 QYWPFSSSDLYNWKNNNPSFSEDPGKLTALIESVLVTHQPTWDDCQQLLGTLLTGEEKQRVLLEARKAVRGNDGRPTQLP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  315 NEIDEVFPLIRPtDWDINTAAGRERHRLYRQTLLAGLKGAGRRPTNLAKVRAVVQGLEETPAGFLERLMEAYRMYTPFDP 394
Cdd:pfam02093   81 NEVDAAFPLERP-DWDYTTPAGRNHLVLYRQLLLAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDP 159
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*....
gi 1868331368  395 QAQDREADIIMSFIGQSAPDIRTKLQRLEGLQGYTLRDLVKEAEKIFNK 443
Cdd:pfam02093  160 EDPGQETNVSMSFIWQSAPDIGRKLERLEDLKSKTLGDLVREAEKIFNK 208
RT_ZFREV_like cd03715
RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the ...
717-930 2.39e-105

RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the intact endogenous retrovirus ZFERV from zebrafish and to Moloney murine leukemia virus RT. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs. Phylogenetic analysis suggests that ZFERV belongs to a distinct group of retroviruses.


Pssm-ID: 239685 [Multi-domain]  Cd Length: 210  Bit Score: 333.16  E-value: 2.39e-105
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  717 PAAIKQYPLSQEAREGIRPHINRLLQQGILRPCHSPWNTPLLPIKKPGTGEYRPVQDLREVNRRTEDIHPTVPNPYNLLS 796
Cdd:cd03715      1 PVNQKQYPLPREAREGITPHIQELLEAGILVPCQSPWNTPILPVKKPGGNDYRMVQDLRLVNQAVLPIHPAVPNPYTLLS 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  797 MLPPSHIWYTVLDLKDTFFCLRLSPQSQPMFAFEWKDpetgfsGQLTWTSLPQGFKNSPTLFDEALHQDLADFRIHNPNL 876
Cdd:cd03715     81 LLPPKHQWYTVLDLANAFFSLPLAPDSQPLFAFEWEG------QQYTFTRLPQGFKNSPTLFHEALARDLAPFPLEHEGT 154
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....
gi 1868331368  877 ILLQYVDDLLLAAESEKDCLQGTGALLQKLGELGYRASAKKAQLYREQVTYLGY 930
Cdd:cd03715    155 ILLQYVDDLLLAADSEEDCLKGTDALLTHLGELGYKVSPKKAQICRAEVKFLGV 208
Gag_MA pfam01140
Matrix protein (MA), p15; The matrix protein, p15, is encoded by the gag gene. MA is involved ...
2-127 1.38e-63

Matrix protein (MA), p15; The matrix protein, p15, is encoded by the gag gene. MA is involved in pathogenicity.


Pssm-ID: 426076 [Multi-domain]  Cd Length: 126  Bit Score: 211.86  E-value: 1.38e-63
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368    2 GQQLTTPLSLTLGHWQDVLSRARKESVEIRKKKWQTLCFSEWPALNTGWPRDGTFDLTTILQVKTRVFQPGPWGYPDQVP 81
Cdd:pfam01140    1 GQTVTTPLSLTLGHWSDVPSRACNQSVDVKKRRWVTFCSAEWPTLNVGWPRDGTFNLTTILQVKTRVFAPGPHGHPDQVP 80
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*.
gi 1868331368   82 YIVTWESLSRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPP 127
Cdd:pfam01140   81 YIVTWEALAADPPPWVRPFLTPKPPPPQPPAAPGLRPPLPPASAPP 126
RNase_HI_RT_Bel cd09273
Bel/Pao family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes ...
1181-1321 6.29e-58

Bel/Pao family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryote. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Bel/Pao family has been described only in metazoan genomes. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260005 [Multi-domain]  Cd Length: 131  Bit Score: 195.63  E-value: 6.29e-58
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1181 TWFTDGSSFlqdgirKAGAAVVDGQTTIWASALPPGTSAQRAELIALTQALKMAEGRRVNIYTDSRYAFATAHVHGEIYR 1260
Cdd:cd09273      1 TVFTDGSSF------KAGYAIVSGTEIVEAQPLPPGTSAQRAELIALIQALELAKGKPVNIYTDSAYAVHALHLLETIGI 74
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1868331368 1261 RRGLLtsaeKDIKNKTEILELLQALFLPRRLSIIHCPGHQKGNDPVARGNRMADEEARKAA 1321
Cdd:cd09273     75 ERGFL----KSIKNLSLFLQLLEAVQRPKPVAIIHIRAHSKLPGPLAEGNAQADAAAKQAA 131
RT_Rtv cd01645
RT_Rtv: Reverse transcriptases (RTs) from retroviruses (Rtvs). RTs catalyze the conversion of ...
717-932 4.23e-44

RT_Rtv: Reverse transcriptases (RTs) from retroviruses (Rtvs). RTs catalyze the conversion of single-stranded RNA into double-stranded viral DNA for integration into host chromosomes. Proteins in this subfamily contain long terminal repeats (LTRs) and are multifunctional enzymes with RNA-directed DNA polymerase, DNA directed DNA polymerase, and ribonuclease hybrid (RNase H) activities. The viral RNA genome enters the cytoplasm as part of a nucleoprotein complex, and the process of reverse transcription generates in the cytoplasm forming a linear DNA duplex via an intricate series of steps. This duplex DNA is colinear with its RNA template, but contains terminal duplications known as LTRs that are not present in viral RNA. It has been proposed that two specialized template switches, known as strand-transfer reactions or "jumps", are required to generate the LTRs.


Pssm-ID: 238823 [Multi-domain]  Cd Length: 213  Bit Score: 159.37  E-value: 4.23e-44
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  717 PAAIKQYPLSQEAREGIRPHINRLLQQGILRPCHSPWNTPLLPIKKPgTGEYRPVQDLREVNRRTED---IHPTVPNPyn 793
Cdd:cd01645      1 PVWIKQWPLTEEKLEALTELVTEQLKEGHIEPSTSPWNTPVFVIKKK-SGKWRLLHDLRAVNAQTQDmgaLQPGLPHP-- 77
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  794 llSMLPPShiWY-TVLDLKDTFFCLRLSPQSQPMFAFEWkdPETGFSG---QLTWTSLPQGFKNSPTLFDEALHQDLADF 869
Cdd:cd01645     78 --AALPKG--WPlIVLDLKDCFFSIPLHPDDRERFAFTV--PSINNKGpakRYQWKVLPQGMKNSPTICQSFVAQALEPF 151
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1868331368  870 RIHNPNLILLQYVDDLLLAAESEKDCLQGTGALLQKLGELGYRASAKKAQLYrEQVTYLGYRL 932
Cdd:cd01645    152 RKQYPDIVIYHYMDDILIASDLEGQLREIYEELRQTLLRWGLTIPPEKVQKE-PPFQYLGYEL 213
RNase_H pfam00075
RNase H; RNase H digests the RNA strand of an RNA/DNA hybrid. Important enzyme in retroviral ...
1181-1321 8.33e-36

RNase H; RNase H digests the RNA strand of an RNA/DNA hybrid. Important enzyme in retroviral replication cycle, and often found as a domain associated with reverse transcriptases. Structure is a mixed alpha+beta fold with three a/b/a layers.


Pssm-ID: 395028 [Multi-domain]  Cd Length: 141  Bit Score: 132.89  E-value: 8.33e-36
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1181 TWFTDGSSFLQDGIRKAGAaVVDGQTTIWASALPPGTSAQRAELIALTQALK-MAEGRRVNIYTDSRYAFATAH--VHGE 1257
Cdd:pfam00075    5 TVYTDGSCLGNPGPGGAGA-VLYRGHENISAPLPGRTTNNRAELQAVIEALKaLKSPSKVNIYTDSQYVIGGITqwVHGW 83
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1868331368 1258 IYRRRGlLTSAEKDIKNKtEILELLQALFLPRRLSIIHCPGHqKGNdpvaRGNRMADEEARKAA 1321
Cdd:pfam00075   84 KKNGWP-TTSEGKPVKNK-DLWQLLKALCKKHQVYWQWVKGH-AGN----PGNEMADRLAKQGA 140
RT_LTR cd01647
RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long ...
744-931 2.31e-35

RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long terminal repeats (LTRs) in their DNA copies but not in their RNA template. RT catalyzes DNA replication from an RNA template, and is responsible for the replication of retroelements. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs are present in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and Caulimoviruses.


Pssm-ID: 238825  Cd Length: 177  Bit Score: 133.10  E-value: 2.31e-35
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  744 GILRPCHSPWNTPLLPIKKPGtGEYRPVQDLREVNRRTE-DIHPtVPNPYNLLSMLPPSHiWYTVLDLKDTFFCLRLSPQ 822
Cdd:cd01647      1 GIIEPSSSPYASPVVVVKKKD-GKLRLCVDYRKLNKVTIkDRYP-LPTIDELLEELAGAK-VFSKLDLRSGYHQIPLAEE 77
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  823 SQPMFAFewkdpeTGFSGQLTWTSLPQGFKNSPTLFDEALHQDLADFRIHNpnliLLQYVDDLLLAAESEKDCLQGTGAL 902
Cdd:cd01647     78 SRPKTAF------RTPFGLYEYTRMPFGLKNAPATFQRLMNKILGDLLGDF----VEVYLDDILVYSKTEEEHLEHLREV 147
                          170       180
                   ....*....|....*....|....*....
gi 1868331368  903 LQKLGELGYRASAKKAQLYREQVTYLGYR 931
Cdd:cd01647    148 LERLREAGLKLNPEKCEFGVPEVEFLGHI 176
RVT_1 pfam00078
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
760-932 4.40e-30

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses.


Pssm-ID: 395031 [Multi-domain]  Cd Length: 189  Bit Score: 118.17  E-value: 4.40e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  760 IKKPGTGEYRPV----QDLREVNRRTED-IHPTVPNPYNLLSMLPPSHI-----WYTVLDLKDTFFCLRLSPQSQPMFAF 829
Cdd:pfam00078    1 IPKKGKGKYRPIsllsIDYKALNKIIVKrLKPENLDSPPQPGFRPGLAKlkkakWFLKLDLKKAFDQVPLDELDRKLTAF 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  830 ------EWKDPETGfSGQLTWTSLPQGFKNSPTLFDEALHQDLADFRiHNPNLILLQYVDDLLLAAESEKDCLQGTGALL 903
Cdd:pfam00078   81 ttppinINWNGELS-GGRYEWKGLPQGLVLSPALFQLFMNELLRPLR-KRAGLTLVRYADDILIFSKSEEEHQEALEEVL 158
                          170       180       190
                   ....*....|....*....|....*....|.
gi 1868331368  904 QKLGELGYRASAKKAQLYR--EQVTYLGYRL 932
Cdd:pfam00078  159 EWLKESGLKINPEKTQFFLksKEVKYLGVTL 189
RP_RTVL_H_like cd06095
Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements; This family ...
552-633 7.19e-25

Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements; This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.


Pssm-ID: 133159  Cd Length: 86  Bit Score: 99.71  E-value: 7.19e-25
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  552 VTLNVGGQPVTFLVDTGAQHSVLTQTQGP---LSTRTAWVQGATGGKLHRWTTERK-VHLSTGHVTHSFLLVPECPYPLL 627
Cdd:cd06095      1 VTITVEGVPIVFLVDTGATHSVLKSDLGPkqeLSTTSVLIRGVSGQSQQPVTTYRTlVDLGGHTVSHSFLVVPNCPDPLL 80

                   ....*.
gi 1868331368  628 GRDLLS 633
Cdd:cd06095     81 GRDLLS 86
RVP pfam00077
Retroviral aspartyl protease; Single domain aspartyl proteases from retroviruses, ...
548-640 1.87e-23

Retroviral aspartyl protease; Single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases such as pepsins, cathepsins, and renins (pfam00026).


Pssm-ID: 425454  Cd Length: 101  Bit Score: 96.28  E-value: 1.87e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  548 PEPRVTLNVGGQPVTFLVDTGAQHSVLTQTQGPLS----TRTAWVQGATGGKLHRWTTERKVHLSTGHVTH--SFLLVPE 621
Cdd:pfam00077    3 QRPLLTVKIGGKYFTALLDTGADDTVISQNDWPTNwpkqKATTNIQGIGGGINVRQSDQILILIGEDKFRGtvSPLILPT 82
                           90
                   ....*....|....*....
gi 1868331368  622 CPYPLLGRDLLSKVGAHIH 640
Cdd:pfam00077   83 CPVNIIGRDLLQQLGGRLT 101
zf-H3C2 pfam16721
Zinc-finger like, probable DNA-binding; This is a family of probably DNA-binding zinc-fingers ...
1340-1432 2.11e-19

Zinc-finger like, probable DNA-binding; This is a family of probably DNA-binding zinc-fingers found on Gag-Pol polyproteins from mouse retroviruses. Added to clan to resolve overlaps with zf-H2C2, but neither are true members.


Pssm-ID: 293326  Cd Length: 96  Bit Score: 84.39  E-value: 2.11e-19
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1340 FEYTDSDLETVQSLGANYEQEANVWRYQGKILLPQGAAKELLSQLHRWTHLGHKKLKALLQREEQTYYIHNPNALIQQIT 1419
Cdd:pfam16721    3 FHYTVTDIKDLTKLGAIYDKTKKYWVYQGKPVMPDQFTFELLDFLHQLTHLSFSKMKALLERSHSPYYMLNRDRTLKNIT 82
                           90
                   ....*....|...
gi 1868331368 1420 STCTPCAKVNTGR 1432
Cdd:pfam16721   83 ETCKACAQVNASK 95
RT_RNaseH_2 pfam17919
RNase H-like domain found in reverse transcriptase;
995-1058 1.56e-17

RNase H-like domain found in reverse transcriptase;


Pssm-ID: 465567 [Multi-domain]  Cd Length: 100  Bit Score: 79.08  E-value: 1.56e-17
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1868331368  995 WTEEHQKAFDGIKKALLSAPALSLPNINKPFTLYVDekA---GMAkGVLTQ-QLGPWKRPVAYFSKKL 1058
Cdd:pfam17919    1 WTEECQKAFEKLKQALTSAPVLAHPDPDKPFILETD--AsdyGIG-AVLSQeDDDGGERPIAYASRKL 65
RNase_HI_prokaryote_like cd09278
RNase HI family found mainly in prokaryotes; Ribonuclease H (RNase H) is classified into two ...
1183-1321 3.46e-17

RNase HI family found mainly in prokaryotes; Ribonuclease H (RNase H) is classified into two evolutionarily unrelated families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is involved in DNA replication, repair and transcription. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes and most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site (DEDD), residues and have the same catalytic mechanism and functions in cells. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. Prokaryotic RNase H varies greatly in domain structures and substrate specificities. Prokaryotes and some single-cell eukaryotes do not require RNase H for viability.


Pssm-ID: 260010 [Multi-domain]  Cd Length: 139  Bit Score: 79.45  E-value: 3.46e-17
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1183 FTDGSSFLQDGirKAGAAVV--DGQTTIWASALPPGTSAQRAELIALTQALKMA-EGRRVNIYTDSRYAF--ATAHVHGe 1257
Cdd:cd09278      5 YTDGACLGNPG--PGGWAAVirYGDHEKELSGGEPGTTNNRMELTAAIEALEALkEPCPVTIYTDSQYVIngITKWIKG- 81
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1868331368 1258 iYRRRGLLTSAEKDIKNKtEILELLQALFLPRRLSIIHCPGHQkGNdpvaRGNRMADEEARKAA 1321
Cdd:cd09278     82 -WKKNGWKTADGKPVKNR-DLWQELDALLAGHKVTWEWVKGHA-GH----PGNERADRLANKAA 138
RNase_HI_eukaryote_like cd09280
Eukaryotic RNase H is essential and is longer and more complex than their prokaryotic ...
1183-1321 3.87e-17

Eukaryotic RNase H is essential and is longer and more complex than their prokaryotic counterparts; Ribonuclease H (RNase H) is classified into two families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is involved in DNA replication, repair and transcription. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H is widely present in various organisms, including bacteria, archaea and eukaryote and most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site (DEDD) residues and have the same catalytic mechanism and functions in cells. Eukaryotic RNase H is longer and more complex than in prokaryotes. Almost all eukaryotic RNase HI have highly conserved regions at their N-termini called hybrid binding domain (HBD). It is speculated that the HBD contributes to binding the RNA/DNA hybrid. Prokaryotes and some single-cell eukaryotes do not require RNase H for viability, but RNase H is essential in higher eukaryotes. RNase H knockout mice lack mitochondrial DNA replication and die as embryos.


Pssm-ID: 260012 [Multi-domain]  Cd Length: 145  Bit Score: 79.53  E-value: 3.87e-17
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1183 FTDGSsFLQDGIR--KAGAAVVDGQTTIWASALP-PGTSA--QRAELIALTQALKMA---EGRRVNIYTDSRYAFATAHV 1254
Cdd:cd09280      3 YTDGS-CLNNGKPgaRAGIGVYFGPGDPRNVSEPlPGRKQtnNRAELLAVIHALEQApeeGIRKLEIRTDSKYAINCITK 81
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1868331368 1255 HGEIYRRRGLLTSAEKDIKNKTEILELLQAL-FLPRRLSIIHCPGHQkgNDPvarGNRMADEEARKAA 1321
Cdd:cd09280     82 WIPKWKKNGWKTSKGKPVKNQDLIKELDKLLrKRGIKVKFEHVKGHS--GDP---GNEEADRLAREGA 144
RnhA COG0328
Ribonuclease HI [Replication, recombination and repair];
1183-1321 6.24e-16

Ribonuclease HI [Replication, recombination and repair];


Pssm-ID: 440097 [Multi-domain]  Cd Length: 136  Bit Score: 76.04  E-value: 6.24e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1183 FTDGSSFLQDGIRKAGAAVVDGQTTIWASALPPGTSAQRAELIALTQALKMAE---GRRVNIYTDSRYAFATAHVHGEIY 1259
Cdd:COG0328      6 YTDGACRGNPGPGGWGAVIRYGGEEKELSGGLGDTTNNRAELTALIAALEALKelgPCEVEIYTDSQYVVNQITGWIHGW 85
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1868331368 1260 RRRGlltsaEKDIKNKtEILELLQALFLPRRLSIIHCPGHQkGNdpvaRGNRMADEEARKAA 1321
Cdd:COG0328     86 KKNG-----WKPVKNP-DLWQRLDELLARHKVTFEWVKGHA-GH----PGNERADALANKAL 136
RNase_H_like cd06222
Ribonuclease H-like superfamily, including RNase H, HI, HII, HIII, and RNase-like domain IV of ...
1182-1318 2.81e-14

Ribonuclease H-like superfamily, including RNase H, HI, HII, HIII, and RNase-like domain IV of spliceosomal protein Prp8; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. It is widely present in various organisms, including bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. An important RNase H function is to remove Okazaki fragments during DNA replication. RNase H inhibitors have been explored as anti-HIV drug targets since RNase H inactivation inhibits reverse transcription. This model also includes the Prp8 domain IV, which adopts the RNase fold but shows low sequence homology; domain IV is implicated in key spliceosomal interactions.


Pssm-ID: 259998 [Multi-domain]  Cd Length: 121  Bit Score: 70.81  E-value: 2.81e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1182 WFTDGSSFLQDGIRKAGAAVVDGQTTI--WASALPPGTSAQRAELIALTQALKMA---EGRRVNIYTDSRYAFATahVHG 1256
Cdd:cd06222      1 INVDGSCRGNPGPAGIGGVLRDHEGGWlgGFALKIGAPTALEAELLALLLALELAldlGYLKVIIESDSKYVVDL--INS 78
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1868331368 1257 EIYRRrglltsaekdIKNKTEILELLQALFLPRRLSIIHCPGHqkgndpvarGNRMADEEAR 1318
Cdd:cd06222     79 GSFKW----------SPNILLIEDILLLLSRFWSVKISHVPRE---------GNQVADALAK 121
Rnase_HI_RT_non_LTR cd09276
non-LTR RNase HI domain of reverse transcriptases; Ribonuclease H (RNase H) is classified into ...
1183-1321 5.06e-13

non-LTR RNase HI domain of reverse transcriptases; Ribonuclease H (RNase H) is classified into two families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). Ribonuclease HI (RNase HI) is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as an adjunct domain to the reverse transcriptase gene in retroviruses, long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. The position of the RNase domain of non-LTR and LTR transposons is at the carboxyl terminal of the reverse transcriptase (RT) domain and their RNase domains group together, indicating a common evolutionary origin. Many non-LTR transposons have lost the RNase domain because their activity is at the nucleus and cellular RNase may suffice; however LTR retrotransposons always encode their own RNase domain because it requires RNase activity in RNA-protein particles in the cytoplasm. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260008 [Multi-domain]  Cd Length: 131  Bit Score: 67.24  E-value: 5.06e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1183 FTDGSsFLQDgirKAGAAVVDGQTTIWAS---ALPPGTSAQRAELIALTQALKMA-----EGRRVNIYTDSRYAFA---- 1250
Cdd:cd09276      3 YTDGS-KLEG---SVGAGFVIYRGGEVISrsyRLGTHASVFDAELEAILEALELAlatarRARKVTIFTDSQSALQalrn 78
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1868331368 1251 TAHVHGEIYRRRGLLTSAEkdIKNKteilellqalflPRRLSIIHCPGHQKgndpvARGNRMADEEARKAA 1321
Cdd:cd09276     79 PRRSSGQVILIRILRLLRL--LKAK------------GVKVRLRWVPGHVG-----IEGNEAADRLAKEAA 130
RT_like cd00304
RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is ...
846-932 8.07e-13

RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs.


Pssm-ID: 238185 [Multi-domain]  Cd Length: 98  Bit Score: 65.84  E-value: 8.07e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  846 SLPQGFKNSPTLFDEALHQDLADFRIHNPNLILLQYVDDLLLAAESEkDCLQGTGALLQKLGELGYRASAKKAQLYREQ- 924
Cdd:cd00304     11 PLPQGSPLSPALANLYMEKLEAPILKQLLDITLIRYVDDLVVIAKSE-QQAVKKRELEEFLARLGLNLSDEKTQFTEKEk 89

                   ....*....
gi 1868331368  925 -VTYLGYRL 932
Cdd:cd00304     90 kFKFLGILV 98
RT_RNaseH pfam17917
RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) ...
1022-1125 1.10e-12

RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) activities allow reverse transcriptases to convert the single-stranded retroviral RNA genome into double-stranded DNA, which is integrated into the host chromosome during infection. This entry represents the RNase H like domain.


Pssm-ID: 465565  Cd Length: 104  Bit Score: 65.61  E-value: 1.10e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1022 NKPFTLYVDE-KAGMAkGVLTQQL-GPWKRPVAYFSKKLDNVAMGWPPCLQMVAAVAVLTKDEDRLTLGQQLTVIAPHAI 1099
Cdd:pfam17917    3 SKPFILETDAsDYGIG-AVLSQKDeDGKERPIAYASRKLTPAERNYSTTEKELLAIVWALKKFRHYLLGRKFTVYTDHKP 81
                           90       100
                   ....*....|....*....|....*.
gi 1868331368 1100 EAVVRQPpdrWLSNARMTHYQALLLN 1125
Cdd:pfam17917   82 LKYLFTP---KELNGRLARWALFLQE 104
RT_DIRS1 cd03714
RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of retransposons. Members ...
808-932 8.70e-11

RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of retransposons. Members of the subfamily include the Dictyostelium DIRS-1, Volvox carteri kangaroo, and Panagrellus redivivus PAT elements. These elements differ from LTR and conventional non-LTR retrotransposons. They contain split direct repeat (SDR) termini, and have been proposed to integrate via double-stranded closed-circle DNA intermediates assisted by an encoded recombinase which is similar to gamma-site-specific integrase.


Pssm-ID: 239684 [Multi-domain]  Cd Length: 119  Bit Score: 60.82  E-value: 8.70e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  808 LDLKDTFFCLRLSPQSQPMFAFEWKDPetgfsgQLTWTSLPQGFKNSPTLFDEALHQDLADFRIHNPNLILlqYVDDLLL 887
Cdd:cd03714      1 VDLKDAYFHIPILPRSRDLLGFAWQGE------TYQFKALPFGLSLAPRVFTKVVEALLAPLRLLGVRIFS--YLDDLLI 72
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1868331368  888 AAESEKDClqgtGALLQKLGE-----LGYRASAKKAQLY-REQVTYLGYRL 932
Cdd:cd03714     73 IASSIKTS----EAVLRHLRAtllanLGFTLNLEKSKLGpTQRITFLGLEL 119
retropepsin_like cd00303
Retropepsins; pepsin-like aspartate proteases; The family includes pepsin-like aspartate ...
552-632 1.24e-10

Retropepsins; pepsin-like aspartate proteases; The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.


Pssm-ID: 133136  Cd Length: 92  Bit Score: 59.27  E-value: 1.24e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  552 VTLNVGGQPVTFLVDTGAQHSVLTQT------QGPLSTRTAW-VQGATGGKLHRWTTERKVHLSTGH--VTHSFLLVPEC 622
Cdd:cd00303      1 LKGKINGVPVRALVDSGASVNFISESlakklgLPPRLLPTPLkVKGANGSSVKTLGVILPVTIGIGGktFTVDFYVLDLL 80
                           90
                   ....*....|.
gi 1868331368  623 PYP-LLGRDLL 632
Cdd:cd00303     81 SYDvILGRPWL 91
zf-H2C2 pfam09337
H2C2 zinc finger; This domain binds to histone upstream activating sequence (UAS) elements ...
1385-1426 1.61e-08

H2C2 zinc finger; This domain binds to histone upstream activating sequence (UAS) elements that are found in histone gene promoters.


Pssm-ID: 430537  Cd Length: 39  Bit Score: 51.56  E-value: 1.61e-08
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|..
gi 1868331368 1385 HRWTHLGHKKLKALLQREeqtYYIHNPNALIQQITSTCTPCA 1426
Cdd:pfam09337    1 HALTHLGINKLTALLARK---YHWLGIKETVSEVISSCVACQ 39
PHA03247 PHA03247
large tegument protein UL36; Provisional
81-222 3.23e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 58.80  E-value: 3.23e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   81 PYIVTW----ESLSRD----PPPWVRPFLPPRLSGPR-PTPLPSPAPLIPT-------PSAPPSTSSLLPLTEPQSPNPR 144
Cdd:PHA03247  2531 PRMLTWirglEELASDdagdPPPPLPPAAPPAAPDRSvPPPRPAPRPSEPAvtsrarrPDAPPQSARPRAPVDDRGDPRG 2610
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1868331368  145 PKAPAVLPdtqEDLLLLDSPPPYPNAEQAASQLPAADQSPPPPSsttgDPGDPSPPSTRLRSRRDRTLEGPDSSGISQ 222
Cdd:PHA03247  2611 PAPPSPLP---PDTHAPDPPPPSPSPAANEPDPHPPPTVPPPER----PRDDPAPGRVSRPRRARRLGRAAQASSPPQ 2681
Gag_p12 pfam01141
Gag polyprotein, inner coat protein p12; The retroviral p12 is a virion structural protein. ...
144-224 4.19e-07

Gag polyprotein, inner coat protein p12; The retroviral p12 is a virion structural protein. p12 is proline rich. The function carried out by p12 in assembly and replication is unknown. p12 is associated with pathogenicity of the virus.


Pssm-ID: 279483 [Multi-domain]  Cd Length: 85  Bit Score: 48.98  E-value: 4.19e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  144 RPKAPAVLPDTQEDL--LLLDSPPPYPNAEQAasqlPAADQSPPPPSSTTGDPGDPSPPSTRLRSRRDRTLEgpdSSGIS 221
Cdd:pfam01141   10 KPPKPQVLPDSGGPLidLLTEDPPPYRDAQPP----PSARDGNEEEAAPAGEAPDPSPMASRLRGRRDPPAA---DSTTS 82

                   ...
gi 1868331368  222 QAF 224
Cdd:pfam01141   83 QAF 85
PHA03247 PHA03247
large tegument protein UL36; Provisional
93-201 7.38e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 54.17  E-value: 7.38e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   93 PPPWVRPFLPPRLSGPRPTPLP---SPAPLIPTPSAPPSTSSL-LPLTEPQSPNPRPKAPAVLPDTQEDLLLLDSPPPYP 168
Cdd:PHA03247  2830 PPTSAQPTAPPPPPGPPPPSLPlggSVAPGGDVRRRPPSRSPAaKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPP 2909
                           90       100       110
                   ....*....|....*....|....*....|...
gi 1868331368  169 NAEQAASQLPAAdQSPPPPSSTTGDPGDPSPPS 201
Cdd:PHA03247  2910 QPQAPPPPQPQP-QPPPPPQPQPPPPPPPRPQP 2941
gag-asp_proteas pfam13975
gag-polyprotein putative aspartyl protease; This family of putative aspartyl proteases is ...
552-634 7.58e-07

gag-polyprotein putative aspartyl protease; This family of putative aspartyl proteases is found pre-dominantly in retroviral proteins.


Pssm-ID: 464060  Cd Length: 92  Bit Score: 48.73  E-value: 7.58e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  552 VTLNVGGQPVTFLVDTGAQHSVLTQT-------QGPLSTRTAWVQGATGGKLHRWTTERKVHLSTGHVTHSFLLV--PEC 622
Cdd:pfam13975    1 VDVTINGRPVRFLVDTGASVTVISEAlaerlglDRLVDAYPVTVRTANGTVRAARVRLDSVKIGGIELRNVPAVVlpGDL 80
                           90
                   ....*....|..
gi 1868331368  623 PYPLLGRDLLSK 634
Cdd:pfam13975   81 DDVLLGMDFLKR 92
PHA03247 PHA03247
large tegument protein UL36; Provisional
71-201 2.32e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 52.63  E-value: 2.32e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   71 PGPWGYPDQVPyIVTWESLSRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPS-TSSLLPLTEPQSPNPRPKAPA 149
Cdd:PHA03247  2712 PHALVSATPLP-PGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPApAPPAAPAAGPPRRLTRPAVAS 2790
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1868331368  150 VLPDTQEDLLLLDSPPPYPNAEQAASQLPA----ADQSPPPPSSTTGDPGDPSPPS 201
Cdd:PHA03247  2791 LSESRESLPSPWDPADPPAAVLAPAAALPPaaspAGPLPPPTSAQPTAPPPPPGPP 2846
PHA03247 PHA03247
large tegument protein UL36; Provisional
89-199 2.79e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 52.25  E-value: 2.79e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   89 LSRDPPPWVRPFLPPRLSGPRPTPLPSPAPliptpsaPPSTSSlLPLTEPQSPNPRPKAPAVLPDTQEdlllldsPPPYP 168
Cdd:PHA03247  2862 VRRRPPSRSPAAKPAAPARPPVRRLARPAV-------SRSTES-FALPPDQPERPPQPQAPPPPQPQP-------QPPPP 2926
                           90       100       110
                   ....*....|....*....|....*....|.
gi 1868331368  169 NAEQAASQLPAADQSPPPPSSTTGDPGDPSP 199
Cdd:PHA03247  2927 PQPQPPPPPPPRPQPPLAPTTDPAGAGEPSG 2957
PHA03247 PHA03247
large tegument protein UL36; Provisional
71-203 3.08e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 52.25  E-value: 3.08e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   71 PGPWGYPDQVPYIVTWESLSRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTP-SAPPSTSSLLPLTEPQSPnPRPKAPA 149
Cdd:PHA03247  2633 PAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRrAARPTVGSLTSLADPPPP-PPTPEPA 2711
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....
gi 1868331368  150 VLPDTQEDLLlldspPPYPNAEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTR 203
Cdd:PHA03247  2712 PHALVSATPL-----PPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARP 2760
PHA02682 PHA02682
ORF080 virion core protein; Provisional
95-184 5.28e-06

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 49.86  E-value: 5.28e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   95 PWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPltEPQSPNPRPkAPAVLPDTQEDLLlldSPPPYPnaeqaA 174
Cdd:PHA02682   110 PAPAPACPPATAPTCPPPAVCPAPARPAPACPPSTRQCPP--APPLPTPKP-APAAKPIFLHNQL---PPPDYP-----A 178
                           90
                   ....*....|
gi 1868331368  175 SQLPAADQSP 184
Cdd:PHA02682   179 ASCPTIETAP 188
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
71-245 7.05e-06

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 50.94  E-value: 7.05e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   71 PGPWGYPDQVPYIVTWESLSRDPPPwVRPFLPPRLSGPRPTPLPSPAPLIPTPSAP-----PSTSSLLPLTEPQSPNPRP 145
Cdd:PHA03307   260 PAPITLPTRIWEASGWNGPSSRPGP-ASSSSSPRERSPSPSPSSPGSGPAPSSPRAsssssSSRESSSSSTSSSSESSRG 338
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  146 kaPAVLPDTQEDLLLLDSPPPYPNAEQAASQLPAADQSPPPPSSTtgdPGDPSPPSTR-LRSRRDRTLEGPDSSGISQAF 224
Cdd:PHA03307   339 --AAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAAS---AGRPTRRRARaAVAGRARRRDATGRFPAGRPR 413
                          170       180
                   ....*....|....*....|.
gi 1868331368  225 PLRLMGEGRYQYWPFSSADLY 245
Cdd:PHA03307   414 PSPLDAGAASGAFYARYPLLT 434
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
102-226 1.31e-05

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 49.77  E-value: 1.31e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  102 PPRLSGPRPTPLPSPAPLIPT------------PSAPPSTSSLLPltEPQSPNPRPKAPAVLPDTQEDLLLLDSPPPYPN 169
Cdd:pfam03154  375 PPHLSGPSPFQMNSNLPPPPAlkplsslsthhpPSAHPPPLQLMP--QSQQLPPPPAQPPVLTQSQSLPPPAASHPPTSG 452
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1868331368  170 AEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTrlrSRRDRTLEGPDSSGISQAFPL 226
Cdd:pfam03154  453 LHQVPSQSPFPQHPFVPGGPPPITPPSGPPTST---SSAMPGIQPPSSASVSSSGPV 506
PHA03247 PHA03247
large tegument protein UL36; Provisional
92-202 1.72e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 49.94  E-value: 1.72e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   92 DPPPWVRPFLPPRLSGPRPTPLPSP-----APLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPAVLPDTQEDLLLLDSPPP 166
Cdd:PHA03247  2625 DPPPPSPSPAANEPDPHPPPTVPPPerprdDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPP 2704
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|
gi 1868331368  167 YPNAEQAASQLPAADQSPPPPSSTTGD----PGDPSPPST 202
Cdd:PHA03247  2705 PPTPEPAPHALVSATPLPPGPAAARQAspalPAAPAPPAV 2744
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
88-188 2.48e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 48.65  E-value: 2.48e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   88 SLSRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPAVlPDTQEdlllldSPPPY 167
Cdd:PRK14950   359 LLVPVPAPQPAKPTAAAPSPVRPTPAPSTRPKAAAAANIPPKEPVRETATPPPVPPRPVAPPV-PHTPE------SAPKL 431
                           90       100
                   ....*....|....*....|.
gi 1868331368  168 PNAEQAASQLPAADQSPPPPS 188
Cdd:PRK14950   432 TRAAIPVDEKPKYTPPAPPKE 452
PHA03247 PHA03247
large tegument protein UL36; Provisional
92-202 3.06e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.78  E-value: 3.06e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   92 DPPPwvrpflPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEpqSPNPRPKAPAVlPDTqedllllDSPPPYPNAE 171
Cdd:PHA03247  2700 DPPP------PPPTPEPAPHALVSATPLPPGPAAARQASPALPAAP--APPAVPAGPAT-PGG-------PARPARPPTT 2763
                           90       100       110
                   ....*....|....*....|....*....|..
gi 1868331368  172 QA-ASQLPAADQSPPPPSSTTGDPGDPSPPST 202
Cdd:PHA03247  2764 AGpPAPAPPAAPAAGPPRRLTRPAVASLSESR 2795
PHA03247 PHA03247
large tegument protein UL36; Provisional
71-202 3.98e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.40  E-value: 3.98e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   71 PGPWGyPDQVPYIVTWESLSRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPS--------------APPSTSSLLPLT 136
Cdd:PHA03247  2799 PSPWD-PADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPlggsvapggdvrrrPPSRSPAAKPAA 2877
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  137 EPQSPNPRPKAPAVLPDTQEDLLLLDSPPPYPNAEQAASQLPAADQSPP----PPSSTTGDPGDPSPPST 202
Cdd:PHA03247  2878 PARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPpqpqPPPPPPPRPQPPLAPTT 2947
retropepsin_like_bacteria cd05483
Bacterial aspartate proteases, retropepsin-like protease family; This family of bacteria ...
550-593 4.12e-05

Bacterial aspartate proteases, retropepsin-like protease family; This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.


Pssm-ID: 133150  Cd Length: 96  Bit Score: 43.77  E-value: 4.12e-05
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 1868331368  550 PRVTLNVGGQPVTFLVDTGAQHSVLTQT------QGPLSTRTAWVQGATG 593
Cdd:cd05483      3 FVVPVTINGQPVRFLLDTGASTTVISEElaerlgLPLTLGGKVTVQTANG 52
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
71-218 4.33e-05

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 48.24  E-value: 4.33e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   71 PGPWGYPDQVPYIVTWESLSRDPPPWvrPFLPPRLSGPRPTPLPSPA--PLIPTPSAPPSTSSLLPLTEPQSPNPRPKAP 148
Cdd:PHA03307    65 FEPPTGPPPGPGTEAPANESRSTPTW--SLSTLAPASPAREGSPTPPgpSSPDPPPPTPPPASPPPSPAPDLSEMLRPVG 142
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  149 AVLPDTQEDLLLLDSPPPYPNAEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTRLRSRRDRTLEGPDSS 218
Cdd:PHA03307   143 SPGPPPAASPPAAGASPAAVASDAASSRQAALPLSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSP 212
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
103-223 4.63e-05

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 48.24  E-value: 4.63e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  103 PRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPAVLP--------DTQEDLLLLDSPPPYPNAEQAA 174
Cdd:PHA03307   251 PENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSPsspgsgpaPSSPRASSSSSSSRESSSSSTS 330
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*....
gi 1868331368  175 SQLPAADQSPPPPSSTTGDPGDPSPPSTRLRSRRDRTLEGPDSSGISQA 223
Cdd:PHA03307   331 SSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPA 379
PHA03378 PHA03378
EBNA-3B; Provisional
50-225 4.70e-05

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 48.14  E-value: 4.70e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   50 WPRDGTFDLTTILQVKTRVFQPGPWGYPDQVPYIvtweSLSRDPPPWVRPFLP---PRLSGPRPTPLPS-------PAPL 119
Cdd:PHA03378   624 WPMPLRPIPMRPLRMQPITFNVLVFPTPHQPPQV----EITPYKPTWTQIGHIpyqPSPTGANTMLPIQwapgtmqPPPR 699
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  120 IPTPSAPPSTSSLlPLTEPQSPNPRPKAPAVLPDTQEDLLLLDSPPPYPNAEQAASQLPAADQSP-PPPSSTTGDPGdPS 198
Cdd:PHA03378   700 APTPMRPPAAPPG-RAQRPAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGRaRPPAAAPGAPT-PQ 777
                          170       180
                   ....*....|....*....|....*..
gi 1868331368  199 PPSTRLRSRRDRTLEGPDSSGISQAFP 225
Cdd:PHA03378   778 PPPQAPPAPQQRPRGAPTPQPPPQAGP 804
COG3577 COG3577
Predicted aspartyl protease [General function prediction only];
523-593 5.24e-05

Predicted aspartyl protease [General function prediction only];


Pssm-ID: 442797  Cd Length: 152  Bit Score: 44.94  E-value: 5.24e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  523 NPRRKPTPILPLEKGD*DSQGQVPPPE-----PRVTLNVGGQPVTFLVDTGAQHSVLTQT-------QGPLSTRTAWVQG 590
Cdd:COG3577     10 GGRGVRAQLNPGQAPVSTGGGEVVLKRdrdghFVVEGTINGQPVRFLVDTGASTVVLSESdarrlglDPEDLGRPVRVQT 89

                   ...
gi 1868331368  591 ATG 593
Cdd:COG3577     90 ANG 92
PHA03247 PHA03247
large tegument protein UL36; Provisional
98-226 7.66e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 47.63  E-value: 7.66e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   98 RPFLPPRLSGP-RPTPLPSPAPLIPTPSAPPSTSSLLPLTE--PQSPNPRPKAPAVLPDTQEDLLLLDSPPPYPNAEQAA 174
Cdd:PHA03247  2756 RPARPPTTAGPpAPAPPAAPAAGPPRRLTRPAVASLSESREslPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQ 2835
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1868331368  175 SQLPAADQSPPPPSSTTGD---PGDP--------SPPSTRLRSRRD--RTLEGPDSSGISQAFPL 226
Cdd:PHA03247  2836 PTAPPPPPGPPPPSLPLGGsvaPGGDvrrrppsrSPAAKPAAPARPpvRRLARPAVSRSTESFAL 2900
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
87-219 7.67e-05

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 47.26  E-value: 7.67e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   87 ESLSRDPPPWVRPFL-----------PPRLSGPRPTPLPSPAPLIP-TPSAPPSTSS-LLPLTEPQSPNPRPKAPAVLPD 153
Cdd:pfam17823  289 DTMARNPAAPMGAQAqgpiiqvstdqPVHNTAGEPTPSPSNTTLEPnTPKSVASTNLaVVTTTKAQAKEPSASPVPVLHT 368
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1868331368  154 TQEdlllldspppyPNAEQAAsqlPAADQSPPPPSSTTGDPGDPSPP---STRLRSRRDRTLEGPDSSG 219
Cdd:pfam17823  369 SMI-----------PEVEATS---PTTQPSPLLPTQGAAGPGILLAPeqvATEATAGTASAGPTPRSSG 423
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
102-203 8.01e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 47.18  E-value: 8.01e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  102 PPRLSGPRPTPLPSPAPLIPTPSAPPstsslLPLTEPQSPNPRPKAPAVLPDTQEdllllDSPPPYPNAEQAASQLPAAD 181
Cdd:PRK12323   444 PGGAPAPAPAPAAAPAAAARPAAAGP-----RPVAAAAAAAPARAAPAAAPAPAD-----DDPPPWEELPPEFASPAPAQ 513
                           90       100
                   ....*....|....*....|....*
gi 1868331368  182 QSPPPP---SSTTGDPGDPSPPSTR 203
Cdd:PRK12323   514 PDAAPAgwvAESIPDPATADPDDAF 538
RNase_HI_like cd09279
RNAse HI family that includes archaeal, some bacterial as well as plant RNase HI; Ribonuclease ...
1183-1321 8.16e-05

RNAse HI family that includes archaeal, some bacterial as well as plant RNase HI; Ribonuclease H (RNase H) is classified into two evolutionarily unrelated families, type 1 (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type 2 (prokaryotic RNase HII and HIII, and eukaryotic RNase H2). RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner. RNase H is involved in DNA replication, repair and transcription. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes and most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site (DEDD) residues and have the same catalytic mechanism and functions in cells. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. Most archaeal genomes contain only type 2 RNase H (RNase HII); however, a few contain RNase HI as well. Although archaeal RNase HI sequences conserve the DEDD active-site motif, they lack other common features important for catalytic function, such as the basic protrusion region. Archaeal RNase HI homologs are more closely related to retroviral RNase HI than bacterial and eukaryotic type I RNase H in enzymatic properties.


Pssm-ID: 260011 [Multi-domain]  Cd Length: 128  Bit Score: 43.62  E-value: 8.16e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1183 FTDGSSFLQDGIrkAGAAVV----DGQTTIWASALPPGTSAQRAELIALTQALKMAEG---RRVNIYTDSRyaFATAHVH 1255
Cdd:cd09279      4 YFDGASRGNPGP--AGAGVViyspGGEVLELSERLGFPATNNEAEYEALIAGLELALElgaEKLEIYGDSQ--LVVNQLN 79
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1868331368 1256 GEIYrrrglltsaekdIKNK------TEILELLQALflpRRLSIIHCPGHQkgndpvargNRMADEEARKAA 1321
Cdd:cd09279     80 GEYK------------VKNErlkpllEKVLELLAKF---ELVELKWIPREQ---------NKEADALANQAL 127
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
87-205 8.57e-05

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 47.47  E-value: 8.57e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   87 ESLSRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPnPRPKAPAVLPDTQEDLLLLDSPPP 166
Cdd:PHA03307   328 STSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAASA-GRPTRRRARAAVAGRARRRDATGR 406
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*.
gi 1868331368  167 YPNAEQAASQLPAADQS-----PPPPSSTTGD--PGDPSPPSTRLR 205
Cdd:PHA03307   407 FPAGRPRPSPLDAGAASgafyaRYPLLTPSGEpwPGSPPPPPGRVR 452
Not5 COG5665
CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription];
108-201 9.09e-05

CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription];


Pssm-ID: 444384 [Multi-domain]  Cd Length: 874  Bit Score: 46.96  E-value: 9.09e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  108 PRPTPLPSPaPLIPTPSAPPS----TSSLLPLTEPQSPNPRPKA--PAVLPDTQEDLLLLDSPPPYPNAEQAASQLPAAD 181
Cdd:COG5665    256 SSQQPKSQP-TSPSGGTTPPStnqlTTSNTPTSTAKAQPQPPTKkqPAKEPPSDTASGNPSAPSVLINSDSPTSEDPATA 334
                           90       100
                   ....*....|....*....|..
gi 1868331368  182 QSPPPPSSTTG-DPGD-PSPPS 201
Cdd:COG5665    335 SVPTTEETTAFtTPSSvPSTPA 356
PHA03247 PHA03247
large tegument protein UL36; Provisional
87-260 1.02e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 47.24  E-value: 1.02e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   87 ESLSRDPPPWVRPFLPPRLSGPRPTPLPSPAPLiPTPSAPPSTSSLLPLTEPQSPNPRPKAPAVLPDTQEDLLLLDSPP- 165
Cdd:PHA03247  2896 ESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQ-PQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAv 2974
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  166 PYPNAEQAASQLPAADQSPPPPSSTTgDPG------------DPSPPSTRLRsrrdRTLEGPDSSGISQAFPLRLMGEGR 233
Cdd:PHA03247  2975 PRFRVPQPAPSREAPASSTPPLTGHS-LSRvsswasslalheETDPPPVSLK----QTLWPPDDTEDSDADSLFDSDSER 3049
                          170       180       190
                   ....*....|....*....|....*....|.
gi 1868331368  234 YQywpFSSADLYNWKAH----NPPFSQDPQA 260
Cdd:PHA03247  3050 SD---LEALDPLPPEPHdpfaHEPDPATPEA 3077
PHA03247 PHA03247
large tegument protein UL36; Provisional
90-218 1.11e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 47.24  E-value: 1.11e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   90 SRDPPPWV-RPFLPPRLSGPRPTPLPS--PAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPAVLPDTQEdlllLDSPPP 166
Cdd:PHA03247  2868 SRSPAAKPaAPARPPVRRLARPAVSRSteSFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPP----RPQPPL 2943
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1868331368  167 YPNAEQAASQLPAAdqSPPPPSSTTGDPGDPSPPSTRLRSRRDrTLEGPDSS 218
Cdd:PHA03247  2944 APTTDPAGAGEPSG--AVPQPWLGALVPGRVAVPRFRVPQPAP-SREAPASS 2992
PHA03247 PHA03247
large tegument protein UL36; Provisional
94-208 1.27e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 46.86  E-value: 1.27e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   94 PPWVRPFLPPRLSGPRPTPLPSPAPlIPTPSAPPSTSSLLPLTEPQSPNPRPKAPAVLPDTQ--EDLLLLDS---PPPYP 168
Cdd:PHA03247  2478 PVYRRPAEARFPFAAGAAPDPGGGG-PPDPDAPPAPSRLAPAILPDEPVGEPVHPRMLTWIRglEELASDDAgdpPPPLP 2556
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|
gi 1868331368  169 naeqAASQLPAADQSPPPPSSTTgdpgDPSPPSTRLRSRR 208
Cdd:PHA03247  2557 ----PAAPPAAPDRSVPPPRPAP----RPSEPAVTSRARR 2588
Asp_protease_2 pfam13650
Aspartyl protease; This family consists of predicted aspartic proteases, typically from 180 to ...
552-575 1.36e-04

Aspartyl protease; This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix.


Pssm-ID: 433378  Cd Length: 90  Bit Score: 42.27  E-value: 1.36e-04
                           10        20
                   ....*....|....*....|....
gi 1868331368  552 VTLNVGGQPVTFLVDTGAQHSVLT 575
Cdd:pfam13650    1 VPVTINGKPVRFLVDTGASGTVIS 24
PHA03247 PHA03247
large tegument protein UL36; Provisional
71-227 1.38e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 46.86  E-value: 1.38e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   71 PGPWGYPD---------QVPYIVTWESLSRDPPP----WVRPF--LPPRLSGPRPTPLPsPAPLIPTP--SAPPStssll 133
Cdd:PHA03247  2498 PGGGGPPDpdappapsrLAPAILPDEPVGEPVHPrmltWIRGLeeLASDDAGDPPPPLP-PAAPPAAPdrSVPPP----- 2571
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  134 pltepqSPNPRPKAPAVLPDTQEDllllDSPPpypnaEQAASQLPAADQSPPP-PSSTTGDPGDPSPPSTRLRSRRDRTL 212
Cdd:PHA03247  2572 ------RPAPRPSEPAVTSRARRP----DAPP-----QSARPRAPVDDRGDPRgPAPPSPLPPDTHAPDPPPPSPSPAAN 2636
                          170
                   ....*....|....*
gi 1868331368  213 EGPDSSGISQAFPLR 227
Cdd:PHA03247  2637 EPDPHPPPTVPPPER 2651
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
69-209 1.54e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 46.41  E-value: 1.54e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   69 FQPGPWGyPDQVPYIVTWESLSRDPPPWVrpflPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSllpltepQSPNPRPKAP 148
Cdd:PRK12323   363 FRPGQSG-GGAGPATAAAAPVAQPAPAAA----APAAAAPAPAAPPAAPAAAPAAAAAARAVA-------AAPARRSPAP 430
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1868331368  149 AVLPDTQEDLLL----LDSPPPYPNAEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTRLRSRRD 209
Cdd:PRK12323   431 EALAAARQASARgpggAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDD 495
PHA02682 PHA02682
ORF080 virion core protein; Provisional
94-218 1.88e-04

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 45.24  E-value: 1.88e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   94 PPWVRPFLP-PRLSGPRP-TPLPSP---APLI----PTPSAPPSTSSLLPlTEPQSPNPRPKAPAVLPDTQEdlllLDSP 164
Cdd:PHA02682    76 PSGQSPLAPsPACAAPAPaCPACAPaapAPAVtcpaPAPACPPATAPTCP-PPAVCPAPARPAPACPPSTRQ----CPPA 150
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 1868331368  165 PPYPNAEQAASQLPA--ADQSPPPPSSTTGDPGDPSPP--STRLRSR-RDRTLEGPDSS 218
Cdd:PHA02682   151 PPLPTPKPAPAAKPIflHNQLPPPDYPAASCPTIETAPaaSPVLEPRiPDKIIDADNDD 209
BimA_second NF040983
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia ...
108-202 1.99e-04

trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia intracellular motility A), WP_004266405.1-like proteins in Burkholderia mallei or B. pseudomallei. The term BimA has also been used for WP_011205626.1-like homologs that have a very different N-terminal half.


Pssm-ID: 468913 [Multi-domain]  Cd Length: 382  Bit Score: 45.66  E-value: 1.99e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  108 PRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPavlpdtqedlllldsPPPYPnaeqaasqlpaadqSPPPP 187
Cdd:NF040983    86 PNKVPPPPPPPPPPPPPPPTPPPPPPPPPPPPPPSPPPPPP---------------PSPPP--------------SPPPP 136
                           90
                   ....*....|....*
gi 1868331368  188 SSTtgdPGDPSPPST 202
Cdd:NF040983   137 TTT---PPTRTTPST 148
FAP pfam07174
Fibronectin-attachment protein (FAP); This family contains bacterial fibronectin-attachment ...
114-199 2.51e-04

Fibronectin-attachment protein (FAP); This family contains bacterial fibronectin-attachment proteins (FAP). Family members are rich in alanine and proline, are approximately 300 long, and seem to be restricted to mycobacteria. These proteins contain a fibronectin-binding motif that allows mycobacteria to bind to fibronectin in the extracellular matrix.


Pssm-ID: 429334  Cd Length: 301  Bit Score: 44.92  E-value: 2.51e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  114 PSPAPLIPTPSAPPstssllPLTEPQSPNPRPKAPAVLPDTQEDllllDSPPPYPNAEQAASQLPAADQSPPPPSSttgD 193
Cdd:pfam07174   41 PEPAPPPPSTATAP------PAPPPPPPAPAAPAPPPPPAAPNA----PNAPPPPADPNAPPPPPADPNAPPPPAV---D 107

                   ....*.
gi 1868331368  194 PGDPSP 199
Cdd:pfam07174  108 PNAPEP 113
Integrase_H2C2 pfam17921
Integrase zinc binding domain; This zinc binding domain is found in a wide variety of ...
1372-1430 2.73e-04

Integrase zinc binding domain; This zinc binding domain is found in a wide variety of integrase proteins.


Pssm-ID: 465569 [Multi-domain]  Cd Length: 58  Bit Score: 40.31  E-value: 2.73e-04
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1372 LPQGAAKELLSQLH-RWTHLGHKKLKALLQreeQTYYIHNPNALIQQITSTCTPCAKVNT 1430
Cdd:pfam17921    1 VPKSLRKEILKEAHdSGGHLGIEKTLARLR---RRYWWPGMRKDVKKYVKSCETCQRRKP 57
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
93-222 2.84e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 45.55  E-value: 2.84e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   93 PPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPAVLPDTQEDllLLDSPPPYPNAEQ 172
Cdd:PHA03307   117 PPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASSRQAALPLSSPEE--TARAPSSPPAEPP 194
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|
gi 1868331368  173 AASQLPAADQSPPPPSSTTGDPGdPSPPSTRLRSRRDRTLEGPDSSGISQ 222
Cdd:PHA03307   195 PSTPPAAASPRPPRRSSPISASA-SSPAPAPGRSAADDAGASSSDSSSSE 243
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
111-202 3.03e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 45.19  E-value: 3.03e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  111 TPLPSPAPLIPTPSAP-PSTSSLLPLTEPQSPNPRPKAPA-VLPDTQedlllldsPPPYPNAEQAASQLPAAdQSPPPPS 188
Cdd:PRK14950   361 VPVPAPQPAKPTAAAPsPVRPTPAPSTRPKAAAAANIPPKePVRETA--------TPPPVPPRPVAPPVPHT-PESAPKL 431
                           90
                   ....*....|....
gi 1868331368  189 STTGDPGDPSPPST 202
Cdd:PRK14950   432 TRAAIPVDEKPKYT 445
PHA03247 PHA03247
large tegument protein UL36; Provisional
71-226 3.34e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 45.70  E-value: 3.34e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   71 PGPWGYPDQVPYIVTWESLSRDPPPWVRPFLPPRLSGPrPTPLP-------SPAPLIPTPSAPPSTSSLLPLTEPQSPNP 143
Cdd:PHA03247  2680 PQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVS-ATPLPpgpaaarQASPALPAAPAPPAVPAGPATPGGPARPA 2758
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  144 RPKAPAVLPDTQEDLLLLDSPPPYPNAEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTRLRSRRDRTLEGPDSSGISQA 223
Cdd:PHA03247  2759 RPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTA 2838

                   ...
gi 1868331368  224 FPL 226
Cdd:PHA03247  2839 PPP 2841
TALPID3 pfam15324
Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for ...
108-202 3.59e-04

Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for Hedgehog signalling. Mutations in this gene noticed first in chickens lead to multiple abnormalities of development.


Pssm-ID: 434634 [Multi-domain]  Cd Length: 1288  Bit Score: 45.26  E-value: 3.59e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  108 PRPTPLPSPaPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPAVLPDTQEDLLLlDSPPPYPNA---------EQAASQLP 178
Cdd:pfam15324 1047 PTVTPIATP-PPAATPTPPLSENSIDKLKSPSPELPKPWEDSDLPLEEENPNS-EQEELHPRAvvmsvardeEPESVVLP 1124
                           90       100
                   ....*....|....*....|....
gi 1868331368  179 AADQSPPPPSSTTGDPGDPSPPST 202
Cdd:pfam15324 1125 ASPPEPKPLAPPPLGAAPPSPPQS 1148
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
93-196 3.77e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 44.98  E-value: 3.77e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   93 PPPWVRPFLPPRLSGPRPTPLPSPAPLiptpsAPPSTSSLLPLTEPQSPNPRPKAPAVLPDTQEDllllDSPPPYPNAEQ 172
Cdd:PRK07764   412 PAAAAPAAAAAPAPAAAPQPAPAPAPA-----PAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPA----AAPEPTAAPAP 482
                           90       100
                   ....*....|....*....|....
gi 1868331368  173 AASQLPAADQSPPPPSSTTGDPGD 196
Cdd:PRK07764   483 APPAAPAPAAAPAAPAAPAAPAGA 506
TALPID3 pfam15324
Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for ...
91-213 3.91e-04

Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for Hedgehog signalling. Mutations in this gene noticed first in chickens lead to multiple abnormalities of development.


Pssm-ID: 434634 [Multi-domain]  Cd Length: 1288  Bit Score: 45.26  E-value: 3.91e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   91 RDPPPWVRPFLPPRLSGPRPTPLPSPAP------LIPTPSAPP--STSSLLPLTEPQSPNPRPKAPAVlpdtqedlllld 162
Cdd:pfam15324  974 PGDLPTKETLLPTPVPTPQPTPPCSPPSplkepsPVKTPDSSPcvSEHDFFPVKEIPPEKGADTGPAV------------ 1041
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1868331368  163 SPPPYPNAEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTRLRSRRDRTLE 213
Cdd:pfam15324 1042 SLVITPTVTPIATPPPAATPTPPLSENSIDKLKSPSPELPKPWEDSDLPLE 1092
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
90-260 4.43e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 44.98  E-value: 4.43e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   90 SRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPqsPNPRPKAPAVLPDTQEDllLLDSPPPYPN 169
Cdd:PRK07764   600 PPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVA--APEHHPKHVAVPDASDG--GDGWPAKAGG 675
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  170 AEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTRLRSRRDRTLEGPDSSGISQAfplRLMGEGRYQYWPFSSADLYNWKA 249
Cdd:PRK07764   676 AAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGAS---APSPAADDPVPLPPEPDDPPDPA 752
                          170
                   ....*....|.
gi 1868331368  250 HNPPFSQDPQA 260
Cdd:PRK07764   753 GAPAQPPPPPA 763
rnhA PRK00203
ribonuclease H; Reviewed
1215-1321 4.56e-04

ribonuclease H; Reviewed


Pssm-ID: 178927 [Multi-domain]  Cd Length: 150  Bit Score: 42.12  E-value: 4.56e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368 1215 PGTSAQRAELIALTQALKM-AEGRRVNIYTDSRY---AFaTAHVHGeiYRRRGLLTSAEKDIKNKteilELLQALflpRR 1290
Cdd:PRK00203    39 ALTTNNRMELMAAIEALEAlKEPCEVTLYTDSQYvrqGI-TEWIHG--WKKNGWKTADKKPVKNV----DLWQRL---DA 108
                           90       100       110
                   ....*....|....*....|....*....|....*..
gi 1868331368 1291 LSIIH------CPGHQkGNdpvaRGNRMADEEARKAA 1321
Cdd:PRK00203   109 ALKRHqikwhwVKGHA-GH----PENERCDELARAGA 140
PHA03247 PHA03247
large tegument protein UL36; Provisional
88-208 5.26e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 44.93  E-value: 5.26e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   88 SLSRDPPPwvRPFLPPRLSGPRPTPLPSPAPliPTPSAPPSTSSLLPLTEPQSPNP-RPKAP-AVLPDTQEDLLLLDSPP 165
Cdd:PHA03247  2885 RLARPAVS--RSTESFALPPDQPERPPQPQA--PPPPQPQPQPPPPPQPQPPPPPPpRPQPPlAPTTDPAGAGEPSGAVP 2960
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|...
gi 1868331368  166 PYPNAEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTRLRSRR 208
Cdd:PHA03247  2961 QPWLGALVPGRVAVPRFRVPQPAPSREAPASSTPPLTGHSLSR 3003
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
102-214 6.21e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 44.21  E-value: 6.21e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  102 PPRLSGPRPTPLPSPAPLIPTPSAPPST-SSLLPLTEPQSPNPRPKAPAVLPDTQEDLLLLD---SPPPYPNAEQAASQL 177
Cdd:PRK07764   398 APSAAAAAPAAAPAPAAAAPAAAAAPAPaAAPQPAPAPAPAPAPPSPAGNAPAGGAPSPPPAaapSAQPAPAPAAAPEPT 477
                           90       100       110
                   ....*....|....*....|....*....|....*..
gi 1868331368  178 PAADQSPPPPSSTTGDPGDPSPPSTRLRSRRDRTLEG 214
Cdd:PRK07764   478 AAPAPAPPAAPAPAAAPAAPAAPAAPAGADDAATLRE 514
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
91-202 6.21e-04

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 44.30  E-value: 6.21e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   91 RDPPPWVRPFLPPRLSGPRPTPLPSPaPLIPTPSAPPSTSSLL--PLTEPQSPNPRPKAPAVLPDTQEDLLLLDSPPPYP 168
Cdd:PTZ00449   634 KRPPPPQRPSSPERPEGPKIIKSPKP-PKSPKPPFDPKFKEKFydDYLDAAAKSKETKTTVVLDESFESILKETLPETPG 712
                           90       100       110
                   ....*....|....*....|....*....|....
gi 1868331368  169 NAEQAASQLPAadQSPPPPSSTTGDPGDPSPPST 202
Cdd:PTZ00449   713 TPFTTPRPLPP--KLPRDEEFPFEPIGDPDAEQP 744
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
65-221 6.71e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 44.21  E-value: 6.71e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   65 KTRVFQPGPWGYPDQVPYIVTWESLSRDPPPWVRPflPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPR 144
Cdd:PRK07764   654 PKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPA--APAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAP 731
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1868331368  145 PKAPAVLPDTQEDLLLLDSPPPYPNAEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTRlRSRRDRTLEGPDSSGIS 221
Cdd:PRK07764   732 SPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEEEEMAEDDA-PSMDDEDRRDAEEVAME 807
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
107-215 7.43e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 43.93  E-value: 7.43e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  107 GPRPTPLPSPAP---LIPTPSAPPSTSSLLPLTEPQSPNPRPK------APAVLPDTQEDLLLLDSPPPYPNAEQAASql 177
Cdd:PRK14951   384 PEAAAPAAAPVAqaaAAPAPAAAPAAAASAPAAPPAAAPPAPVaapaaaAPAAAPAAAPAAVALAPAPPAQAAPETVA-- 461
                           90       100       110
                   ....*....|....*....|....*....|....*...
gi 1868331368  178 PAADQSPPPPSSTTGDPGDPSPPSTRLRsrrdRTLEGP 215
Cdd:PRK14951   462 IPVRVAPEPAVASAAPAPAAAPAAARLT----PTEEGD 495
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
93-199 1.14e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 43.33  E-value: 1.14e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   93 PPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPK-----APAVLPDTQEDLLLLDSPPPY 167
Cdd:PRK12323   400 AAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPApaaapAAAARPAAAGPRPVAAAAAAA 479
                           90       100       110
                   ....*....|....*....|....*....|..
gi 1868331368  168 PNAEQAASQLPAADQSPPPPSSTTGDPGDPSP 199
Cdd:PRK12323   480 PARAAPAAAPAPADDDPPPWEELPPEFASPAP 511
Herpes_TAF50 pfam03326
Herpesvirus transcription activation factor (transactivator); This family includes EBV BRLF1 ...
94-215 1.21e-03

Herpesvirus transcription activation factor (transactivator); This family includes EBV BRLF1 and similar ORF 50 proteins from other herpesviruses.


Pssm-ID: 308764 [Multi-domain]  Cd Length: 568  Bit Score: 43.15  E-value: 1.21e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   94 PPWVRPFLPPrlsGPRPTPLPSPAPLIPTPSAPPST--SSLLPLTEPQSPNPRPKAPAVLPDTQEDlLLLDSPPPYpnaE 171
Cdd:pfam03326  421 PKRIRALHPP---GSPSANRPLPSSLAPTPTGPVHEpgSSLTPATVPQPLDAAPVATPEASHELQP-PDEETPQPL---D 493
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....
gi 1868331368  172 QAASQLPAADQSPPPPSSTTGDPGDPSPPSTrlrsrRDRTLEGP 215
Cdd:pfam03326  494 EDQALCGQQDASHPPPRGQLDELTTTLESMT-----EDLNLDSP 532
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
90-244 1.47e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 42.94  E-value: 1.47e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   90 SRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPAV-----LPDTQEDLLLLDSP 164
Cdd:PRK12323   436 ARQASARGPGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPppweeLPPEFASPAPAQPD 515
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  165 PPYPNAEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTRLRSRRDRTLEGPD--SSGISQAFPlrlmgegryQYWPFSSA 242
Cdd:PRK12323   516 AAPAGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRPPRasASGLPDMFD---------GDWPALAA 586

                   ..
gi 1868331368  243 DL 244
Cdd:PRK12323   587 RL 588
PHA03247 PHA03247
large tegument protein UL36; Provisional
103-194 1.94e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 43.00  E-value: 1.94e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  103 PRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPAVLPDTQEdlllldSPPPYPNAEQAASQLPAADQ 182
Cdd:PHA03247   405 TRPAAPVPASVPTPAPTPVPASAPPPPATPLPSAEPGSDDGPAPPPERQPPAPA------TEPAPDDPDDATRKALDALR 478
                           90
                   ....*....|..
gi 1868331368  183 SPPPPSSTTGDP 194
Cdd:PHA03247   479 ERRPPEPPGADL 490
flhF PRK06995
flagellar biosynthesis protein FlhF;
110-231 2.19e-03

flagellar biosynthesis protein FlhF;


Pssm-ID: 235904 [Multi-domain]  Cd Length: 484  Bit Score: 42.26  E-value: 2.19e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  110 PTPLPSPAPLIPTPSAPPSTSSlLPLTEPQSPNPRPKAPAVLPDTQEDLLLLDS--PPPYPNAEQAASQLPAADQSPPPP 187
Cdd:PRK06995    53 PPAAAAPAAAQPPPAAAPAAVS-RPAAPAAEPAPWLVEHAKRLTAQREQLVARAaaPAAPEAQAPAAPAERAAAENAARR 131
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....
gi 1868331368  188 SSTTgdpgDPSPPSTRLRSRRDRTLEGPDSSGISQAFPLRLMGE 231
Cdd:PRK06995   132 LARA----AAAAPRPRVPADAAAAVADAVKARIERIVNDTVMQE 171
HIV_retropepsin_like cd05482
Retropepsins, pepsin-like aspartate proteases; This is a subfamily of retropepsins. The family ...
552-633 2.31e-03

Retropepsins, pepsin-like aspartate proteases; This is a subfamily of retropepsins. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.


Pssm-ID: 133149  Cd Length: 87  Bit Score: 38.40  E-value: 2.31e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  552 VTLNVGGQPVTFLVDTGAQHSVLTQTQGPLSTRTAW----VQGATGGKLHRWTTerKVHL---STGHVTHSFLLVPECPY 624
Cdd:cd05482      1 LTLYINGKLFEGLLDTGADVSIIAENDWPKNWPIQPapsnLTGIGGAITPSQSS--VLLLeidGEGHLGTILVYVLSLPV 78

                   ....*....
gi 1868331368  625 PLLGRDLLS 633
Cdd:cd05482     79 NLWGRDILS 87
PRK10263 PRK10263
DNA translocase FtsK; Provisional
68-208 2.39e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 42.76  E-value: 2.39e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   68 VFQPGPWGYPDQVPYIVTWESLSRdppPWVRPFLPPRlsgPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKA 147
Cdd:PRK10263   372 VIAPAPEGYPQQSQYAQPAVQYNE---PLQQPVQPQQ---PYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGN 445
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1868331368  148 PAVLPDTQEdlllLDSPPPYPNAEQAASQ-LPAADQSPPPPSSTTGDPGDPSPPSTRLRSRR 208
Cdd:PRK10263   446 AWQAEEQQS----TFAPQSTYQTEQTYQQpAAQEPLYQQPQPVEQQPVVEPEPVVEETKPAR 503
PLN02983 PLN02983
biotin carboxyl carrier protein of acetyl-CoA carboxylase
93-143 2.60e-03

biotin carboxyl carrier protein of acetyl-CoA carboxylase


Pssm-ID: 215533 [Multi-domain]  Cd Length: 274  Bit Score: 41.36  E-value: 2.60e-03
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1868331368   93 PPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPST-SSLLPLTEP------QSPNP 143
Cdd:PLN02983   157 PPHAMPPASPPAAQPAPSAPASSPPPTPASPPPAKAPkSSHPPLKSPmagtfyRSPAP 214
PHA03247 PHA03247
large tegument protein UL36; Provisional
102-208 2.64e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 42.62  E-value: 2.64e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  102 PPRLSGPRPTPLPSPAPLIPTPSAPPSTS------SLLPLTEPQSPNPRPKAPAVLPDTQED---LLLLDSPPPYPNAEQ 172
Cdd:PHA03247   268 APETARGATGPPPPPEAAAPNGAAAPPDGvwgaalAGAPLALPAPPDPPPPAPAGDAEEEDDedgAMEVVSPLPRPRQHY 347
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|..
gi 1868331368  173 AASQLPAADQSPPPPSS----TTGD--PGDPSPPSTRLRSRR 208
Cdd:PHA03247   348 PLGFPKRRRPTWTPPSSledlSAGRhhPKRASLPTRKRRSAR 389
PHA03378 PHA03378
EBNA-3B; Provisional
71-205 2.80e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 42.36  E-value: 2.80e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   71 PGPWGYPDQVPYIVTWESL---SRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSP-----N 142
Cdd:PHA03378   596 PWPVPHPSQTPEPPTTQSHipeTSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPPQVEITPYKPTWTQIghipyQ 675
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  143 PRPKAPAVL------PDTQEDLLLLDSPPPYPNAEQAASQLPAADQSP-PPPSSTTGDPGDPSPPSTRLR 205
Cdd:PHA03378   676 PSPTGANTMlpiqwaPGTMQPPPRAPTPMRPPAAPPGRAQRPAAATGRaRPPAAAPGRARPPAAAPGRAR 745
PTZ00441 PTZ00441
sporozoite surface protein 2 (SSP2); Provisional
108-201 3.10e-03

sporozoite surface protein 2 (SSP2); Provisional


Pssm-ID: 240420 [Multi-domain]  Cd Length: 576  Bit Score: 41.87  E-value: 3.10e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  108 PRPTPLPSPAPLIPTPsappstssllpltEPQSPNPRPKAPAVlPDTQEDLLLLDSP-PPYPNAEQAASQLPAADQSPPP 186
Cdd:PTZ00441   284 VEPEPLPVPAPVPPTP-------------EDDNPRPTDDEFAV-PNFNEGLDVPDNPqDPVPPPNEGKDGNPNEENLFPP 349
                           90
                   ....*....|....*..
gi 1868331368  187 PSSTTGD--PGDPSPPS 201
Cdd:PTZ00441   350 GDDEVPDesNVPPNPPN 366
SOBP pfam15279
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ...
97-203 3.18e-03

Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.


Pssm-ID: 464609 [Multi-domain]  Cd Length: 325  Bit Score: 41.34  E-value: 3.18e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   97 VRPFLPPRLSGPRPTPLPS--------PAPLIPTPSAPPSTSSLLPLTEPQSPNP----RPKAPAVLPDTQEDLLLLDSP 164
Cdd:pfam15279  170 PRGLLGKPQQHPPPSPLPAfmepssmpPPFLRPPPSIPQPNSPLSNPMLPGIGPPpkppRNLGPPSNPMHRPPFSPHHPP 249
                           90       100       110
                   ....*....|....*....|....*....|....*....
gi 1868331368  165 PPYPNAEqaasqlPAADQSPPPPSSTTGDPGDPSPPSTR 203
Cdd:pfam15279  250 PPPTPPG------PPPGLPPPPPRGFTPPFGPPFPPVNM 282
PRK10819 PRK10819
transport protein TonB; Provisional
77-236 3.90e-03

transport protein TonB; Provisional


Pssm-ID: 236768 [Multi-domain]  Cd Length: 246  Bit Score: 40.82  E-value: 3.90e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   77 PDQvPYIVTWESLSRDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPstsslLPLTEPQ-SPNPRPKAPAVLPDTQ 155
Cdd:PRK10819    44 PAQ-PISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIP-----KPEPKPKpKPKPKPKPVKKVEEQP 117
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  156 EDllllDSPPPYPNAEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTRLRSRrdrtlegPDSSGISQAFPLRLMGEGRYQ 235
Cdd:PRK10819   118 KR----EVKPVEPRPASPFENTAPARPTSSTATAAASKPVTSVSSGPRALSR-------NQPQYPARAQALRIEGQVKVK 186

                   .
gi 1868331368  236 Y 236
Cdd:PRK10819   187 F 187
Drf_FH1 pfam06346
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs) ...
93-200 4.03e-03

Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.


Pssm-ID: 461881 [Multi-domain]  Cd Length: 157  Bit Score: 39.47  E-value: 4.03e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   93 PPPWVRPFLPPRLsgPRPTPLPSPAPL-----IPTPSAPPSTSSLLPltepqsPNPRPKAPAVLPDTQEDLLLLDSPPPY 167
Cdd:pfam06346   27 LPGGGGPPPPPPL--PGSAAIPPPPPLpggtsIPPPPPLPGAASIPP------PPPLPGSTGIPPPPPLPGGAGIPPPPP 98
                           90       100       110
                   ....*....|....*....|....*....|...
gi 1868331368  168 PnaeqaasqLPAADQSPPPPSSTTGDPGDPSPP 200
Cdd:pfam06346   99 P--------LPGGAGVPPPPPPLPGGPGIPPPP 123
PLN02983 PLN02983
biotin carboxyl carrier protein of acetyl-CoA carboxylase
87-148 5.46e-03

biotin carboxyl carrier protein of acetyl-CoA carboxylase


Pssm-ID: 215533 [Multi-domain]  Cd Length: 274  Bit Score: 40.59  E-value: 5.46e-03
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1868331368   87 ESLSRDPPPW-VRPFLPPRLSGPRPTPLP--SPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAP 148
Cdd:PLN02983   139 EALPQPPPPApVVMMQPPPPHAMPPASPPaaQPAPSAPASSPPPTPASPPPAKAPKSSHPPLKSP 203
PHA03247 PHA03247
large tegument protein UL36; Provisional
91-224 5.78e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 41.46  E-value: 5.78e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   91 RDPPPWVRPFLPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTE-PQSPNPRPKAPAVLPDTQEDLLLLDSPPPYPN 169
Cdd:PHA03247   354 RRRPTWTPPSSLEDLSAGRHHPKRASLPTRKRRSARHAATPFARGPGgDDQTRPAAPVPASVPTPAPTPVPASAPPPPAT 433
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1868331368  170 AEQAASqlPAADQSPPPPSS--TTGDPGDPSPPSTRLRSR------RDRTLEGPDSSGISQAF 224
Cdd:PHA03247   434 PLPSAE--PGSDDGPAPPPErqPPAPATEPAPDDPDDATRkaldalRERRPPEPPGADLAELL 494
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
99-201 6.00e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 41.29  E-value: 6.00e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   99 PFLPPRLSGPRPTPLPSPAPlipTPSAPPSTSSLLPLTEPQSPNPRPKA-PAVLPDTQEDLLLLDSPPPYPNAEQAASQL 177
Cdd:pfam03154  180 AASPPSPPPPGTTQAATAGP---TPSAPSVPPQGSPATSQPPNQTQSTAaPHTLIQQTPTLHPQRLPSPHPPLQPMTQPP 256
                           90       100
                   ....*....|....*....|....*
gi 1868331368  178 PAADQSPPP-PSSTTGDPGDPSPPS 201
Cdd:pfam03154  257 PPSQVSPQPlPQPSLHGQMPPMPHS 281
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
104-262 6.08e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 41.00  E-value: 6.08e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  104 RLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQsPNPRPKAPAVLPDTQEDLLlldSPPPYPNAEQAASQLPAADQS 183
Cdd:PRK07994   360 HPAAPLPEPEVPPQSAAPAASAQATAAPTAAVAPPQ-APAVPPPPASAPQQAPAVP---LPETTSQLLAARQQLQRAQGA 435
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1868331368  184 PPPPSSTTGDPGDPSPPSTRLrsrrDRTLEGPDSSGISQAFPLrlmgegryqywpfsSADLYNWKAHNPPFSQDPQALT 262
Cdd:PRK07994   436 TKAKKSEPAAASRARPVNSAL----ERLASVRPAPSALEKAPA--------------KKEAYRWKATNPVEVKKEPVAT 496
PHA03247 PHA03247
large tegument protein UL36; Provisional
93-225 6.29e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 41.46  E-value: 6.29e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   93 PPPWVRPFLPPRLSG-PRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPAVLPDTQEdlllldSPPPYPNAE 171
Cdd:PHA03247  2592 PPQSARPRAPVDDRGdPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDP------APGRVSRPR 2665
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1868331368  172 QAASQLPAADQSPPP-------------PSSTTGDPGDP-SPPSTRLRSRRDRTLEGPDSSGISQAFP 225
Cdd:PHA03247  2666 RARRLGRAAQASSPPqrprrraarptvgSLTSLADPPPPpPTPEPAPHALVSATPLPPGPAAARQASP 2733
PHA03321 PHA03321
tegument protein VP11/12; Provisional
90-209 6.83e-03

tegument protein VP11/12; Provisional


Pssm-ID: 223041 [Multi-domain]  Cd Length: 694  Bit Score: 41.10  E-value: 6.83e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   90 SRDPPPwvrpflPPRlsgPRPTPLPSPA-------PLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPavlPDTQEDLLLLD 162
Cdd:PHA03321   439 DNDPPP------PPR---ARPGSTPACArraraqrARDAGPEYVDPLGALRRLPAGAAPPPEPAAA---PSPATYYTRMG 506
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 1868331368  163 SPPPYPNAEQAASQLPAADQSPPPPSSTTGDPGDPSPPSTRLRSRRD 209
Cdd:PHA03321   507 GGPPRLPPRNRATETLRPDWGPPAAAPPEQMEDPYLEPDDDRFDRRD 553
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
71-224 6.86e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 40.91  E-value: 6.86e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   71 PGPWGYPDQVP---YIVTWESLSRDPPPWVRPflPPRLSGPRPTPLPSPAPLIP----TPSAPPSTSSLLPLTE-----P 138
Cdd:pfam03154  381 PSPFQMNSNLPpppALKPLSSLSTHHPPSAHP--PPLQLMPQSQQLPPPPAQPPvltqSQSLPPPAASHPPTSGlhqvpS 458
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  139 QSPNPR---------PKAPAVLPDTQEDLLLLDSPPPYPNAEQAASQLPAADQSPPPPSSTTGDPGDPS------PPSTR 203
Cdd:pfam03154  459 QSPFPQhpfvpggppPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAeepespPPPPR 538
                          170       180
                   ....*....|....*....|.
gi 1868331368  204 LRSRRDRTLEGPDSSGISQAF 224
Cdd:pfam03154  539 SPSPEPTVVNTPSHASQSARF 559
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
88-258 6.86e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 40.91  E-value: 6.86e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   88 SLSRDPPPWVRPfLPPRLSGPRPTPLPSPAPLIPTPSAPPSTSSL--LPLTEPQSPNPRPKAPAVLPDTQEDLLLLDSPP 165
Cdd:pfam03154  281 SLQTGPSHMQHP-VPPQPFPLTPQSSQSQVPPGPSPAAPGQSQQRihTPPSQSQLQSQQPPREQPLPPAPLSMPHIKPPP 359
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  166 PYPnaeqaASQLPAADQSPPPPSSTTGDPGD-PS--PPSTRLRSRRDRTLEGPDSsgiSQAFPLRLMGEGR-YQYWPFSS 241
Cdd:pfam03154  360 TTP-----IPQLPNPQSHKHPPHLSGPSPFQmNSnlPPPPALKPLSSLSTHHPPS---AHPPPLQLMPQSQqLPPPPAQP 431
                          170
                   ....*....|....*..
gi 1868331368  242 ADLYNWKAHNPPFSQDP 258
Cdd:pfam03154  432 PVLTQSQSLPPPAASHP 448
retropepsin_like_LTR_2 cd05484
Retropepsins_like_LTR, pepsin-like aspartate proteases; Retropepsin of retrotransposons with ...
552-632 7.03e-03

Retropepsins_like_LTR, pepsin-like aspartate proteases; Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.


Pssm-ID: 133151  Cd Length: 91  Bit Score: 37.19  E-value: 7.03e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368  552 VTLNVGGQPVTFLVDTGAQHSVLTQT------QGPLSTRTAWVQGATGGKLH---RWTTERKVHLSTGHVThsfLLVPEC 622
Cdd:cd05484      3 VTLLVNGKPLKFQLDTGSAITVISEKtwrklgSPPLKPTKKRLRTATGTKLSvlgQILVTVKYGGKTKVLT---LYVVKN 79
                           90
                   ....*....|.
gi 1868331368  623 PYP-LLGRDLL 632
Cdd:cd05484     80 EGLnLLGRDWL 90
PHA03379 PHA03379
EBNA-3A; Provisional
89-210 9.26e-03

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 40.43  E-value: 9.26e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   89 LSRDPPPWVRP--FLPPRLSGPRP--TPLPSPAPLIPTPSAPPSTSSLLPLTEPQSPNPRPKAPAVLPDTQEDLLLLDSP 164
Cdd:PHA03379   470 LPPGPLQDLEPgdQLPGVVQDGRPacAPVPAPAGPIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAP 549
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1868331368  165 PpypnaeQAASQLPA-------ADQSPPPPSSTTGDPGDPSPPSTRLRSRRDR 210
Cdd:PHA03379   550 P------LIAMQGPGetsgivrVRERWRPAPWTPNPPRSPSQMSVRDRLARLR 596
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
90-206 9.60e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 40.60  E-value: 9.60e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1868331368   90 SRDPPPWVRPFLPP-RLSGPRPTPLPSPAPLIPTPSAPPSTSSLLPLTEPQSpNPRPKApAVLPDTQEDLLLLDSPPPYP 168
Cdd:PRK07003   421 TRAEAPPAAPAPPAtADRGDDAADGDAPVPAKANARASADSRCDERDAQPPA-DSGSAS-APASDAPPDAAFEPAPRAAA 498
                           90       100       110
                   ....*....|....*....|....*....|....*...
gi 1868331368  169 NAEQAASqlPAADQSPPPPSSTTGDPGDPSPPSTRLRS 206
Cdd:PRK07003   499 PSAATPA--AVPDARAPAAASREDAPAAAAPPAPEARP 534
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH