NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|126723547|ref|NP_079357|]
View 

protein NYNRIN [Homo sapiens]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
PIN_N4BP1-like cd18728
PRORP-like PIN domain of NEDD4 binding protein 1 and related proteins; NEDD4-binding partner-1 ...
795-919 2.46e-74

PRORP-like PIN domain of NEDD4 binding protein 1 and related proteins; NEDD4-binding partner-1 (N4BP1) interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally down-regulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. This subfamily additionally includes NYNRIN (NYN domain and retroviral integrase containing, also known as CGIN1/Cousin of GIN1), and KHNYN (KH and NYN domain containing) protein. N4BP1, CGIN1, and KHNYN proteins are probably of retroviral origin. This subfamily belongs to the Zc3h12a-N4BP1-like PIN subfamily of the PRORP-Zc3h12a-like PIN family, the latter of which additionally includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.


:

Pssm-ID: 350295  Cd Length: 127  Bit Score: 242.79  E-value: 2.46e-74
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  795 VVIDGSSVAMVHGLQHFFSCRGIAMAVQFFWNRGHREVTVFVPTWQLKKNRRVRESHFLTKLHSLKMLSITPSQLENGKK 874
Cdd:cd18728     1 IVIDGSNVAMVHGLQHFFSCRGIAIAVEYFWKRGHRNITVFVPQWRTKRDPNVTEQHFLTQLQELGILSLTPSRMVLGKR 80
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*..
gi 126723547  875 ITTYDYRFMVKLAEETDGIIVTNEQIHILMNSS--KKLMVKDRLLPF 919
Cdd:cd18728    81 IASHDDRFLLHLAEKTGGIIVTNDNFREFVNESpsWREIIKERLLQY 127
KH-I_NYNRIN_like cd22477
type I K homology (KH) RNA-binding domain found in the subfamily of NYN domain and retroviral ...
75-140 2.26e-36

type I K homology (KH) RNA-binding domain found in the subfamily of NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN); The NYNRIN subfamily includes NYNRIN and KH and NYN domain-containing protein (KHNYN). NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation.


:

Pssm-ID: 411905  Cd Length: 66  Bit Score: 132.14  E-value: 2.26e-36
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 126723547   75 PELWKEVRYPPILHCAFLGAQGLFLDCLCWSTLAYLVPGPPGSLMVGGLTESFIMTQNWLEELVGR 140
Cdd:cd22477     1 PELTKEVLYPRDLHCIFLGAKGLFLDCLIWGTSAHIVPGAPGSLLISGLTEAFVMAQSRIEDLVEK 66
RT_RNaseH_2 pfam17919
RNase H-like domain found in reverse transcriptase;
1130-1230 4.88e-20

RNase H-like domain found in reverse transcriptase;


:

Pssm-ID: 465567 [Multi-domain]  Cd Length: 100  Bit Score: 86.78  E-value: 4.88e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  1130 WDQEHEEAFLALKRALVSALCLMAPNSQLPFRLEVTVSHVALTAILHQEHS-GRKHPIAYTSKPLLPDEESQgpqsggdS 1208
Cdd:pfam17919    1 WTEECQKAFEKLKQALTSAPVLAHPDPDKPFILETDASDYGIGAVLSQEDDdGGERPIAYASRKLSPAERNY-------S 73
                           90       100
                   ....*....|....*....|....*..
gi 126723547  1209 PY-----AVAWALKHFSRCIGDTPVVL 1230
Cdd:pfam17919   74 TTekellAIVFALKKFRHYLLGRKFTV 100
KH-I_N4BP1_like_rpt1 cd09032
first type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein 1 ...
12-73 1.20e-18

first type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein 1 (N4BP1); The N4BP1 family includes N4BP1, NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN) and KH and NYN domain-containing protein (KHNYN). These proteins are probably of retroviral origin. N4BP1 interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally downregulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. N4BP1 acts by interacting with the second WW domain of ITCH, leading to compete with ITCH's substrates and impairing ubiquitination of substrates. NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation. Members of this family contains two type I K homology (KH) RNA-binding domain. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif.


:

Pssm-ID: 411808  Cd Length: 65  Bit Score: 81.56  E-value: 1.20e-18
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 126723547   12 EWFMVQTKSKPRVQRQRLQVQRIFRVKLN---AFQSRPDTPYFWLQLEGPRENMGKAKEYLKGLC 73
Cdd:cd09032     1 DEFVVPAEKVPLLERSRPRIERLFGVKVSlleELSKPKDGGKQWVQLEGDEEDVRKAKEYIKALC 65
Integrase_H2C2 pfam17921
Integrase zinc binding domain; This zinc binding domain is found in a wide variety of ...
1536-1589 2.91e-14

Integrase zinc binding domain; This zinc binding domain is found in a wide variety of integrase proteins.


:

Pssm-ID: 465569 [Multi-domain]  Cd Length: 58  Bit Score: 68.81  E-value: 2.91e-14
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....
gi 126723547  1536 VPTQLRRDLIFSVHDipLGAHQRPEETYKKLRLLGWWPGMQEHVKDYCRSCLFC 1589
Cdd:pfam17921    1 VPKSLRKEILKEAHD--SGGHLGIEKTLARLRRRYWWPGMRKDVKKYVKSCETC 52
PHA03247 super family cl33720
large tegument protein UL36; Provisional
396-726 1.67e-09

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 63.42  E-value: 1.67e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  396 PLGPIQLKLPGQNPLPLNLEWKQKELAPLPSAESPAGRPDGG--------------LGGEAALQNCPRPEISPKVTSLlv 461
Cdd:PHA03247 2618 PPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGrvsrprrarrlgraAQASSPPQRPRRRAARPTVGSL-- 2695
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  462 vpgssdvkdKVSSDLPQIGPPLTSTPQLQAGGEPGDQGSMQLDFKGLEEGPAPVLPTGQGKPVAQGGLTDQSVPGAQTVP 541
Cdd:PHA03247 2696 ---------TSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGP 2766
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  542 ETlkvPMAAAVPKAENPSRTQVPSAAPkLPTSRMMLAVHTEPAAPEVPLAPTKPTAQLMATAQKTVVnQPVLVAQVEPTT 621
Cdd:PHA03247 2767 PA---PAPPAAPAAGPPRRLTRPAVAS-LSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLP-PPTSAQPTAPPP 2841
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  622 PKTPQAQKM----------PVAKTSPAGPKTPKAQAGPAATVSKAPAASKAPAAPKVPVTPRVSRAPKTPAAQKVPTDAG 691
Cdd:PHA03247 2842 PPGPPPPSLplggsvapggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQP 2921
                         330       340       350
                  ....*....|....*....|....*....|....*
gi 126723547  692 PTLDVARLLSEVQPTSRASVSLLKGQGQAGRQGPQ 726
Cdd:PHA03247 2922 QPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPS 2956
 
Name Accession Description Interval E-value
PIN_N4BP1-like cd18728
PRORP-like PIN domain of NEDD4 binding protein 1 and related proteins; NEDD4-binding partner-1 ...
795-919 2.46e-74

PRORP-like PIN domain of NEDD4 binding protein 1 and related proteins; NEDD4-binding partner-1 (N4BP1) interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally down-regulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. This subfamily additionally includes NYNRIN (NYN domain and retroviral integrase containing, also known as CGIN1/Cousin of GIN1), and KHNYN (KH and NYN domain containing) protein. N4BP1, CGIN1, and KHNYN proteins are probably of retroviral origin. This subfamily belongs to the Zc3h12a-N4BP1-like PIN subfamily of the PRORP-Zc3h12a-like PIN family, the latter of which additionally includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.


Pssm-ID: 350295  Cd Length: 127  Bit Score: 242.79  E-value: 2.46e-74
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  795 VVIDGSSVAMVHGLQHFFSCRGIAMAVQFFWNRGHREVTVFVPTWQLKKNRRVRESHFLTKLHSLKMLSITPSQLENGKK 874
Cdd:cd18728     1 IVIDGSNVAMVHGLQHFFSCRGIAIAVEYFWKRGHRNITVFVPQWRTKRDPNVTEQHFLTQLQELGILSLTPSRMVLGKR 80
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*..
gi 126723547  875 ITTYDYRFMVKLAEETDGIIVTNEQIHILMNSS--KKLMVKDRLLPF 919
Cdd:cd18728    81 IASHDDRFLLHLAEKTGGIIVTNDNFREFVNESpsWREIIKERLLQY 127
RNase_Zc3h12a pfam11977
Zc3h12a-like Ribonuclease NYN domain; This domain is found in the Zc3h12a protein which has ...
791-942 4.41e-70

Zc3h12a-like Ribonuclease NYN domain; This domain is found in the Zc3h12a protein which has shown to be a ribonuclease that controls the stability of a set of inflammatory genes. It has been suggested that this domain belongs to the PIN domain superfamily. This domain has also been identified as part of the NYN domain family.


Pssm-ID: 403256  Cd Length: 154  Bit Score: 231.83  E-value: 4.41e-70
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547   791 GLRRVVIDGSSVAMVHGLQHFFSCRGIAMAVQFFWNRGHREVTVFVPTWQLKKNRRVRESHFLTKLHSLKMLSITPSQLE 870
Cdd:pfam11977    1 GLRPIVIDGSNVAMSHGRQKKFSVRGLAIAVDYFVKRGHEEITVFVPQWRKEADEKITDQHELLELERLGLIVFTPSRTL 80
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 126723547   871 NGKKITTYDYRFMVKLAEETDGIIVTNEQIHILMNSSKKLM--VKDRLLPFTFAGNLFMVPDDPLGRDGPTLDE 942
Cdd:pfam11977   81 DGKRIVSYDDRFILKLAEETDGVIVSNDNFRDLADENPEWIdiVEERLLMYTFVGDKFMPPDDPLGRVGPSLED 154
KH-I_NYNRIN_like cd22477
type I K homology (KH) RNA-binding domain found in the subfamily of NYN domain and retroviral ...
75-140 2.26e-36

type I K homology (KH) RNA-binding domain found in the subfamily of NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN); The NYNRIN subfamily includes NYNRIN and KH and NYN domain-containing protein (KHNYN). NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation.


Pssm-ID: 411905  Cd Length: 66  Bit Score: 132.14  E-value: 2.26e-36
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 126723547   75 PELWKEVRYPPILHCAFLGAQGLFLDCLCWSTLAYLVPGPPGSLMVGGLTESFIMTQNWLEELVGR 140
Cdd:cd22477     1 PELTKEVLYPRDLHCIFLGAKGLFLDCLIWGTSAHIVPGAPGSLLISGLTEAFVMAQSRIEDLVEK 66
RT_RNaseH_2 pfam17919
RNase H-like domain found in reverse transcriptase;
1130-1230 4.88e-20

RNase H-like domain found in reverse transcriptase;


Pssm-ID: 465567 [Multi-domain]  Cd Length: 100  Bit Score: 86.78  E-value: 4.88e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  1130 WDQEHEEAFLALKRALVSALCLMAPNSQLPFRLEVTVSHVALTAILHQEHS-GRKHPIAYTSKPLLPDEESQgpqsggdS 1208
Cdd:pfam17919    1 WTEECQKAFEKLKQALTSAPVLAHPDPDKPFILETDASDYGIGAVLSQEDDdGGERPIAYASRKLSPAERNY-------S 73
                           90       100
                   ....*....|....*....|....*..
gi 126723547  1209 PY-----AVAWALKHFSRCIGDTPVVL 1230
Cdd:pfam17919   74 TTekellAIVFALKKFRHYLLGRKFTV 100
KH-I_N4BP1_like_rpt1 cd09032
first type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein 1 ...
12-73 1.20e-18

first type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein 1 (N4BP1); The N4BP1 family includes N4BP1, NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN) and KH and NYN domain-containing protein (KHNYN). These proteins are probably of retroviral origin. N4BP1 interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally downregulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. N4BP1 acts by interacting with the second WW domain of ITCH, leading to compete with ITCH's substrates and impairing ubiquitination of substrates. NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation. Members of this family contains two type I K homology (KH) RNA-binding domain. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif.


Pssm-ID: 411808  Cd Length: 65  Bit Score: 81.56  E-value: 1.20e-18
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 126723547   12 EWFMVQTKSKPRVQRQRLQVQRIFRVKLN---AFQSRPDTPYFWLQLEGPRENMGKAKEYLKGLC 73
Cdd:cd09032     1 DEFVVPAEKVPLLERSRPRIERLFGVKVSlleELSKPKDGGKQWVQLEGDEEDVRKAKEYIKALC 65
Integrase_H2C2 pfam17921
Integrase zinc binding domain; This zinc binding domain is found in a wide variety of ...
1536-1589 2.91e-14

Integrase zinc binding domain; This zinc binding domain is found in a wide variety of integrase proteins.


Pssm-ID: 465569 [Multi-domain]  Cd Length: 58  Bit Score: 68.81  E-value: 2.91e-14
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....
gi 126723547  1536 VPTQLRRDLIFSVHDipLGAHQRPEETYKKLRLLGWWPGMQEHVKDYCRSCLFC 1589
Cdd:pfam17921    1 VPKSLRKEILKEAHD--SGGHLGIEKTLARLRRRYWWPGMRKDVKKYVKSCETC 52
PHA03247 PHA03247
large tegument protein UL36; Provisional
396-726 1.67e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 63.42  E-value: 1.67e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  396 PLGPIQLKLPGQNPLPLNLEWKQKELAPLPSAESPAGRPDGG--------------LGGEAALQNCPRPEISPKVTSLlv 461
Cdd:PHA03247 2618 PPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGrvsrprrarrlgraAQASSPPQRPRRRAARPTVGSL-- 2695
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  462 vpgssdvkdKVSSDLPQIGPPLTSTPQLQAGGEPGDQGSMQLDFKGLEEGPAPVLPTGQGKPVAQGGLTDQSVPGAQTVP 541
Cdd:PHA03247 2696 ---------TSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGP 2766
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  542 ETlkvPMAAAVPKAENPSRTQVPSAAPkLPTSRMMLAVHTEPAAPEVPLAPTKPTAQLMATAQKTVVnQPVLVAQVEPTT 621
Cdd:PHA03247 2767 PA---PAPPAAPAAGPPRRLTRPAVAS-LSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLP-PPTSAQPTAPPP 2841
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  622 PKTPQAQKM----------PVAKTSPAGPKTPKAQAGPAATVSKAPAASKAPAAPKVPVTPRVSRAPKTPAAQKVPTDAG 691
Cdd:PHA03247 2842 PPGPPPPSLplggsvapggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQP 2921
                         330       340       350
                  ....*....|....*....|....*....|....*
gi 126723547  692 PTLDVARLLSEVQPTSRASVSLLKGQGQAGRQGPQ 726
Cdd:PHA03247 2922 QPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPS 2956
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
477-710 2.19e-06

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 52.65  E-value: 2.19e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547   477 PQIGPPLTSTPQLQAGGEPGDQGSMQLDFKGLEEGPAPVLP--------TGQGKPVAqgGLTDQSVPGAQTVPETLKVPM 548
Cdd:pfam17823  168 PHAASPAPRTAASSTTAASSTTAASSAPTTAASSAPATLTPargistaaTATGHPAA--GTALAAVGNSSPAAGTVTAAV 245
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547   549 AAAVPKAENPSRTQVPSAAPKLPTSRMMLAVHTEPA---------APEVPLAPTKPTAQlmATAQKTVVNQPVLVAQVEP 619
Cdd:pfam17823  246 GTVTPAALATLAAAAGTVASAAGTINMGDPHARRLSpakhmpsdtMARNPAAPMGAQAQ--GPIIQVSTDQPVHNTAGEP 323
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547   620 T-----------TPKTPQAQKMPVAKTSPAGPKTPKAQAGP---AATVSKAPAASKAPAAPKVPVTPRVSrAPKTP-AAQ 684
Cdd:pfam17823  324 TpspsnttlepnTPKSVASTNLAVVTTTKAQAKEPSASPVPvlhTSMIPEVEATSPTTQPSPLLPTQGAA-GPGILlAPE 402
                          250       260
                   ....*....|....*....|....*.
gi 126723547   685 KVPTDAGPTLDVArllsevQPTSRAS 710
Cdd:pfam17823  403 QVATEATAGTASA------GPTPRSS 422
RNase_HI_RT_Ty3 cd09274
Ty3/Gypsy family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1161-1264 1.80e-05

Ty3/Gypsy family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Ty3/Gypsy family widely distributed among the genomes of plants, fungi and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260006 [Multi-domain]  Cd Length: 121  Bit Score: 45.95  E-value: 1.80e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547 1161 RLEVTVSHVALTAILHQEHS-GRKHPIAYTSKPLLPDEESQgpqsggdSPY-----AVAWALKHFSRCIGDTPVVL--D- 1231
Cdd:cd09274     1 ILETDASDYGIGAVLSQEDDdGKERPIAFFSRKLTPAERNY-------STTekellAIVWALKKFRHYLLGRPFTVytDh 73
                          90       100       110
                  ....*....|....*....|....*....|....*
gi 126723547 1232 --LSYAsRTTADPEVRegrrvskawLIRWSLLVQD 1264
Cdd:cd09274    74 kaLKYL-LTQKDLNGR---------LARWLLLLSE 98
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
535-696 2.37e-03

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 42.83  E-value: 2.37e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  535 PGAQTVPETLKVPMAaavPKAENPSrtqvPSAAPKLPTSRMMLAVHTEPAAPEVPLAPTKPTAQLMATAQKTvvnQPVLV 614
Cdd:NF033839  297 PGMQPSPQPEKKEVK---PEPETPK----PEVKPQLEKPKPEVKPQPEKPKPEVKPQLETPKPEVKPQPEKP---KPEVK 366
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  615 AQVEPTTPKTPQAQKMPVAKTSPAgPKTPKAQAGPAATVSKAPAASKAPAApkvpvTPRVSRAPKTPAAQKVPTDAGPTL 694
Cdd:NF033839  367 PQPEKPKPEVKPQPETPKPEVKPQ-PEKPKPEVKPQPEKPKPEVKPQPEKP-----KPEVKPQPEKPKPEVKPQPEKPKP 440

                  ..
gi 126723547  695 DV 696
Cdd:NF033839  441 EV 442
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
396-745 7.85e-03

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 41.07  E-value: 7.85e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  396 PLGPIQLKLPGQNPlplnlewkqkelAPLPSaeSPAGRPDGGLGGEAALQNCPRPEISPKVTS--LLVVPGSSDVKDKVS 473
Cdd:cd22540    51 PPQPTPRKLVPIKP------------APLPL--GPGKNSIGFLSAKGNIIQLQGSQLSSSAPGgqQVFAIQNPTMIIKGS 116
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  474 SDLPQIGPPLTSTPQLQAGGEPGDQGSMQL--------------------DFKGLEEGPAPVLPTGQGKPVAQGGLTDQS 533
Cdd:cd22540   117 QTRSSTNQQYQISPQIQAAGQINNSGQIQIipgtnqaiitpvqvlqqpqqAHKPVPIKPAPLQTSNTNSASLQVPGNVIK 196
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  534 VPGAQTVPETLkvPMAAAVPKAENPSRTQVPSAAPKlpTSRMMLAVHTEPAAPEVPLApTKPTAQLMATAQKTVV---NQ 610
Cdd:cd22540   197 LQSGGNVALTL--PVNNLVGTQDGATQLQLAAAPSK--PSKKIRKKSAQAAQPAVTVA-EQVETVLIETTADNIIqagNN 271
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  611 PVLVAQvePTTPKTPQAQKMPVAKtspagpktPKAQAGPAATVSKAPAASKAPAAPKVPVTPRVSRAPKTPAAQKVPTDA 690
Cdd:cd22540   272 LLIVQS--PGTGQPAVLQQVQVLQ--------PKQEQQVVQIPQQALRVVQAASATLPTVPQKPLQNIQIQNSEPTPTQV 341
                         330       340       350       360       370
                  ....*....|....*....|....*....|....*....|....*....|....*...
gi 126723547  691 ---GPTLDVARLLSEVQPTSRASVSLLKGQGQAgrQGPQSSGTLALSSKHQFQMEGLL 745
Cdd:cd22540   342 yikTPSGEVQTVLLQEAPAATATPSSSTSTVQQ--QVTANNGTGTSKPNYNVRKERTL 397
 
Name Accession Description Interval E-value
PIN_N4BP1-like cd18728
PRORP-like PIN domain of NEDD4 binding protein 1 and related proteins; NEDD4-binding partner-1 ...
795-919 2.46e-74

PRORP-like PIN domain of NEDD4 binding protein 1 and related proteins; NEDD4-binding partner-1 (N4BP1) interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally down-regulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. This subfamily additionally includes NYNRIN (NYN domain and retroviral integrase containing, also known as CGIN1/Cousin of GIN1), and KHNYN (KH and NYN domain containing) protein. N4BP1, CGIN1, and KHNYN proteins are probably of retroviral origin. This subfamily belongs to the Zc3h12a-N4BP1-like PIN subfamily of the PRORP-Zc3h12a-like PIN family, the latter of which additionally includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.


Pssm-ID: 350295  Cd Length: 127  Bit Score: 242.79  E-value: 2.46e-74
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  795 VVIDGSSVAMVHGLQHFFSCRGIAMAVQFFWNRGHREVTVFVPTWQLKKNRRVRESHFLTKLHSLKMLSITPSQLENGKK 874
Cdd:cd18728     1 IVIDGSNVAMVHGLQHFFSCRGIAIAVEYFWKRGHRNITVFVPQWRTKRDPNVTEQHFLTQLQELGILSLTPSRMVLGKR 80
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*..
gi 126723547  875 ITTYDYRFMVKLAEETDGIIVTNEQIHILMNSS--KKLMVKDRLLPF 919
Cdd:cd18728    81 IASHDDRFLLHLAEKTGGIIVTNDNFREFVNESpsWREIIKERLLQY 127
RNase_Zc3h12a pfam11977
Zc3h12a-like Ribonuclease NYN domain; This domain is found in the Zc3h12a protein which has ...
791-942 4.41e-70

Zc3h12a-like Ribonuclease NYN domain; This domain is found in the Zc3h12a protein which has shown to be a ribonuclease that controls the stability of a set of inflammatory genes. It has been suggested that this domain belongs to the PIN domain superfamily. This domain has also been identified as part of the NYN domain family.


Pssm-ID: 403256  Cd Length: 154  Bit Score: 231.83  E-value: 4.41e-70
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547   791 GLRRVVIDGSSVAMVHGLQHFFSCRGIAMAVQFFWNRGHREVTVFVPTWQLKKNRRVRESHFLTKLHSLKMLSITPSQLE 870
Cdd:pfam11977    1 GLRPIVIDGSNVAMSHGRQKKFSVRGLAIAVDYFVKRGHEEITVFVPQWRKEADEKITDQHELLELERLGLIVFTPSRTL 80
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 126723547   871 NGKKITTYDYRFMVKLAEETDGIIVTNEQIHILMNSSKKLM--VKDRLLPFTFAGNLFMVPDDPLGRDGPTLDE 942
Cdd:pfam11977   81 DGKRIVSYDDRFILKLAEETDGVIVSNDNFRDLADENPEWIdiVEERLLMYTFVGDKFMPPDDPLGRVGPSLED 154
PIN_Zc3h12a-N4BP1-like cd18719
PRORP-like PIN domain of ribonuclease Zc3h12a, NEDD4-binding partner-1, and related proteins; ...
795-919 7.67e-56

PRORP-like PIN domain of ribonuclease Zc3h12a, NEDD4-binding partner-1, and related proteins; Zc3h12a (zinc finger CCCH-type containing 12A, also known as MCPIP1/MCP induced protein 1 and Regnase-1) is a critical regulator of inflammatory response, with additional roles in defense against viruses and various stresses, cellular differentiation, and apoptosis. This subfamily also includes Caenorhabditis elegans REGE-1 (REGnasE-1), which also functions as a cytoplasmic endonuclease. Additionally, it includes three less-studied mammalian homologs: Zc3h12b-d/Regnase-2-4, as well as N4BP1 (NEDD4-binding partner-1), NYNRIN (NYN domain and retroviral integrase containing, also known as CGIN1/Cousin of GIN1), and KHNYN (KH and NYN domain containing) protein. N4BP1, CGIN1, and KHNYN proteins are probably of retroviral origin. This subfamily belongs to the PRORP-Zc3h12a-like PIN family which in addition includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.


Pssm-ID: 350286  Cd Length: 127  Bit Score: 190.11  E-value: 7.67e-56
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  795 VVIDGSSVAMVHGLQHFFSCRGIAMAVQFFWNRGHrEVTVFVPTWQLKKNR-RVRESHFLTKLHSLKMLSITPSQLENGK 873
Cdd:cd18719     1 VVIDGSNVAMSHGNGKVFSCKGIQICVRYFLERGH-EVTAFVPQFRLESPNpNSTDQDILEELERLGILVFTPSRRVPGK 79
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*...
gi 126723547  874 KITTYDYRFMVKLAEETDGIIVTNEQIHILMNSSK--KLMVKDRLLPF 919
Cdd:cd18719    80 RISSYDDRFILQLAEETDGVIVSNDNFRDLLNENPdwREIIEERLLPF 127
PIN_Zc3h12-like cd18729
PRORP-like PIN domain of ribonuclease Zc3h12a and related proteins; Zc3h12a (zinc finger ...
795-920 1.27e-41

PRORP-like PIN domain of ribonuclease Zc3h12a and related proteins; Zc3h12a (zinc finger CCCH-type containing 12A, also known as MCPIP1/MCP induced protein 1 and Regnase-1) is a critical regulator of inflammatory response, with additional roles in defense against viruses and various stresses, cellular differentiation, and apoptosis. This subfamily also includes three less-studied mammalian homologs: Zc3h12b-d/Regnase-2-4. It belongs to the Zc3h12a-N4BP1-like PIN subfamily of the PRORP-Zc3h12a-like PIN family, the latter of which additionally includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.


Pssm-ID: 350296  Cd Length: 131  Bit Score: 149.44  E-value: 1.27e-41
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  795 VVIDGSSVAMVHGLQHFFSCRGIAMAVQFFWNRGHREVTVFVPTWQlKKNRR----VRESHFLTKLHSLKMLSITPSQLE 870
Cdd:cd18729     1 IVIDGSNVAMSHGNKEVFSCRGIQLAVDWFRERGHRDITVFVPSWR-KEQPRpdapITDQEILRELEKEKILVFTPSRRV 79
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|..
gi 126723547  871 NGKKITTYDYRFMVKLAEETDGIIVTNEQIHILMNSSK--KLMVKDRLLPFT 920
Cdd:cd18729    80 GGKRVVCYDDRFILKLAYESDGIVVSNDNYRDLQNEKPewKKFIEERLLMYS 131
KH-I_NYNRIN_like cd22477
type I K homology (KH) RNA-binding domain found in the subfamily of NYN domain and retroviral ...
75-140 2.26e-36

type I K homology (KH) RNA-binding domain found in the subfamily of NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN); The NYNRIN subfamily includes NYNRIN and KH and NYN domain-containing protein (KHNYN). NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation.


Pssm-ID: 411905  Cd Length: 66  Bit Score: 132.14  E-value: 2.26e-36
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 126723547   75 PELWKEVRYPPILHCAFLGAQGLFLDCLCWSTLAYLVPGPPGSLMVGGLTESFIMTQNWLEELVGR 140
Cdd:cd22477     1 PELTKEVLYPRDLHCIFLGAKGLFLDCLIWGTSAHIVPGAPGSLLISGLTEAFVMAQSRIEDLVEK 66
KH-I_N4BP1_like_rpt2 cd22388
second type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein ...
76-138 6.75e-34

second type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein 1 (N4BP1); The N4BP1 family includes N4BP1, NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN) and KH and NYN domain-containing protein (KHNYN). These proteins are probably of retroviral origin. N4BP1 interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally downregulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. N4BP1 acts by interacting with the second WW domain of ITCH, leading to compete with ITCH's substrates and impairing ubiquitination of substrates. NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation. Members of this family contains two type I K homology (KH) RNA-binding domain. The model corresponds to the second one.


Pssm-ID: 411816  Cd Length: 63  Bit Score: 124.97  E-value: 6.75e-34
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 126723547   76 ELWKEVRYPPILHCAFLGAQGLFLDCLCWSTLAYLVPGPPGSLMVGGLTESFIMTQNWLEELV 138
Cdd:cd22388     1 ELWKEVRYPKDMHCIFLGAQGLFLDSLIWSTLAYLVPGPPGSLMIGGLTESVVMAQSWIQEFV 63
RT_RNaseH_2 pfam17919
RNase H-like domain found in reverse transcriptase;
1130-1230 4.88e-20

RNase H-like domain found in reverse transcriptase;


Pssm-ID: 465567 [Multi-domain]  Cd Length: 100  Bit Score: 86.78  E-value: 4.88e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  1130 WDQEHEEAFLALKRALVSALCLMAPNSQLPFRLEVTVSHVALTAILHQEHS-GRKHPIAYTSKPLLPDEESQgpqsggdS 1208
Cdd:pfam17919    1 WTEECQKAFEKLKQALTSAPVLAHPDPDKPFILETDASDYGIGAVLSQEDDdGGERPIAYASRKLSPAERNY-------S 73
                           90       100
                   ....*....|....*....|....*..
gi 126723547  1209 PY-----AVAWALKHFSRCIGDTPVVL 1230
Cdd:pfam17919   74 TTekellAIVFALKKFRHYLLGRKFTV 100
KH-I_N4BP1_like_rpt1 cd09032
first type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein 1 ...
12-73 1.20e-18

first type I K homology (KH) RNA-binding domain found in the family of NEDD4-binding protein 1 (N4BP1); The N4BP1 family includes N4BP1, NYN domain and retroviral integrase catalytic domain-containing protein (NYNRIN) and KH and NYN domain-containing protein (KHNYN). These proteins are probably of retroviral origin. N4BP1 interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally downregulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. N4BP1 acts by interacting with the second WW domain of ITCH, leading to compete with ITCH's substrates and impairing ubiquitination of substrates. NYNRIN, also known as CGIN1/Cousin of GIN1, may contribute to retroviral resistance in mammals by regulating the ubiquitination of viral proteins. KHNYN acts as a novel cofactor for zinc finger antiviral protein (ZAP) to target CpG-containing retroviral RNA for degradation. Members of this family contains two type I K homology (KH) RNA-binding domain. The model corresponds to the first one. The KH1 domain is a divergent KH domain that lacks the RNA-binding GXXG motif.


Pssm-ID: 411808  Cd Length: 65  Bit Score: 81.56  E-value: 1.20e-18
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 126723547   12 EWFMVQTKSKPRVQRQRLQVQRIFRVKLN---AFQSRPDTPYFWLQLEGPRENMGKAKEYLKGLC 73
Cdd:cd09032     1 DEFVVPAEKVPLLERSRPRIERLFGVKVSlleELSKPKDGGKQWVQLEGDEEDVRKAKEYIKALC 65
PIN_PRORP-Zc3h12a-like cd18671
PIN domain of protein-only RNase P (PRORP), ribonuclease Zc3h12a, and related proteins; PRORPs ...
795-899 1.64e-17

PIN domain of protein-only RNase P (PRORP), ribonuclease Zc3h12a, and related proteins; PRORPs catalyze the maturation of the 5' end of precursor tRNAs in eukaryotes. This family includes human PRORP, also known as proteinaceous RNase P and mitochondrial RNase P protein subunit 3 (MRPP3), and Arabidopsis thaliana PRORP1-3, PRORP1 localizes to the chloroplast and the mitochondria, and PRORP2 and PRORP3 localize to the nucleus. Zc3h12a (zinc finger CCCH-type containing 12A, also known as MCPIP1/MCP induced protein 1 and Regnase-1) is a critical regulator of inflammatory response, with additional roles in defense against viruses and various stresses, cellular differentiation, and apoptosis. This PIN_PRORP-Zc3h12a-like family also includes Caenorhabditis elegans REGE-1 (REGnasE-1), which also functions as a cytoplasmic endonuclease. Additionally, it includes three less-studied mammalian homologs: Zc3h12b-d/Regnase-2-4, as well as N4BP1 (NEDD4-binding partner-1), NYNRIN (NYN domain and retroviral integrase containing, also known as CGIN1/Cousin of GIN1), and KHNYN (KH and NYN domain containing) protein. N4BP1, CGIN1, and KHNYN proteins are probably of retroviral origin. The PIN (PilT N terminus) domain belongs to a large nuclease superfamily. The structural properties of the PIN domain indicate its active center, consisting of three highly conserved catalytic residues which coordinate metal ions; in some members, additional metal coordinating residues can be found while some others lack several of these key catalytic residues. The PIN active site is geometrically similar in the active center of structure-specific 5' nucleases, PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons.


Pssm-ID: 350238  Cd Length: 126  Bit Score: 80.37  E-value: 1.64e-17
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  795 VVIDGSSVAMVHGLQHFFSCRGIAMAVQFFWNRGH--REVTVFVPTWQLKKNRRV--RESHFLTKLHSLKMLSITPSqle 870
Cdd:cd18671     1 AVIDGANVGLSHQNKESFSCRQLLLAVNWFLERSHnnTDPLVFLHKWRVEQPRPVppTDRHLLEEWEKKGILYATPP--- 77
                          90       100
                  ....*....|....*....|....*....
gi 126723547  871 ngkkiTTYDYRFMVKLAEETDGIIVTNEQ 899
Cdd:cd18671    78 -----GSNDDWYWLYAAYESKCLLVTNDE 101
Integrase_H2C2 pfam17921
Integrase zinc binding domain; This zinc binding domain is found in a wide variety of ...
1536-1589 2.91e-14

Integrase zinc binding domain; This zinc binding domain is found in a wide variety of integrase proteins.


Pssm-ID: 465569 [Multi-domain]  Cd Length: 58  Bit Score: 68.81  E-value: 2.91e-14
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....
gi 126723547  1536 VPTQLRRDLIFSVHDipLGAHQRPEETYKKLRLLGWWPGMQEHVKDYCRSCLFC 1589
Cdd:pfam17921    1 VPKSLRKEILKEAHD--SGGHLGIEKTLARLRRRYWWPGMRKDVKKYVKSCETC 52
PHA03247 PHA03247
large tegument protein UL36; Provisional
396-726 1.67e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 63.42  E-value: 1.67e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  396 PLGPIQLKLPGQNPLPLNLEWKQKELAPLPSAESPAGRPDGG--------------LGGEAALQNCPRPEISPKVTSLlv 461
Cdd:PHA03247 2618 PPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGrvsrprrarrlgraAQASSPPQRPRRRAARPTVGSL-- 2695
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  462 vpgssdvkdKVSSDLPQIGPPLTSTPQLQAGGEPGDQGSMQLDFKGLEEGPAPVLPTGQGKPVAQGGLTDQSVPGAQTVP 541
Cdd:PHA03247 2696 ---------TSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGP 2766
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  542 ETlkvPMAAAVPKAENPSRTQVPSAAPkLPTSRMMLAVHTEPAAPEVPLAPTKPTAQLMATAQKTVVnQPVLVAQVEPTT 621
Cdd:PHA03247 2767 PA---PAPPAAPAAGPPRRLTRPAVAS-LSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLP-PPTSAQPTAPPP 2841
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  622 PKTPQAQKM----------PVAKTSPAGPKTPKAQAGPAATVSKAPAASKAPAAPKVPVTPRVSRAPKTPAAQKVPTDAG 691
Cdd:PHA03247 2842 PPGPPPPSLplggsvapggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQP 2921
                         330       340       350
                  ....*....|....*....|....*....|....*
gi 126723547  692 PTLDVARLLSEVQPTSRASVSLLKGQGQAGRQGPQ 726
Cdd:PHA03247 2922 QPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPS 2956
KH-I_N4BP1 cd22476
type I K homology (KH) RNA-binding domain found in NEDD4-binding protein 1 (N4BP1) and similar ...
75-138 6.00e-08

type I K homology (KH) RNA-binding domain found in NEDD4-binding protein 1 (N4BP1) and similar proteins; N4BP1 interacts with and is a substrate of NEDD4 ubiquitin ligase (neural precursor cell expressed, developmentally downregulated 4, E3 ubiquitin protein ligase). It is also an inhibitor of the E3 ubiquitin-protein ligase ITCH, a NEDD4 structurally related E3. N4BP1 acts by interacting with the second WW domain of ITCH, leading to compete with ITCH's substrates and impairing ubiquitination of substrates.


Pssm-ID: 411904  Cd Length: 68  Bit Score: 51.28  E-value: 6.00e-08
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 126723547   75 PELWKEVRYPPILHCAFLGAQGLFLDCLCWSTLAYLVPGPPGSLMVGGLTESFIMTQNWLEELV 138
Cdd:cd22476     1 PELEEKEYYPKDMHCIFAGAQGLFLNSLIQDTCADVSVLDIGVLGIKGGAEAVVMAQSRIQQFV 64
PHA03247 PHA03247
large tegument protein UL36; Provisional
423-698 1.11e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 57.26  E-value: 1.11e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  423 PLPSAESPAGrPDGGLggeAALQNCPRPeISPKVTSLLVVPGSsdvkdkvssdlpqigPPLTSTPQLqAGGEPGDQGSMQ 502
Cdd:PHA03247 2554 PLPPAAPPAA-PDRSV---PPPRPAPRP-SEPAVTSRARRPDA---------------PPQSARPRA-PVDDRGDPRGPA 2612
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  503 LDFKGLEEGPAPVLPTGQGKPVAqgglTDQSVPGAQTVPETLKVPMAAAVPK------AENPSRTQVPSAAPKLPTSRMM 576
Cdd:PHA03247 2613 PPSPLPPDTHAPDPPPPSPSPAA----NEPDPHPPPTVPPPERPRDDPAPGRvsrprrARRLGRAAQASSPPQRPRRRAA 2688
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  577 ------LAVHTEPAAPEVPLAPtKPTAQLMATAQKTVvnqPVLVAQVEPTTPKTPQAQKMPVAKTSPAGPKTPKAQAGPA 650
Cdd:PHA03247 2689 rptvgsLTSLADPPPPPPTPEP-APHALVSATPLPPG---PAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTA 2764
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*...
gi 126723547  651 AtvskapaaskapaapkvPVTPRVSRAPKTPAAQKVPTDAGPTLDVAR 698
Cdd:PHA03247 2765 G-----------------PPAPAPPAAPAAGPPRRLTRPAVASLSESR 2795
RT_RNaseH pfam17917
RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) ...
1155-1264 1.25e-07

RNase H-like domain found in reverse transcriptase; DNA polymerase and ribonuclease H (RNase H) activities allow reverse transcriptases to convert the single-stranded retroviral RNA genome into double-stranded DNA, which is integrated into the host chromosome during infection. This entry represents the RNase H like domain.


Pssm-ID: 465565  Cd Length: 104  Bit Score: 51.36  E-value: 1.25e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  1155 NSQLPFRLEVTVSHVALTAILHQ-EHSGRKHPIAYTSKPLLPDE--------ESqgpqsggdspYAVAWALKHFSRCIGD 1225
Cdd:pfam17917    1 DPSKPFILETDASDYGIGAVLSQkDEDGKERPIAYASRKLTPAErnysttekEL----------LAIVWALKKFRHYLLG 70
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....
gi 126723547  1226 TPVVL--D---LSYASRTtadpevregrRVSKAWLIRWSLLVQD 1264
Cdd:pfam17917   71 RKFTVytDhkpLKYLFTP----------KELNGRLARWALFLQE 104
PHA03247 PHA03247
large tegument protein UL36; Provisional
425-693 1.61e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 56.87  E-value: 1.61e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  425 PSAESPAGRPDGglggeaALQNCPRPEISPKVTSLLVVPGSSDVKDKVSSDL-PQIGPPLTSTPqlqAGGEPGDQGSMQL 503
Cdd:PHA03247 2767 PAPAPPAAPAAG------PPRRLTRPAVASLSESRESLPSPWDPADPPAAVLaPAAALPPAASP---AGPLPPPTSAQPT 2837
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  504 DFKGLEEGPAPVLPTGQGkpVAQGGLTDQSVPGAQTVPetlkVPMAAAVPKAENPSRTQVPSAAPKLPTSRMmlavhtEP 583
Cdd:PHA03247 2838 APPPPPGPPPPSLPLGGS--VAPGGDVRRRPPSRSPAA----KPAAPARPPVRRLARPAVSRSTESFALPPD------QP 2905
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  584 AAPEVPLAPTKPTAQlmATAQKTVVNQPvlvaqvEPTTPKTPQAQKMPVAKTSPAGPKTPKAQAGPAATVskapaaskap 663
Cdd:PHA03247 2906 ERPPQPQAPPPPQPQ--PQPPPPPQPQP------PPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGAL---------- 2967
                         250       260       270
                  ....*....|....*....|....*....|
gi 126723547  664 aapkVPVTPRVSRAPKTPAAQKVPTDAGPT 693
Cdd:PHA03247 2968 ----VPGRVAVPRFRVPQPAPSREAPASST 2993
PHA03247 PHA03247
large tegument protein UL36; Provisional
419-713 6.12e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 54.94  E-value: 6.12e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  419 KELAPLPSAESPA-GRPDGGLGGEAALQNCPRPEISPKVTSLLVVPGSsdvKDKVSSDLPQIGPPLTSTPQLQAGGEP-- 495
Cdd:PHA03247 2706 PTPEPAPHALVSAtPLPPGPAAARQASPALPAAPAPPAVPAGPATPGG---PARPARPPTTAGPPAPAPPAAPAAGPPrr 2782
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  496 GDQGSMQLDFKGLEEGPAPVLPTGQGKPV--AQGGLTDQSVPGAQTVPETLKVPMAAAVPKAENPSRTQVP-SAAPKLPT 572
Cdd:PHA03247 2783 LTRPAVASLSESRESLPSPWDPADPPAAVlaPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGgSVAPGGDV 2862
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  573 SRMmlavHTEPAAPEVPLAPTKPTAQLMATAQKTVVNQPVLVAQVEPTTPKTPQAQKMPV--------AKTSPAGPKTPK 644
Cdd:PHA03247 2863 RRR----PPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQpqpqppppPQPQPPPPPPPR 2938
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 126723547  645 AQAGPAATvskapaASKAPAAPKVPVTPRVSRAPKTPAAQKVPTDAGPTLDVARLLSEVQPTSRASVSL 713
Cdd:PHA03247 2939 PQPPLAPT------TDPAGAGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAPSREAPASSTPPLTGHSL 3001
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
477-710 2.19e-06

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 52.65  E-value: 2.19e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547   477 PQIGPPLTSTPQLQAGGEPGDQGSMQLDFKGLEEGPAPVLP--------TGQGKPVAqgGLTDQSVPGAQTVPETLKVPM 548
Cdd:pfam17823  168 PHAASPAPRTAASSTTAASSTTAASSAPTTAASSAPATLTPargistaaTATGHPAA--GTALAAVGNSSPAAGTVTAAV 245
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547   549 AAAVPKAENPSRTQVPSAAPKLPTSRMMLAVHTEPA---------APEVPLAPTKPTAQlmATAQKTVVNQPVLVAQVEP 619
Cdd:pfam17823  246 GTVTPAALATLAAAAGTVASAAGTINMGDPHARRLSpakhmpsdtMARNPAAPMGAQAQ--GPIIQVSTDQPVHNTAGEP 323
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547   620 T-----------TPKTPQAQKMPVAKTSPAGPKTPKAQAGP---AATVSKAPAASKAPAAPKVPVTPRVSrAPKTP-AAQ 684
Cdd:pfam17823  324 TpspsnttlepnTPKSVASTNLAVVTTTKAQAKEPSASPVPvlhTSMIPEVEATSPTTQPSPLLPTQGAA-GPGILlAPE 402
                          250       260
                   ....*....|....*....|....*.
gi 126723547   685 KVPTDAGPTLDVArllsevQPTSRAS 710
Cdd:pfam17823  403 QVATEATAGTASA------GPTPRSS 422
PHA03247 PHA03247
large tegument protein UL36; Provisional
421-693 5.21e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.86  E-value: 5.21e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  421 LAPLPSAESPAGRPDGGLGGEAALQNCPRPEISPKVTSLLVVPGSSDvkDKVSSDLPQIGPPLT-------STPQLQAGG 493
Cdd:PHA03247 2560 PPAAPDRSVPPPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRG--DPRGPAPPSPLPPDThapdpppPSPSPAANE 2637
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  494 EPGDQGSMQLDFKGLEEGPAPVLPTGQGKPVAQGGLTDQSVPGAQTVPETLKVPMAAAVPKAENPSRTQVPSAAPKLPTS 573
Cdd:PHA03247 2638 PDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVS 2717
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  574 RMMLAVHTEPAAPEVPLAPTKPTAQLMATAQKTVVNQPVLVAQVEPTTPKTPQAQKMPVAKTSPAGPKTPKAQAGPAATV 653
Cdd:PHA03247 2718 ATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRES 2797
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|
gi 126723547  654 SKAPAASKAPAAPKVPVTPRVSRAPKTPAAQKVPTDAGPT 693
Cdd:PHA03247 2798 LPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPT 2837
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
469-689 9.39e-06

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 50.84  E-value: 9.39e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  469 KDKVSSDLPQIGPPLTSTPQLQAGGEPGDQGSMQlDFKGLEEgpapvlPTGQGKP--VAQGGLTDQSVPGAQTVPEtlKV 546
Cdd:PTZ00449  502 EDSDKHDEPPEGPEASGLPPKAPGDKEGEEGEHE-DSKESDE------PKEGGKPgeTKEGEVGKKPGPAKEHKPS--KI 572
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  547 PMAAAVPKAENPSRTQVPSAAPKLPTSRMMLAVHTEPAAPEVPLAPTKPTAQLMATAQKTVVNQPVLVAQVEPTTPKTPQ 626
Cdd:PTZ00449  573 PTLSKKPEFPKDPKHPKDPEEPKKPKRPRSAQRPTRPKSPKLPELLDIPKSPKRPESPKSPKRPPPPQRPSSPERPEGPK 652
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 126723547  627 AQKMPVAKTSPAGPKTPK-------------AQAGPAATVSKAPAASKAPAAPKVPVTPRVSRAPKTPAAQKVPTD 689
Cdd:PTZ00449  653 IIKSPKPPKSPKPPFDPKfkekfyddyldaaAKSKETKTTVVLDESFESILKETLPETPGTPFTTPRPLPPKLPRD 728
PHA03247 PHA03247
large tegument protein UL36; Provisional
448-649 9.78e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.09  E-value: 9.78e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  448 PRPEISPKVTSLLVVPGSSDvkdkvssdlPQIGPPLTSTPQLQAGGEPGDQGSMQLDFKGLEEGPAPVLPTGQGKPVAQG 527
Cdd:PHA03247 2887 ARPAVSRSTESFALPPDQPE---------RPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSG 2957
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  528 GLTDQS----VPGAQTVPETLKVPMAAAVPKAENPSRTQVPSAAPKLPTSRMMLAVHTEPAAPEVPLAPT-KPTAQLMAT 602
Cdd:PHA03247 2958 AVPQPWlgalVPGRVAVPRFRVPQPAPSREAPASSTPPLTGHSLSRVSSWASSLALHEETDPPPVSLKQTlWPPDDTEDS 3037
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|..
gi 126723547  603 AQKTVVNQPVLVAQVEPTTPKTPQAQKMPVAKTSPAGP-----KTPKAQAGP 649
Cdd:PHA03247 3038 DADSLFDSDSERSDLEALDPLPPEPHDPFAHEPDPATPeagarESPSSQFGP 3089
RNase_HI_RT_Ty3 cd09274
Ty3/Gypsy family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1161-1264 1.80e-05

Ty3/Gypsy family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing retrotransposons and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD), are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1 and the vertebrate retroviruses. Ty3/Gypsy family widely distributed among the genomes of plants, fungi and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260006 [Multi-domain]  Cd Length: 121  Bit Score: 45.95  E-value: 1.80e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547 1161 RLEVTVSHVALTAILHQEHS-GRKHPIAYTSKPLLPDEESQgpqsggdSPY-----AVAWALKHFSRCIGDTPVVL--D- 1231
Cdd:cd09274     1 ILETDASDYGIGAVLSQEDDdGKERPIAFFSRKLTPAERNY-------STTekellAIVWALKKFRHYLLGRPFTVytDh 73
                          90       100       110
                  ....*....|....*....|....*....|....*
gi 126723547 1232 --LSYAsRTTADPEVRegrrvskawLIRWSLLVQD 1264
Cdd:cd09274    74 kaLKYL-LTQKDLNGR---------LARWLLLLSE 98
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
517-720 4.98e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 48.31  E-value: 4.98e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  517 PTGQGKPvaqGGLTDQSVPGAqtvpetlkVPMAAAVPKAENPSRTQVPSAAPKLPTSRmmlAVHTEPAAPEVPLAPTKPT 596
Cdd:PRK07003  362 VTGGGAP---GGGVPARVAGA--------VPAPGARAAAAVGASAVPAVTAVTGAAGA---ALAPKAAAAAAATRAEAPP 427
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  597 AQLMATAQKTVVNQPVLVAQVEPTTPKTPQAQKMPVAKTSPAGPKTPKAQAGPAATVSKAPAA-SKAPAAPKVPVTPRVS 675
Cdd:PRK07003  428 AAPAPPATADRGDDAADGDAPVPAKANARASADSRCDERDAQPPADSGSASAPASDAPPDAAFePAPRAAAPSAATPAAV 507
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*
gi 126723547  676 RAPKTPAAQKVPTDAGPTLDVARLLSEVQPTSRASVSLLKGQGQA 720
Cdd:PRK07003  508 PDARAPAAASREDAPAAAAPPAPEARPPTPAAAAPAARAGGAAAA 552
PRK10905 PRK10905
cell division protein DamX; Validated
481-652 5.09e-05

cell division protein DamX; Validated


Pssm-ID: 236792 [Multi-domain]  Cd Length: 328  Bit Score: 47.63  E-value: 5.09e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  481 PPLTSTPQlQAGGEPGDQGSMQLDFKGlEEGPAPVLPTGQGKpvAQGGLTDQSVPgaqTVPETLkvpmaAAVPKAENPSR 560
Cdd:PRK10905   74 PPISSTPT-QGQTPVATDGQQRVEVQG-DLNNALTQPQNQQQ--LNNVAVNSTLP---TEPATV-----APVRNGNASRQ 141
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  561 TQVPSAAPKLPTSRmmlavhtePAAPEVPLAPTKPtaqlMATAQKTVVNQPVLVAQVEPTTPKTPqaQKMPVAKTSPAGP 640
Cdd:PRK10905  142 TAKTQTAERPATTR--------PARKQAVIEPKKP----QATAKTEPKPVAQTPKRTEPAAPVAS--TKAPAATSTPAPK 207
                         170
                  ....*....|....*.
gi 126723547  641 KT----PKAQAGPAAT 652
Cdd:PRK10905  208 ETattaPVQTASPAQT 223
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
487-698 9.81e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 47.54  E-value: 9.81e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  487 PQLQAGGEPGDQGSmqldfkgleegPAPVLPTGQGKPVAQGGLTDQSVPGAQTVPETlkvPMAAAVPKAENPSRTQVPSA 566
Cdd:PRK07003  360 PAVTGGGAPGGGVP-----------ARVAGAVPAPGARAAAAVGASAVPAVTAVTGA---AGAALAPKAAAAAAATRAEA 425
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  567 APKLPTSRMMLAVHTEPAAPEVPLAPTKPTAQLMATAQKTVVNQPVL--VAQVEPTTPKTPQAQKMPVAKTSPAGPKTPK 644
Cdd:PRK07003  426 PPAAPAPPATADRGDDAADGDAPVPAKANARASADSRCDERDAQPPAdsGSASAPASDAPPDAAFEPAPRAAAPSAATPA 505
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....*..
gi 126723547  645 AQAGPAAtvskaPAASKAPAAPKVPVTPRVSRAPKTPAAQKVPTDAG---PTLDVAR 698
Cdd:PRK07003  506 AVPDARA-----PAAASREDAPAAAAPPAPEARPPTPAAAAPAARAGgaaAALDVLR 557
PHA03247 PHA03247
large tegument protein UL36; Provisional
481-778 1.52e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 47.24  E-value: 1.52e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  481 PPLTSTPQLQAGGEPGD--QGSMQLDFKGLEE-------GPAPVLPtgqgkPVAQGGLTDQSVPGAQTVPETLK------ 545
Cdd:PHA03247 2511 APSRLAPAILPDEPVGEpvHPRMLTWIRGLEElasddagDPPPPLP-----PAAPPAAPDRSVPPPRPAPRPSEpavtsr 2585
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  546 -----VPMAAAVPKA-----ENPSRTQVPSAAPKL----------PTSRMMLAVHTEPAAPEVPLAPTKPTAQLMATAQK 605
Cdd:PHA03247 2586 arrpdAPPQSARPRApvddrGDPRGPAPPSPLPPDthapdppppsPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPR 2665
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  606 TVVNQPVLVAQVEPTTPKTPQAQKMPVAK-TSPAGPKTPKAQAGPAATVSKAPAASKAPAAPKVPVTPRVSRAPKTPaaq 684
Cdd:PHA03247 2666 RARRLGRAAQASSPPQRPRRRAARPTVGSlTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPP--- 2742
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  685 kvPTDAGPTLDVARLLSEVQPTSRASVSLLKGQGQAGrqGPQSSGTLALSSKHQFQMEGLLGAWEGAPrQPPRHLQANST 764
Cdd:PHA03247 2743 --AVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAA--GPPRRLTRPAVASLSESRESLPSPWDPAD-PPAAVLAPAAA 2817
                         330
                  ....*....|....
gi 126723547  765 VTSFQRYHEALNTP 778
Cdd:PHA03247 2818 LPPAASPAGPLPPP 2831
rne PRK10811
ribonuclease E; Reviewed
523-709 2.85e-04

ribonuclease E; Reviewed


Pssm-ID: 236766 [Multi-domain]  Cd Length: 1068  Bit Score: 46.19  E-value: 2.85e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  523 PVAQggltdqsvPGAQTVPETLKVPMAAAVPKAENpsrTQVPSAAPKLPTSRMMLAVHTEPAAPEVPLAPTKPTAQLMAT 602
Cdd:PRK10811  846 PVVR--------PQDVQVEEQREAEEVQVQPVVAE---VPVAAAVEPVVSAPVVEAVAEVVEEPVVVAEPQPEEVVVVET 914
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  603 AQ-----KTVVNQPVLVAQVEPTTPKTPQAQKMPVakTSPAGPKTPKAQAGPAATVSKAPAASKAPAAPKVPVTPRVSrA 677
Cdd:PRK10811  915 THpeviaAPVTEQPQVITESDVAVAQEVAEHAEPV--VEPQDETADIEEAAETAEVVVAEPEVVAQPAAPVVAEVAAE-V 991
                         170       180       190
                  ....*....|....*....|....*....|..
gi 126723547  678 PKTPAAQKVPTDAGPTLDVARLLSEVQPTSRA 709
Cdd:PRK10811  992 ETVTAVEPEVAPAQVPEATVEHNHATAPMTRA 1023
rne PRK10811
ribonuclease E; Reviewed
509-653 3.08e-04

ribonuclease E; Reviewed


Pssm-ID: 236766 [Multi-domain]  Cd Length: 1068  Bit Score: 45.80  E-value: 3.08e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  509 EEGPAPVLPTGQGKPVAQGGLTDQSVPGAQTVPETLKVPMAAAVPKAENPSRTQVPSAAPKLPTSRMMLAVHTEPAAPEV 588
Cdd:PRK10811  863 EVQVQPVVAEVPVAAAVEPVVSAPVVEAVAEVVEEPVVVAEPQPEEVVVVETTHPEVIAAPVTEQPQVITESDVAVAQEV 942
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 126723547  589 P--LAPTKPTAQLMATAQKTVVNQPVLVaqVEPTTPKTPQAQKMPVAKTSPAGPKTPKAQAGPAATV 653
Cdd:PRK10811  943 AehAEPVVEPQDETADIEEAAETAEVVV--AEPEVVAQPAAPVVAEVAAEVETVTAVEPEVAPAQVP 1007
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
457-682 3.86e-04

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 45.34  E-value: 3.86e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547   457 TSLLVVPGSSDVKDKVSSDLPQIGPPLTSTPQLQAGGEPGDQGSMQL-DFKGLEEGPAPVLP--TGQGKPVAQGGLTDQs 533
Cdd:pfam17823  226 TALAAVGNSSPAAGTVTAAVGTVTPAALATLAAAAGTVASAAGTINMgDPHARRLSPAKHMPsdTMARNPAAPMGAQAQ- 304
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547   534 vpgAQTVPETLKVPMAAAVPK-AENPSRTQVPSAAPKLPTSRMMLAVHT------EPAAPEVPLAPTKptaqlmataqkt 606
Cdd:pfam17823  305 ---GPIIQVSTDQPVHNTAGEpTPSPSNTTLEPNTPKSVASTNLAVVTTtkaqakEPSASPVPVLHTS------------ 369
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 126723547   607 vvnqpvLVAQVEPTTPKTPQAQKMPVAKTspAGPKTPKA--QAGPAATVSKAPAASkapaapkvpvTPRVSRAPKTPA 682
Cdd:pfam17823  370 ------MIPEVEATSPTTQPSPLLPTQGA--AGPGILLApeQVATEATAGTASAGP----------TPRSSGDPKTLA 429
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
522-810 7.12e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 44.48  E-value: 7.12e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  522 KPVAQGGltDQSVPGAQTVPETLKVPMAAAVPKAENPSRTQVPSAAPKLPTSRMMLAVHTEPAAPEVPLAPTKPTAQLMA 601
Cdd:PRK12323  364 RPGQSGG--GAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASA 441
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  602 TAQKTVVNQPVLVAQVEPTTPKTPQAQKMPVAKTSPAGPKTPKAQAGPAATVSKAPAASKAPAAPKVPVTPRVSRAPKTP 681
Cdd:PRK12323  442 RGPGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEELPPEFASPAPAQPDAAPAGW 521
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  682 AAQKV-------PTDAGPTLDVARLLSEVQPTSRAS---VSLLKGQGQAGRQGPQSSG---TLA-------LSSKHQFQM 741
Cdd:PRK12323  522 VAESIpdpatadPDDAFETLAPAPAAAPAPRAAAATepvVAPRPPRASASGLPDMFDGdwpALAarlpvrgLAQQLARQS 601
                         250       260       270       280       290       300       310
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 126723547  742 EglLGAWEGAP---RQPPRHLQANSTVtsfQRYHEALNTPFelnlsGEPgnqglRRVVID----GSSVAMVHGLQH 810
Cdd:PRK12323  602 E--LAGVEGDTvrlRVPVPALAEAEVV---ERLQAALTEHF-----GQP-----VRVVCEvgavGATAAAVDAEER 662
PHA03378 PHA03378
EBNA-3B; Provisional
538-732 1.13e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 43.90  E-value: 1.13e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  538 QTVPETLKVPMAA---AVPKAENPSRTQVPSAAP-KLPTSRMMLAVHTEPAAPEVPLAPTKPTAQLMATAQKTVVNQPVL 613
Cdd:PHA03378  688 QWAPGTMQPPPRAptpMRPPAAPPGRAQRPAAATgRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPP 767
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  614 VAQVEPTTPkTPQAQKMPVAKTSPAGPKTPKA--QAGPAAtvskapaaskapaapkvpvtprVSRAPKTPAAQKVPTDAG 691
Cdd:PHA03378  768 AAAPGAPTP-QPPPQAPPAPQQRPRGAPTPQPppQAGPTS----------------------MQLMPRAAPGQQGPTKQI 824
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|.
gi 126723547  692 PTLDVARLLSEVQPTSRASVSLLKgQGQAGRQGPQSSGTLA 732
Cdd:PHA03378  825 LRQLLTGGVKRGRPSLKKPAALER-QAAAGPTPSPGSGTSD 864
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
418-687 1.14e-03

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 43.91  E-value: 1.14e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  418 QKELAPLPSAESPA-GRPDGGLGGEAAlqncprPEISPKVTSllvvpGSSDVK-DKVSSDLPQIGPPLTSTPQLQAGGEP 495
Cdd:PTZ00449  493 KKKLAPIEEEDSDKhDEPPEGPEASGL------PPKAPGDKE-----GEEGEHeDSKESDEPKEGGKPGETKEGEVGKKP 561
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  496 G----DQGSMQLDFKGLEEGPA-------PVLPTGQGKPVAQGGLTDQSVPgaqTVPETLKVPMAAAVPKAENPSRTQVP 564
Cdd:PTZ00449  562 GpakeHKPSKIPTLSKKPEFPKdpkhpkdPEEPKKPKRPRSAQRPTRPKSP---KLPELLDIPKSPKRPESPKSPKRPPP 638
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  565 S---AAPKLPTSRMMLAVHTEPAAPEVPLAPT---KPTAQLMATAQKT--VVNQPVLVAQVEPTTPKT-PQAQKMPVAKT 635
Cdd:PTZ00449  639 PqrpSSPERPEGPKIIKSPKPPKSPKPPFDPKfkeKFYDDYLDAAAKSkeTKTTVVLDESFESILKETlPETPGTPFTTP 718
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|..
gi 126723547  636 SPAGPKTPKAQAGPAATVSKAPAASKAPAAPKVPVTPRVSRAPKTPAAQKVP 687
Cdd:PTZ00449  719 RPLPPKLPRDEEFPFEPIGDPDAEQPDDIEFFTPPEEERTFFHETPADTPLP 770
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
503-653 1.28e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 43.55  E-value: 1.28e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  503 LDFKglEEGPAPVLPTGQGKPVAQGGltdQSVPGAQTVPETLKVPMAAAVPKAENPSRTQVPSAAPKLPTSRmmlavhte 582
Cdd:PRK14951  362 LAFK--PAAAAEAAAPAEKKTPARPE---AAAPAAAPVAQAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAA-------- 428
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 126723547  583 PAAPEVPLAPTKPTAQLMATAQKTVVNQPVLVAQVEPTTPKTPQAQKMPVAKTSPAGPKTPKAQAGPA--ATV 653
Cdd:PRK14951  429 PAAAAPAAAPAAAPAAVALAPAPPAQAAPETVAIPVRVAPEPAVASAAPAPAAAPAAARLTPTEEGDVwhATV 501
PHA03379 PHA03379
EBNA-3A; Provisional
494-791 1.38e-03

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 43.89  E-value: 1.38e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  494 EPG-----DQGSMQlDFKGLEEGPAPVLPTGQGKPVAQGGLTDQSVPGAQTVPETLKVPMAAAVPKAEnPSRTQVPSAAP 568
Cdd:PHA03379  444 EPPpvhdlEPGPLH-DQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDGRPACAPVPAPAGPIVRPWE-ASLSQVPGVAF 521
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  569 KlPTSRMMLAVHTEP---AAPEVPLAPTKPTAQLMATAQKTVVNQPVLVAQVEPTTPKTPQA-QKMPVAKTSPAGPKTPK 644
Cdd:PHA03379  522 A-PVMPQPMPVEPVPvptVALERPVCPAPPLIAMQGPGETSGIVRVRERWRPAPWTPNPPRSpSQMSVRDRLARLRAEAQ 600
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  645 AQAGPAATVSKAPAASKAPAAPKVPVTPRVSRAPKTPAAQKVPTDAGPTLDV----ARLLSEVQP-TSRASVSLLK---- 715
Cdd:PHA03379  601 PYQASVEVQPPQLTQVSPQQPMEYPLEPEQQMFPGSPFSQVADVMRAGGVPAmqpqYFDLPLQQPiSQGAPLAPLRasmg 680
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  716 GQGQAGRQGPQ----------SSGTLALSSKHQFQMEGllgawegaPRQPPRHLQANSTVTSFQRYHEALNTPFELNLSg 785
Cdd:PHA03379  681 PVPPVPATQPQyfdipltepiNQGASAAHFLPQQPMEG--------PLVPERWMFQGATLSQSVRPGVAQSQYFDLPLT- 751

                  ....*.
gi 126723547  786 EPGNQG 791
Cdd:PHA03379  752 QPINHG 757
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
503-651 1.39e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 43.70  E-value: 1.39e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  503 LDFKGLEEGPAPVLPTGQGKPVAQGGLTdqSVPGAQTVPetlkvPMAAAVPKAENPSRTQVPSAAPKLPTSRMMLAVHTE 582
Cdd:PRK07994  357 LAFHPAAPLPEPEVPPQSAAPAASAQAT--AAPTAAVAP-----PQAPAVPPPPASAPQQAPAVPLPETTSQLLAARQQL 429
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  583 PAAPE-VPLAPTKPTAQLMATAQKTVVNQPVLVAQVEPTTPKTPQAQKMPVAKTSPAGPKTPKAQAGPAA 651
Cdd:PRK07994  430 QRAQGaTKAKKSEPAAASRARPVNSALERLASVRPAPSALEKAPAKKEAYRWKATNPVEVKKEPVATPKA 499
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
535-696 2.37e-03

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 42.83  E-value: 2.37e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  535 PGAQTVPETLKVPMAaavPKAENPSrtqvPSAAPKLPTSRMMLAVHTEPAAPEVPLAPTKPTAQLMATAQKTvvnQPVLV 614
Cdd:NF033839  297 PGMQPSPQPEKKEVK---PEPETPK----PEVKPQLEKPKPEVKPQPEKPKPEVKPQLETPKPEVKPQPEKP---KPEVK 366
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  615 AQVEPTTPKTPQAQKMPVAKTSPAgPKTPKAQAGPAATVSKAPAASKAPAApkvpvTPRVSRAPKTPAAQKVPTDAGPTL 694
Cdd:NF033839  367 PQPEKPKPEVKPQPETPKPEVKPQ-PEKPKPEVKPQPEKPKPEVKPQPEKP-----KPEVKPQPEKPKPEVKPQPEKPKP 440

                  ..
gi 126723547  695 DV 696
Cdd:NF033839  441 EV 442
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
576-693 3.61e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 42.01  E-value: 3.61e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  576 MLAVHTEPAAPEVPLAPTKPTAQLMATAQKTVvnqPVLVAQVePTTPKTPQAQKMPVAKTSPA----GPKTPKAQAGPAA 651
Cdd:PRK14951  361 LLAFKPAAAAEAAAPAEKKTPARPEAAAPAAA---PVAQAAA-APAPAAAPAAAASAPAAPPAaappAPVAAPAAAAPAA 436
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|..
gi 126723547  652 TVSKAPAASKAPAAPKVPVTPRVSRAPKTPAAQKVPTDAGPT 693
Cdd:PRK14951  437 APAAAPAAVALAPAPPAQAAPETVAIPVRVAPEPAVASAAPA 478
PHA03377 PHA03377
EBNA-3C; Provisional
417-646 4.66e-03

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 41.96  E-value: 4.66e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  417 KQKELAPLPSAESPAGRPDGGLGGEAALQNCPRPEISPKVTSLLVVPGSSDVKD--KVSSDLPQIGPPLTSTPQLQAGGE 494
Cdd:PHA03377  546 RQKRATPPKVSPSDRGPPKASPPVMAPPSTGPRVMATPSTGPRDMAPPSTGPRQqaKCKDGPPASGPHEKQPPSSAPRDM 625
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  495 PGDQGSMQLDFKGLEE--GPAPV----------LPTGQGKPVAQGGLTDQSVPGAQT-VPETLKVP-MAAAVPKAENPSR 560
Cdd:PHA03377  626 APSVVRMFLRERLLEQstGPKPKsfwemragrdGSGIQQEPSSRRQPATQSTPPRPSwLPSVFVLPsVDAGRAQPSEESH 705
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  561 TQvpSAAPKLPTSRMMLAVHTEPAAP-EVPL----APTKPTAQLMATAQKTVVNQPVLVAQVEPTTP-------KTPQAQ 628
Cdd:PHA03377  706 LS--SMSPTQPISHEEQPRYEDPDDPlDLSLhpdqAPPPSHQAPYSGHEEPQAQQAPYPGYWEPRPPqapylgyQEPQAQ 783
                         250
                  ....*....|....*....
gi 126723547  629 KMPVAK-TSPAGPKTPKAQ 646
Cdd:PHA03377  784 GVQVSSyPGYAGPWGLRAQ 802
PHA03247 PHA03247
large tegument protein UL36; Provisional
521-726 5.14e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 42.23  E-value: 5.14e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  521 GKPVAQGGLTDQ--SVPGAQTVPETLKVPMAAAVPKAENPSRTQVPSAAPKLPTSRMMLA--------VHTEPAAPEVPL 590
Cdd:PHA03247 2476 GAPVYRRPAEARfpFAAGAAPDPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPRMLTwirgleelASDDAGDPPPPL 2555
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  591 APTKPTAqlmATAQKTVVNQPVlvaqVEPTTPKTPQAQKMPVAKTSPAGPKTPKAQAGPAATVSKAPAASKAPAAPKVPV 670
Cdd:PHA03247 2556 PPAAPPA---APDRSVPPPRPA----PRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPP 2628
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 126723547  671 TPRVSRAPKTPAAQKVPTDAGPTLDVARLLSEVQPTSRASVSLLKGQGQAGRQGPQ 726
Cdd:PHA03247 2629 PSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPR 2684
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
396-745 7.85e-03

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 41.07  E-value: 7.85e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  396 PLGPIQLKLPGQNPlplnlewkqkelAPLPSaeSPAGRPDGGLGGEAALQNCPRPEISPKVTS--LLVVPGSSDVKDKVS 473
Cdd:cd22540    51 PPQPTPRKLVPIKP------------APLPL--GPGKNSIGFLSAKGNIIQLQGSQLSSSAPGgqQVFAIQNPTMIIKGS 116
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  474 SDLPQIGPPLTSTPQLQAGGEPGDQGSMQL--------------------DFKGLEEGPAPVLPTGQGKPVAQGGLTDQS 533
Cdd:cd22540   117 QTRSSTNQQYQISPQIQAAGQINNSGQIQIipgtnqaiitpvqvlqqpqqAHKPVPIKPAPLQTSNTNSASLQVPGNVIK 196
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  534 VPGAQTVPETLkvPMAAAVPKAENPSRTQVPSAAPKlpTSRMMLAVHTEPAAPEVPLApTKPTAQLMATAQKTVV---NQ 610
Cdd:cd22540   197 LQSGGNVALTL--PVNNLVGTQDGATQLQLAAAPSK--PSKKIRKKSAQAAQPAVTVA-EQVETVLIETTADNIIqagNN 271
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  611 PVLVAQvePTTPKTPQAQKMPVAKtspagpktPKAQAGPAATVSKAPAASKAPAAPKVPVTPRVSRAPKTPAAQKVPTDA 690
Cdd:cd22540   272 LLIVQS--PGTGQPAVLQQVQVLQ--------PKQEQQVVQIPQQALRVVQAASATLPTVPQKPLQNIQIQNSEPTPTQV 341
                         330       340       350       360       370
                  ....*....|....*....|....*....|....*....|....*....|....*...
gi 126723547  691 ---GPTLDVARLLSEVQPTSRASVSLLKGQGQAgrQGPQSSGTLALSSKHQFQMEGLL 745
Cdd:cd22540   342 yikTPSGEVQTVLLQEAPAATATPSSSTSTVQQ--QVTANNGTGTSKPNYNVRKERTL 397
PHA03247 PHA03247
large tegument protein UL36; Provisional
423-714 8.03e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 41.46  E-value: 8.03e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  423 PLPSAESPAGRPDGGLGGeAALQNCPRPeispkvtsllvvpgssdvkdkvssdLPQIGPPLTSTPQLQAGGEPGDQGSMQ 502
Cdd:PHA03247  282 PEAAAPNGAAAPPDGVWG-AALAGAPLA-------------------------LPAPPDPPPPAPAGDAEEEDDEDGAME 335
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  503 LdfkgleegpapVLPTGQGKPVAQGGLTDQSVPgAQTVPETLKVPMAA--AVPKAENPSRTQVPSAAPKLPTSRMMLAVH 580
Cdd:PHA03247  336 V-----------VSPLPRPRQHYPLGFPKRRRP-TWTPPSSLEDLSAGrhHPKRASLPTRKRRSARHAATPFARGPGGDD 403
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 126723547  581 TEPAAPEVPLAPTKPTAQLMATAQktvvnqpvlvaqvePTTPKTPQAQKMPVAKTSPAGPKTPKaqagPAATVSKAPAAS 660
Cdd:PHA03247  404 QTRPAAPVPASVPTPAPTPVPASA--------------PPPPATPLPSAEPGSDDGPAPPPERQ----PPAPATEPAPDD 465
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|....
gi 126723547  661 KAPAAPKVPVTPRVSRAPKTPAAqkvptdagptlDVARLLsEVQPTSRASVSLL 714
Cdd:PHA03247  466 PDDATRKALDALRERRPPEPPGA-----------DLAELL-GRHPDTAGTVVRL 507
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH