|
Name |
Accession |
Description |
Interval |
E-value |
| ADNP_N |
pfam19627 |
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the ... |
1-962 |
1.55e-63 |
|
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the N-terminal domain of Activity-dependent neuroprotector homeobox protein (ADNP, also known as Activity- dependent neuroprotective protein), which contains zinc finger motifs. It is involved in transcriptional regulation and it is vital for mammalian brain formation. In humans, de novo mutations result in a syndromic form of autism-like spectrum disorder (ASD), including cognitive and motor deficits, the ADNP syndrome. This protein is also related to autophagy and the pathophysiology of schizophrenia.
Pssm-ID: 466132 [Multi-domain] Cd Length: 744 Bit Score: 230.51 E-value: 1.55e-63
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 1 MFQIPVENLDNIRK----------------------DLKGFDPGEKYFHNTSWGDVSLWEPSGKKVR-YRTKPYCCGLCK 57
Cdd:pfam19627 1 MFQLPVNNLGSLRKarknvkkilsdigleyckehieDFKDFEPNDFYIKNTSWDDVCLWDPSLTKNQdYRTKPFCCSGCP 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 58 YSTKVLTSFKNHLHRYHEDEIDQELVIPCPNCVFASQPKVVGRHFRMFHAPVRKVQNYTVNILGETKSSRSDVIS----- 132
Cdd:pfam19627 81 FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTYNGNKKTLETHIKLFHMPNNVVRQPSGGPVGFKDKSKQDSLKpkqgd 160
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 133 ------FTCLKCNFSNTLYYSMKKHVLVAHFHYLINSYFGlRTEEmgeqpKT-NDTVSIEKIPPPDKYYCKKCNANASSQ 205
Cdd:pfam19627 161 sveqavYYCKKCTYRDPLYNVVRKHIYREHFQHVAAPYVA-KPGE-----KSvNGAVASSNTRDDGSIHCKRCLFMPRTY 234
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 206 DALMYHiltsdihrdlenklrsVISEHIKrtgllkqthiapkpaahlaapangsapsapaqppcfhlalpqnspspaAGQ 285
Cdd:pfam19627 235 EALVQH----------------VIEDHER------------------------------------------------IGY 250
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 286 PVTVAQGapgslthsppaagqshmtlvssplpvgqnsltlqppapqpvflshgvplHQSVnppVLPLSQPvgpvnksvgt 365
Cdd:pfam19627 251 QVTAMIG-------------------------------------------------HTNV---VVPRSKP---------- 268
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 366 svlpinqtvrpgvLPLTQPVGPINRPVGpgvLPVSPSVTPGVLQAvspgvlsvsravpsgvlpagqmtpagqmtpagvip 445
Cdd:pfam19627 269 -------------LMLIAPKPQDKKSLG---VTQKGGLVTGNVRS----------------------------------- 297
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 446 gqtatsgvLPTGQMvqsgvlpvgqtapSRVLPPGqtaplrvisagqvvpSGLLSPnqtvsssavVPVNQGvnsgvlQLSQ 525
Cdd:pfam19627 298 --------LSSQQM-------------NRLSIPK---------------ANLLSN---------VHLKQG------SYGL 326
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 526 PVVSGVLPVGQPVRPGvlqlnqtvgtniLPVNQPVR-PGASQNttfltsgsiLRQLIPTGkqvNGIPTYTLAPVSVTLP- 603
Cdd:pfam19627 327 KSMPSFYVLGQQVRLS------------LPGNAQVSvPQQSQT---------VKQLLPGG---NGRPSTVGSSQSGQQPa 382
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 604 ---VPPGGLATVAPPQM-PIQLLPSGAAAPMAGSMPGMPSppvlvnaaqsvfvqASSSAADTNQVLKqakqWKTCPVCNE 679
Cdd:pfam19627 383 rfsVQSGNSASSSSSQLkSPPLSSSVAATRALGQGPSKSS--------------ASAAGLNTSYTQK----WKICTICNE 444
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 680 LFPSNVYQVHMEVAHKhsesksgeklePEKLAACAPFLKWMREKTVRCLSCKCLVSEEELIHHLLMHGLGCLFCPCTFHD 759
Cdd:pfam19627 445 LFPENVYSAHFEKEHK-----------AEKVPAVANYIMKIHNFTSKCLYCNRYLPSDTLLNHMLIHGLSCPYCRSTFND 513
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 760 IKGLSEHSRNRHLGKKKLPMDYSNRGFQLDVDaNGNLLFPHLDFITILPKEKLGEREVYLA---------ILAGIHSKSL 830
Cdd:pfam19627 514 VEKMVAHMRMVHPDEEVGPRTDSPLTFDLTLQ-QGNPKNIQLLVTTYNMRDAPEESVAFHAqnnspqpkkPKPKVQEKSD 592
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 831 VPvyVKVRPQaegTPGSTGKRV--STCPFCF----GPFvtTEAYELHLKERHHIMPTVHTVLKSPAFKCIHCCGVYTGNM 904
Cdd:pfam19627 593 VP--VKSSPQ---AAVPYKKDVgkTLCPLCFsilkGPI--SDALAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNM 665
|
970 980 990 1000 1010 1020
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 767998433 905 TLAAIAVHLVRCRSAPK--DSSSDLQAQPGFIHNSELLLVSGEVMH-DSSFSVKRKLPDGH 962
Cdd:pfam19627 666 TASTITLHLVHCRGVGKtqNGQDKSAPSPRVTQSPGAAPLKRELEHvDPALPKKRKLDDEE 726
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
248-641 |
5.40e-15 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 80.37 E-value: 5.40e-15
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 248 PAAHLAAPANGSAPSAPAQPPcfhlalpqnsPSPAAGQPVTVAQGAPGSLTHSPPAA--GQSHMTLVSSPLPVGQNSLTL 325
Cdd:PHA03247 2557 PAAPPAAPDRSVPPPRPAPRP----------SEPAVTSRARRPDAPPQSARPRAPVDdrGDPRGPAPPSPLPPDTHAPDP 2626
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 326 QPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPV------NKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPV 399
Cdd:PHA03247 2627 PPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPP 2706
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 400 SPSVTPGVLQAVSPgvLSVSRAVPSGVLPAGQMTPAGQMTPAG-VIPGQTATSGVLPTGQMVQSGVLPVGQ-TAPSRVLP 477
Cdd:PHA03247 2707 TPEPAPHALVSATP--LPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAGPPAPAPPAAPaAGPPRRLT 2784
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 478 PGQTAPLRVISAGQVVPSGLLSPNQTVSS-SAVVPVNQGVNSGVlqlsqPVVSGVLPVGQPVRPGVLQLNQTVGTNILPV 556
Cdd:PHA03247 2785 RPAVASLSESRESLPSPWDPADPPAAVLApAAALPPAASPAGPL-----PPPTSAQPTAPPPPPGPPPPSLPLGGSVAPG 2859
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 557 NQPVRPGASQNTTFLTSGsilrqliPTGKQVNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPG 636
Cdd:PHA03247 2860 GDVRRRPPSRSPAAKPAA-------PARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932
|
....*
gi 767998433 637 MPSPP 641
Cdd:PHA03247 2933 PPPPP 2937
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
246-492 |
2.16e-09 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 62.26 E-value: 2.16e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 246 PKPAAhLAAPANGSAPSAPAQPPCFHLALPQNSP-SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLT 324
Cdd:PHA03247 2758 ARPPT-TAGPPAPAPPAAPAAGPPRRLTRPAVASlSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQP 2836
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 325 LQPPAPqPVFLSHGVPLHQSVNP--PVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGpiNRPVGPGVLPVSPS 402
Cdd:PHA03247 2837 TAPPPP-PGPPPPSLPLGGSVAPggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFA--LPPDQPERPPQPQA 2913
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 403 VTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTP----AGQMTPAGVIPgqTATSGVLPTGQmVQSGVLPVGQTAPSRVLPP 478
Cdd:PHA03247 2914 PPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPttdpAGAGEPSGAVP--QPWLGALVPGR-VAVPRFRVPQPAPSREAPA 2990
|
250
....*....|....
gi 767998433 479 GQTAPLRVISAGQV 492
Cdd:PHA03247 2991 SSTPPLTGHSLSRV 3004
|
|
| SP1-4_arthropods_N |
cd22553 |
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ... |
362-679 |
1.58e-07 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.
Pssm-ID: 411778 [Multi-domain] Cd Length: 384 Bit Score: 55.03 E-value: 1.58e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 362 SVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSpgVLSVSRAVPSGVLPAGQMTPAG-QMTP 440
Cdd:cd22553 35 ETHDPLILSPPLSQPQQIITAQSSGSAAGGVAYSVSPAVQTVTVDGHEAIF--IPANSGLLQTNNQQAIQLAPGGtQAIL 112
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 441 AGvipGQTATSGVLPTGQMVQSGVLPV-GQTAPSRV---LPP---GQTAPLRV-ISA--GQVVPSGLLSPNQTVSSSAVV 510
Cdd:cd22553 113 AN---QQTLIRPNTVQGQANASNVLQNiAQIASGGNavqLPLnnmTQTIPVQVpVSTanGQTVYQTIQVPIQAIQSGNAG 189
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 511 PVNQGVNSGVL-QLSQPvvsgvlpvgqpvrpGVLQLNQTVGTNILPVNQPVRPGASQNTTFL------TSGSILRQLIPT 583
Cdd:cd22553 190 GGNQALQAQVIpQLAQA--------------AQLQPQQLAQVSSQGYIQQIPANASQQQPQMvqqgpnQSGQIIGQVASA 255
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 584 -GKQVNGIPTYTLApvSVTLPVPPGGLATVAP-PQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQASSSAADT 661
Cdd:cd22553 256 sSIQAAAIPLTVYT--GALAGQNGSNQQQVGQiVTSPIQGMTQGLTAPASSSIPTVVQQQAIQGNPLPPGTQIIAAGQQL 333
|
330 340 350
....*....|....*....|....*....|....*.
gi 767998433 662 NQVLKQAKQWK------------------TCPVCNE 679
Cdd:cd22553 334 QQDPNDPTKWQvvadgtpgskkrlrrvacTCPNCRD 369
|
|
| HOX |
smart00389 |
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key ... |
1021-1074 |
1.58e-06 |
|
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes
Pssm-ID: 197696 [Multi-domain] Cd Length: 57 Bit Score: 46.09 E-value: 1.58e-06
10 20 30 40 50
....*....|....*....|....*....|....*....|....*....|....
gi 767998433 1021 PKKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRY 1074
Cdd:smart00389 1 KRRKRTSFTPEQLEELEKEFQKNPYPSREEREELAKKLGLSERQVKVWFQNRRA 54
|
|
| Soli_cterm |
TIGR03437 |
Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in ... |
443-645 |
4.34e-06 |
|
Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in 90 proteins of Solibacter usitatus Ellin6076, nearly always as the C-terminal domain of a much larger protein. No homologs to this domain are detected outside of S. usitatus, a member of the Acidobacteria.
Pssm-ID: 274578 [Multi-domain] Cd Length: 215 Bit Score: 48.81 E-value: 4.34e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 443 VIPGQTAT---SGVLPTGQMVQSGVLPVgQTAPSRVLPPGQTAPLRVISAGQV---VPSGLLSPNQTVsssaVVpVNQGV 516
Cdd:TIGR03437 2 VAPGSIVSifgTNLAPATLTAAGGPLPT-SLGGVSVTVNGVAAPLLYVSPGQInaqVPYEVAPGAATV----TV-TYNGG 75
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 517 NSGVLQLS-QPVVSGVLPVGQ-PVRPGVLQLNQtvGTNILPVNQPVRPGaSQNTTFLTSGSILRQLIPTGKQVNGIPTY- 593
Cdd:TIGR03437 76 ASAAVTVTvAAAAPGIFTLDGsGTGQAAALNNQ--DGSVNSAANPAAPG-DVVVLYATGLGPTSPAVADGAPAPSSPLAp 152
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 767998433 594 TLAPVSVTL-----PVPPGGLATVAPPQMPIQL-LPSGAAA---PMAGSMPGMPSPPVLVN 645
Cdd:TIGR03437 153 ALAPVTVTIggvpaTVLYAGLAPGFVGLYQVNVrVPAGLATgavPVVITVGGVTSNAVTIA 213
|
|
| homeodomain |
cd00086 |
Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic ... |
1022-1078 |
7.71e-06 |
|
Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner.
Pssm-ID: 238039 [Multi-domain] Cd Length: 59 Bit Score: 44.16 E-value: 7.71e-06
10 20 30 40 50
....*....|....*....|....*....|....*....|....*....|....*..
gi 767998433 1022 KKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRYICMK 1078
Cdd:cd00086 1 RRKRTRFTPEQLEELEKEFEKNPYPSREEREELAKELGLTERQVKIWFQNRRAKLKR 57
|
|
| PPE |
COG5651 |
PPE-repeat protein [Function unknown]; |
340-542 |
3.17e-05 |
|
PPE-repeat protein [Function unknown];
Pssm-ID: 444372 [Multi-domain] Cd Length: 385 Bit Score: 47.58 E-value: 3.17e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 340 PLHQSVNPPVLPLSQPVGPVNKSVGTSV--LPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLS 417
Cdd:COG5651 170 PPPTITNPGGLLGAQNAGSGNTSSNPGFanLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGFAGTGAAAGAAAAAAAAAA 249
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 418 VSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLRVISAGqVVPSGL 497
Cdd:COG5651 250 AAGAGASAALASLAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAATGLGLGAGGAAGAAGAT-GAGAAL 328
|
170 180 190 200
....*....|....*....|....*....|....*....|....*
gi 767998433 498 LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGV 542
Cdd:COG5651 329 GAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGA 373
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
242-412 |
2.64e-04 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 45.14 E-value: 2.64e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 242 THIAPKPAAH-LAAPANGSAPSApaQPPCFHLaLPQNS--PSPAAgQPVTVAQgapgSLTHSPPAAgqshmtlvSSPLPV 318
Cdd:pfam03154 388 SNLPPPPALKpLSSLSTHHPPSA--HPPPLQL-MPQSQqlPPPPA-QPPVLTQ----SQSLPPPAA--------SHPPTS 451
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 319 GQNSLTLQPPAPQPVFLSHGVPL------HQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPINRPV 392
Cdd:pfam03154 452 GLHQVPSQSPFPQHPFVPGGPPPitppsgPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPE 531
|
170 180
....*....|....*....|
gi 767998433 393 GPGVLPVSPSVTPGVLQAVS 412
Cdd:pfam03154 532 SPPPPPRSPSPEPTVVNTPS 551
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| ADNP_N |
pfam19627 |
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the ... |
1-962 |
1.55e-63 |
|
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the N-terminal domain of Activity-dependent neuroprotector homeobox protein (ADNP, also known as Activity- dependent neuroprotective protein), which contains zinc finger motifs. It is involved in transcriptional regulation and it is vital for mammalian brain formation. In humans, de novo mutations result in a syndromic form of autism-like spectrum disorder (ASD), including cognitive and motor deficits, the ADNP syndrome. This protein is also related to autophagy and the pathophysiology of schizophrenia.
Pssm-ID: 466132 [Multi-domain] Cd Length: 744 Bit Score: 230.51 E-value: 1.55e-63
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 1 MFQIPVENLDNIRK----------------------DLKGFDPGEKYFHNTSWGDVSLWEPSGKKVR-YRTKPYCCGLCK 57
Cdd:pfam19627 1 MFQLPVNNLGSLRKarknvkkilsdigleyckehieDFKDFEPNDFYIKNTSWDDVCLWDPSLTKNQdYRTKPFCCSGCP 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 58 YSTKVLTSFKNHLHRYHEDEIDQELVIPCPNCVFASQPKVVGRHFRMFHAPVRKVQNYTVNILGETKSSRSDVIS----- 132
Cdd:pfam19627 81 FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTYNGNKKTLETHIKLFHMPNNVVRQPSGGPVGFKDKSKQDSLKpkqgd 160
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 133 ------FTCLKCNFSNTLYYSMKKHVLVAHFHYLINSYFGlRTEEmgeqpKT-NDTVSIEKIPPPDKYYCKKCNANASSQ 205
Cdd:pfam19627 161 sveqavYYCKKCTYRDPLYNVVRKHIYREHFQHVAAPYVA-KPGE-----KSvNGAVASSNTRDDGSIHCKRCLFMPRTY 234
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 206 DALMYHiltsdihrdlenklrsVISEHIKrtgllkqthiapkpaahlaapangsapsapaqppcfhlalpqnspspaAGQ 285
Cdd:pfam19627 235 EALVQH----------------VIEDHER------------------------------------------------IGY 250
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 286 PVTVAQGapgslthsppaagqshmtlvssplpvgqnsltlqppapqpvflshgvplHQSVnppVLPLSQPvgpvnksvgt 365
Cdd:pfam19627 251 QVTAMIG-------------------------------------------------HTNV---VVPRSKP---------- 268
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 366 svlpinqtvrpgvLPLTQPVGPINRPVGpgvLPVSPSVTPGVLQAvspgvlsvsravpsgvlpagqmtpagqmtpagvip 445
Cdd:pfam19627 269 -------------LMLIAPKPQDKKSLG---VTQKGGLVTGNVRS----------------------------------- 297
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 446 gqtatsgvLPTGQMvqsgvlpvgqtapSRVLPPGqtaplrvisagqvvpSGLLSPnqtvsssavVPVNQGvnsgvlQLSQ 525
Cdd:pfam19627 298 --------LSSQQM-------------NRLSIPK---------------ANLLSN---------VHLKQG------SYGL 326
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 526 PVVSGVLPVGQPVRPGvlqlnqtvgtniLPVNQPVR-PGASQNttfltsgsiLRQLIPTGkqvNGIPTYTLAPVSVTLP- 603
Cdd:pfam19627 327 KSMPSFYVLGQQVRLS------------LPGNAQVSvPQQSQT---------VKQLLPGG---NGRPSTVGSSQSGQQPa 382
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 604 ---VPPGGLATVAPPQM-PIQLLPSGAAAPMAGSMPGMPSppvlvnaaqsvfvqASSSAADTNQVLKqakqWKTCPVCNE 679
Cdd:pfam19627 383 rfsVQSGNSASSSSSQLkSPPLSSSVAATRALGQGPSKSS--------------ASAAGLNTSYTQK----WKICTICNE 444
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 680 LFPSNVYQVHMEVAHKhsesksgeklePEKLAACAPFLKWMREKTVRCLSCKCLVSEEELIHHLLMHGLGCLFCPCTFHD 759
Cdd:pfam19627 445 LFPENVYSAHFEKEHK-----------AEKVPAVANYIMKIHNFTSKCLYCNRYLPSDTLLNHMLIHGLSCPYCRSTFND 513
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 760 IKGLSEHSRNRHLGKKKLPMDYSNRGFQLDVDaNGNLLFPHLDFITILPKEKLGEREVYLA---------ILAGIHSKSL 830
Cdd:pfam19627 514 VEKMVAHMRMVHPDEEVGPRTDSPLTFDLTLQ-QGNPKNIQLLVTTYNMRDAPEESVAFHAqnnspqpkkPKPKVQEKSD 592
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 831 VPvyVKVRPQaegTPGSTGKRV--STCPFCF----GPFvtTEAYELHLKERHHIMPTVHTVLKSPAFKCIHCCGVYTGNM 904
Cdd:pfam19627 593 VP--VKSSPQ---AAVPYKKDVgkTLCPLCFsilkGPI--SDALAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNM 665
|
970 980 990 1000 1010 1020
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 767998433 905 TLAAIAVHLVRCRSAPK--DSSSDLQAQPGFIHNSELLLVSGEVMH-DSSFSVKRKLPDGH 962
Cdd:pfam19627 666 TASTITLHLVHCRGVGKtqNGQDKSAPSPRVTQSPGAAPLKRELEHvDPALPKKRKLDDEE 726
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
248-641 |
5.40e-15 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 80.37 E-value: 5.40e-15
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 248 PAAHLAAPANGSAPSAPAQPPcfhlalpqnsPSPAAGQPVTVAQGAPGSLTHSPPAA--GQSHMTLVSSPLPVGQNSLTL 325
Cdd:PHA03247 2557 PAAPPAAPDRSVPPPRPAPRP----------SEPAVTSRARRPDAPPQSARPRAPVDdrGDPRGPAPPSPLPPDTHAPDP 2626
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 326 QPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPV------NKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPV 399
Cdd:PHA03247 2627 PPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPP 2706
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 400 SPSVTPGVLQAVSPgvLSVSRAVPSGVLPAGQMTPAGQMTPAG-VIPGQTATSGVLPTGQMVQSGVLPVGQ-TAPSRVLP 477
Cdd:PHA03247 2707 TPEPAPHALVSATP--LPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAGPPAPAPPAAPaAGPPRRLT 2784
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 478 PGQTAPLRVISAGQVVPSGLLSPNQTVSS-SAVVPVNQGVNSGVlqlsqPVVSGVLPVGQPVRPGVLQLNQTVGTNILPV 556
Cdd:PHA03247 2785 RPAVASLSESRESLPSPWDPADPPAAVLApAAALPPAASPAGPL-----PPPTSAQPTAPPPPPGPPPPSLPLGGSVAPG 2859
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 557 NQPVRPGASQNTTFLTSGsilrqliPTGKQVNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPG 636
Cdd:PHA03247 2860 GDVRRRPPSRSPAAKPAA-------PARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932
|
....*
gi 767998433 637 MPSPP 641
Cdd:PHA03247 2933 PPPPP 2937
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
246-640 |
7.02e-14 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 76.90 E-value: 7.02e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 246 PKPAAHLAAPANGSA---PSAPAQP--PCFHLALPQNSPSPAAGQPVTVAQGAPgsltHSPPAAGQSHmtlvSSPLPVGQ 320
Cdd:PHA03247 2571 PRPAPRPSEPAVTSRarrPDAPPQSarPRAPVDDRGDPRGPAPPSPLPPDTHAP----DPPPPSPSPA----ANEPDPHP 2642
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 321 NSLTLQPPAPQPVFLSHGVPLHQSVNP---PVLPLSQPVGPVNKSVGTSVLPINQTVRPGvlPLTQPVGPINRPVGPGVl 397
Cdd:PHA03247 2643 PPTVPPPERPRDDPAPGRVSRPRRARRlgrAAQASSPPQRPRRRAARPTVGSLTSLADPP--PPPPTPEPAPHALVSAT- 2719
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 398 PVSPSVTPGVLQAVSPGVLSVSRAVPSG-VLPAGQMTPAGQMTPAGviPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVL 476
Cdd:PHA03247 2720 PLPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAG--PPAPAPPAAPAAGPPRRLTRPAVASLSESRES 2797
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 477 PPGQTAPLRViSAGQVVPSGLLSPNQTVSS-----SAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGVLQlnQTVGT 551
Cdd:PHA03247 2798 LPSPWDPADP-PAAVLAPAAALPPAASPAGplpppTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSR--SPAAK 2874
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 552 NILPVNQPVR----PGASQNTTFLTSGSILRQLIPTGK-QVNGIPTYTLAPVSVTLPVPPgglatvAPPQMPIQLLPSGA 626
Cdd:PHA03247 2875 PAAPARPPVRrlarPAVSRSTESFALPPDQPERPPQPQaPPPPQPQPQPPPPPQPQPPPP------PPPRPQPPLAPTTD 2948
|
410
....*....|....
gi 767998433 627 AAPMAGSMPGMPSP 640
Cdd:PHA03247 2949 PAGAGEPSGAVPQP 2962
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
246-492 |
2.16e-09 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 62.26 E-value: 2.16e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 246 PKPAAhLAAPANGSAPSAPAQPPCFHLALPQNSP-SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLT 324
Cdd:PHA03247 2758 ARPPT-TAGPPAPAPPAAPAAGPPRRLTRPAVASlSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQP 2836
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 325 LQPPAPqPVFLSHGVPLHQSVNP--PVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGpiNRPVGPGVLPVSPS 402
Cdd:PHA03247 2837 TAPPPP-PGPPPPSLPLGGSVAPggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFA--LPPDQPERPPQPQA 2913
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 403 VTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTP----AGQMTPAGVIPgqTATSGVLPTGQmVQSGVLPVGQTAPSRVLPP 478
Cdd:PHA03247 2914 PPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPttdpAGAGEPSGAVP--QPWLGALVPGR-VAVPRFRVPQPAPSREAPA 2990
|
250
....*....|....
gi 767998433 479 GQTAPLRVISAGQV 492
Cdd:PHA03247 2991 SSTPPLTGHSLSRV 3004
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
245-578 |
9.00e-09 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 59.95 E-value: 9.00e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 245 APKPAAHLAAPANGSAPSAPAQPPcfhlALPQNSPSPAAGQPVTVAQGAPGS-LTHSPPAAGQSHMTLVSsPLPVGQNSL 323
Cdd:PHA03247 2688 ARPTVGSLTSLADPPPPPPTPEPA----PHALVSATPLPPGPAAARQASPALpAAPAPPAVPAGPATPGG-PARPARPPT 2762
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 324 TLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQ-PVGPINRPvgPGVLPVSPS 402
Cdd:PHA03247 2763 TAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAAsPAGPLPPP--TSAQPTAPP 2840
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 403 VTPGVLQAvspgvlsvSRAVPSGVLPAGqmtPAGQMTPAGVIPGQTATSGVLPTGQMVQSgvlPVGQTAPSRVLPPGQTA 482
Cdd:PHA03247 2841 PPPGPPPP--------SLPLGGSVAPGG---DVRRRPPSRSPAAKPAAPARPPVRRLARP---AVSRSTESFALPPDQPE 2906
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 483 PLRVISAGQvvPSGLLSPNQTVSSSAVVPVNQGVNSGVLQ-----LSQPVVSGVLPVGQ--PVRPGVLQLNQTvgtnILP 555
Cdd:PHA03247 2907 RPPQPQAPP--PPQPQPQPPPPPQPQPPPPPPPRPQPPLApttdpAGAGEPSGAVPQPWlgALVPGRVAVPRF----RVP 2980
|
330 340
....*....|....*....|...
gi 767998433 556 VNQPVRPGASQNTTFLTSGSILR 578
Cdd:PHA03247 2981 QPAPSREAPASSTPPLTGHSLSR 3003
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
233-564 |
1.05e-07 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 56.22 E-value: 1.05e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 233 IKRTGllKQTHIAPKPAAHLAAPANGSaPSAPAQPPcfHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMtlv 312
Cdd:PHA03379 391 LMRAG--KLTERAREALEKASEPTYGT-PRPPVEKP--RPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSM--- 462
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 313 sSPLPVGQNsltlqPPAP----QPVFLSHGVPlhQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPI 388
Cdd:PHA03379 463 -APCPVAQL-----PPGPlqdlEPGDQLPGVV--QDGRPACAPVPAPAGPIVRPWEASLSQVPGVAFAPVMPQPMPVEPV 534
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 389 NRPVGPGVLPVSPSVTPGVLQAvsPGVLSVSRAV----------PSGVLPAGQMT---------PAGQMTPAGV------ 443
Cdd:PHA03379 535 PVPTVALERPVCPAPPLIAMQG--PGETSGIVRVrerwrpapwtPNPPRSPSQMSvrdrlarlrAEAQPYQASVevqppq 612
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 444 ---IPGQTATSGVL-PTGQM------------VQSGVLPVGQtAPSRVLPPGQ-------TAPLRViSAGQVVPSGLLSP 500
Cdd:PHA03379 613 ltqVSPQQPMEYPLePEQQMfpgspfsqvadvMRAGGVPAMQ-PQYFDLPLQQpisqgapLAPLRA-SMGPVPPVPATQP 690
|
330 340 350 360 370 380 390
....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 767998433 501 nQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLP---------VGQPVRPGVLQlNQTVGtniLPVNQPVRPGA 564
Cdd:PHA03379 691 -QYFDIPLTEPINQGASAAHFLPQQPMEGPLVPerwmfqgatLSQSVRPGVAQ-SQYFD---LPLTQPINHGA 758
|
|
| SP1-4_arthropods_N |
cd22553 |
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ... |
362-679 |
1.58e-07 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.
Pssm-ID: 411778 [Multi-domain] Cd Length: 384 Bit Score: 55.03 E-value: 1.58e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 362 SVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSpgVLSVSRAVPSGVLPAGQMTPAG-QMTP 440
Cdd:cd22553 35 ETHDPLILSPPLSQPQQIITAQSSGSAAGGVAYSVSPAVQTVTVDGHEAIF--IPANSGLLQTNNQQAIQLAPGGtQAIL 112
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 441 AGvipGQTATSGVLPTGQMVQSGVLPV-GQTAPSRV---LPP---GQTAPLRV-ISA--GQVVPSGLLSPNQTVSSSAVV 510
Cdd:cd22553 113 AN---QQTLIRPNTVQGQANASNVLQNiAQIASGGNavqLPLnnmTQTIPVQVpVSTanGQTVYQTIQVPIQAIQSGNAG 189
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 511 PVNQGVNSGVL-QLSQPvvsgvlpvgqpvrpGVLQLNQTVGTNILPVNQPVRPGASQNTTFL------TSGSILRQLIPT 583
Cdd:cd22553 190 GGNQALQAQVIpQLAQA--------------AQLQPQQLAQVSSQGYIQQIPANASQQQPQMvqqgpnQSGQIIGQVASA 255
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 584 -GKQVNGIPTYTLApvSVTLPVPPGGLATVAP-PQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQASSSAADT 661
Cdd:cd22553 256 sSIQAAAIPLTVYT--GALAGQNGSNQQQVGQiVTSPIQGMTQGLTAPASSSIPTVVQQQAIQGNPLPPGTQIIAAGQQL 333
|
330 340 350
....*....|....*....|....*....|....*.
gi 767998433 662 NQVLKQAKQWK------------------TCPVCNE 679
Cdd:cd22553 334 QQDPNDPTKWQvvadgtpgskkrlrrvacTCPNCRD 369
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
279-642 |
1.91e-07 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 55.54 E-value: 1.91e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 279 PSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSlTLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGP 358
Cdd:pfam03154 149 PSPQDNESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTT-QAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAP 227
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 359 VNKSVGTSVLPINQ--TVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTPAG 436
Cdd:pfam03154 228 HTLIQQTPTLHPQRlpSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQPFPLTPQSSQS 307
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 437 QM--TPAGVIPGQTATSGVLPTGQ-MVQSGVLPVGQTAPSRVLP-----PGQTAPLRVISAGQ--------VVPSGLLSP 500
Cdd:pfam03154 308 QVppGPSPAAPGQSQQRIHTPPSQsQLQSQQPPREQPLPPAPLSmphikPPPTTPIPQLPNPQshkhpphlSGPSPFQMN 387
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 501 NQTVSSSAVVPVNQGVNSGVLQLSQPVVSgVLPVGQPVRPGVLQlnqtvgTNILPVNQPVRPGASQNTTFLTSGSILRQl 580
Cdd:pfam03154 388 SNLPPPPALKPLSSLSTHHPPSAHPPPLQ-LMPQSQQLPPPPAQ------PPVLTQSQSLPPPAASHPPTSGLHQVPSQ- 459
|
330 340 350 360 370 380
....*....|....*....|....*....|....*....|....*....|....*....|..
gi 767998433 581 iptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPV 642
Cdd:pfam03154 460 -------SPFPQHPFVPGGPPPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPL 514
|
|
| HOX |
smart00389 |
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key ... |
1021-1074 |
1.58e-06 |
|
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes
Pssm-ID: 197696 [Multi-domain] Cd Length: 57 Bit Score: 46.09 E-value: 1.58e-06
10 20 30 40 50
....*....|....*....|....*....|....*....|....*....|....
gi 767998433 1021 PKKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRY 1074
Cdd:smart00389 1 KRRKRTSFTPEQLEELEKEFQKNPYPSREEREELAKKLGLSERQVKVWFQNRRA 54
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
241-658 |
1.65e-06 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 52.46 E-value: 1.65e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 241 QTHIAPKPAAHLAAPANGSAPSaPAQPPCFHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPaagQSHMTLVSSPLPVGQ 320
Cdd:pfam03154 175 QAQSGAASPPSPPPPGTTQAAT-AGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTP---TLHPQRLPSPHPPLQ 250
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 321 nSLTLQPPAPQPVFLSHGVPLHQSVNPPvLPLSQPVGP--VNKSVGTSVLPI-NQTVRPGVLPLTQPVGPI---NRPVGP 394
Cdd:pfam03154 251 -PMTQPPPPSQVSPQPLPQPSLHGQMPP-MPHSLQTGPshMQHPVPPQPFPLtPQSSQSQVPPGPSPAAPGqsqQRIHTP 328
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 395 GVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQM-TPAGQMTPAGVipgqtatSGVLPTgQMvqsgvlpvgqtaPS 473
Cdd:pfam03154 329 PSQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTPIPQLpNPQSHKHPPHL-------SGPSPF-QM------------NS 388
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 474 RVLPPGQTAPLRVISAGQVvPSGLLSPnqtvsssavvpvnqgvnsgvLQLsqpvvsgvLPVGQPVRPgvlqlnqtvgtni 553
Cdd:pfam03154 389 NLPPPPALKPLSSLSTHHP-PSAHPPP--------------------LQL--------MPQSQQLPP------------- 426
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 554 lPVNQPvrPGASQNTTFLTSGSilrqliptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMpiqLLPSGAAAPMAGS 633
Cdd:pfam03154 427 -PPAQP--PVLTQSQSLPPPAA------------SHPPTSGLHQVPSQSPFPQHPFVPGGPPPI---TPPSGPPTSTSSA 488
|
410 420
....*....|....*....|....*
gi 767998433 634 MPGMpSPPVLVNAAQSVFVQASSSA 658
Cdd:pfam03154 489 MPGI-QPPSSASVSSSGPVPAAVSC 512
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
255-659 |
2.87e-06 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 51.86 E-value: 2.87e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 255 PANGSAPSAPAQPPCfhlalPQNSPSPAAGQPVTVAQGAPGSLTHSPPA-AGQSHMTLVSSPLPVGQNSLTLQPPAPQPV 333
Cdd:PHA03247 2483 PAEARFPFAAGAAPD-----PGGGGPPDPDAPPAPSRLAPAILPDEPVGePVHPRMLTWIRGLEELASDDAGDPPPPLPP 2557
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 334 FLSHGVPlHQSVnPPVLPLSQPVGPVNKSvgtsvlpinQTVRPGVLPltQPvgpiNRPVGPGVLPVSPsvtpgvlqavsP 413
Cdd:PHA03247 2558 AAPPAAP-DRSV-PPPRPAPRPSEPAVTS---------RARRPDAPP--QS----ARPRAPVDDRGDP-----------R 2609
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 414 GVLSVSRAVPSGVLPAgqmTPAGQMTPAGVIPGQTATSGVLPTGQmvqsgvlPVGQTAPSRVLPPgqtapLRVISAGQvv 493
Cdd:PHA03247 2610 GPAPPSPLPPDTHAPD---PPPPSPSPAANEPDPHPPPTVPPPER-------PRDDPAPGRVSRP-----RRARRLGR-- 2672
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 494 PSGLLSPNQTVSSSAVVPVNQGVNSgvlqLSQPVVSGVLPVGQPvRPGVLQLNQTVGTNILPVNQPVRPGASqnttfLTS 573
Cdd:PHA03247 2673 AAQASSPPQRPRRRAARPTVGSLTS----LADPPPPPPTPEPAP-HALVSATPLPPGPAAARQASPALPAAP-----APP 2742
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 574 GSILRQLIPTGKQVNGIPTYTLAPVSvtlPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQ 653
Cdd:PHA03247 2743 AVPAGPATPGGPARPARPPTTAGPPA---PAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALP 2819
|
....*.
gi 767998433 654 ASSSAA 659
Cdd:PHA03247 2820 PAASPA 2825
|
|
| Soli_cterm |
TIGR03437 |
Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in ... |
443-645 |
4.34e-06 |
|
Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in 90 proteins of Solibacter usitatus Ellin6076, nearly always as the C-terminal domain of a much larger protein. No homologs to this domain are detected outside of S. usitatus, a member of the Acidobacteria.
Pssm-ID: 274578 [Multi-domain] Cd Length: 215 Bit Score: 48.81 E-value: 4.34e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 443 VIPGQTAT---SGVLPTGQMVQSGVLPVgQTAPSRVLPPGQTAPLRVISAGQV---VPSGLLSPNQTVsssaVVpVNQGV 516
Cdd:TIGR03437 2 VAPGSIVSifgTNLAPATLTAAGGPLPT-SLGGVSVTVNGVAAPLLYVSPGQInaqVPYEVAPGAATV----TV-TYNGG 75
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 517 NSGVLQLS-QPVVSGVLPVGQ-PVRPGVLQLNQtvGTNILPVNQPVRPGaSQNTTFLTSGSILRQLIPTGKQVNGIPTY- 593
Cdd:TIGR03437 76 ASAAVTVTvAAAAPGIFTLDGsGTGQAAALNNQ--DGSVNSAANPAAPG-DVVVLYATGLGPTSPAVADGAPAPSSPLAp 152
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 767998433 594 TLAPVSVTL-----PVPPGGLATVAPPQMPIQL-LPSGAAA---PMAGSMPGMPSPPVLVN 645
Cdd:TIGR03437 153 ALAPVTVTIggvpaTVLYAGLAPGFVGLYQVNVrVPAGLATgavPVVITVGGVTSNAVTIA 213
|
|
| homeodomain |
cd00086 |
Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic ... |
1022-1078 |
7.71e-06 |
|
Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner.
Pssm-ID: 238039 [Multi-domain] Cd Length: 59 Bit Score: 44.16 E-value: 7.71e-06
10 20 30 40 50
....*....|....*....|....*....|....*....|....*....|....*..
gi 767998433 1022 KKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRYICMK 1078
Cdd:cd00086 1 RRKRTRFTPEQLEELEKEFEKNPYPSREEREELAKELGLTERQVKIWFQNRRAKLKR 57
|
|
| DUF4813 |
pfam16072 |
Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. ... |
423-659 |
1.13e-05 |
|
Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 345 and 672 amino acids in length.
Pssm-ID: 435117 [Multi-domain] Cd Length: 288 Bit Score: 48.60 E-value: 1.13e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 423 PSGVLPAGqmtpaGQMTPAGVIPGqtaTSGVLPTGqmvqsGVlPVGQT----APSRVLPPGQTaplrVISAGQVVPSGLL 498
Cdd:pfam16072 13 PGGYAPAG-----ATYHPAGQVPA---GATYYPSG-----GV-PHGATyypqAPVAAVPAGAT----YLPAGAAIPAGAT 74
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 499 SPNQTVSSSAVVPVNQGVNSG---------VLQLSQPVVSGVLPVGQPVRPGVLQLNQTVGTNILPVNQPvrPGASQNTT 569
Cdd:pfam16072 75 YYPQAPKSSSGLGLGTGLIAGalggailghALTPTQTRVVEHAPSSGGGGGGGGYSNGNNEDKIIIINNG--PPGSVTTT 152
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 570 FLTSGSilrQLIPTGKQVNGiptytlAPVSVTLPVPPGGLATVAPPQMPIQlLPSGAAAPMAGSMPGMPSPPVLVNAAQS 649
Cdd:pfam16072 153 SAGSGT---TVINAGGQQPA------APAAPAYPVAPAAYPAQAPAAAPAP-APGAPQTPLAPLNPVAAAPAAAAGAAAA 222
|
250
....*....|
gi 767998433 650 VFVQASSSAA 659
Cdd:pfam16072 223 PVVAAAAPAA 232
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
244-412 |
1.44e-05 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 49.55 E-value: 1.44e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 244 IAPKPAAHLAAPANGSAPSAPAQPPCFHLA--------LPQNSP--SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVS 313
Cdd:PHA03247 2828 LPPPTSAQPTAPPPPPGPPPPSLPLGGSVApggdvrrrPPSRSPaaKPAAPARPPVRRLARPAVSRSTESFALPPDQPER 2907
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 314 SPLPVGQNSLTLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPvGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVG 393
Cdd:PHA03247 2908 PPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDP-AGAGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAPSR 2986
|
170
....*....|....*....
gi 767998433 394 PGVLPVSPSVTPGVLQAVS 412
Cdd:PHA03247 2987 EAPASSTPPLTGHSLSRVS 3005
|
|
| SP1-4_arthropods_N |
cd22553 |
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ... |
238-579 |
1.57e-05 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.
Pssm-ID: 411778 [Multi-domain] Cd Length: 384 Bit Score: 48.48 E-value: 1.57e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 238 LLKQTHIAPKPAAHLAAPANGSAPSAPAQPPCFHLALPQNSPSPAAGQP--VTVAQ---GAPGSLTHSPPAAGQShMTL- 311
Cdd:cd22553 1 FNQSQQVAPSELAQVATTASNIGGQQKQAQSDSSETHDPLILSPPLSQPqqIITAQssgSAAGGVAYSVSPAVQT-VTVd 79
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 312 ----VSSPLPVGQNSLTLQPPAPQPVFLSHGVPLHQSVnppvlpLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGP 387
Cdd:cd22553 80 gheaIFIPANSGLLQTNNQQAIQLAPGGTQAILANQQT------LIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNN 153
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 388 INRPVgPGVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTPAGQMTPAGVI-PGQTATsgvlPTGQMVQSGVLP 466
Cdd:cd22553 154 MTQTI-PVQVPVSTANGQTVYQTIQVPIQAIQSGNAGGGNQALQAQVIPQLAQAAQLqPQQLAQ----VSSQGYIQQIPA 228
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 467 VGQTAPSRVLPPGQTaplrviSAGQVVPSGL-LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGVLQL 545
Cdd:cd22553 229 NASQQQPQMVQQGPN------QSGQIIGQVAsASSIQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPA 302
|
330 340 350
....*....|....*....|....*....|....
gi 767998433 546 NQTVGTNILPvNQPVRPGASQNTTFLTSGSILRQ 579
Cdd:cd22553 303 SSSIPTVVQQ-QAIQGNPLPPGTQIIAAGQQLQQ 335
|
|
| PPE |
COG5651 |
PPE-repeat protein [Function unknown]; |
340-542 |
3.17e-05 |
|
PPE-repeat protein [Function unknown];
Pssm-ID: 444372 [Multi-domain] Cd Length: 385 Bit Score: 47.58 E-value: 3.17e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 340 PLHQSVNPPVLPLSQPVGPVNKSVGTSV--LPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLS 417
Cdd:COG5651 170 PPPTITNPGGLLGAQNAGSGNTSSNPGFanLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGFAGTGAAAGAAAAAAAAAA 249
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 418 VSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLRVISAGqVVPSGL 497
Cdd:COG5651 250 AAGAGASAALASLAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAATGLGLGAGGAAGAAGAT-GAGAAL 328
|
170 180 190 200
....*....|....*....|....*....|....*....|....*
gi 767998433 498 LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGV 542
Cdd:COG5651 329 GAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGA 373
|
|
| PHA02682 |
PHA02682 |
ORF080 virion core protein; Provisional |
246-353 |
4.72e-05 |
|
ORF080 virion core protein; Provisional
Pssm-ID: 177464 [Multi-domain] Cd Length: 280 Bit Score: 46.39 E-value: 4.72e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 246 PKPAAHLAAPANGSAPSAPAQPPCFHLALPQNSPSPAAGQPvtvAQGAPGSLTHSPPAagqshmtlvsSPLPvgqnslTL 325
Cdd:PHA02682 96 PACAPAAPAPAVTCPAPAPACPPATAPTCPPPAVCPAPARP---APACPPSTRQCPPA----------PPLP------TP 156
|
90 100
....*....|....*....|....*....
gi 767998433 326 QP-PAPQPVFlshgvpLHQSVNPPVLPLS 353
Cdd:PHA02682 157 KPaPAAKPIF------LHNQLPPPDYPAA 179
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
246-478 |
9.91e-05 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 46.62 E-value: 9.91e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 246 PKPAAHLAAPAngSAPSAPAQPPCFHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQ---SHMTLVSSPLPVGQNS 322
Cdd:PRK10263 362 PVPGPQTGEPV--IAPAPEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYyapAPEQPAQQPYYAPAPE 439
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 323 LTL-----QPPAPQPVFLSHgvPLHQSVNPPVLPLSQPVG-----PVNKSVGTSVLPINQTVRPGVLPL----------- 381
Cdd:PRK10263 440 QPVagnawQAEEQQSTFAPQ--STYQTEQTYQQPAAQEPLyqqpqPVEQQPVVEPEPVVEETKPARPPLyyfeeveekra 517
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 382 ---------TQPV-GPI--NRPVGPGVLPVSPSVTPGVLQAvsPGVLSVSRAVPSGVLPAGqmTPAGQMTPAGvipgqTA 449
Cdd:PRK10263 518 rereqlaawYQPIpEPVkePEPIKSSLKAPSVAAVPPVEAA--AAVSPLASGVKKATLATG--AAATVAAPVF-----SL 588
|
250 260
....*....|....*....|....*....
gi 767998433 450 TSGVLPTGQmVQSGVLPvGQTAPSRVLPP 478
Cdd:PRK10263 589 ANSGGPRPQ-VKEGIGP-QLPRPKRIRVP 615
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
242-412 |
2.64e-04 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 45.14 E-value: 2.64e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 242 THIAPKPAAH-LAAPANGSAPSApaQPPCFHLaLPQNS--PSPAAgQPVTVAQgapgSLTHSPPAAgqshmtlvSSPLPV 318
Cdd:pfam03154 388 SNLPPPPALKpLSSLSTHHPPSA--HPPPLQL-MPQSQqlPPPPA-QPPVLTQ----SQSLPPPAA--------SHPPTS 451
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 319 GQNSLTLQPPAPQPVFLSHGVPL------HQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPINRPV 392
Cdd:pfam03154 452 GLHQVPSQSPFPQHPFVPGGPPPitppsgPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPE 531
|
170 180
....*....|....*....|
gi 767998433 393 GPGVLPVSPSVTPGVLQAVS 412
Cdd:pfam03154 532 SPPPPPRSPSPEPTVVNTPS 551
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
245-391 |
2.66e-04 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 45.03 E-value: 2.66e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 245 APKPAAhlaaPANGSAPSAPAQPPCFHLALPQNSPSPAAGQ--PVTVAQGAPGSLTHSPPAA----GQSHMTLVSSPLPV 318
Cdd:pfam09770 207 AKKPAQ----QPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQqqPQQQPQQPQQHPGQGHPVTilqrPQSPQPDPAQPSIQ 282
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 767998433 319 GQNSLTLQPPAPQPVflshgVPLHQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVR-PGVLPltQPVGPINRP 391
Cdd:pfam09770 283 PQAQQFHQQPPPVPV-----QPTQILQNPNRLSAARVGYPQNPQPGVQPAPAHQAHRqQGSFG--RQAPIITHP 349
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
248-643 |
3.94e-04 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 44.93 E-value: 3.94e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 248 PAAHLAAP-ANGSAPSAPAQPPCFHLALPQNSPSPAAGQPVTV----------------AQGAPGSLTHS--PPAAGQSH 308
Cdd:PHA03247 2489 PFAAGAAPdPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPrmltwirgleelasddAGDPPPPLPPAapPAAPDRSV 2568
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 309 MTLVSSPLPVGqnsltlqpPAPQPVFLSHGVPLHQsvNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPgvlPLTQPVGPI 388
Cdd:PHA03247 2569 PPPRPAPRPSE--------PAVTSRARRPDAPPQS--ARPRAPVDDRGDPRGPAPPSPLPPDTHAPDP---PPPSPSPAA 2635
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 389 NRPVGPGVLPVSPSVTPGvlQAVSPGVLSVSRAVPSGVLPAGQMTPAGQMTPAGVIPGqtatsgVLPTGQMVQSGVLPVG 468
Cdd:PHA03247 2636 NEPDPHPPPTVPPPERPR--DDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPT------VGSLTSLADPPPPPPT 2707
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 469 QTAPSRVLPPGQTAPLRVISAGQVVPSGLLSPnqtvsssavvpvnqgvnsgvlqLSQPVVSGVLPVGQPVRPGVLQLNQT 548
Cdd:PHA03247 2708 PEPAPHALVSATPLPPGPAAARQASPALPAAP----------------------APPAVPAGPATPGGPARPARPPTTAG 2765
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 549 VGTNILPVNQPVRPGASQNTTFLTSGSILRQLIPTGKQVngiptytlAPVSVTLPVPPGGLATVAPPQMPiqLLPSGAAA 628
Cdd:PHA03247 2766 PPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDP--------ADPPAAVLAPAAALPPAASPAGP--LPPPTSAQ 2835
|
410
....*....|....*
gi 767998433 629 PMAGSMPGMPSPPVL 643
Cdd:PHA03247 2836 PTAPPPPPGPPPPSL 2850
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
241-419 |
7.34e-04 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 43.71 E-value: 7.34e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 241 QTHIAPKPAAHLAAPANGSAPSAPAQPPCFHLALPqnSPSPAAGQPVTVAQGAPGSLTHSPPAAgqshmtlvSSPLPVGQ 320
Cdd:PRK12323 392 PAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARR--SPAPEALAAARQASARGPGGAPAPAPA--------PAAAPAAA 461
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 321 NSLTLQPPAPQPVFLSHGVPLHQSVNPPV-----------LPLSQPV-GPVNKSVGTSVLPINQTVRPGVLPL-----TQ 383
Cdd:PRK12323 462 ARPAAAGPRPVAAAAAAAPARAAPAAAPApadddpppweeLPPEFASpAPAQPDAAPAGWVAESIPDPATADPddafeTL 541
|
170 180 190
....*....|....*....|....*....|....*.
gi 767998433 384 PVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLSVS 419
Cdd:PRK12323 542 APAPAAAPAPRAAAATEPVVAPRPPRASASGLPDMF 577
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
291-689 |
8.68e-04 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 43.46 E-value: 8.68e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 291 QGAPGSLTHSPPAAGQSHMtlVSSPLPVGQN--SLTLQPPAPQPVFLSHGVPLHQSVNPPVL--PLSQPVGPVNKSVGTS 366
Cdd:pfam09606 60 QQQPQGGQGNGGMGGGQQG--MPDPINALQNlaGQGTRPQMMGPMGPGPGGPMGQQMGGPGTasNLLASLGRPQMPMGGA 137
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 367 VLPINQTvrpGVLPLTQPVGpinrpVGPGVLPVSPSVTPGVLQAvspgvlsvsravpsgvlPAGQMTPAGQMTPaGVIPG 446
Cdd:pfam09606 138 GFPSQMS---RVGRMQPGGQ-----AGGMMQPSSGQPGSGTPNQ-----------------MGPNGGPGQGQAG-GMNGG 191
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 447 QTATSGVLPTGQMVQSGVL-------PVGQTAPSRVLPP---GQTAPLRVISAGQVVPSGllsPNQTVSSSAVVPVNQgV 516
Cdd:pfam09606 192 QQGPMGGQMPPQMGVPGMPgpadagaQMGQQAQANGGMNpqqMGGAPNQVAMQQQQPQQQ---GQQSQLGMGINQMQQ-M 267
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 517 NSGVLQLSQPVVSGVLPVGQPVRPGVLQLNQTVGTNILPVNQPVRPgasqnttfltsgsilRQlipTGKQVNGIPTytlA 596
Cdd:pfam09606 268 PQGVGGGAGQGGPGQPMGPPGQQPGAMPNVMSIGDQNNYQQQQTRQ---------------QQ---QQQGGNHPAA---H 326
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 597 PVSVTLPVPPGGLATVAPPQMPIQLLPSGA-----AAPMAGSMPGMPSPPVLVNAAQSVFVQasssaadTNQVLKQAKQw 671
Cdd:pfam09606 327 QQQMNQSVGQGGQVVALGGLNHLETWNPGNfgglgANPMQRGQPGMMSSPSPVPGQQVRQVT-------PNQFMRQSPQ- 398
|
410
....*....|....*...
gi 767998433 672 ktcpvcnelfPSNVYQVH 689
Cdd:pfam09606 399 ----------PSVPSPQG 406
|
|
| SP4_N |
cd22536 |
N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins ... |
255-648 |
1.13e-03 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. Human SP4 is a risk gene of multiple psychiatric disorders including schizophrenia, bipolar disorder, and major depression. SP4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP4.
Pssm-ID: 411773 [Multi-domain] Cd Length: 623 Bit Score: 42.98 E-value: 1.13e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 255 PANGSAPSAPAQppcFHLALPQNSPSPAAGQPVTVA---QGAPGSLTHSPPAAGQSHMTLVSSP--LPVGQNSLTLQPPA 329
Cdd:cd22536 115 KAGNSNASAPGQ---FQVIQVQNMQNPSGSVQYQVIpqiQTVEGQQIQISPANATALQDLQGQIqlIPAGNNQAILTTPN 191
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 330 PQP-------VFLSHGVPLHqsVNPPV-LPLSQPVGPVNKSVGTSVLPINQtvrpGVLPLTQPVgpINRPVGPG-----V 396
Cdd:cd22536 192 RTAsgniiaqNLANQTVPVQ--IRPGVsIPLQLQTIPGAQAQVVTTLPINI----GGVTLALPV--INNVAAGGgsgqlV 263
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 397 LPVSPSVTPGVLQAVSPGVLSVSRAVPSgvlpagqmtpagqmTPAGVIPGQTATSGVLPTGQMVQSGVLPVGQ--TAPSR 474
Cdd:cd22536 264 QPSDGGVSNGNQLVSTPITTASVSTMPE--------------SPSSSTTCTTTASTSLTSSDTLVSSAETGQYasTAASS 329
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 475 VL----PPGQTAPLRVISAGQVVPSGLLS-PNQTVSSSAVVPVNQGVNSgVLQLSQPVVSgVLPVGQPVRPgVLQLNQTV 549
Cdd:cd22536 330 ERteeePQTSAAESEAQSSSQLQSNGLQNvQDQSNSLQQVQIVGQPILQ-QIQIQQPQQQ-IIQAIQPQSF-QLQSGQTI 406
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 550 GTNILPVNQPVRPGASQNTT-------FLT-SGSI---------LRQLIPTGKQVNGIPTY-TLAPVSVTlpvppGGLAT 611
Cdd:cd22536 407 QTIQQQPLQNVQLQAVQSPTqvlirapTLTpSGQIswqtvqvqnIQSLSNLQVQNAGLPQQlTLTPVSSS-----AGGTT 481
|
410 420 430
....*....|....*....|....*....|....*..
gi 767998433 612 VAppqmpiQLLPsgaaAPMAGSmpgmpspPVLVNAAQ 648
Cdd:cd22536 482 IA------QIAP----VAVAGT-------PITLNAAQ 501
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
225-480 |
1.50e-03 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 42.83 E-value: 1.50e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 225 LRSVISEHIKRtglLKQTHIAPKPAAHLAAPANgsAPSAPAQPPCFHLALP------QNSPS----PAAGQPVTV----- 289
Cdd:pfam03154 231 IQQTPTLHPQR---LPSPHPPLQPMTQPPPPSQ--VSPQPLPQPSLHGQMPpmphslQTGPShmqhPVPPQPFPLtpqss 305
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 290 -AQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLTLQPPAPQPVFL------------------SHGVPLHQSVNPPV- 349
Cdd:pfam03154 306 qSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPPAPLSMphikpppttpipqlpnpqSHKHPPHLSGPSPFq 385
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 350 LPLSQPVGPVNK---SVGTSVLPINQTVRPGVLPLTQPVGPinRPVGPGVLPVSPSVTPGVLQAVSPGvlSVSRAVPSGV 426
Cdd:pfam03154 386 MNSNLPPPPALKplsSLSTHHPPSAHPPPLQLMPQSQQLPP--PPAQPPVLTQSQSLPPPAASHPPTS--GLHQVPSQSP 461
|
250 260 270 280 290
....*....|....*....|....*....|....*....|....*....|....*....
gi 767998433 427 LPAGQMTPAG--QMTPAGVIPgqTATSGVLPTGQMVQSGVLPVGQTAP---SRVLPPGQ 480
Cdd:pfam03154 462 FPQHPFVPGGppPITPPSGPP--TSTSSAMPGIQPPSSASVSSSGPVPaavSCPLPPVQ 518
|
|
| PRK07994 |
PRK07994 |
DNA polymerase III subunits gamma and tau; Validated |
247-411 |
1.78e-03 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236138 [Multi-domain] Cd Length: 647 Bit Score: 42.55 E-value: 1.78e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 247 KPAAHLAAPANGSAPSAPAqppcfhlALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLTLQ 326
Cdd:PRK07994 360 HPAAPLPEPEVPPQSAAPA-------ASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETTSQLLAARQQLQRA 432
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 327 PPAPQPvflshgvplhqsvnppvlPLSQPVGPVNKSVGTSVLPINQTVRPgvLPLTQPVGPIN------RPVGPGVLPVS 400
Cdd:PRK07994 433 QGATKA------------------KKSEPAAASRARPVNSALERLASVRP--APSALEKAPAKkeayrwKATNPVEVKKE 492
|
170
....*....|.
gi 767998433 401 PSVTPGVLQAV 411
Cdd:PRK07994 493 PVATPKALKKA 503
|
|
| PPE |
COG5651 |
PPE-repeat protein [Function unknown]; |
409-662 |
1.84e-03 |
|
PPE-repeat protein [Function unknown];
Pssm-ID: 444372 [Multi-domain] Cd Length: 385 Bit Score: 41.80 E-value: 1.84e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 409 QAVSpgvLSVSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATS---GVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLR 485
Cdd:COG5651 155 AAAS---AAAVALTPFTQPPPTITNPGGLLGAQNAGSGNTSSNpgfANLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGF 231
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 486 VISAGQVVPSGLLSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPvgqpvrpgvlqlNQTVGTNILPVNQPVRPGAS 565
Cdd:COG5651 232 AGTGAAAGAAAAAAAAAAAAGAGASAALASLAATLLNASSLGLAATAA------------SSAATNLGLAGSPLGLAGGG 299
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 566 QNTTFLTSGSilrqliptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPVLVN 645
Cdd:COG5651 300 AGAAAATGLG------------LGAGGAAGAAGATGAGAALGAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGG 367
|
250
....*....|....*..
gi 767998433 646 AAQSVFVQASSSAADTN 662
Cdd:COG5651 368 GGSAGAAAGAASGGGAA 384
|
|
| half-pint |
TIGR01645 |
poly-U binding splicing factor, half-pint family; The proteins represented by this model ... |
384-500 |
7.47e-03 |
|
poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.
Pssm-ID: 130706 [Multi-domain] Cd Length: 612 Bit Score: 40.44 E-value: 7.47e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 384 PVGPINRPVGPGVLPVSPSVTPGvlqAVSPGVLSVSRAVPSGVLPAGQMTPAgqmTPAGVIPGQTATSGVLPTGQMVQSG 463
Cdd:TIGR01645 284 PPDALLQPATVSAIPAAAAVAAA---AATAKIMAAEAVAGAAVLGPRAQSPA---TPSSSLPTDIGNKAVVSSAKKEAEE 357
|
90 100 110
....*....|....*....|....*....|....*..
gi 767998433 464 VLPVGQTAPSRVLPPGQTAPLRVISAGQVVPSGLLSP 500
Cdd:TIGR01645 358 VPPLPQAAPAVVKPGPMEIPTPVPPPGLAIPSLVAPP 394
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
245-466 |
8.76e-03 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 40.44 E-value: 8.76e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 245 APKPAAHLAA-------PANGSAPSAPAQPPCFHLALPQNSPSPA---AGQPVTVAQGAPGSLTHSPPAAGQSHMTlvSS 314
Cdd:PHA03378 700 APTPMRPPAAppgraqrPAAATGRARPPAAAPGRARPPAAAPGRArppAAAPGRARPPAAAPGRARPPAAAPGAPT--PQ 777
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 315 PLPVGQNSLTLQP---PAPQPVflSHGVPLHQSVNPPVLPLSQpvGPVNKSVGTSVLPINQTVRPGVlpLTQPVGPINRP 391
Cdd:PHA03378 778 PPPQAPPAPQQRPrgaPTPQPP--PQAGPTSMQLMPRAAPGQQ--GPTKQILRQLLTGGVKRGRPSL--KKPAALERQAA 851
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433 392 VGPGVLPVS---------PSVTPGVLQAVS-PGVLSVSRAVPSGVLP------AGQMTPAGQMTPAGVIPGQTATSGVLP 455
Cdd:PHA03378 852 AGPTPSPGSgtsdkivqaPVFYPPVLQPIQvMRQLGSVRAAAASTVTqapteyTGERRGVGPMHPTDIPPSKRAKTDAYV 931
|
250
....*....|.
gi 767998433 456 TGQMVQSGVLP 466
Cdd:PHA03378 932 ESQPPHGGQSH 942
|
|
|