|
Name |
Accession |
Description |
Interval |
E-value |
| PTZ00121 |
PTZ00121 |
MAEBL; Provisional |
381-1042 |
3.48e-14 |
|
MAEBL; Provisional
Pssm-ID: 173412 [Multi-domain] Cd Length: 2084 Bit Score: 79.03 E-value: 3.48e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 381 ELENIVDEVQRKETkdsgIKWDSTISYTAQAERT-PDLTELRQQPVASEDISEDSTKDNVSLKKGDFYQEDETDEYQSWK 459
Cdd:PTZ00121 1221 EDAKKAEAVKKAEE----AKKDAEEAKKAEEERNnEEIRKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAK 1296
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 460 RSH--KKATYVYETSGPNLSDNKSGQKVSEAKPsqyyELQVLKKKRKEMKSFSEDKsKSPTEAKRKHLSLTETKSQGGKS 537
Cdd:PTZ00121 1297 KAEekKKADEAKKKAEEAKKADEAKKKAEEAKK----KADAAKKKAEEAKKAAEAA-KAEAEAAADEAEAAEEKAEAAEK 1371
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 538 GTSmmmlEQFRK---VKRESPFDKRPTAAEIKVEPTTESLD--KEGKGEIRSLVEPLSMIQFDDTAEPQKGKIKGKKHHI 612
Cdd:PTZ00121 1372 KKE----EAKKKadaAKKKAEEKKKADEAKKKAEEDKKKADelKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKAD 1447
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 613 SSGTITSKEEKTEEKEELTKQVKSHQLVKSLSRVAKETSESTRVLESPDGKSEQsnLEEFQEAimaflKQKIDNIGKAFD 692
Cdd:PTZ00121 1448 EAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADE--AKKAAEA-----KKKADEAKKAEE 1520
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 693 KKtvpKEEELLKRAEAEKLGiiKAKMEEYFQKVAEtvtkiLRKYKDTKKEEQVgeKPIKQKKVVSFMPGLHFQKSPISAK 772
Cdd:PTZ00121 1521 AK---KADEAKKAEEAKKAD--EAKKAEEKKKADE-----LKKAEELKKAEEK--KKAEEAKKAEEDKNMALRKAEEAKK 1588
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 773 SESSTLLSYESTDPVINNLIQMILAEIESERdiPTVSTVQKDHKEKEKQRQEQYLQEgQEQMSGMSLKQQllGERNLLKE 852
Cdd:PTZ00121 1589 AEEARIEEVMKLYEEEKKMKAEEAKKAEEAK--IKAEELKKAEEEKKKVEQLKKKEA-EEKKKAEELKKA--EEENKIKA 1663
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 853 HYEKISENWEEKKAwlqmKEGKQEQQSQKQWQEEEMWKEEQKQATPKQAEQEEKQKQRGQE---EEELPKSSLQRLEEGT 929
Cdd:PTZ00121 1664 AEEAKKAEEDKKKA----EEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEElkkAEEENKIKAEEAKKEA 1739
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 930 QKMKTQGLLLEKENGQMRQIQ---KEAKHLGPHRRREKG---KE--KQKPERGLEDLERQIK-TKDQMQMKETQPKELEK 1000
Cdd:PTZ00121 1740 EEDKKKAEEAKKDEEEKKKIAhlkKEEEKKAEEIRKEKEaviEEelDEEDEKRRMEVDKKIKdIFDNFANIIEGGKEGNL 1819
|
650 660 670 680
....*....|....*....|....*....|....*....|....
gi 224458301 1001 MVIQTPMTLSPRWKSVL--KDVQRSyEGKEFQRNLKTLENLPDE 1042
Cdd:PTZ00121 1820 VINDSKEMEDSAIKEVAdsKNMQLE-EADAFEKHKFNKNNENGE 1862
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
1602-1953 |
3.30e-09 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 62.65 E-value: 3.30e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1602 PLTPQQAQAQGIPlTPQQAQAlgISLTPQQAQAQGITLTPQQAQalgvPITPVN-----AWVSAVTLTSEQTHALESPMN 1676
Cdd:PHA03247 2557 PAAPPAAPDRSVP-PPRPAPR--PSEPAVTSRARRPDAPPQSAR----PRAPVDdrgdpRGPAPPSPLPPDTHAPDPPPP 2629
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1677 LEQAQEQLLKLGVPLTLDKAHTLGSPLTLKQVQWSHRPFQKSKASLPTG--QSIISRLSPSLRLSLASSA--PTAEKSSi 1752
Cdd:PHA03247 2630 SPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSppQRPRRRAARPTVGSLTSLAdpPPPPPTP- 2708
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1753 fgVSSTPLQISRVPLNQGPFAPGKplemgilSEPGKLGAPQTLRSSGQTLVYGGQSTSAQFPAPQAPPSPG--QLPISRA 1830
Cdd:PHA03247 2709 --EPAPHALVSATPLPPGPAAARQ-------ASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAppAAPAAGP 2779
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1831 PPTPGQPFIAGVPPTSGQIPSLWAPLSPGQPLVPEASSIPGDLLESGPLTfseqlqefqPPATAEQ--SPYLQAPSTPGQ 1908
Cdd:PHA03247 2780 PRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLP---------PPTSAQPtaPPPPPGPPPPSL 2850
|
330 340 350 360
....*....|....*....|....*....|....*....|....*
gi 224458301 1909 HLATWTLPGrasslwiPPTSRHPPTLWPSPAPGKPQKSWSPSVAK 1953
Cdd:PHA03247 2851 PLGGSVAPG-------GDVRRRPPSRSPAAKPAAPARPPVRRLAR 2888
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
1512-1940 |
5.64e-09 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 61.71 E-value: 5.64e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1512 AQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQVQALGIPLIPPQAQELEIPLtPQQAQALGIPLTPQQAqelGIPL 1591
Cdd:pfam03154 162 AQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQP-PNQTQSTAAPHTLIQQ---TPTL 237
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1592 TPQQ-------AQELGIPLTPQQAQAQG---------IPLTPQQAQAlGISLTPQQAQAQGITLTPQQAQAlGVPITPVN 1655
Cdd:pfam03154 238 HPQRlpsphppLQPMTQPPPPSQVSPQPlpqpslhgqMPPMPHSLQT-GPSHMQHPVPPQPFPLTPQSSQS-QVPPGPSP 315
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1656 AWVSAVTLTSEQTHALESPMNLEQAQEQLLKlgvPLTLDKAHTLGSPLT-LKQVQWSHRPFQKSKASLPTGQSIISRLSP 1734
Cdd:pfam03154 316 AAPGQSQQRIHTPPSQSQLQSQQPPREQPLP---PAPLSMPHIKPPPTTpIPQLPNPQSHKHPPHLSGPSPFQMNSNLPP 392
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1735 SLRLSLASSAPTAEKSSifgVSSTPLQIsrvpLNQGPFAPGKPLEMGILSepgklgapqtlrssgqtlvyggQSTSAQFP 1814
Cdd:pfam03154 393 PPALKPLSSLSTHHPPS---AHPPPLQL----MPQSQQLPPPPAQPPVLT----------------------QSQSLPPP 443
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1815 APQAPPSPGQLPISRAPPTPGQPFIAGVPPTSgqipslwapLSPGQPLVPEASSIPGdllesgpltfseqlqeFQPPATA 1894
Cdd:pfam03154 444 AASHPPTSGLHQVPSQSPFPQHPFVPGGPPPI---------TPPSGPPTSTSSAMPG----------------IQPPSSA 498
|
410 420 430 440
....*....|....*....|....*....|....*....|....*.
gi 224458301 1895 EQSPYLQAPSTPGQHLATWTLPGRASSLWIPPTSRHPPTLWPSPAP 1940
Cdd:pfam03154 499 SVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEP 544
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
1216-1621 |
9.16e-09 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 61.23 E-value: 9.16e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1216 GIPLTPQQAQALGITLTLQQAQQLGIPLTPQQAQALGITLTPKQVQELGIPLTPQQAQALGITLTPKQAQELGIPLNPQQ 1295
Cdd:PHA03379 414 GTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDGRP 493
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1296 AQT--------LGIPLTPKQAQALGIPFTPQQAQALGIPLTPQQAQTQEITLTPQQAQAL----GMPLTTQQAQE--LGI 1361
Cdd:PHA03379 494 ACApvpapagpIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPPLIAmqgpGETSGIVRVRErwRPA 573
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1362 PLTPQHAQALG-MPLTTQQAQeLGIPLTPQQAQALGMPLTTQQA---QELGIPLTPQQAQELGIPFTPQQAQAQEITLTP 1437
Cdd:PHA03379 574 PWTPNPPRSPSqMSVRDRLAR-LRAEAQPYQASVEVQPPQLTQVspqQPMEYPLEPEQQMFPGSPFSQVADVMRAGGVPA 652
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1438 QQAQALGMPLtaqqaqelgitltpQQAQELGIPLTPQQAQALGIPLIPP-QAQELGIPLTPQQAQALGILLIPPQAQELG 1516
Cdd:PHA03379 653 MQPQYFDLPL--------------QQPISQGAPLAPLRASMGPVPPVPAtQPQYFDIPLTEPINQGASAAHFLPQQPMEG 718
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1517 iPLTPQQAQALGIPLIP------PQAQELGIPLTpqQVQALGIPLIP-PQAQELEIPLTPQQAQALGIPLTPqqaqelGI 1589
Cdd:PHA03379 719 -PLVPERWMFQGATLSQsvrpgvAQSQYFDLPLT--QPINHGAPAAHfLHQPPMEGPWVPEQWMFQGAPPSQ------GT 789
|
410 420 430
....*....|....*....|....*....|..
gi 224458301 1590 PLTPQQAQELGIPltPQQAQAQGIPLTPQQAQ 1621
Cdd:PHA03379 790 DVVQHQLDALGYV--LHVLNHPGVPVSPAVNQ 819
|
|
| SMC_N |
pfam02463 |
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ... |
291-1039 |
2.55e-07 |
|
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.
Pssm-ID: 426784 [Multi-domain] Cd Length: 1161 Bit Score: 56.52 E-value: 2.55e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 291 AHETSEAEKELSLKIIRDLSNENEMLQQKLQDAEEKCEQLIRSKIVIEQLYAKLSTSSTLKVLPGPSPQSSRAIIKVGDT 370
Cdd:pfam02463 254 ESSKQEIEKEEEKLAQVLKENKEEEKEKKLQEEELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKEKKKAEKELKKE 333
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 371 EDNMDN---------ILDKELENIVDEVQRKETKDSGIKWDSTISYTAQAERTPDLTELRQQPVASEDISEDSTKDNVSL 441
Cdd:pfam02463 334 KEEIEElekelkeleIKREAEEEEEEELEKLQEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKSEEEKEAQLLLEL 413
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 442 KKGDFYQEDETDEYQSWKRSHKKatyvyetsgpNLSDNKSGQKVSEAKPSQYYELQVLKKKRKEMKSFSEDKSKSPTEAK 521
Cdd:pfam02463 414 ARQLEDLLKEEKKEELEILEEEE----------ESIELKQGKLTEEKEELEKQELKLLKDELELKKSEDLLKETQLVKLQ 483
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 522 RKHLSLTETKSQGG---KSGTSMMMLEQFRKVKRESPFDKRPTAAEIKVEPTTESLDKEGKGEIRSLVEPLSMIQFDDTA 598
Cdd:pfam02463 484 EQLELLLSRQKLEErsqKESKARSGLKVLLALIKDGVGGRIISAHGRLGDLGVAVENYKVAISTAVIVEVSATADEVEER 563
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 599 EP-QKGKIKGKKHHISSGTITSKEEKTEEKEELTKQVKSHQLVKSLSRVAKETSESTRVLESPDG---KSEQSNLEEFQE 674
Cdd:pfam02463 564 QKlVRALTELPLGARKLRLLIPKLKLPLKSIAVLEIDPILNLAQLDKATLEADEDDKRAKVVEGIlkdTELTKLKESAKA 643
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 675 AIMAFLKQKIDNIGKAFDKKTVPKEEELLKRAEAEKLGIIKAKMEEYFQKVAETVTKILRKYKDTKKEEQVgekpIKQKK 754
Cdd:pfam02463 644 KESGLRKGVSLEEGLAEKSEVKASLSELTKELLEIQELQEKAESELAKEEILRRQLEIKKKEQREKEELKK----LKLEA 719
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 755 VVSFMPGLHFQKSPISaksesstllsyestdpVINNLIQMILAEIESERDIPTVSTVQKDHKEKEKQRQEQYLQEGQEQm 834
Cdd:pfam02463 720 EELLADRVQEAQDKIN----------------EELKLLKQKIDEEEEEEEKSRLKKEEKEEEKSELSLKEKELAEEREK- 782
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 835 sgmslkqQLLGERNLLKEHYEKI---SENWEEKKAWLQMKEGKQEQqsqkqwqeeemwkEEQKQATPKQAEQEEKQKQRG 911
Cdd:pfam02463 783 -------TEKLKVEEEKEEKLKAqeeELRALEEELKEEAELLEEEQ-------------LLIEQEEKIKEEELEELALEL 842
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 912 QEEEELPKSSLQRLEEgtQKMKTQGLLLEKENGQMRQIQKEAKHlgphrrREKGKEKQKPERGLEDLERQIKTKDQMQMK 991
Cdd:pfam02463 843 KEEQKLEKLAEEELER--LEEEITKEELLQELLLKEEELEEQKL------KDELESKEEKEKEEKKELEEESQKLNLLEE 914
|
730 740 750 760
....*....|....*....|....*....|....*....|....*...
gi 224458301 992 ETQPKELEKMVIQTPMTLSPRWKSVLKDVQRSYEGKEFQrNLKTLENL 1039
Cdd:pfam02463 915 KENEIEERIKEEAEILLKYEEEPEELLLEEADEKEKEEN-NKEEEEER 961
|
|
| SP2_N |
cd22540 |
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ... |
1046-1487 |
3.09e-06 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.
Pssm-ID: 411776 [Multi-domain] Cd Length: 511 Bit Score: 52.24 E-value: 3.09e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1046 ISITPPPSLQYSLPGALPISGQPLTkcihlTPQQAQEVGITltpqqaQAQGITLTLQQAQELGIPLTPQQAQALEILFTP 1125
Cdd:cd22540 44 AAVTPPAPPQPTPRKLVPIKPAPLP-----LGPGKNSIGFL------SAKGNIIQLQGSQLSSSAPGGQQVFAIQNPTMI 112
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1126 QQAQALGIPLTPQ---QTQVQGITLTPQQDQAPGISlTTQQAQKLGIPLTPQQAQA-LGIPLTPQQAQELGIPLTPQQAQ 1201
Cdd:cd22540 113 IKGSQTRSSTNQQyqiSPQIQAAGQINNSGQIQIIP-GTNQAIITPVQVLQQPQQAhKPVPIKPAPLQTSNTNSASLQVP 191
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1202 ALRVSLTPQQAQELGIPLTPQQAQALGITLTL-------------QQAQQLGIPLTP--QQAQALGITLTPKQVQELGIP 1266
Cdd:cd22540 192 GNVIKLQSGGNVALTLPVNNLVGTQDGATQLQlaaapskpskkirKKSAQAAQPAVTvaEQVETVLIETTADNIIQAGNN 271
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1267 LTPQQAQALGITLTPKQAQELGIPLNPQQAQTLGIPLTPKQAQALGIPFTPQQAQALGIPLTPQQAQTQEITLTPQ-QAQ 1345
Cdd:cd22540 272 LLIVQSPGTGQPAVLQQVQVLQPKQEQQVVQIPQQALRVVQAASATLPTVPQKPLQNIQIQNSEPTPTQVYIKTPSgEVQ 351
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1346 ALGM---PLTTQQAQELGIPLTPQHAQALGMPLTTQQAQELGIPLTPQQAQALGmpLTTQQAQELGIplTPQQAQELGIP 1422
Cdd:cd22540 352 TVLLqeaPAATATPSSSTSTVQQQVTANNGTGTSKPNYNVRKERTLPKIAPAGG--IISLNAAQLAA--AAQAIQTININ 427
|
410 420 430 440 450 460
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301 1423 FTPQQAQAQEITLTPQQAQalgmpLTAQQAQELGITLTPQQAQElgipLTPQQAQALGIPLIPPQ 1487
Cdd:cd22540 428 GVQVQGVPVTITNAGGQQQ-----LTVQTVSSNNLTISGLSPTQ----IQLQMEQALEIETQPGE 483
|
|
| AvrBs3 |
NF041308 |
type III secretion system effector avirulence protein AvrBs3; |
1087-1646 |
1.65e-05 |
|
type III secretion system effector avirulence protein AvrBs3;
Pssm-ID: 469205 [Multi-domain] Cd Length: 1179 Bit Score: 50.34 E-value: 1.65e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1087 LTPQQ----AQAQGITLTLQQAQELgIPLTPQQAQALeilfTPQQAQALGIPLTPQQTQVQGITLTPQQDQAPgISLTTQ 1162
Cdd:NF041308 326 LTPEQvvaiASNDGGKQALETVQRL-LPVLCQAEHGL----TPDQVVAIASNIGGKPALETVQRLLPVLCQPP-HGLTPD 399
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1163 QAQKLGIPLTPQQAQALGIPLTPQQAQELGiPLTPQQAQALRVSLTPQQAQELGIPLTPQQAQALGITltlqQAQQLGIP 1242
Cdd:NF041308 400 QVVAIASNDGGKQALETVQRLLPVLCQAPH-GLTPDQVVAIASNDGGKQALETVQRLLPELCQAHGLT----PDQVVAIA 474
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1243 LTPQQAQALGIT--LTPKQVQeLGIPLTPQQAQALGITLTPKQAQELGIPLNPQQAQTLGiPLTPKQAQALGIPFTPQQA 1320
Cdd:NF041308 475 SNGGGKQALETVqrLLPVLCQ-PPHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQPPH-GLTPEQVVAIASHDGGKQA 552
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1321 QALGIPLTPQQAQTQEiTLTPQQAQALGMPLTTQQA----QELgIP--------LTPQHAQALGMPLTTQQAQELGIPLT 1388
Cdd:NF041308 553 LETVHRLLPVLCQAPH-GLTPEQVVAIASHNGGKQAletvQRL-LPvlcqrpygLTPNQVVAIASNDGGKQALETVQRLL 630
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1389 PQQAQAlGMPLTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQEiTLTPQQAQALGMPLTAQQAQELGITLTPQQAQElG 1468
Cdd:NF041308 631 PVLCQA-PHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQRPH-GLTPHQVVAIASNDGGKQALETVQRLLPVLCQP-P 707
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1469 IPLTPQQAQALGIPLIPPQAQELGIPLTPqqaqalgILLIPPQAqelgipLTPQQAQALGIPLIPPQA----QELgIP-- 1542
Cdd:NF041308 708 YGLTPEQVVAIASNNGGKQALETVQRLLP-------VLCQRPHG------LTPDQVVAIASNDGGKQAletvQRL-LPvl 773
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1543 ------LTPQQVQALGIPLIPPQAQELEIPLTPQQAQALGiPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGipLT 1616
Cdd:NF041308 774 cqpphgLTPDQVVAIASNDGGKQALETVQRLLPVLCDAPH-GLTPHQVVAIASNIGGRQALETVQRLLPVLCQAHG--LT 850
|
570 580 590
....*....|....*....|....*....|
gi 224458301 1617 PQQAQALGISLTPQQAQAQGITLTPQQAQA 1646
Cdd:NF041308 851 PDQVVAIASNNGGKQALETVQRLLPVLCQP 880
|
|
| KREPA2 |
cd23959 |
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of ... |
1770-1908 |
1.91e-04 |
|
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of the parasitic protozoan's KREPA RNA editing catalytic complex (RECC). Kinetoplastid RNA editing (KRE) proteins occur as pairs or sets of related proteins in multiple complexes. KREPA complex is composed of six components (KREPA1-6), which share a conserved C-terminal region containing an oligonucleotide-binding (OB)-fold-like domain. KREPAs are responsible for the site-specific insertion and deletion of U nucleotides in the kinetoplastid mitochondria pre-messenger RNA. Apart from the conserved C-terminal OB-fold domain, KREPA1, KREPA2, and KREPA3 contain two conserved C2H2 zinc-finger domains. KREPA2 and kinetoplastid RNA editing ligase 1 (KREL1) are specific for ligation post-U-deletion and are paralogous to KREL2 and KREPA1 that are specific for ligation post-U-insertion. KREPA2, is critical for RECC stability and KREL1 integration into the complex.
Pssm-ID: 467780 [Multi-domain] Cd Length: 424 Bit Score: 46.40 E-value: 1.91e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1770 GPFAPGKPLEMGILSePGKLGAPQTLRSSGQTLVYGGQSTSAQFPAPQAPPSP--GQLPI--SRAPPTPGQPFIAGVPPT 1845
Cdd:cd23959 97 DAFAMAPDESLGPFR-AARVPNPFSASSSTQRETHKTAQVAPPKAEPQTAPVTpfGQLPMfgQHPPPAKPLPAAAAAQQS 175
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 224458301 1846 S---GQIPSLWAPLSP-GQPLVPEASSIPGDLLESGPLTFSEQLQEFQPPATAEQSPylQAPSTPGQ 1908
Cdd:cd23959 176 SaspGEVASPFASGTVsASPFATATDTAPSSGAPDGFPAEASAPSPFAAPASAASFP--AAPVANGE 240
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
1292-1453 |
1.13e-03 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 44.26 E-value: 1.13e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1292 NPQQAQTLGIPLTPKQAQAlgipfTPQQAQALGIPLTPQQAQTQEITLTPQQAQALGMPLTTQQAqelgiPLTPQHAQAL 1371
Cdd:pfam09770 209 KPAQQPAPAPAQPPAAPPA-----QQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQR-----PQSPQPDPAQ 278
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1372 GMPLTTQQAQELGIPLTPQQ-AQALGMPlTTQQAQELGIPLTPQQAQElgiPFTPQQAQAQeitltpQQAQALGMPLTAQ 1450
Cdd:pfam09770 279 PSIQPQAQQFHQQPPPVPVQpTQILQNP-NRLSAARVGYPQNPQPGVQ---PAPAHQAHRQ------QGSFGRQAPIITH 348
|
...
gi 224458301 1451 QAQ 1453
Cdd:pfam09770 349 PQQ 351
|
|
| PspC_subgroup_1 |
NF033838 |
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ... |
639-1006 |
1.29e-03 |
|
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.
Pssm-ID: 468201 [Multi-domain] Cd Length: 684 Bit Score: 43.85 E-value: 1.29e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 639 LVKSLSRVAKETSESTRVLESpdgKSEQSnleefqeaIMAFLKQKIDNIGKAFDKKTVPKEEellKRAEAEKlgiikakm 718
Cdd:NF033838 89 LNKKLSDIKTEYLYELNVLKE---KSEAE--------LTSKTKKELDAAFEQFKKDTLEPGK---KVAEATK-------- 146
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 719 eeyfqKVAETVtkilRKYKDTKKEEQVGEKPIKQKKVvsfmpGLHFQKSPISAKSESSTLLSYESTDPVINNLIQMILAE 798
Cdd:NF033838 147 -----KVEEAE----KKAKDQKEEDRRNYPTNTYKTL-----ELEIAESDVEVKKAELELVKEEAKEPRDEEKIKQAKAK 212
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 799 IESERDIPT----VSTVQKDHKEKEKQRQEQYLQEGQEQMSGMSLKQQLLG--ERNLLKEHYEKISENWEEKKAWLQMKE 872
Cdd:NF033838 213 VESKKAEATrlekIKTDREKAEEEAKRRADAKLKEAVEKNVATSEQDKPKRraKRGVLGEPATPDKKENDAKSSDSSVGE 292
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 873 GKQEQQSQkqwqeeemwKEEQKQATPKQAEQEEKQKQRGQEEEE---LPKSSLQRLE----EGTQKMKTQGLLLEKEngq 945
Cdd:NF033838 293 ETLPSPSL---------KPEKKVAEAEKKVEEAKKKAKDQKEEDrrnYPTNTYKTLEleiaESDVKVKEAELELVKE--- 360
|
330 340 350 360 370 380
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301 946 mrqiqkEAKHlgpHRRREKGKE-KQKPERGLEDLERQIKTKDQMQMKETQPK---ELEKMVIQTP 1006
Cdd:NF033838 361 ------EAKE---PRNEEKIKQaKAKVESKKAEATRLEKIKTDRKKAEEEAKrkaAEEDKVKEKP 416
|
|
| DamX |
COG3266 |
Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell ... |
1322-1622 |
1.36e-03 |
|
Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell cycle control, cell division, chromosome partitioning];
Pssm-ID: 442497 [Multi-domain] Cd Length: 455 Bit Score: 43.69 E-value: 1.36e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1322 ALGIPLTPQQAQTQEITLTPQQAQALgMPLTTQQAQELGIPLTPQHAQALGMPLTTQQAQELGIPLTPQQAQALGMPLTT 1401
Cdd:COG3266 53 LLAGLLLLLIRLLSEAVDLGALASAA-LLLALASLALLGILLLALLALLLDLLLLADLLRAAALLLLKLLLLLLTLLLLV 131
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1402 QQAQELGIPLTPQQAQELGIPFTPQQAQAQEITLTPQQAQALGMPLTAQQAQELgiTLTPQQAQELGIPLTPQQAQALGI 1481
Cdd:COG3266 132 LLLLLALLLALLLDLPLLTLLIVLPLLEEQLLLLALQDIQGTLQALGAVAALLG--LRKAEEALALRAGSAAADALALLL 209
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1482 PLIPPQAQELGIPLTPQQ--AQALGILLIPPQAQELGIPLTPQQAQALgipliPPQAQELGIPLTPQQVQALGIPLIPPQ 1559
Cdd:COG3266 210 LLLASALGEAVAAAAELAalALLAAGAAEVLTARLVLLLLIIGSALKA-----PSQASSASAPATTSLGEQQEVSLPPAV 284
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|...
gi 224458301 1560 AQELEiPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGIPlTPQQAQA 1622
Cdd:COG3266 285 AAQPA-AAAAAQPSAVALPAAPAAAAAAAAPAEAAAPQPTAAKPVVTETAAPAAP-APEAAAA 345
|
|
| SMC_prok_A |
TIGR02169 |
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ... |
817-1000 |
4.66e-03 |
|
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]
Pssm-ID: 274009 [Multi-domain] Cd Length: 1164 Bit Score: 42.36 E-value: 4.66e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 817 EKEKQRQEqyLQEGQEQMSGMSLK--------QQLLGERNLlKEHYEKISENWEEKKAWLQMKEGKQEqqsqkqwqeeem 888
Cdd:TIGR02169 171 KKEKALEE--LEEVEENIERLDLIidekrqqlERLRREREK-AERYQALLKEKREYEGYELLKEKEAL------------ 235
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 889 wkEEQKQATPKQ-AEQEEKQKQRGQEEEELPK---SSLQRLEEGTQKMKtqglllEKENGQMRQIQKEAKHLGPHRRREK 964
Cdd:TIGR02169 236 --ERQKEAIERQlASLEEELEKLTEEISELEKrleEIEQLLEELNKKIK------DLGEEEQLRVKEKIGELEAEIASLE 307
|
170 180 190
....*....|....*....|....*....|....*..
gi 224458301 965 GKEKQKpERGLEDLERQI-KTKDQMQMKETQPKELEK 1000
Cdd:TIGR02169 308 RSIAEK-ERELEDAEERLaKLEAEIDKLLAEIEELER 343
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| PTZ00121 |
PTZ00121 |
MAEBL; Provisional |
381-1042 |
3.48e-14 |
|
MAEBL; Provisional
Pssm-ID: 173412 [Multi-domain] Cd Length: 2084 Bit Score: 79.03 E-value: 3.48e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 381 ELENIVDEVQRKETkdsgIKWDSTISYTAQAERT-PDLTELRQQPVASEDISEDSTKDNVSLKKGDFYQEDETDEYQSWK 459
Cdd:PTZ00121 1221 EDAKKAEAVKKAEE----AKKDAEEAKKAEEERNnEEIRKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAK 1296
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 460 RSH--KKATYVYETSGPNLSDNKSGQKVSEAKPsqyyELQVLKKKRKEMKSFSEDKsKSPTEAKRKHLSLTETKSQGGKS 537
Cdd:PTZ00121 1297 KAEekKKADEAKKKAEEAKKADEAKKKAEEAKK----KADAAKKKAEEAKKAAEAA-KAEAEAAADEAEAAEEKAEAAEK 1371
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 538 GTSmmmlEQFRK---VKRESPFDKRPTAAEIKVEPTTESLD--KEGKGEIRSLVEPLSMIQFDDTAEPQKGKIKGKKHHI 612
Cdd:PTZ00121 1372 KKE----EAKKKadaAKKKAEEKKKADEAKKKAEEDKKKADelKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKAD 1447
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 613 SSGTITSKEEKTEEKEELTKQVKSHQLVKSLSRVAKETSESTRVLESPDGKSEQsnLEEFQEAimaflKQKIDNIGKAFD 692
Cdd:PTZ00121 1448 EAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADE--AKKAAEA-----KKKADEAKKAEE 1520
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 693 KKtvpKEEELLKRAEAEKLGiiKAKMEEYFQKVAEtvtkiLRKYKDTKKEEQVgeKPIKQKKVVSFMPGLHFQKSPISAK 772
Cdd:PTZ00121 1521 AK---KADEAKKAEEAKKAD--EAKKAEEKKKADE-----LKKAEELKKAEEK--KKAEEAKKAEEDKNMALRKAEEAKK 1588
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 773 SESSTLLSYESTDPVINNLIQMILAEIESERdiPTVSTVQKDHKEKEKQRQEQYLQEgQEQMSGMSLKQQllGERNLLKE 852
Cdd:PTZ00121 1589 AEEARIEEVMKLYEEEKKMKAEEAKKAEEAK--IKAEELKKAEEEKKKVEQLKKKEA-EEKKKAEELKKA--EEENKIKA 1663
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 853 HYEKISENWEEKKAwlqmKEGKQEQQSQKQWQEEEMWKEEQKQATPKQAEQEEKQKQRGQE---EEELPKSSLQRLEEGT 929
Cdd:PTZ00121 1664 AEEAKKAEEDKKKA----EEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEElkkAEEENKIKAEEAKKEA 1739
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 930 QKMKTQGLLLEKENGQMRQIQ---KEAKHLGPHRRREKG---KE--KQKPERGLEDLERQIK-TKDQMQMKETQPKELEK 1000
Cdd:PTZ00121 1740 EEDKKKAEEAKKDEEEKKKIAhlkKEEEKKAEEIRKEKEaviEEelDEEDEKRRMEVDKKIKdIFDNFANIIEGGKEGNL 1819
|
650 660 670 680
....*....|....*....|....*....|....*....|....
gi 224458301 1001 MVIQTPMTLSPRWKSVL--KDVQRSyEGKEFQRNLKTLENLPDE 1042
Cdd:PTZ00121 1820 VINDSKEMEDSAIKEVAdsKNMQLE-EADAFEKHKFNKNNENGE 1862
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
1602-1953 |
3.30e-09 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 62.65 E-value: 3.30e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1602 PLTPQQAQAQGIPlTPQQAQAlgISLTPQQAQAQGITLTPQQAQalgvPITPVN-----AWVSAVTLTSEQTHALESPMN 1676
Cdd:PHA03247 2557 PAAPPAAPDRSVP-PPRPAPR--PSEPAVTSRARRPDAPPQSAR----PRAPVDdrgdpRGPAPPSPLPPDTHAPDPPPP 2629
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1677 LEQAQEQLLKLGVPLTLDKAHTLGSPLTLKQVQWSHRPFQKSKASLPTG--QSIISRLSPSLRLSLASSA--PTAEKSSi 1752
Cdd:PHA03247 2630 SPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSppQRPRRRAARPTVGSLTSLAdpPPPPPTP- 2708
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1753 fgVSSTPLQISRVPLNQGPFAPGKplemgilSEPGKLGAPQTLRSSGQTLVYGGQSTSAQFPAPQAPPSPG--QLPISRA 1830
Cdd:PHA03247 2709 --EPAPHALVSATPLPPGPAAARQ-------ASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAppAAPAAGP 2779
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1831 PPTPGQPFIAGVPPTSGQIPSLWAPLSPGQPLVPEASSIPGDLLESGPLTfseqlqefqPPATAEQ--SPYLQAPSTPGQ 1908
Cdd:PHA03247 2780 PRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLP---------PPTSAQPtaPPPPPGPPPPSL 2850
|
330 340 350 360
....*....|....*....|....*....|....*....|....*
gi 224458301 1909 HLATWTLPGrasslwiPPTSRHPPTLWPSPAPGKPQKSWSPSVAK 1953
Cdd:PHA03247 2851 PLGGSVAPG-------GDVRRRPPSRSPAAKPAAPARPPVRRLAR 2888
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
1422-1937 |
4.57e-09 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 62.26 E-value: 4.57e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1422 PFTPQQAQAQEITlTPQQAQALGMPLTAQQAQELGITLTPQQAQELGIPLTPQQAQALGIPLiPPQAQELGIPltPQQAQ 1501
Cdd:PHA03247 2557 PAAPPAAPDRSVP-PPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPL-PPDTHAPDPP--PPSPS 2632
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1502 ALGILLIPPQAQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQvqalgiPLIPPQAQELEIPLTPQQAQALGIPLTP 1581
Cdd:PHA03247 2633 PAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASS------PPQRPRRRAARPTVGSLTSLADPPPPPP 2706
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1582 QqaqelgiPLTPQQAQELGIPLTPQQAQAQGI-------PLTPQQAQALGISLTPQQAQAQGITLTPQQAQALGVPIT-- 1652
Cdd:PHA03247 2707 T-------PEPAPHALVSATPLPPGPAAARQAspalpaaPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAgp 2779
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1653 PVNAWVSAVTLTSEQTHALESPMNLEQAQEQLL--KLGVPLTLDKAHTLGSPLTLKQVQwSHRPFQKSKASLPTGQSI-- 1728
Cdd:PHA03247 2780 PRRLTRPAVASLSESRESLPSPWDPADPPAAVLapAAALPPAASPAGPLPPPTSAQPTA-PPPPPGPPPPSLPLGGSVap 2858
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1729 ---ISRLSPSLRLSLASSAPTAEKSSifGVSSTPLQISRVPLNQGPFAPGKPlemgilSEPGKLGAPQTlrssgqtlvyg 1805
Cdd:PHA03247 2859 ggdVRRRPPSRSPAAKPAAPARPPVR--RLARPAVSRSTESFALPPDQPERP------PQPQAPPPPQP----------- 2919
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1806 gQSTSAQFPAPQAPPSPGQLPISRAPPTPGQpfiAGVPPTSGQIPSLW-APLSPGQPLVPeassipgdllesgpltfseq 1884
Cdd:PHA03247 2920 -QPQPPPPPQPQPPPPPPPRPQPPLAPTTDP---AGAGEPSGAVPQPWlGALVPGRVAVP-------------------- 2975
|
490 500 510 520 530
....*....|....*....|....*....|....*....|....*....|....*...
gi 224458301 1885 lqEFQPPATAEQSPYLQAPSTPGQHLATWTLPGRASSLWIPPTSRHPP-----TLWPS 1937
Cdd:PHA03247 2976 --RFRVPQPAPSREAPASSTPPLTGHSLSRVSSWASSLALHEETDPPPvslkqTLWPP 3031
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
1512-1940 |
5.64e-09 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 61.71 E-value: 5.64e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1512 AQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQVQALGIPLIPPQAQELEIPLtPQQAQALGIPLTPQQAqelGIPL 1591
Cdd:pfam03154 162 AQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQP-PNQTQSTAAPHTLIQQ---TPTL 237
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1592 TPQQ-------AQELGIPLTPQQAQAQG---------IPLTPQQAQAlGISLTPQQAQAQGITLTPQQAQAlGVPITPVN 1655
Cdd:pfam03154 238 HPQRlpsphppLQPMTQPPPPSQVSPQPlpqpslhgqMPPMPHSLQT-GPSHMQHPVPPQPFPLTPQSSQS-QVPPGPSP 315
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1656 AWVSAVTLTSEQTHALESPMNLEQAQEQLLKlgvPLTLDKAHTLGSPLT-LKQVQWSHRPFQKSKASLPTGQSIISRLSP 1734
Cdd:pfam03154 316 AAPGQSQQRIHTPPSQSQLQSQQPPREQPLP---PAPLSMPHIKPPPTTpIPQLPNPQSHKHPPHLSGPSPFQMNSNLPP 392
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1735 SLRLSLASSAPTAEKSSifgVSSTPLQIsrvpLNQGPFAPGKPLEMGILSepgklgapqtlrssgqtlvyggQSTSAQFP 1814
Cdd:pfam03154 393 PPALKPLSSLSTHHPPS---AHPPPLQL----MPQSQQLPPPPAQPPVLT----------------------QSQSLPPP 443
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1815 APQAPPSPGQLPISRAPPTPGQPFIAGVPPTSgqipslwapLSPGQPLVPEASSIPGdllesgpltfseqlqeFQPPATA 1894
Cdd:pfam03154 444 AASHPPTSGLHQVPSQSPFPQHPFVPGGPPPI---------TPPSGPPTSTSSAMPG----------------IQPPSSA 498
|
410 420 430 440
....*....|....*....|....*....|....*....|....*.
gi 224458301 1895 EQSPYLQAPSTPGQHLATWTLPGRASSLWIPPTSRHPPTLWPSPAP 1940
Cdd:pfam03154 499 SVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEP 544
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
1216-1621 |
9.16e-09 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 61.23 E-value: 9.16e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1216 GIPLTPQQAQALGITLTLQQAQQLGIPLTPQQAQALGITLTPKQVQELGIPLTPQQAQALGITLTPKQAQELGIPLNPQQ 1295
Cdd:PHA03379 414 GTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDGRP 493
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1296 AQT--------LGIPLTPKQAQALGIPFTPQQAQALGIPLTPQQAQTQEITLTPQQAQAL----GMPLTTQQAQE--LGI 1361
Cdd:PHA03379 494 ACApvpapagpIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPPLIAmqgpGETSGIVRVRErwRPA 573
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1362 PLTPQHAQALG-MPLTTQQAQeLGIPLTPQQAQALGMPLTTQQA---QELGIPLTPQQAQELGIPFTPQQAQAQEITLTP 1437
Cdd:PHA03379 574 PWTPNPPRSPSqMSVRDRLAR-LRAEAQPYQASVEVQPPQLTQVspqQPMEYPLEPEQQMFPGSPFSQVADVMRAGGVPA 652
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1438 QQAQALGMPLtaqqaqelgitltpQQAQELGIPLTPQQAQALGIPLIPP-QAQELGIPLTPQQAQALGILLIPPQAQELG 1516
Cdd:PHA03379 653 MQPQYFDLPL--------------QQPISQGAPLAPLRASMGPVPPVPAtQPQYFDIPLTEPINQGASAAHFLPQQPMEG 718
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1517 iPLTPQQAQALGIPLIP------PQAQELGIPLTpqQVQALGIPLIP-PQAQELEIPLTPQQAQALGIPLTPqqaqelGI 1589
Cdd:PHA03379 719 -PLVPERWMFQGATLSQsvrpgvAQSQYFDLPLT--QPINHGAPAAHfLHQPPMEGPWVPEQWMFQGAPPSQ------GT 789
|
410 420 430
....*....|....*....|....*....|..
gi 224458301 1590 PLTPQQAQELGIPltPQQAQAQGIPLTPQQAQ 1621
Cdd:PHA03379 790 DVVQHQLDALGYV--LHVLNHPGVPVSPAVNQ 819
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
1362-1679 |
2.55e-07 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 56.49 E-value: 2.55e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1362 PLTPQHAQALGMPLTTQQAQELGI-------PLTPQQAQALGMPLTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQEIT 1434
Cdd:PHA03247 2708 PEPAPHALVSATPLPPGPAAARQAspalpaaPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPA 2787
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1435 LTPQQAQALGMPLTAQQAQELGITLTPQQAqelgipLTPQQAQALGIPlIPPQAQELGIPLTPQqaqalgilliPPQAqe 1514
Cdd:PHA03247 2788 VASLSESRESLPSPWDPADPPAAVLAPAAA------LPPAASPAGPLP-PPTSAQPTAPPPPPG----------PPPP-- 2848
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1515 lgiPLTPQQAQALGIPLI--PPQAQELGIPLTPQQVQALGIPLIPPQAQELEIPLTPQQAQALGIPLTPQQAQELgiPLT 1592
Cdd:PHA03247 2849 ---SLPLGGSVAPGGDVRrrPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQ--PQP 2923
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1593 PQQAQELGIPLTPQQAQAQGIPLTPQQA--QALGISLTPQQ-AQAQGI-----TLTPQQAQALGVPITP----------- 1653
Cdd:PHA03247 2924 PPPPQPQPPPPPPPRPQPPLAPTTDPAGagEPSGAVPQPWLgALVPGRvavprFRVPQPAPSREAPASStppltghslsr 3003
|
330 340
....*....|....*....|....*.
gi 224458301 1654 VNAWVSAVTLTSEQTHAlesPMNLEQ 1679
Cdd:PHA03247 3004 VSSWASSLALHEETDPP---PVSLKQ 3026
|
|
| SMC_N |
pfam02463 |
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ... |
291-1039 |
2.55e-07 |
|
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.
Pssm-ID: 426784 [Multi-domain] Cd Length: 1161 Bit Score: 56.52 E-value: 2.55e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 291 AHETSEAEKELSLKIIRDLSNENEMLQQKLQDAEEKCEQLIRSKIVIEQLYAKLSTSSTLKVLPGPSPQSSRAIIKVGDT 370
Cdd:pfam02463 254 ESSKQEIEKEEEKLAQVLKENKEEEKEKKLQEEELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKEKKKAEKELKKE 333
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 371 EDNMDN---------ILDKELENIVDEVQRKETKDSGIKWDSTISYTAQAERTPDLTELRQQPVASEDISEDSTKDNVSL 441
Cdd:pfam02463 334 KEEIEElekelkeleIKREAEEEEEEELEKLQEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKSEEEKEAQLLLEL 413
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 442 KKGDFYQEDETDEYQSWKRSHKKatyvyetsgpNLSDNKSGQKVSEAKPSQYYELQVLKKKRKEMKSFSEDKSKSPTEAK 521
Cdd:pfam02463 414 ARQLEDLLKEEKKEELEILEEEE----------ESIELKQGKLTEEKEELEKQELKLLKDELELKKSEDLLKETQLVKLQ 483
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 522 RKHLSLTETKSQGG---KSGTSMMMLEQFRKVKRESPFDKRPTAAEIKVEPTTESLDKEGKGEIRSLVEPLSMIQFDDTA 598
Cdd:pfam02463 484 EQLELLLSRQKLEErsqKESKARSGLKVLLALIKDGVGGRIISAHGRLGDLGVAVENYKVAISTAVIVEVSATADEVEER 563
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 599 EP-QKGKIKGKKHHISSGTITSKEEKTEEKEELTKQVKSHQLVKSLSRVAKETSESTRVLESPDG---KSEQSNLEEFQE 674
Cdd:pfam02463 564 QKlVRALTELPLGARKLRLLIPKLKLPLKSIAVLEIDPILNLAQLDKATLEADEDDKRAKVVEGIlkdTELTKLKESAKA 643
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 675 AIMAFLKQKIDNIGKAFDKKTVPKEEELLKRAEAEKLGIIKAKMEEYFQKVAETVTKILRKYKDTKKEEQVgekpIKQKK 754
Cdd:pfam02463 644 KESGLRKGVSLEEGLAEKSEVKASLSELTKELLEIQELQEKAESELAKEEILRRQLEIKKKEQREKEELKK----LKLEA 719
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 755 VVSFMPGLHFQKSPISaksesstllsyestdpVINNLIQMILAEIESERDIPTVSTVQKDHKEKEKQRQEQYLQEGQEQm 834
Cdd:pfam02463 720 EELLADRVQEAQDKIN----------------EELKLLKQKIDEEEEEEEKSRLKKEEKEEEKSELSLKEKELAEEREK- 782
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 835 sgmslkqQLLGERNLLKEHYEKI---SENWEEKKAWLQMKEGKQEQqsqkqwqeeemwkEEQKQATPKQAEQEEKQKQRG 911
Cdd:pfam02463 783 -------TEKLKVEEEKEEKLKAqeeELRALEEELKEEAELLEEEQ-------------LLIEQEEKIKEEELEELALEL 842
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 912 QEEEELPKSSLQRLEEgtQKMKTQGLLLEKENGQMRQIQKEAKHlgphrrREKGKEKQKPERGLEDLERQIKTKDQMQMK 991
Cdd:pfam02463 843 KEEQKLEKLAEEELER--LEEEITKEELLQELLLKEEELEEQKL------KDELESKEEKEKEEKKELEEESQKLNLLEE 914
|
730 740 750 760
....*....|....*....|....*....|....*....|....*...
gi 224458301 992 ETQPKELEKMVIQTPMTLSPRWKSVLKDVQRSYEGKEFQrNLKTLENL 1039
Cdd:pfam02463 915 KENEIEERIKEEAEILLKYEEEPEELLLEEADEKEKEEN-NKEEEEER 961
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
1554-1956 |
4.11e-07 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 56.10 E-value: 4.11e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1554 PLIPPQAQELEIPlTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGIPLTPQQAQAlgiSLTPQQAQ 1633
Cdd:PHA03247 2557 PAAPPAAPDRSVP-PPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAP---DPPPPSPS 2632
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1634 AQGITLTPQQAQALGVPITPVNAWVSAVTLTSEQTHALESPMNLEQAQEQLLKLGVPLTLDKAHTLGSPLTLKQVQWSHR 1713
Cdd:PHA03247 2633 PAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAP 2712
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1714 PFQKSKASLPTGQSIISRLSPSLRLSLASSAPTAEKSsifgVSSTPLQISRVPLNQGPFAPGKPlemgilSEPGKLGAPQ 1793
Cdd:PHA03247 2713 HALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPA----TPGGPARPARPPTTAGPPAPAPP------AAPAAGPPRR 2782
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1794 TLRSSGQTLVYGGQS---------------------TSAQFPAPQAPPSPGQLPIsrAPPTPGQPFIAGVPPTSGQIP-- 1850
Cdd:PHA03247 2783 LTRPAVASLSESRESlpspwdpadppaavlapaaalPPAASPAGPLPPPTSAQPT--APPPPPGPPPPSLPLGGSVAPgg 2860
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1851 -------SLWAPLSPGQPLVPEASSIPGDLLESGPLTFSeqlqefQPPATAEQSPYLQAPSTPgqhLATWTLPGRASSLW 1923
Cdd:PHA03247 2861 dvrrrppSRSPAAKPAAPARPPVRRLARPAVSRSTESFA------LPPDQPERPPQPQAPPPP---QPQPQPPPPPQPQP 2931
|
410 420 430
....*....|....*....|....*....|....
gi 224458301 1924 IPPTS-RHPPTLWPSPAPgKPQKSWSPSVAKKRL 1956
Cdd:PHA03247 2932 PPPPPpRPQPPLAPTTDP-AGAGEPSGAVPQPWL 2964
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
1509-1952 |
1.03e-06 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 54.56 E-value: 1.03e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1509 PPQAQELGIPlTPQQAQALGIPLIPPQAQELGIPLTPQQVQALGIPLIPPQAQELEIPLTPQQAQALGIPLTPQ-QAQEL 1587
Cdd:PHA03247 2560 PPAAPDRSVP-PPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSpAANEP 2638
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1588 GIPLTPQQAQELGIPLTPQQAQAQGIPLTPQQAQALGISLTPQQAQAQGITLTPQQAQALGVPITPVNAWVSAVTLTSEQ 1667
Cdd:PHA03247 2639 DPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSA 2718
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1668 THALESPMNLEQAQEQL-LKLGVPLTLDKAHTLGSPLTLKQVQWSHRPFQKSKASLPTG-------QSIISRLSPSLRLS 1739
Cdd:PHA03247 2719 TPLPPGPAAARQASPALpAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAgpprrltRPAVASLSESRESL 2798
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1740 LASSAPTAEKSSIFGVSSTplqisrVPLNQGPFAPGKPLEMGILSEPGKLGAP-QTLRSSGQTLVYGG--QSTSAQFPAP 1816
Cdd:PHA03247 2799 PSPWDPADPPAAVLAPAAA------LPPAASPAGPLPPPTSAQPTAPPPPPGPpPPSLPLGGSVAPGGdvRRRPPSRSPA 2872
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1817 QAPPSPGQLPISR----APPTPGQPFiAGVPPTSGQIPSLWAPLSPGQPLVPEASSIPGDLLESGPLTFSEQLQEFQPPA 1892
Cdd:PHA03247 2873 AKPAAPARPPVRRlarpAVSRSTESF-ALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAG 2951
|
410 420 430 440 450 460
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301 1893 TAEQSPYLQAPS----TPGQHLATWTL-PGRASSLWIPPTSRHPPTLWPSPApgkpQKSWSPSVA 1952
Cdd:PHA03247 2952 AGEPSGAVPQPWlgalVPGRVAVPRFRvPQPAPSREAPASSTPPLTGHSLSR----VSSWASSLA 3012
|
|
| SP2_N |
cd22540 |
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ... |
1046-1487 |
3.09e-06 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.
Pssm-ID: 411776 [Multi-domain] Cd Length: 511 Bit Score: 52.24 E-value: 3.09e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1046 ISITPPPSLQYSLPGALPISGQPLTkcihlTPQQAQEVGITltpqqaQAQGITLTLQQAQELGIPLTPQQAQALEILFTP 1125
Cdd:cd22540 44 AAVTPPAPPQPTPRKLVPIKPAPLP-----LGPGKNSIGFL------SAKGNIIQLQGSQLSSSAPGGQQVFAIQNPTMI 112
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1126 QQAQALGIPLTPQ---QTQVQGITLTPQQDQAPGISlTTQQAQKLGIPLTPQQAQA-LGIPLTPQQAQELGIPLTPQQAQ 1201
Cdd:cd22540 113 IKGSQTRSSTNQQyqiSPQIQAAGQINNSGQIQIIP-GTNQAIITPVQVLQQPQQAhKPVPIKPAPLQTSNTNSASLQVP 191
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1202 ALRVSLTPQQAQELGIPLTPQQAQALGITLTL-------------QQAQQLGIPLTP--QQAQALGITLTPKQVQELGIP 1266
Cdd:cd22540 192 GNVIKLQSGGNVALTLPVNNLVGTQDGATQLQlaaapskpskkirKKSAQAAQPAVTvaEQVETVLIETTADNIIQAGNN 271
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1267 LTPQQAQALGITLTPKQAQELGIPLNPQQAQTLGIPLTPKQAQALGIPFTPQQAQALGIPLTPQQAQTQEITLTPQ-QAQ 1345
Cdd:cd22540 272 LLIVQSPGTGQPAVLQQVQVLQPKQEQQVVQIPQQALRVVQAASATLPTVPQKPLQNIQIQNSEPTPTQVYIKTPSgEVQ 351
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1346 ALGM---PLTTQQAQELGIPLTPQHAQALGMPLTTQQAQELGIPLTPQQAQALGmpLTTQQAQELGIplTPQQAQELGIP 1422
Cdd:cd22540 352 TVLLqeaPAATATPSSSTSTVQQQVTANNGTGTSKPNYNVRKERTLPKIAPAGG--IISLNAAQLAA--AAQAIQTININ 427
|
410 420 430 440 450 460
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301 1423 FTPQQAQAQEITLTPQQAQalgmpLTAQQAQELGITLTPQQAQElgipLTPQQAQALGIPLIPPQ 1487
Cdd:cd22540 428 GVQVQGVPVTITNAGGQQQ-----LTVQTVSSNNLTISGLSPTQ----IQLQMEQALEIETQPGE 483
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
1108-1525 |
8.20e-06 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 51.21 E-value: 8.20e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1108 GIPLTPQQAQALEILFTPQQAQALGIPLTPQQTQVQGITLTPQQDQAPGISLTTQQAQKLGIPLTPQQAQALGIPLT--- 1184
Cdd:PHA03379 414 GTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDgrp 493
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1185 -----PQQAQELGIPLTPQQAQALRVSLTPQQAQELGIPLTPQQAQALgitltlqqaQQLGIPLTPQQA-QALGITLTPK 1258
Cdd:PHA03379 494 acapvPAPAGPIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVAL---------ERPVCPAPPLIAmQGPGETSGIV 564
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1259 QVQE--LGIPLTPQQAQAL-------GITLTPKQAQELGIPLNPQQAQtlgIPLTPKQaQALGIPFTPQQAQALGIPLTP 1329
Cdd:PHA03379 565 RVRErwRPAPWTPNPPRSPsqmsvrdRLARLRAEAQPYQASVEVQPPQ---LTQVSPQ-QPMEYPLEPEQQMFPGSPFSQ 640
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1330 QQAQTQEITLTPQQAQALGMPLttQQAQELGIPLTPQHAQALGM-PLTTQQAQELGIPLTPQQAQALGMPLTTQQAQELG 1408
Cdd:PHA03379 641 VADVMRAGGVPAMQPQYFDLPL--QQPISQGAPLAPLRASMGPVpPVPATQPQYFDIPLTEPINQGASAAHFLPQQPMEG 718
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1409 iPLTPQQAqelgipFTPQQAQAQEITLTPQQAQALGMPLTAQQAQELGITLTPQQAQELGiPLTPQQAQALGIPLIPpqa 1488
Cdd:PHA03379 719 -PLVPERW------MFQGATLSQSVRPGVAQSQYFDLPLTQPINHGAPAAHFLHQPPMEG-PWVPEQWMFQGAPPSQ--- 787
|
410 420 430
....*....|....*....|....*....|....*..
gi 224458301 1489 qelGIPLTPQQAQALGilLIPPQAQELGIPLTPQQAQ 1525
Cdd:PHA03379 788 ---GTDVVQHQLDALG--YVLHVLNHPGVPVSPAVNQ 819
|
|
| AvrBs3 |
NF041308 |
type III secretion system effector avirulence protein AvrBs3; |
1087-1646 |
1.65e-05 |
|
type III secretion system effector avirulence protein AvrBs3;
Pssm-ID: 469205 [Multi-domain] Cd Length: 1179 Bit Score: 50.34 E-value: 1.65e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1087 LTPQQ----AQAQGITLTLQQAQELgIPLTPQQAQALeilfTPQQAQALGIPLTPQQTQVQGITLTPQQDQAPgISLTTQ 1162
Cdd:NF041308 326 LTPEQvvaiASNDGGKQALETVQRL-LPVLCQAEHGL----TPDQVVAIASNIGGKPALETVQRLLPVLCQPP-HGLTPD 399
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1163 QAQKLGIPLTPQQAQALGIPLTPQQAQELGiPLTPQQAQALRVSLTPQQAQELGIPLTPQQAQALGITltlqQAQQLGIP 1242
Cdd:NF041308 400 QVVAIASNDGGKQALETVQRLLPVLCQAPH-GLTPDQVVAIASNDGGKQALETVQRLLPELCQAHGLT----PDQVVAIA 474
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1243 LTPQQAQALGIT--LTPKQVQeLGIPLTPQQAQALGITLTPKQAQELGIPLNPQQAQTLGiPLTPKQAQALGIPFTPQQA 1320
Cdd:NF041308 475 SNGGGKQALETVqrLLPVLCQ-PPHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQPPH-GLTPEQVVAIASHDGGKQA 552
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1321 QALGIPLTPQQAQTQEiTLTPQQAQALGMPLTTQQA----QELgIP--------LTPQHAQALGMPLTTQQAQELGIPLT 1388
Cdd:NF041308 553 LETVHRLLPVLCQAPH-GLTPEQVVAIASHNGGKQAletvQRL-LPvlcqrpygLTPNQVVAIASNDGGKQALETVQRLL 630
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1389 PQQAQAlGMPLTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQEiTLTPQQAQALGMPLTAQQAQELGITLTPQQAQElG 1468
Cdd:NF041308 631 PVLCQA-PHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQRPH-GLTPHQVVAIASNDGGKQALETVQRLLPVLCQP-P 707
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1469 IPLTPQQAQALGIPLIPPQAQELGIPLTPqqaqalgILLIPPQAqelgipLTPQQAQALGIPLIPPQA----QELgIP-- 1542
Cdd:NF041308 708 YGLTPEQVVAIASNNGGKQALETVQRLLP-------VLCQRPHG------LTPDQVVAIASNDGGKQAletvQRL-LPvl 773
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1543 ------LTPQQVQALGIPLIPPQAQELEIPLTPQQAQALGiPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGipLT 1616
Cdd:NF041308 774 cqpphgLTPDQVVAIASNDGGKQALETVQRLLPVLCDAPH-GLTPHQVVAIASNIGGRQALETVQRLLPVLCQAHG--LT 850
|
570 580 590
....*....|....*....|....*....|
gi 224458301 1617 PQQAQALGISLTPQQAQAQGITLTPQQAQA 1646
Cdd:NF041308 851 PDQVVAIASNNGGKQALETVQRLLPVLCQP 880
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
1040-1538 |
2.70e-05 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 49.70 E-value: 2.70e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1040 PDEKEPISITPPPSLQYSLPGALPISGQPLTKCiHLTPQQAQEVGITLTPQQAQAQGITLTLQQAQELGIPLTPQQAQAL 1119
Cdd:PRK10263 377 PEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYA-PAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQST 455
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1120 ---EILFTPQQAQALGIPLTPQQTQVQGITLTPQQDQAPGISLTTqqaqklgiPLTP----------------QQAQALG 1180
Cdd:PRK10263 456 fapQSTYQTEQTYQQPAAQEPLYQQPQPVEQQPVVEPEPVVEETK--------PARPplyyfeeveekrarerEQLAAWY 527
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1181 IPLtPQQAQElGIPLTPQQAQALRVSLTPQQAQELGIPLTPQQAQAlgiTLTLQQAQQLGIPL---------TPQQAQAL 1251
Cdd:PRK10263 528 QPI-PEPVKE-PEPIKSSLKAPSVAAVPPVEAAAAVSPLASGVKKA---TLATGAAATVAAPVfslansggpRPQVKEGI 602
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1252 GITLT-PKQVQelgIPlTPQQAQALGITLTPKQAQELGIPLNPQQAQTLGIPLTPKQAQA-----LGIPFTPQQAQALG- 1324
Cdd:PRK10263 603 GPQLPrPKRIR---VP-TRRELASYGIKLPSQRAAEEKAREAQRNQYDSGDQYNDDEIDAmqqdeLARQFAQTQQQRYGe 678
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1325 -----IPLTPQQAQT-QEITLTPQQAQalgmpltTQQAQELGipltPQHAQALGMPLTTQQAQELGIPLTPQQAQALGMP 1398
Cdd:PRK10263 679 qyqhdVPVNAEDADAaAEAELARQFAQ-------TQQQRYSG----EQPAGANPFSLDDFEFSPMKALLDDGPHEPLFTP 747
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1399 LTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQeitltPQQAQAlgmplTAQQAQELGITLTPQ-QAQELGIPLTPQ-QA 1476
Cdd:PRK10263 748 IVEPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQ-----PQQPVA-----PQPQYQQPQQPVAPQpQYQQPQQPVAPQpQY 817
|
490 500 510 520 530 540
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301 1477 QALGIPLIP-PQAQELGIPLTPQQAQAL--GILLIPPQAQELGIPLTPQQAQALGIPliPPQAQE 1538
Cdd:PRK10263 818 QQPQQPVAPqPQYQQPQQPVAPQPQDTLlhPLLMRNGDSRPLHKPTTPLPSLDLLTP--PPSEVE 880
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
1738-1975 |
5.75e-05 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 48.63 E-value: 5.75e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1738 LSLASSAPTAEKSSIFGVSSTPLQISRVPLNQGPFAPGKPlemgILSEPGKLGAPQTLRSSGQTLVYG---GQSTSAQFP 1814
Cdd:PHA03307 93 STLAPASPAREGSPTPPGPSSPDPPPPTPPPASPPPSPAP----DLSEMLRPVGSPGPPPAASPPAAGaspAAVASDAAS 168
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1815 APQ-APPSPGQLPISRAPPTPGQPFIAGVPPTSGQiPSLWAPLSPGQPLVPEASSIPGDLLESGPLTFSEQLQEFQPPAT 1893
Cdd:PHA03307 169 SRQaALPLSSPEETARAPSSPPAEPPPSTPPAAAS-PRPPRRSSPISASASSPAPAPGRSAADDAGASSSDSSSSESSGC 247
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1894 AEQspylQAPSTPGQHLATWTLPGR--ASSLWIPPTSRHPPTLwPSPAPGKPQKSWSPSVAKKRLAIISSLKSKSV-LIH 1970
Cdd:PHA03307 248 GWG----PENECPLPRPAPITLPTRiwEASGWNGPSSRPGPAS-SSSSPRERSPSPSPSSPGSGPAPSSPRASSSSsSSR 322
|
....*
gi 224458301 1971 PSAPD 1975
Cdd:PHA03307 323 ESSSS 327
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
1630-1950 |
1.40e-04 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 47.45 E-value: 1.40e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1630 QQAQAQGITLTPQQAQALGVPITPVNAWVSAVTLTSEQTHALESPMNLEQAQEQLLKLGVPLTldkahtlgSPLTLKQVQ 1709
Cdd:pfam03154 163 QQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTA--------APHTLIQQT 234
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1710 WSHRPfqkskASLPTGQSIISRLSPSLRLSLASSAPTAEKSSifgvsSTPLQISRVPLNQGPFA---PGKPLEMGILSEP 1786
Cdd:pfam03154 235 PTLHP-----QRLPSPHPPLQPMTQPPPPSQVSPQPLPQPSL-----HGQMPPMPHSLQTGPSHmqhPVPPQPFPLTPQS 304
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1787 GKLGAPqtlrSSGQTLVYGGQSTSAQFPAPQAPPSPGQLPISRA-PPTP-GQPFIAgvPPTSGQIPSLWAPLS---PGQP 1861
Cdd:pfam03154 305 SQSQVP----PGPSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPlPPAPlSMPHIK--PPPTTPIPQLPNPQShkhPPHL 378
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1862 LVPEASSIPGDLLESGPLTFSEQLQEFQPPAT-------AEQSPYLQA-PSTPGQHLATWTLPgrasslwiPPTSRHPPT 1933
Cdd:pfam03154 379 SGPSPFQMNSNLPPPPALKPLSSLSTHHPPSAhppplqlMPQSQQLPPpPAQPPVLTQSQSLP--------PPAASHPPT 450
|
330 340
....*....|....*....|
gi 224458301 1934 LWPSPAPGK---PQKSWSPS 1950
Cdd:pfam03154 451 SGLHQVPSQspfPQHPFVPG 470
|
|
| SMC_N |
pfam02463 |
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ... |
635-993 |
1.47e-04 |
|
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.
Pssm-ID: 426784 [Multi-domain] Cd Length: 1161 Bit Score: 47.27 E-value: 1.47e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 635 KSHQLVKSLSRVAKETSESTRVLESPDGKSEQSNLEEFQEAIMAfLKQKIDNIGKAFDKKTVPKEEELLKRAEAEKLGII 714
Cdd:pfam02463 174 ALKKLIEETENLAELIIDLEELKLQELKLKEQAKKALEYYQLKE-KLELEEEYLLYLDYLKLNEERIDLLQELLRDEQEE 252
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 715 KAKMEEYFQKVAETVTKILRKYKDTKKEEQVGEKPIKQKKvvsfmpglhfqkspISAKSESSTLLSYEST-----DPVIN 789
Cdd:pfam02463 253 IESSKQEIEKEEEKLAQVLKENKEEEKEKKLQEEELKLLA--------------KEEEELKSELLKLERRkvddeEKLKE 318
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 790 NLIQMILAEIESERDIPTVStvQKDHKEKEKQRQEQYLQEGQEQMSgmslkQQLLGERNLLKEHYEKISENWEEKKAwlq 869
Cdd:pfam02463 319 SEKEKKKAEKELKKEKEEIE--ELEKELKELEIKREAEEEEEEELE-----KLQEKLEQLEEELLAKKKLESERLSS--- 388
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 870 mkegkqeqqsqkqwQEEEMWKEEQKQATPKQAEQEEKQKQRGQEEEELPKSSLQRLEEGTQKmKTQGLLLEKENGQMRQI 949
Cdd:pfam02463 389 --------------AAKLKEEELELKSEEEKEAQLLLELARQLEDLLKEEKKEELEILEEEE-ESIELKQGKLTEEKEEL 453
|
330 340 350 360
....*....|....*....|....*....|....*....|....
gi 224458301 950 QKEAKHLGPHRRREKGKEKQKPERGLEDLERQIKTKDQMQMKET 993
Cdd:pfam02463 454 EKQELKLLKDELELKKSEDLLKETQLVKLQEQLELLLSRQKLEE 497
|
|
| KREPA2 |
cd23959 |
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of ... |
1770-1908 |
1.91e-04 |
|
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of the parasitic protozoan's KREPA RNA editing catalytic complex (RECC). Kinetoplastid RNA editing (KRE) proteins occur as pairs or sets of related proteins in multiple complexes. KREPA complex is composed of six components (KREPA1-6), which share a conserved C-terminal region containing an oligonucleotide-binding (OB)-fold-like domain. KREPAs are responsible for the site-specific insertion and deletion of U nucleotides in the kinetoplastid mitochondria pre-messenger RNA. Apart from the conserved C-terminal OB-fold domain, KREPA1, KREPA2, and KREPA3 contain two conserved C2H2 zinc-finger domains. KREPA2 and kinetoplastid RNA editing ligase 1 (KREL1) are specific for ligation post-U-deletion and are paralogous to KREL2 and KREPA1 that are specific for ligation post-U-insertion. KREPA2, is critical for RECC stability and KREL1 integration into the complex.
Pssm-ID: 467780 [Multi-domain] Cd Length: 424 Bit Score: 46.40 E-value: 1.91e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1770 GPFAPGKPLEMGILSePGKLGAPQTLRSSGQTLVYGGQSTSAQFPAPQAPPSP--GQLPI--SRAPPTPGQPFIAGVPPT 1845
Cdd:cd23959 97 DAFAMAPDESLGPFR-AARVPNPFSASSSTQRETHKTAQVAPPKAEPQTAPVTpfGQLPMfgQHPPPAKPLPAAAAAQQS 175
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 224458301 1846 S---GQIPSLWAPLSP-GQPLVPEASSIPGDLLESGPLTFSEQLQEFQPPATAEQSPylQAPSTPGQ 1908
Cdd:cd23959 176 SaspGEVASPFASGTVsASPFATATDTAPSSGAPDGFPAEASAPSPFAAPASAASFP--AAPVANGE 240
|
|
| PTZ00121 |
PTZ00121 |
MAEBL; Provisional |
646-1001 |
2.69e-04 |
|
MAEBL; Provisional
Pssm-ID: 173412 [Multi-domain] Cd Length: 2084 Bit Score: 46.67 E-value: 2.69e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 646 VAKETSESTRVLESPDGKSEQSNLEEFQEAIMAFLK----QKIDNIGKAFDkktVPKEEELlKRAEAEKLGIIKAKMEEy 721
Cdd:PTZ00121 1088 RADEATEEAFGKAEEAKKTETGKAEEARKAEEAKKKaedaRKAEEARKAED---ARKAEEA-RKAEDAKRVEIARKAED- 1162
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 722 fQKVAEtvtkILRKYKDTKKEEQVgekpikqKKVVSFMPGLHFQKSPISAKSESSTllSYESTDPVINnliqmiLAEIES 801
Cdd:PTZ00121 1163 -ARKAE----EARKAEDAKKAEAA-------RKAEEVRKAEELRKAEDARKAEAAR--KAEEERKAEE------ARKAED 1222
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 802 ERDIPTVSTVQKDHKEKEKQRQEQYLQEGQEQMSGMSLKQQLLGERnllkehyeKISENWEEKKAWLQMKEGKQEQQSQK 881
Cdd:PTZ00121 1223 AKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARR--------QAAIKAEEARKADELKKAEEKKKADE 1294
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 882 QWQEEEMWKEEQKQATPKQAEQEEKQKQRGQEEEELPKSSLQRLEEGTQKMKTQGLLLEKENGQMRQIQKEAKHLGPHRR 961
Cdd:PTZ00121 1295 AKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKE 1374
|
330 340 350 360
....*....|....*....|....*....|....*....|....*....
gi 224458301 962 REKGK---------EKQKPERGLEDLERQIKTKDQMQMKETQPKELEKM 1001
Cdd:PTZ00121 1375 EAKKKadaakkkaeEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEA 1423
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
1792-1982 |
3.50e-04 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 46.08 E-value: 3.50e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1792 PQTLRSSGQTLVYGGQSTSAQFPAPQAPPSPGQLPISRAPPTPGQPFIAGVPPTSGQIPSLWAPLS--------PGQPLV 1863
Cdd:PHA03247 2628 PPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGsltsladpPPPPPT 2707
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1864 PEASsiPGDLLESGPLTFSEQL-QEFQPPATAEQSPylqaPSTPgqhlATWTLPGRASSLWIPPTSRHPPTLWPSPAPGK 1942
Cdd:PHA03247 2708 PEPA--PHALVSATPLPPGPAAaRQASPALPAAPAP----PAVP----AGPATPGGPARPARPPTTAGPPAPAPPAAPAA 2777
|
170 180 190 200
....*....|....*....|....*....|....*....|
gi 224458301 1943 PqkswsPSVAKKRLAIISSLKSKSVLIHPSAPDFKVAQVP 1982
Cdd:PHA03247 2778 G-----PPRRLTRPAVASLSESRESLPSPWDPADPPAAVL 2812
|
|
| SP1-4_arthropods_N |
cd22553 |
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ... |
1284-1552 |
5.03e-04 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.
Pssm-ID: 411778 [Multi-domain] Cd Length: 384 Bit Score: 45.02 E-value: 5.03e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1284 AQELGIPLNPQQAqtlgIPLTPKQAQALgipFTPQQAQALGIPLTPQQAQTQEITLTPQQAQALGMPLTTQQAQELGIPL 1363
Cdd:cd22553 88 ANSGLLQTNNQQA----IQLAPGGTQAI---LANQQTLIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNNMTQTIPV 160
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1364 TpqhaqalgMPLTTQQAQELgipltpqqAQALGMPLTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQEITLTPQQAQAL 1443
Cdd:cd22553 161 Q--------VPVSTANGQTV--------YQTIQVPIQAIQSGNAGGGNQALQAQVIPQLAQAAQLQPQQLAQVSSQGYIQ 224
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1444 GMPLTAQQAQELGITLTPQQAQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQAQALGILLIPPQAQELGIPLTPQQ 1523
Cdd:cd22553 225 QIPANASQQQPQMVQQGPNQSGQIIGQVASASSIQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPASS 304
|
250 260 270
....*....|....*....|....*....|
gi 224458301 1524 AqalgIPLIPPQAQELGIPLTP-QQVQALG 1552
Cdd:cd22553 305 S----IPTVVQQQAIQGNPLPPgTQIIAAG 330
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
1499-1951 |
5.41e-04 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 45.44 E-value: 5.41e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1499 QAQALGILLIPPQAQELGIPLTPQQAQALGIPLIP-PQAQELGIPL---TPQQVQALG--IPLIPPQAQELEipltpQQA 1572
Cdd:PHA03378 446 HSQAPTVVLHRPPTQPLEGPTGPLSVQAPLEPWQPlPHPQVTPVILhqpPAQGVQAHGsmLDLLEKDDEDME-----QRV 520
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1573 QALGIPLTPQQaqelgiPLTPQQA-----QELGI----PLTPQQAQAQGIP---LTPQQAQALGISLTPQQAQ-----AQ 1635
Cdd:PHA03378 521 MATLLPPSPPQ------PRAGRRApcvytEDLDIesdePASTEPVHDQLLPapgLGPLQIQPLTSPTTSQLASsapsyAQ 594
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1636 GITLTPQQAQALGVPITPVNAwvsAVTLTSEQTHALESPMNLEQAQEQLLKLGVPLTLDKAHTLGSPLTLKQVQWS---H 1712
Cdd:PHA03378 595 TPWPVPHPSQTPEPPTTQSHI---PETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPPQVEITPYKPTWTqigH 671
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1713 RPFQKSkaslPTGQSIISRLSPS-LRLSLASSAPTAEKSSifGVSSTPLQISRVPLNQGPFAPGKPLEMG-ILSEPGKLG 1790
Cdd:PHA03378 672 IPYQPS----PTGANTMLPIQWApGTMQPPPRAPTPMRPP--AAPPGRAQRPAAATGRARPPAAAPGRARpPAAAPGRAR 745
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1791 APQTLRSSGQTLV---------YGGQSTSAQFPAPQAPPSPGQLPisRAPPTPGQPfiAGVPPTSGQI------------ 1849
Cdd:PHA03378 746 PPAAAPGRARPPAaapgrarppAAAPGAPTPQPPPQAPPAPQQRP--RGAPTPQPP--PQAGPTSMQLmpraapgqqgpt 821
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1850 ----------------PSLWAP--LSPGQPLVPEAS--SIPGDLLESGPLTFSEQLQEFQPPATAEQSPYLQA---PSTP 1906
Cdd:PHA03378 822 kqilrqlltggvkrgrPSLKKPaaLERQAAAGPTPSpgSGTSDKIVQAPVFYPPVLQPIQVMRQLGSVRAAAAstvTQAP 901
|
490 500 510 520
....*....|....*....|....*....|....*....|....*
gi 224458301 1907 GQHLATWTLPGRASSLWIPPTSRHPPTLWPSPAPGKPQKSWSPSV 1951
Cdd:PHA03378 902 TEYTGERRGVGPMHPTDIPPSKRAKTDAYVESQPPHGGQSHSFSV 946
|
|
| SP2_N |
cd22540 |
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ... |
1319-1656 |
7.37e-04 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.
Pssm-ID: 411776 [Multi-domain] Cd Length: 511 Bit Score: 44.53 E-value: 7.37e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1319 QAQALGIPLTPQQAQTQEITLTPQQAQALGMPLTTQQAQelgiplTPQHAQALGMPLTTQQAQelGIPLTPQQAQALGMP 1398
Cdd:cd22540 89 QGSQLSSSAPGGQQVFAIQNPTMIIKGSQTRSSTNQQYQ------ISPQIQAAGQINNSGQIQ--IIPGTNQAIITPVQV 160
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1399 LTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQEITLTPQQAQALGMPLTAQQAQELGITLTPQQAQELGIPLTPQQAQA 1478
Cdd:cd22540 161 LQQPQQAHKPVPIKPAPLQTSNTNSASLQVPGNVIKLQSGGNVALTLPVNNLVGTQDGATQLQLAAAPSKPSKKIRKKSA 240
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1479 LGIPLIPPQAQELGIPL----TPQQAQALGILLIpPQAQELGIPLTPQQAQAL------GIPLIPP------QAQELGIP 1542
Cdd:cd22540 241 QAAQPAVTVAEQVETVLiettADNIIQAGNNLLI-VQSPGTGQPAVLQQVQVLqpkqeqQVVQIPQqalrvvQAASATLP 319
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1543 LTPQQV-QALGIPLIPPQAQELEIPLTPQQAQALGI---PLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGIPLTPQ 1618
Cdd:cd22540 320 TVPQKPlQNIQIQNSEPTPTQVYIKTPSGEVQTVLLqeaPAATATPSSSTSTVQQQVTANNGTGTSKPNYNVRKERTLPK 399
|
330 340 350 360
....*....|....*....|....*....|....*....|..
gi 224458301 1619 QAQALG-ISL--TPQQAQAQGI-TLTPQQAQALGVPITPVNA 1656
Cdd:cd22540 400 IAPAGGiISLnaAQLAAAAQAIqTININGVQVQGVPVTITNA 441
|
|
| SP1-4_arthropods_N |
cd22553 |
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ... |
1431-1660 |
8.45e-04 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.
Pssm-ID: 411778 [Multi-domain] Cd Length: 384 Bit Score: 44.25 E-value: 8.45e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1431 QEITLTPQQAQALgmpLTAQQAQELGITLTPQQAQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQAQALG-----I 1505
Cdd:cd22553 99 QAIQLAPGGTQAI---LANQQTLIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNNMTQTIPVQVPVSTANGqtvyqT 175
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1506 LLIPPQAQELGIPLTPQQAqaLGIPLIPPQAQelgipltPQQVQALGIPLIPPQAQELEIPLTPQQAQALGIPLTPQQAQ 1585
Cdd:cd22553 176 IQVPIQAIQSGNAGGGNQA--LQAQVIPQLAQ-------AAQLQPQQLAQVSSQGYIQQIPANASQQQPQMVQQGPNQSG 246
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1586 ELGIPLTPQQAQELGIPLTPQQAQAQGIPLTPQQAQALGISLTPQQAQAQGITLTP--------QQAQALGVPITPVNAW 1657
Cdd:cd22553 247 QIIGQVASASSIQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPAsssiptvvQQQAIQGNPLPPGTQI 326
|
...
gi 224458301 1658 VSA 1660
Cdd:cd22553 327 IAA 329
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
1292-1453 |
1.13e-03 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 44.26 E-value: 1.13e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1292 NPQQAQTLGIPLTPKQAQAlgipfTPQQAQALGIPLTPQQAQTQEITLTPQQAQALGMPLTTQQAqelgiPLTPQHAQAL 1371
Cdd:pfam09770 209 KPAQQPAPAPAQPPAAPPA-----QQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQR-----PQSPQPDPAQ 278
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1372 GMPLTTQQAQELGIPLTPQQ-AQALGMPlTTQQAQELGIPLTPQQAQElgiPFTPQQAQAQeitltpQQAQALGMPLTAQ 1450
Cdd:pfam09770 279 PSIQPQAQQFHQQPPPVPVQpTQILQNP-NRLSAARVGYPQNPQPGVQ---PAPAHQAHRQ------QGSFGRQAPIITH 348
|
...
gi 224458301 1451 QAQ 1453
Cdd:pfam09770 349 PQQ 351
|
|
| flk |
PRK10715 |
flagella biosynthesis regulator Flk; |
1291-1491 |
1.14e-03 |
|
flagella biosynthesis regulator Flk;
Pssm-ID: 182670 Cd Length: 335 Bit Score: 43.51 E-value: 1.14e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1291 LNPQQAQTLgipLTPKQAQALGIPFtPQQAQALGIPLTP--QQAQTQEIT----LTPQQAQALGMPLTTQQAQELGIPLT 1364
Cdd:PRK10715 130 LSPEQLKQV---LTLLQNGQLSIPQ-PQQRPATDRPLLPaeHNALNQLVTklaaATGEQPKKIWQSMLELSGVKSGELIP 205
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1365 PQHAQALGMPLTTQQAqelgipLTPQQAQALGMPLTTqqaqeLGIPLTPQQAQELgIPFTPQQAQAqeitlTPQqaqalg 1444
Cdd:PRK10715 206 AKHFPLLSQWLQARQT------LSQQHAPTLESLQAA-----LKQPLDAQEQQLL-SDYAQQRFQA-----SPQ------ 262
|
170 180 190 200
....*....|....*....|....*....|....*....|....*..
gi 224458301 1445 MPLTAQQAQELGITLTPQQAQELGIPLTPQQAQALGIPLIPPQAQEL 1491
Cdd:PRK10715 263 TPLTPAQVQDLLNQLFQRRVERIQEALEPRPLQPLINPLIAPLPDTL 309
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
1368-1916 |
1.20e-03 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 44.31 E-value: 1.20e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1368 AQALGMPL--TTQQAQELGIPLTPQQAQALGMPLTTQQAQELGIPLTPQqaqelGIPFTPQQAQAQEITLTPQQ-----A 1440
Cdd:PRK10263 330 TQSWAAPVepVTQTPPVASVDVPPAQPTVAWQPVPGPQTGEPVIAPAPE-----GYPQQSQYAQPAVQYNEPLQqpvqpQ 404
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1441 QALGMPLTAQQAQELGITLTPQQAQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQAQALGILLIPPQAQELGIPLT 1520
Cdd:PRK10263 405 QPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQPAAQEPLYQQPQPV 484
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1521 PQQAQALGIPLIppqaqELGIPLTPqqvqalgiPLIPPQAQELEIPLTPQQAQALGIPL-TPQQAQELGIPLTPQQAQEL 1599
Cdd:PRK10263 485 EQQPVVEPEPVV-----EETKPARP--------PLYYFEEVEEKRAREREQLAAWYQPIpEPVKEPEPIKSSLKAPSVAA 551
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1600 GIPLTPQQAQA---QGIPLTPQQAQALGISLTPQQAQAQGITLTPQQAQALGvPITPVNAWVSAVTLTSEQTHALESP-- 1674
Cdd:PRK10263 552 VPPVEAAAAVSplaSGVKKATLATGAAATVAAPVFSLANSGGPRPQVKEGIG-PQLPRPKRIRVPTRRELASYGIKLPsq 630
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1675 -MNLEQAQE---QLLKLGVPLTLDKAHTLGSPLTLKQV--QWSHRPFQKSKASLPTgQSIISRLSPSLRLSLASSAPTAE 1748
Cdd:PRK10263 631 rAAEEKAREaqrNQYDSGDQYNDDEIDAMQQDELARQFaqTQQQRYGEQYQHDVPV-NAEDADAAAEAELARQFAQTQQQ 709
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1749 KSSifgvSSTPLQISRVPLNQGPFAPGKPLEMGILSEPgkLGAPQTLRSSGQTLVYGGQSTSAQFPAPQAPPSPGQLPIS 1828
Cdd:PRK10263 710 RYS----GEQPAGANPFSLDDFEFSPMKALLDDGPHEP--LFTPIVEPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQPQQ 783
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1829 RAPPTP--GQPFIAGVPPTSGQipslwaplSPGQPLVPEassiPGDLLESGPLTFSEQLQEFQPPATAE----------- 1895
Cdd:PRK10263 784 PVAPQPqyQQPQQPVAPQPQYQ--------QPQQPVAPQ----PQYQQPQQPVAPQPQYQQPQQPVAPQpqdtllhpllm 851
|
570 580
....*....|....*....|....
gi 224458301 1896 ---QSPYLQAPSTPGQHLATWTLP 1916
Cdd:PRK10263 852 rngDSRPLHKPTTPLPSLDLLTPP 875
|
|
| PspC_subgroup_1 |
NF033838 |
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ... |
639-1006 |
1.29e-03 |
|
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.
Pssm-ID: 468201 [Multi-domain] Cd Length: 684 Bit Score: 43.85 E-value: 1.29e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 639 LVKSLSRVAKETSESTRVLESpdgKSEQSnleefqeaIMAFLKQKIDNIGKAFDKKTVPKEEellKRAEAEKlgiikakm 718
Cdd:NF033838 89 LNKKLSDIKTEYLYELNVLKE---KSEAE--------LTSKTKKELDAAFEQFKKDTLEPGK---KVAEATK-------- 146
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 719 eeyfqKVAETVtkilRKYKDTKKEEQVGEKPIKQKKVvsfmpGLHFQKSPISAKSESSTLLSYESTDPVINNLIQMILAE 798
Cdd:NF033838 147 -----KVEEAE----KKAKDQKEEDRRNYPTNTYKTL-----ELEIAESDVEVKKAELELVKEEAKEPRDEEKIKQAKAK 212
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 799 IESERDIPT----VSTVQKDHKEKEKQRQEQYLQEGQEQMSGMSLKQQLLG--ERNLLKEHYEKISENWEEKKAWLQMKE 872
Cdd:NF033838 213 VESKKAEATrlekIKTDREKAEEEAKRRADAKLKEAVEKNVATSEQDKPKRraKRGVLGEPATPDKKENDAKSSDSSVGE 292
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 873 GKQEQQSQkqwqeeemwKEEQKQATPKQAEQEEKQKQRGQEEEE---LPKSSLQRLE----EGTQKMKTQGLLLEKEngq 945
Cdd:NF033838 293 ETLPSPSL---------KPEKKVAEAEKKVEEAKKKAKDQKEEDrrnYPTNTYKTLEleiaESDVKVKEAELELVKE--- 360
|
330 340 350 360 370 380
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301 946 mrqiqkEAKHlgpHRRREKGKE-KQKPERGLEDLERQIKTKDQMQMKETQPK---ELEKMVIQTP 1006
Cdd:NF033838 361 ------EAKE---PRNEEKIKQaKAKVESKKAEATRLEKIKTDRKKAEEEAKrkaAEEDKVKEKP 416
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
1791-1952 |
1.29e-03 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 44.10 E-value: 1.29e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1791 APQTLRSSGQTLVYGGQSTSAQFPAPQAPPSPGQLPisRAPPTPGQPFIAGVPPTSGQIPSLWAPLSPGQPLVPEAssip 1870
Cdd:PRK12323 429 APEALAAARQASARGPGGAPAPAPAPAAAPAAAARP--AAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEEL---- 502
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1871 gdllesgpltfseqlqefqPPATAEQSPYLQAPSTPGQHLATWTLPGRASSLWIPPTSRHPPTLWPSPAPGKPQKSWSPS 1950
Cdd:PRK12323 503 -------------------PPEFASPAPAQPDAAPAGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAP 563
|
..
gi 224458301 1951 VA 1952
Cdd:PRK12323 564 RP 565
|
|
| DamX |
COG3266 |
Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell ... |
1322-1622 |
1.36e-03 |
|
Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell cycle control, cell division, chromosome partitioning];
Pssm-ID: 442497 [Multi-domain] Cd Length: 455 Bit Score: 43.69 E-value: 1.36e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1322 ALGIPLTPQQAQTQEITLTPQQAQALgMPLTTQQAQELGIPLTPQHAQALGMPLTTQQAQELGIPLTPQQAQALGMPLTT 1401
Cdd:COG3266 53 LLAGLLLLLIRLLSEAVDLGALASAA-LLLALASLALLGILLLALLALLLDLLLLADLLRAAALLLLKLLLLLLTLLLLV 131
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1402 QQAQELGIPLTPQQAQELGIPFTPQQAQAQEITLTPQQAQALGMPLTAQQAQELgiTLTPQQAQELGIPLTPQQAQALGI 1481
Cdd:COG3266 132 LLLLLALLLALLLDLPLLTLLIVLPLLEEQLLLLALQDIQGTLQALGAVAALLG--LRKAEEALALRAGSAAADALALLL 209
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1482 PLIPPQAQELGIPLTPQQ--AQALGILLIPPQAQELGIPLTPQQAQALgipliPPQAQELGIPLTPQQVQALGIPLIPPQ 1559
Cdd:COG3266 210 LLLASALGEAVAAAAELAalALLAAGAAEVLTARLVLLLLIIGSALKA-----PSQASSASAPATTSLGEQQEVSLPPAV 284
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|...
gi 224458301 1560 AQELEiPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGIPlTPQQAQA 1622
Cdd:COG3266 285 AAQPA-AAAAAQPSAVALPAAPAAAAAAAAPAEAAAPQPTAAKPVVTETAAPAAP-APEAAAA 345
|
|
| SP1-4_arthropods_N |
cd22553 |
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ... |
1217-1452 |
1.60e-03 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.
Pssm-ID: 411778 [Multi-domain] Cd Length: 384 Bit Score: 43.48 E-value: 1.60e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1217 IPLTPQQAQAL---GITLTLQQAQQL------GIPLTPQQAQALGITLTPKQVQELGIPLTPQQAQALGITLTpkqaQEL 1287
Cdd:cd22553 101 IQLAPGGTQAIlanQQTLIRPNTVQGqanasnVLQNIAQIASGGNAVQLPLNNMTQTIPVQVPVSTANGQTVY----QTI 176
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1288 GIPLNPQQAQTLGIPLTPKQAQALgipftPQQAQalgipltPQQAQTQEITLTPQQAQALGMPLTTQQAQELGIPLTP-Q 1366
Cdd:cd22553 177 QVPIQAIQSGNAGGGNQALQAQVI-----PQLAQ-------AAQLQPQQLAQVSSQGYIQQIPANASQQQPQMVQQGPnQ 244
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1367 HAQALGMPLTTQQAQELGIPLTPQQaQALGMPLTTQQAQELGIPLTPQQAQELGIPFTPqqaqAQEITLTPQQAQALGMP 1446
Cdd:cd22553 245 SGQIIGQVASASSIQAAAIPLTVYT-GALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPA----SSSIPTVVQQQAIQGNP 319
|
....*.
gi 224458301 1447 LTAQQA 1452
Cdd:cd22553 320 LPPGTQ 325
|
|
| DUF5401 |
pfam17380 |
Family of unknown function (DUF5401); This is a family of unknown function found in ... |
799-1055 |
1.71e-03 |
|
Family of unknown function (DUF5401); This is a family of unknown function found in Chromadorea.
Pssm-ID: 375164 [Multi-domain] Cd Length: 722 Bit Score: 43.57 E-value: 1.71e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 799 IESERDIPTVSTVQKDhKEKEKQRQEQYLQEGQE--QMSGMSLKQQLLGER---NLLKEHYEKISENWEEKKAWLQMKEG 873
Cdd:pfam17380 344 MERERELERIRQEERK-RELERIRQEEIAMEISRmrELERLQMERQQKNERvrqELEAARKVKILEEERQRKIQQQKVEM 422
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 874 KQEQQSqkqwqeeemwKEEQKQATPKQAEQE-EKQKQRGQEEEELPKSSLQRLEEGTQKMKTQGLLLEKENGQMRQIQKE 952
Cdd:pfam17380 423 EQIRAE----------QEEARQREVRRLEEErAREMERVRLEEQERQQQVERLRQQEEERKRKKLELEKEKRDRKRAEEQ 492
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 953 -----AKHLGPHRRR--EKGKEKQKPERGLEDLERQIKTKDQMQMKE----TQPKELEKMVIQTPMTLSPRWKSVLKDVQ 1021
Cdd:pfam17380 493 rrkilEKELEERKQAmiEEERKRKLLEKEMEERQKAIYEEERRREAEeerrKQQEMEERRRIQEQMRKATEERSRLEAME 572
|
250 260 270
....*....|....*....|....*....|....
gi 224458301 1022 RSyegKEFQRNLKTLENLPDEKEpiSITPPPSLQ 1055
Cdd:pfam17380 573 RE---REMMRQIVESEKARAEYE--ATTPITTIK 601
|
|
| PRK03918 |
PRK03918 |
DNA double-strand break repair ATPase Rad50; |
781-1051 |
1.81e-03 |
|
DNA double-strand break repair ATPase Rad50;
Pssm-ID: 235175 [Multi-domain] Cd Length: 880 Bit Score: 43.51 E-value: 1.81e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 781 YESTDPVINNLIQMILAEIESERD-IPTVSTVQKDHKEKEKqRQEQYLQEGQEQMSGM-SLKQQLLGERNLLKEhYEKIS 858
Cdd:PRK03918 160 YENAYKNLGEVIKEIKRRIERLEKfIKRTENIEELIKEKEK-ELEEVLREINEISSELpELREELEKLEKEVKE-LEELK 237
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 859 ENWEEKKAWLQMKEGKQEQQsqkqwqeeemwkEEQKQATPKQAEQEEKqkqrgqEEEELpKSSLQRLEEgtqkmktqgll 938
Cdd:PRK03918 238 EEIEELEKELESLEGSKRKL------------EEKIRELEERIEELKK------EIEEL-EEKVKELKE----------- 287
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 939 LEKENGQMRQIQKEakhlgphrRREKGKEKQKPERGLEDLERQIKT-KDQMQMKETQPKELEKmviqtpmtLSPRWKSVL 1017
Cdd:PRK03918 288 LKEKAEEYIKLSEF--------YEEYLDELREIEKRLSRLEEEINGiEERIKELEEKEERLEE--------LKKKLKELE 351
|
250 260 270
....*....|....*....|....*....|....*
gi 224458301 1018 KDVQRSYE-GKEFQRNLKTLENLPDEKEPISITPP 1051
Cdd:PRK03918 352 KRLEELEErHELYEEAKAKKEELERLKKRLTGLTP 386
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
1739-1946 |
1.83e-03 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 43.62 E-value: 1.83e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1739 SLASSAPTAEKSSIFG----VSSTPLQISRVPLNQGPfAPGKPLEMGILSEPGKLGAPQTLRSSGQTLVYGGQSTSAQF- 1813
Cdd:PHA03307 23 RPPATPGDAADDLLSGsqgqLVSDSAELAAVTVVAGA-AACDRFEPPTGPPPGPGTEAPANESRSTPTWSLSTLAPASPa 101
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1814 -------PAPQAPPSPGQLPISRAPPTPGQPFIAGVPPTSGQIPSLWAPLSPGQPLVPEASsiPGDLLESGPLTFSEQLq 1886
Cdd:PHA03307 102 regsptpPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAV--ASDAASSRQAALPLSS- 178
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 224458301 1887 efqPPATAeqspylQAPSTPGQHLATWTLPGRASSlwiPPTSRHPP----TLWPSPAPGKPQKS 1946
Cdd:PHA03307 179 ---PEETA------RAPSSPPAEPPPSTPPAAASP---RPPRRSSPisasASSPAPAPGRSAAD 230
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
1770-1944 |
1.85e-03 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 43.44 E-value: 1.85e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1770 GPFAPGKPLEMGILSEPGKLGAPqtlrssgqtlvyGGQSTSAQFPAPQAPPSPGQLPISRAPPTPGQPFIAGVPPTSGQI 1849
Cdd:PRK07764 599 GPPAPASSGPPEEAARPAAPAAP------------AAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGG 666
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1850 PSLWAPLSPGQPLVPEASSIPGDllESGPLTFSEQLQEFQPPATAEQSPYLQAPSTPGQHLATWTLPGRASSLWIPPTSR 1929
Cdd:PRK07764 667 DGWPAKAGGAAPAAPPPAPAPAA--PAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPE 744
|
170
....*....|....*
gi 224458301 1930 HPPTLWPSPAPGKPQ 1944
Cdd:PRK07764 745 PDDPPDPAGAPAQPP 759
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
1797-1974 |
1.94e-03 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 43.60 E-value: 1.94e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1797 SSGQTLVYGGQSTSAQFPAPQAPPSPgqlpisRAPPTPGQPFIAGVPPTSGQIPSLWAPLSPGQPLVPEASSIPGDLLES 1876
Cdd:pfam03154 160 SSAQQQILQTQPPVLQAQSGAASPPS------PPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQ 233
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1877 GPLTFSEQLQEFQPPATAEQSP----YLQAPSTPGQHLATWTLPGRASSLWIPPTSRHPPTLWPSPAPGKPQKSWSPSVA 1952
Cdd:pfam03154 234 TPTLHPQRLPSPHPPLQPMTQPpppsQVSPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQPFPLTPQSSQSQVPPGP 313
|
170 180
....*....|....*....|..
gi 224458301 1953 KKRLAIISSLKSKSVLIHPSAP 1974
Cdd:pfam03154 314 SPAAPGQSQQRIHTPPSQSQLQ 335
|
|
| PHA03377 |
PHA03377 |
EBNA-3C; Provisional |
1790-1975 |
2.28e-03 |
|
EBNA-3C; Provisional
Pssm-ID: 177614 [Multi-domain] Cd Length: 1000 Bit Score: 43.50 E-value: 2.28e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1790 GAPQTLRSSGQTLVYGGQSTSAQ----FPAPQAPPSPGQLPISRAPPTPGQPFIAGVPPTSGQIP--SLWAPLSPGQPLV 1863
Cdd:PHA03377 696 GRAQPSEESHLSSMSPTQPISHEeqprYEDPDDPLDLSLHPDQAPPPSHQAPYSGHEEPQAQQAPypGYWEPRPPQAPYL 775
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1864 ----PEA-----SSIPGDlleSGPLTFSEQLQEFQPP-ATAEQSPYLQAPSTPgqhlatwtlpgrasslWIPPTSRHPPT 1933
Cdd:PHA03377 776 gyqePQAqgvqvSSYPGY---AGPWGLRAQHPRYRHSwAYWSQYPGHGHPQGP----------------WAPRPPHLPPQ 836
|
170 180 190 200
....*....|....*....|....*....|....*....|..
gi 224458301 1934 LWPSPAPGKPQKSWSPSVAKKRLAIISSLKSKSVLIHPSAPD 1975
Cdd:PHA03377 837 WDGSAGHGQDQVSQFPHLQSETGPPRLQLSQVPQLPYSQTLV 878
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
1803-1950 |
2.65e-03 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 43.05 E-value: 2.65e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1803 VYGGQSTSAQFPAPQAPPSPGQLPISRAPPTPGQPfIAGVPPTSGQIPSLWAPLSPGQPLVPEASSIPGDLLESGPLTFS 1882
Cdd:PRK07764 587 VVGPAPGAAGGEGPPAPASSGPPEEAARPAAPAAP-AAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDG 665
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1883 EQLQ--EFQPPATAEQSPylqAPSTPGQHLATWTLPGRASSlwiPPTSRHPPTLWPSPAPGKPQKSWSPS 1950
Cdd:PRK07764 666 GDGWpaKAGGAAPAAPPP---APAPAAPAAPAGAAPAQPAP---APAATPPAGQADDPAAQPPQAAQGAS 729
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
1153-1539 |
4.25e-03 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 42.36 E-value: 4.25e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1153 QAPGISLTTQQAQKLGIPLTPQQAQAlgiPLTPQQA----QELGIPLTPQQAQALRV------------SLTPQQAQELG 1216
Cdd:PHA03378 448 QAPTVVLHRPPTQPLEGPTGPLSVQA---PLEPWQPlphpQVTPVILHQPPAQGVQAhgsmldllekddEDMEQRVMATL 524
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1217 IPLTPQQAQAlGITLTLQQAQQLGI---------PLTPQQAQALGitLTPKQVQELGIPLTPQQAqalgiTLTPKQAQEL 1287
Cdd:PHA03378 525 LPPSPPQPRA-GRRAPCVYTEDLDIesdepastePVHDQLLPAPG--LGPLQIQPLTSPTTSQLA-----SSAPSYAQTP 596
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1288 GipLNPQQAQTLGIPLTPKQAQALGipfTPQQAQALGIPLTPQQAQTQEITLTP-------QQAQALGMPLTTQQAQELG 1360
Cdd:PHA03378 597 W--PVPHPSQTPEPPTTQSHIPETS---APRQWPMPLRPIPMRPLRMQPITFNVlvfptphQPPQVEITPYKPTWTQIGH 671
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1361 IPLTPQHAQALGM------PLTTQQAQELGIPLTPQQAQalgmPLTTQQAQELGIPLTPQQAqelgipfTPQQAQAQEIT 1434
Cdd:PHA03378 672 IPYQPSPTGANTMlpiqwaPGTMQPPPRAPTPMRPPAAP----PGRAQRPAAATGRARPPAA-------APGRARPPAAA 740
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1435 LTPQQAQAlGMPLTAQQAQELGITLTPQQAQElGIPLTPQQAQALGIPLIPPQAQElgiplTPQQaqalgilliPPQAQE 1514
Cdd:PHA03378 741 PGRARPPA-AAPGRARPPAAAPGRARPPAAAP-GAPTPQPPPQAPPAPQQRPRGAP-----TPQP---------PPQAGP 804
|
410 420
....*....|....*....|....*
gi 224458301 1515 LGIPLTPQQAQALGIPLIPPQAQEL 1539
Cdd:PHA03378 805 TSMQLMPRAAPGQQGPTKQILRQLL 829
|
|
| Borrelia_P83 |
pfam05262 |
Borrelia P83/100 protein; This family consists of several Borrelia P83/P100 antigen proteins. |
890-1030 |
4.41e-03 |
|
Borrelia P83/100 protein; This family consists of several Borrelia P83/P100 antigen proteins.
Pssm-ID: 114011 [Multi-domain] Cd Length: 489 Bit Score: 42.30 E-value: 4.41e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 890 KEEQKQATPKQAEQEEKQKQRGQEEEELPKSSLQRLEEGtqkmktqgllLEKENGQMRQIQKEAKHLGPHRRREKGKEKQ 969
Cdd:pfam05262 204 KERESQEDAKRAQQLKEELDKKQIDADKAQQKADFAQDN----------ADKQRDEVRQKQQEAKNLPKPADTSSPKEDK 273
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 224458301 970 K----PERGLEDLERQIKTKDQMQMKET---------QPKELEKMVIQTPMTLSPRWKSVLKDVQRSYEGKEFQ 1030
Cdd:pfam05262 274 QvaenQKREIEKAQIEIKKNDEEALKAKdhkafdlkqESKASEKEAEDKELEAQKKREPVAEDLQKTKPQVEAQ 347
|
|
| SMC_prok_A |
TIGR02169 |
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ... |
817-1000 |
4.66e-03 |
|
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]
Pssm-ID: 274009 [Multi-domain] Cd Length: 1164 Bit Score: 42.36 E-value: 4.66e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 817 EKEKQRQEqyLQEGQEQMSGMSLK--------QQLLGERNLlKEHYEKISENWEEKKAWLQMKEGKQEqqsqkqwqeeem 888
Cdd:TIGR02169 171 KKEKALEE--LEEVEENIERLDLIidekrqqlERLRREREK-AERYQALLKEKREYEGYELLKEKEAL------------ 235
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 889 wkEEQKQATPKQ-AEQEEKQKQRGQEEEELPK---SSLQRLEEGTQKMKtqglllEKENGQMRQIQKEAKHLGPHRRREK 964
Cdd:TIGR02169 236 --ERQKEAIERQlASLEEELEKLTEEISELEKrleEIEQLLEELNKKIK------DLGEEEQLRVKEKIGELEAEIASLE 307
|
170 180 190
....*....|....*....|....*....|....*..
gi 224458301 965 GKEKQKpERGLEDLERQI-KTKDQMQMKETQPKELEK 1000
Cdd:TIGR02169 308 RSIAEK-ERELEDAEERLaKLEAEIDKLLAEIEELER 343
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
1389-1551 |
4.67e-03 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 42.33 E-value: 4.67e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1389 PQQAQALGMPLTTQQaqelgIPLTPQQAQELGIPFTPQQAQAQEITLTPQQAQALGMPLTAQQaqelgitlTPQQAQelg 1468
Cdd:pfam09770 210 PAQQPAPAPAQPPAA-----PPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQ--------RPQSPQ--- 273
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1469 ipltPQQAQalgiPLIPPQAQELGIPLTPQQAQALGILLIP--PQAQELGIPLTPQQAQALGIPLIPPQAQ-----ELGI 1541
Cdd:pfam09770 274 ----PDPAQ----PSIQPQAQQFHQQPPPVPVQPTQILQNPnrLSAARVGYPQNPQPGVQPAPAHQAHRQQgsfgrQAPI 345
|
170
....*....|
gi 224458301 1542 PLTPQQVQAL 1551
Cdd:pfam09770 346 ITHPQQLAQL 355
|
|
| PRK03918 |
PRK03918 |
DNA double-strand break repair ATPase Rad50; |
670-980 |
6.05e-03 |
|
DNA double-strand break repair ATPase Rad50;
Pssm-ID: 235175 [Multi-domain] Cd Length: 880 Bit Score: 41.97 E-value: 6.05e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 670 EEFQEAIMAFLKQKIDNIGKAFdKKTVPKEEELlkRAEAEKLGIIKAKMEEYF--QKVAETVTKI---LRKY------KD 738
Cdd:PRK03918 447 EEHRKELLEEYTAELKRIEKEL-KEIEEKERKL--RKELRELEKVLKKESELIklKELAEQLKELeekLKKYnleeleKK 523
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 739 TKKEEQVGEKPIKQKKVVSfmpglhfqkspiSAKSESSTLLSYESTDPVINNLIQMI---LAEIESERDIPTVSTVQKDh 815
Cdd:PRK03918 524 AEEYEKLKEKLIKLKGEIK------------SLKKELEKLEELKKKLAELEKKLDELeeeLAELLKELEELGFESVEEL- 590
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 816 kEKEKQRQEQYLQEGQEQMSGMSLKQQLLGERNLLKEHYEKISENWEEKKAWLQMKEGKQeqqsqkqwqeeemwkEEQKQ 895
Cdd:PRK03918 591 -EERLKELEPFYNEYLELKDAEKELEREEKELKKLEEELDKAFEELAETEKRLEELRKEL---------------EELEK 654
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 896 atpKQAEQEEKQKqrgqEEEELPKSS-LQRLEEGtqkmktqgllLEKENGQMRQIQKEAKHLGPHR--RREKGKEKQKPE 972
Cdd:PRK03918 655 ---KYSEEEYEEL----REEYLELSReLAGLRAE----------LEELEKRREEIKKTLEKLKEELeeREKAKKELEKLE 717
|
....*...
gi 224458301 973 RGLEDLER 980
Cdd:PRK03918 718 KALERVEE 725
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
1520-1679 |
6.22e-03 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 41.95 E-value: 6.22e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1520 TPQQAQALGIPLIPPQAQelgiPLTPQQVQALGIPLIPPQAQELEIPLTPQQAQALGIPLTPQQAQELGiPLTPQQAQel 1599
Cdd:pfam09770 208 KKPAQQPAPAPAQPPAAP----PAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSP-QPDPAQPS-- 280
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1600 giPLTPQQAQAQGIPLTPQQ-------------AQALGISLTPQQAQAQGITLTPQQAQALGVPITPVNAWVSAVTLTSE 1666
Cdd:pfam09770 281 --IQPQAQQFHQQPPPVPVQptqilqnpnrlsaARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHPQQLAQLSEE 358
|
170
....*....|...
gi 224458301 1667 QTHAlespmNLEQ 1679
Cdd:pfam09770 359 EKAA-----YLDE 366
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
1550-1998 |
7.96e-03 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 41.58 E-value: 7.96e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1550 ALGIPLIPPQAQELEIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGIPLTPQQAQALGISLTP 1629
Cdd:PHA03379 412 TYGTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDG 491
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1630 QQAqaqgitltPQQAQALGVPItpVNAWVSAVTLTSEQTHALESPMNLEQA-----QEQLLKLGVPLTLDKAHT-LGSPL 1703
Cdd:PHA03379 492 RPA--------CAPVPAPAGPI--VRPWEASLSQVPGVAFAPVMPQPMPVEpvpvpTVALERPVCPAPPLIAMQgPGETS 561
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1704 TLKQVQWSHRPfqksKASLPTGQSIISRLSPSLRLSLASSAPTAEKSSifgVSSTPLQISRVPlnqgpfaPGKPLEMGIl 1783
Cdd:PHA03379 562 GIVRVRERWRP----APWTPNPPRSPSQMSVRDRLARLRAEAQPYQAS---VEVQPPQLTQVS-------PQQPMEYPL- 626
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1784 sEPGKLGAPQTLRSSGQTLVYGGQSTSAQfpaPQAPPSPGQLPIS-RAPPTPGQPFIAGVPPTSGQIPS-----LWAPLS 1857
Cdd:PHA03379 627 -EPEQQMFPGSPFSQVADVMRAGGVPAMQ---PQYFDLPLQQPISqGAPLAPLRASMGPVPPVPATQPQyfdipLTEPIN 702
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1858 PGQ-------------PLVPEASSIPGDLLESGPLTFSEQLQEFQPPAT------AEQSPYLQAPSTPGQhlatwtlpgr 1918
Cdd:PHA03379 703 QGAsaahflpqqpmegPLVPERWMFQGATLSQSVRPGVAQSQYFDLPLTqpinhgAPAAHFLHQPPMEGP---------- 772
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1919 asslWIPPTsrhpptlWP-SPAPGKPQKSWSPSVAKKRLAIISSLKSKSVLIHPSAPDFKVAQVPF--TTKKFQMSEVSD 1995
Cdd:PHA03379 773 ----WVPEQ-------WMfQGAPPSQGTDVVQHQLDALGYVLHVLNHPGVPVSPAVNQYHVSQAAFglPIDEDESGEGSD 841
|
...
gi 224458301 1996 TSE 1998
Cdd:PHA03379 842 TSE 844
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
1496-1643 |
8.51e-03 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 41.56 E-value: 8.51e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1496 TPQQAQALGILLIPPQAQelgiPLTPQQAQALGIPLIPPQAQELGIPLTPQQVQALGIPlippqAQELEIPLTPQQAQAL 1575
Cdd:pfam09770 208 KKPAQQPAPAPAQPPAAP----PAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHP-----VTILQRPQSPQPDPAQ 278
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 224458301 1576 GIPLTPQQAQELGIPLTPQQ-AQELGIPLTPQQAQAQGIPLTPQQAQALGISLTPQQAQAQG----ITLTPQQ 1643
Cdd:pfam09770 279 PSIQPQAQQFHQQPPPVPVQpTQILQNPNRLSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGrqapIITHPQQ 351
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
1476-1631 |
8.80e-03 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 41.56 E-value: 8.80e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1476 AQALGIPliPPQAQElgIPLTPQQAQAlgillIPPQAQELGIPLTPQQAQALGIPLIPPQAQELGIPLT----PQQVQAL 1551
Cdd:pfam09770 205 AQAKKPA--QQPAPA--PAQPPAAPPA-----QQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTilqrPQSPQPD 275
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1552 -GIPLIPPQAQELEIPLTPQQAQALGIPLTP--QQAQELGIPLTPQQAQElgiPLTPQQAQAQGipltPQQAQALGISLT 1628
Cdd:pfam09770 276 pAQPSIQPQAQQFHQQPPPVPVQPTQILQNPnrLSAARVGYPQNPQPGVQ---PAPAHQAHRQQ----GSFGRQAPIITH 348
|
...
gi 224458301 1629 PQQ 1631
Cdd:pfam09770 349 PQQ 351
|
|
| PHA03377 |
PHA03377 |
EBNA-3C; Provisional |
1805-1951 |
9.26e-03 |
|
EBNA-3C; Provisional
Pssm-ID: 177614 [Multi-domain] Cd Length: 1000 Bit Score: 41.19 E-value: 9.26e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1805 GGQSTSAQFPAPQAPPSPgqlPISRAPPTPGQPFIAgvPPTSGqiPSLWAPLSPGQPLVPEASSIPGDLLESGPLTFSEQ 1884
Cdd:PHA03377 539 GFQRSGRRQKRATPPKVS---PSDRGPPKASPPVMA--PPSTG--PRVMATPSTGPRDMAPPSTGPRQQAKCKDGPPASG 611
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1885 LQEFQPPATAeqsPYLQAPSTPGQHLATWTLP---GRA-SSLW------------IPPTSRHPPTLWPSPapgkPQKSWS 1948
Cdd:PHA03377 612 PHEKQPPSSA---PRDMAPSVVRMFLRERLLEqstGPKpKSFWemragrdgsgiqQEPSSRRQPATQSTP----PRPSWL 684
|
...
gi 224458301 1949 PSV 1951
Cdd:PHA03377 685 PSV 687
|
|
|