NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|224458301|ref|NP_001138947|]
View 

protein FAM186A [Homo sapiens]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
PTZ00121 super family cl31754
MAEBL; Provisional
381-1042 3.48e-14

MAEBL; Provisional


The actual alignment was detected with superfamily member PTZ00121:

Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 79.03  E-value: 3.48e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  381 ELENIVDEVQRKETkdsgIKWDSTISYTAQAERT-PDLTELRQQPVASEDISEDSTKDNVSLKKGDFYQEDETDEYQSWK 459
Cdd:PTZ00121 1221 EDAKKAEAVKKAEE----AKKDAEEAKKAEEERNnEEIRKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAK 1296
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  460 RSH--KKATYVYETSGPNLSDNKSGQKVSEAKPsqyyELQVLKKKRKEMKSFSEDKsKSPTEAKRKHLSLTETKSQGGKS 537
Cdd:PTZ00121 1297 KAEekKKADEAKKKAEEAKKADEAKKKAEEAKK----KADAAKKKAEEAKKAAEAA-KAEAEAAADEAEAAEEKAEAAEK 1371
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  538 GTSmmmlEQFRK---VKRESPFDKRPTAAEIKVEPTTESLD--KEGKGEIRSLVEPLSMIQFDDTAEPQKGKIKGKKHHI 612
Cdd:PTZ00121 1372 KKE----EAKKKadaAKKKAEEKKKADEAKKKAEEDKKKADelKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKAD 1447
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  613 SSGTITSKEEKTEEKEELTKQVKSHQLVKSLSRVAKETSESTRVLESPDGKSEQsnLEEFQEAimaflKQKIDNIGKAFD 692
Cdd:PTZ00121 1448 EAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADE--AKKAAEA-----KKKADEAKKAEE 1520
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  693 KKtvpKEEELLKRAEAEKLGiiKAKMEEYFQKVAEtvtkiLRKYKDTKKEEQVgeKPIKQKKVVSFMPGLHFQKSPISAK 772
Cdd:PTZ00121 1521 AK---KADEAKKAEEAKKAD--EAKKAEEKKKADE-----LKKAEELKKAEEK--KKAEEAKKAEEDKNMALRKAEEAKK 1588
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  773 SESSTLLSYESTDPVINNLIQMILAEIESERdiPTVSTVQKDHKEKEKQRQEQYLQEgQEQMSGMSLKQQllGERNLLKE 852
Cdd:PTZ00121 1589 AEEARIEEVMKLYEEEKKMKAEEAKKAEEAK--IKAEELKKAEEEKKKVEQLKKKEA-EEKKKAEELKKA--EEENKIKA 1663
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  853 HYEKISENWEEKKAwlqmKEGKQEQQSQKQWQEEEMWKEEQKQATPKQAEQEEKQKQRGQE---EEELPKSSLQRLEEGT 929
Cdd:PTZ00121 1664 AEEAKKAEEDKKKA----EEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEElkkAEEENKIKAEEAKKEA 1739
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  930 QKMKTQGLLLEKENGQMRQIQ---KEAKHLGPHRRREKG---KE--KQKPERGLEDLERQIK-TKDQMQMKETQPKELEK 1000
Cdd:PTZ00121 1740 EEDKKKAEEAKKDEEEKKKIAhlkKEEEKKAEEIRKEKEaviEEelDEEDEKRRMEVDKKIKdIFDNFANIIEGGKEGNL 1819
                         650       660       670       680
                  ....*....|....*....|....*....|....*....|....
gi 224458301 1001 MVIQTPMTLSPRWKSVL--KDVQRSyEGKEFQRNLKTLENLPDE 1042
Cdd:PTZ00121 1820 VINDSKEMEDSAIKEVAdsKNMQLE-EADAFEKHKFNKNNENGE 1862
PHA03247 super family cl33720
large tegument protein UL36; Provisional
1602-1953 3.30e-09

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 62.65  E-value: 3.30e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1602 PLTPQQAQAQGIPlTPQQAQAlgISLTPQQAQAQGITLTPQQAQalgvPITPVN-----AWVSAVTLTSEQTHALESPMN 1676
Cdd:PHA03247 2557 PAAPPAAPDRSVP-PPRPAPR--PSEPAVTSRARRPDAPPQSAR----PRAPVDdrgdpRGPAPPSPLPPDTHAPDPPPP 2629
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1677 LEQAQEQLLKLGVPLTLDKAHTLGSPLTLKQVQWSHRPFQKSKASLPTG--QSIISRLSPSLRLSLASSA--PTAEKSSi 1752
Cdd:PHA03247 2630 SPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSppQRPRRRAARPTVGSLTSLAdpPPPPPTP- 2708
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1753 fgVSSTPLQISRVPLNQGPFAPGKplemgilSEPGKLGAPQTLRSSGQTLVYGGQSTSAQFPAPQAPPSPG--QLPISRA 1830
Cdd:PHA03247 2709 --EPAPHALVSATPLPPGPAAARQ-------ASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAppAAPAAGP 2779
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1831 PPTPGQPFIAGVPPTSGQIPSLWAPLSPGQPLVPEASSIPGDLLESGPLTfseqlqefqPPATAEQ--SPYLQAPSTPGQ 1908
Cdd:PHA03247 2780 PRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLP---------PPTSAQPtaPPPPPGPPPPSL 2850
                         330       340       350       360
                  ....*....|....*....|....*....|....*....|....*
gi 224458301 1909 HLATWTLPGrasslwiPPTSRHPPTLWPSPAPGKPQKSWSPSVAK 1953
Cdd:PHA03247 2851 PLGGSVAPG-------GDVRRRPPSRSPAAKPAAPARPPVRRLAR 2888
PHA03379 super family cl33730
EBNA-3A; Provisional
1216-1621 9.16e-09

EBNA-3A; Provisional


The actual alignment was detected with superfamily member PHA03379:

Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 61.23  E-value: 9.16e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1216 GIPLTPQQAQALGITLTLQQAQQLGIPLTPQQAQALGITLTPKQVQELGIPLTPQQAQALGITLTPKQAQELGIPLNPQQ 1295
Cdd:PHA03379  414 GTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDGRP 493
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1296 AQT--------LGIPLTPKQAQALGIPFTPQQAQALGIPLTPQQAQTQEITLTPQQAQAL----GMPLTTQQAQE--LGI 1361
Cdd:PHA03379  494 ACApvpapagpIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPPLIAmqgpGETSGIVRVRErwRPA 573
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1362 PLTPQHAQALG-MPLTTQQAQeLGIPLTPQQAQALGMPLTTQQA---QELGIPLTPQQAQELGIPFTPQQAQAQEITLTP 1437
Cdd:PHA03379  574 PWTPNPPRSPSqMSVRDRLAR-LRAEAQPYQASVEVQPPQLTQVspqQPMEYPLEPEQQMFPGSPFSQVADVMRAGGVPA 652
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1438 QQAQALGMPLtaqqaqelgitltpQQAQELGIPLTPQQAQALGIPLIPP-QAQELGIPLTPQQAQALGILLIPPQAQELG 1516
Cdd:PHA03379  653 MQPQYFDLPL--------------QQPISQGAPLAPLRASMGPVPPVPAtQPQYFDIPLTEPINQGASAAHFLPQQPMEG 718
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1517 iPLTPQQAQALGIPLIP------PQAQELGIPLTpqQVQALGIPLIP-PQAQELEIPLTPQQAQALGIPLTPqqaqelGI 1589
Cdd:PHA03379  719 -PLVPERWMFQGATLSQsvrpgvAQSQYFDLPLT--QPINHGAPAAHfLHQPPMEGPWVPEQWMFQGAPPSQ------GT 789
                         410       420       430
                  ....*....|....*....|....*....|..
gi 224458301 1590 PLTPQQAQELGIPltPQQAQAQGIPLTPQQAQ 1621
Cdd:PHA03379  790 DVVQHQLDALGYV--LHVLNHPGVPVSPAVNQ 819
 
Name Accession Description Interval E-value
PTZ00121 PTZ00121
MAEBL; Provisional
381-1042 3.48e-14

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 79.03  E-value: 3.48e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  381 ELENIVDEVQRKETkdsgIKWDSTISYTAQAERT-PDLTELRQQPVASEDISEDSTKDNVSLKKGDFYQEDETDEYQSWK 459
Cdd:PTZ00121 1221 EDAKKAEAVKKAEE----AKKDAEEAKKAEEERNnEEIRKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAK 1296
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  460 RSH--KKATYVYETSGPNLSDNKSGQKVSEAKPsqyyELQVLKKKRKEMKSFSEDKsKSPTEAKRKHLSLTETKSQGGKS 537
Cdd:PTZ00121 1297 KAEekKKADEAKKKAEEAKKADEAKKKAEEAKK----KADAAKKKAEEAKKAAEAA-KAEAEAAADEAEAAEEKAEAAEK 1371
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  538 GTSmmmlEQFRK---VKRESPFDKRPTAAEIKVEPTTESLD--KEGKGEIRSLVEPLSMIQFDDTAEPQKGKIKGKKHHI 612
Cdd:PTZ00121 1372 KKE----EAKKKadaAKKKAEEKKKADEAKKKAEEDKKKADelKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKAD 1447
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  613 SSGTITSKEEKTEEKEELTKQVKSHQLVKSLSRVAKETSESTRVLESPDGKSEQsnLEEFQEAimaflKQKIDNIGKAFD 692
Cdd:PTZ00121 1448 EAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADE--AKKAAEA-----KKKADEAKKAEE 1520
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  693 KKtvpKEEELLKRAEAEKLGiiKAKMEEYFQKVAEtvtkiLRKYKDTKKEEQVgeKPIKQKKVVSFMPGLHFQKSPISAK 772
Cdd:PTZ00121 1521 AK---KADEAKKAEEAKKAD--EAKKAEEKKKADE-----LKKAEELKKAEEK--KKAEEAKKAEEDKNMALRKAEEAKK 1588
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  773 SESSTLLSYESTDPVINNLIQMILAEIESERdiPTVSTVQKDHKEKEKQRQEQYLQEgQEQMSGMSLKQQllGERNLLKE 852
Cdd:PTZ00121 1589 AEEARIEEVMKLYEEEKKMKAEEAKKAEEAK--IKAEELKKAEEEKKKVEQLKKKEA-EEKKKAEELKKA--EEENKIKA 1663
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  853 HYEKISENWEEKKAwlqmKEGKQEQQSQKQWQEEEMWKEEQKQATPKQAEQEEKQKQRGQE---EEELPKSSLQRLEEGT 929
Cdd:PTZ00121 1664 AEEAKKAEEDKKKA----EEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEElkkAEEENKIKAEEAKKEA 1739
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  930 QKMKTQGLLLEKENGQMRQIQ---KEAKHLGPHRRREKG---KE--KQKPERGLEDLERQIK-TKDQMQMKETQPKELEK 1000
Cdd:PTZ00121 1740 EEDKKKAEEAKKDEEEKKKIAhlkKEEEKKAEEIRKEKEaviEEelDEEDEKRRMEVDKKIKdIFDNFANIIEGGKEGNL 1819
                         650       660       670       680
                  ....*....|....*....|....*....|....*....|....
gi 224458301 1001 MVIQTPMTLSPRWKSVL--KDVQRSyEGKEFQRNLKTLENLPDE 1042
Cdd:PTZ00121 1820 VINDSKEMEDSAIKEVAdsKNMQLE-EADAFEKHKFNKNNENGE 1862
PHA03247 PHA03247
large tegument protein UL36; Provisional
1602-1953 3.30e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 62.65  E-value: 3.30e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1602 PLTPQQAQAQGIPlTPQQAQAlgISLTPQQAQAQGITLTPQQAQalgvPITPVN-----AWVSAVTLTSEQTHALESPMN 1676
Cdd:PHA03247 2557 PAAPPAAPDRSVP-PPRPAPR--PSEPAVTSRARRPDAPPQSAR----PRAPVDdrgdpRGPAPPSPLPPDTHAPDPPPP 2629
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1677 LEQAQEQLLKLGVPLTLDKAHTLGSPLTLKQVQWSHRPFQKSKASLPTG--QSIISRLSPSLRLSLASSA--PTAEKSSi 1752
Cdd:PHA03247 2630 SPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSppQRPRRRAARPTVGSLTSLAdpPPPPPTP- 2708
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1753 fgVSSTPLQISRVPLNQGPFAPGKplemgilSEPGKLGAPQTLRSSGQTLVYGGQSTSAQFPAPQAPPSPG--QLPISRA 1830
Cdd:PHA03247 2709 --EPAPHALVSATPLPPGPAAARQ-------ASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAppAAPAAGP 2779
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1831 PPTPGQPFIAGVPPTSGQIPSLWAPLSPGQPLVPEASSIPGDLLESGPLTfseqlqefqPPATAEQ--SPYLQAPSTPGQ 1908
Cdd:PHA03247 2780 PRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLP---------PPTSAQPtaPPPPPGPPPPSL 2850
                         330       340       350       360
                  ....*....|....*....|....*....|....*....|....*
gi 224458301 1909 HLATWTLPGrasslwiPPTSRHPPTLWPSPAPGKPQKSWSPSVAK 1953
Cdd:PHA03247 2851 PLGGSVAPG-------GDVRRRPPSRSPAAKPAAPARPPVRRLAR 2888
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
1512-1940 5.64e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 61.71  E-value: 5.64e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1512 AQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQVQALGIPLIPPQAQELEIPLtPQQAQALGIPLTPQQAqelGIPL 1591
Cdd:pfam03154  162 AQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQP-PNQTQSTAAPHTLIQQ---TPTL 237
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1592 TPQQ-------AQELGIPLTPQQAQAQG---------IPLTPQQAQAlGISLTPQQAQAQGITLTPQQAQAlGVPITPVN 1655
Cdd:pfam03154  238 HPQRlpsphppLQPMTQPPPPSQVSPQPlpqpslhgqMPPMPHSLQT-GPSHMQHPVPPQPFPLTPQSSQS-QVPPGPSP 315
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1656 AWVSAVTLTSEQTHALESPMNLEQAQEQLLKlgvPLTLDKAHTLGSPLT-LKQVQWSHRPFQKSKASLPTGQSIISRLSP 1734
Cdd:pfam03154  316 AAPGQSQQRIHTPPSQSQLQSQQPPREQPLP---PAPLSMPHIKPPPTTpIPQLPNPQSHKHPPHLSGPSPFQMNSNLPP 392
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1735 SLRLSLASSAPTAEKSSifgVSSTPLQIsrvpLNQGPFAPGKPLEMGILSepgklgapqtlrssgqtlvyggQSTSAQFP 1814
Cdd:pfam03154  393 PPALKPLSSLSTHHPPS---AHPPPLQL----MPQSQQLPPPPAQPPVLT----------------------QSQSLPPP 443
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1815 APQAPPSPGQLPISRAPPTPGQPFIAGVPPTSgqipslwapLSPGQPLVPEASSIPGdllesgpltfseqlqeFQPPATA 1894
Cdd:pfam03154  444 AASHPPTSGLHQVPSQSPFPQHPFVPGGPPPI---------TPPSGPPTSTSSAMPG----------------IQPPSSA 498
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|....*.
gi 224458301  1895 EQSPYLQAPSTPGQHLATWTLPGRASSLWIPPTSRHPPTLWPSPAP 1940
Cdd:pfam03154  499 SVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEP 544
PHA03379 PHA03379
EBNA-3A; Provisional
1216-1621 9.16e-09

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 61.23  E-value: 9.16e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1216 GIPLTPQQAQALGITLTLQQAQQLGIPLTPQQAQALGITLTPKQVQELGIPLTPQQAQALGITLTPKQAQELGIPLNPQQ 1295
Cdd:PHA03379  414 GTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDGRP 493
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1296 AQT--------LGIPLTPKQAQALGIPFTPQQAQALGIPLTPQQAQTQEITLTPQQAQAL----GMPLTTQQAQE--LGI 1361
Cdd:PHA03379  494 ACApvpapagpIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPPLIAmqgpGETSGIVRVRErwRPA 573
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1362 PLTPQHAQALG-MPLTTQQAQeLGIPLTPQQAQALGMPLTTQQA---QELGIPLTPQQAQELGIPFTPQQAQAQEITLTP 1437
Cdd:PHA03379  574 PWTPNPPRSPSqMSVRDRLAR-LRAEAQPYQASVEVQPPQLTQVspqQPMEYPLEPEQQMFPGSPFSQVADVMRAGGVPA 652
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1438 QQAQALGMPLtaqqaqelgitltpQQAQELGIPLTPQQAQALGIPLIPP-QAQELGIPLTPQQAQALGILLIPPQAQELG 1516
Cdd:PHA03379  653 MQPQYFDLPL--------------QQPISQGAPLAPLRASMGPVPPVPAtQPQYFDIPLTEPINQGASAAHFLPQQPMEG 718
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1517 iPLTPQQAQALGIPLIP------PQAQELGIPLTpqQVQALGIPLIP-PQAQELEIPLTPQQAQALGIPLTPqqaqelGI 1589
Cdd:PHA03379  719 -PLVPERWMFQGATLSQsvrpgvAQSQYFDLPLT--QPINHGAPAAHfLHQPPMEGPWVPEQWMFQGAPPSQ------GT 789
                         410       420       430
                  ....*....|....*....|....*....|..
gi 224458301 1590 PLTPQQAQELGIPltPQQAQAQGIPLTPQQAQ 1621
Cdd:PHA03379  790 DVVQHQLDALGYV--LHVLNHPGVPVSPAVNQ 819
SMC_N pfam02463
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ...
291-1039 2.55e-07

RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.


Pssm-ID: 426784 [Multi-domain]  Cd Length: 1161  Bit Score: 56.52  E-value: 2.55e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   291 AHETSEAEKELSLKIIRDLSNENEMLQQKLQDAEEKCEQLIRSKIVIEQLYAKLSTSSTLKVLPGPSPQSSRAIIKVGDT 370
Cdd:pfam02463  254 ESSKQEIEKEEEKLAQVLKENKEEEKEKKLQEEELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKEKKKAEKELKKE 333
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   371 EDNMDN---------ILDKELENIVDEVQRKETKDSGIKWDSTISYTAQAERTPDLTELRQQPVASEDISEDSTKDNVSL 441
Cdd:pfam02463  334 KEEIEElekelkeleIKREAEEEEEEELEKLQEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKSEEEKEAQLLLEL 413
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   442 KKGDFYQEDETDEYQSWKRSHKKatyvyetsgpNLSDNKSGQKVSEAKPSQYYELQVLKKKRKEMKSFSEDKSKSPTEAK 521
Cdd:pfam02463  414 ARQLEDLLKEEKKEELEILEEEE----------ESIELKQGKLTEEKEELEKQELKLLKDELELKKSEDLLKETQLVKLQ 483
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   522 RKHLSLTETKSQGG---KSGTSMMMLEQFRKVKRESPFDKRPTAAEIKVEPTTESLDKEGKGEIRSLVEPLSMIQFDDTA 598
Cdd:pfam02463  484 EQLELLLSRQKLEErsqKESKARSGLKVLLALIKDGVGGRIISAHGRLGDLGVAVENYKVAISTAVIVEVSATADEVEER 563
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   599 EP-QKGKIKGKKHHISSGTITSKEEKTEEKEELTKQVKSHQLVKSLSRVAKETSESTRVLESPDG---KSEQSNLEEFQE 674
Cdd:pfam02463  564 QKlVRALTELPLGARKLRLLIPKLKLPLKSIAVLEIDPILNLAQLDKATLEADEDDKRAKVVEGIlkdTELTKLKESAKA 643
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   675 AIMAFLKQKIDNIGKAFDKKTVPKEEELLKRAEAEKLGIIKAKMEEYFQKVAETVTKILRKYKDTKKEEQVgekpIKQKK 754
Cdd:pfam02463  644 KESGLRKGVSLEEGLAEKSEVKASLSELTKELLEIQELQEKAESELAKEEILRRQLEIKKKEQREKEELKK----LKLEA 719
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   755 VVSFMPGLHFQKSPISaksesstllsyestdpVINNLIQMILAEIESERDIPTVSTVQKDHKEKEKQRQEQYLQEGQEQm 834
Cdd:pfam02463  720 EELLADRVQEAQDKIN----------------EELKLLKQKIDEEEEEEEKSRLKKEEKEEEKSELSLKEKELAEEREK- 782
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   835 sgmslkqQLLGERNLLKEHYEKI---SENWEEKKAWLQMKEGKQEQqsqkqwqeeemwkEEQKQATPKQAEQEEKQKQRG 911
Cdd:pfam02463  783 -------TEKLKVEEEKEEKLKAqeeELRALEEELKEEAELLEEEQ-------------LLIEQEEKIKEEELEELALEL 842
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   912 QEEEELPKSSLQRLEEgtQKMKTQGLLLEKENGQMRQIQKEAKHlgphrrREKGKEKQKPERGLEDLERQIKTKDQMQMK 991
Cdd:pfam02463  843 KEEQKLEKLAEEELER--LEEEITKEELLQELLLKEEELEEQKL------KDELESKEEKEKEEKKELEEESQKLNLLEE 914
                          730       740       750       760
                   ....*....|....*....|....*....|....*....|....*...
gi 224458301   992 ETQPKELEKMVIQTPMTLSPRWKSVLKDVQRSYEGKEFQrNLKTLENL 1039
Cdd:pfam02463  915 KENEIEERIKEEAEILLKYEEEPEELLLEEADEKEKEEN-NKEEEEER 961
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
1046-1487 3.09e-06

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 52.24  E-value: 3.09e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1046 ISITPPPSLQYSLPGALPISGQPLTkcihlTPQQAQEVGITltpqqaQAQGITLTLQQAQELGIPLTPQQAQALEILFTP 1125
Cdd:cd22540    44 AAVTPPAPPQPTPRKLVPIKPAPLP-----LGPGKNSIGFL------SAKGNIIQLQGSQLSSSAPGGQQVFAIQNPTMI 112
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1126 QQAQALGIPLTPQ---QTQVQGITLTPQQDQAPGISlTTQQAQKLGIPLTPQQAQA-LGIPLTPQQAQELGIPLTPQQAQ 1201
Cdd:cd22540   113 IKGSQTRSSTNQQyqiSPQIQAAGQINNSGQIQIIP-GTNQAIITPVQVLQQPQQAhKPVPIKPAPLQTSNTNSASLQVP 191
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1202 ALRVSLTPQQAQELGIPLTPQQAQALGITLTL-------------QQAQQLGIPLTP--QQAQALGITLTPKQVQELGIP 1266
Cdd:cd22540   192 GNVIKLQSGGNVALTLPVNNLVGTQDGATQLQlaaapskpskkirKKSAQAAQPAVTvaEQVETVLIETTADNIIQAGNN 271
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1267 LTPQQAQALGITLTPKQAQELGIPLNPQQAQTLGIPLTPKQAQALGIPFTPQQAQALGIPLTPQQAQTQEITLTPQ-QAQ 1345
Cdd:cd22540   272 LLIVQSPGTGQPAVLQQVQVLQPKQEQQVVQIPQQALRVVQAASATLPTVPQKPLQNIQIQNSEPTPTQVYIKTPSgEVQ 351
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1346 ALGM---PLTTQQAQELGIPLTPQHAQALGMPLTTQQAQELGIPLTPQQAQALGmpLTTQQAQELGIplTPQQAQELGIP 1422
Cdd:cd22540   352 TVLLqeaPAATATPSSSTSTVQQQVTANNGTGTSKPNYNVRKERTLPKIAPAGG--IISLNAAQLAA--AAQAIQTININ 427
                         410       420       430       440       450       460
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301 1423 FTPQQAQAQEITLTPQQAQalgmpLTAQQAQELGITLTPQQAQElgipLTPQQAQALGIPLIPPQ 1487
Cdd:cd22540   428 GVQVQGVPVTITNAGGQQQ-----LTVQTVSSNNLTISGLSPTQ----IQLQMEQALEIETQPGE 483
AvrBs3 NF041308
type III secretion system effector avirulence protein AvrBs3;
1087-1646 1.65e-05

type III secretion system effector avirulence protein AvrBs3;


Pssm-ID: 469205 [Multi-domain]  Cd Length: 1179  Bit Score: 50.34  E-value: 1.65e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1087 LTPQQ----AQAQGITLTLQQAQELgIPLTPQQAQALeilfTPQQAQALGIPLTPQQTQVQGITLTPQQDQAPgISLTTQ 1162
Cdd:NF041308  326 LTPEQvvaiASNDGGKQALETVQRL-LPVLCQAEHGL----TPDQVVAIASNIGGKPALETVQRLLPVLCQPP-HGLTPD 399
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1163 QAQKLGIPLTPQQAQALGIPLTPQQAQELGiPLTPQQAQALRVSLTPQQAQELGIPLTPQQAQALGITltlqQAQQLGIP 1242
Cdd:NF041308  400 QVVAIASNDGGKQALETVQRLLPVLCQAPH-GLTPDQVVAIASNDGGKQALETVQRLLPELCQAHGLT----PDQVVAIA 474
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1243 LTPQQAQALGIT--LTPKQVQeLGIPLTPQQAQALGITLTPKQAQELGIPLNPQQAQTLGiPLTPKQAQALGIPFTPQQA 1320
Cdd:NF041308  475 SNGGGKQALETVqrLLPVLCQ-PPHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQPPH-GLTPEQVVAIASHDGGKQA 552
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1321 QALGIPLTPQQAQTQEiTLTPQQAQALGMPLTTQQA----QELgIP--------LTPQHAQALGMPLTTQQAQELGIPLT 1388
Cdd:NF041308  553 LETVHRLLPVLCQAPH-GLTPEQVVAIASHNGGKQAletvQRL-LPvlcqrpygLTPNQVVAIASNDGGKQALETVQRLL 630
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1389 PQQAQAlGMPLTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQEiTLTPQQAQALGMPLTAQQAQELGITLTPQQAQElG 1468
Cdd:NF041308  631 PVLCQA-PHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQRPH-GLTPHQVVAIASNDGGKQALETVQRLLPVLCQP-P 707
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1469 IPLTPQQAQALGIPLIPPQAQELGIPLTPqqaqalgILLIPPQAqelgipLTPQQAQALGIPLIPPQA----QELgIP-- 1542
Cdd:NF041308  708 YGLTPEQVVAIASNNGGKQALETVQRLLP-------VLCQRPHG------LTPDQVVAIASNDGGKQAletvQRL-LPvl 773
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1543 ------LTPQQVQALGIPLIPPQAQELEIPLTPQQAQALGiPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGipLT 1616
Cdd:NF041308  774 cqpphgLTPDQVVAIASNDGGKQALETVQRLLPVLCDAPH-GLTPHQVVAIASNIGGRQALETVQRLLPVLCQAHG--LT 850
                         570       580       590
                  ....*....|....*....|....*....|
gi 224458301 1617 PQQAQALGISLTPQQAQAQGITLTPQQAQA 1646
Cdd:NF041308  851 PDQVVAIASNNGGKQALETVQRLLPVLCQP 880
KREPA2 cd23959
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of ...
1770-1908 1.91e-04

Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of the parasitic protozoan's KREPA RNA editing catalytic complex (RECC). Kinetoplastid RNA editing (KRE) proteins occur as pairs or sets of related proteins in multiple complexes. KREPA complex is composed of six components (KREPA1-6), which share a conserved C-terminal region containing an oligonucleotide-binding (OB)-fold-like domain. KREPAs are responsible for the site-specific insertion and deletion of U nucleotides in the kinetoplastid mitochondria pre-messenger RNA. Apart from the conserved C-terminal OB-fold domain, KREPA1, KREPA2, and KREPA3 contain two conserved C2H2 zinc-finger domains. KREPA2 and kinetoplastid RNA editing ligase 1 (KREL1) are specific for ligation post-U-deletion and are paralogous to KREL2 and KREPA1 that are specific for ligation post-U-insertion. KREPA2, is critical for RECC stability and KREL1 integration into the complex.


Pssm-ID: 467780 [Multi-domain]  Cd Length: 424  Bit Score: 46.40  E-value: 1.91e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1770 GPFAPGKPLEMGILSePGKLGAPQTLRSSGQTLVYGGQSTSAQFPAPQAPPSP--GQLPI--SRAPPTPGQPFIAGVPPT 1845
Cdd:cd23959    97 DAFAMAPDESLGPFR-AARVPNPFSASSSTQRETHKTAQVAPPKAEPQTAPVTpfGQLPMfgQHPPPAKPLPAAAAAQQS 175
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 224458301 1846 S---GQIPSLWAPLSP-GQPLVPEASSIPGDLLESGPLTFSEQLQEFQPPATAEQSPylQAPSTPGQ 1908
Cdd:cd23959   176 SaspGEVASPFASGTVsASPFATATDTAPSSGAPDGFPAEASAPSPFAAPASAASFP--AAPVANGE 240
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
1292-1453 1.13e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 44.26  E-value: 1.13e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1292 NPQQAQTLGIPLTPKQAQAlgipfTPQQAQALGIPLTPQQAQTQEITLTPQQAQALGMPLTTQQAqelgiPLTPQHAQAL 1371
Cdd:pfam09770  209 KPAQQPAPAPAQPPAAPPA-----QQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQR-----PQSPQPDPAQ 278
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1372 GMPLTTQQAQELGIPLTPQQ-AQALGMPlTTQQAQELGIPLTPQQAQElgiPFTPQQAQAQeitltpQQAQALGMPLTAQ 1450
Cdd:pfam09770  279 PSIQPQAQQFHQQPPPVPVQpTQILQNP-NRLSAARVGYPQNPQPGVQ---PAPAHQAHRQ------QGSFGRQAPIITH 348

                   ...
gi 224458301  1451 QAQ 1453
Cdd:pfam09770  349 PQQ 351
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
639-1006 1.29e-03

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 43.85  E-value: 1.29e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  639 LVKSLSRVAKETSESTRVLESpdgKSEQSnleefqeaIMAFLKQKIDNIGKAFDKKTVPKEEellKRAEAEKlgiikakm 718
Cdd:NF033838   89 LNKKLSDIKTEYLYELNVLKE---KSEAE--------LTSKTKKELDAAFEQFKKDTLEPGK---KVAEATK-------- 146
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  719 eeyfqKVAETVtkilRKYKDTKKEEQVGEKPIKQKKVvsfmpGLHFQKSPISAKSESSTLLSYESTDPVINNLIQMILAE 798
Cdd:NF033838  147 -----KVEEAE----KKAKDQKEEDRRNYPTNTYKTL-----ELEIAESDVEVKKAELELVKEEAKEPRDEEKIKQAKAK 212
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  799 IESERDIPT----VSTVQKDHKEKEKQRQEQYLQEGQEQMSGMSLKQQLLG--ERNLLKEHYEKISENWEEKKAWLQMKE 872
Cdd:NF033838  213 VESKKAEATrlekIKTDREKAEEEAKRRADAKLKEAVEKNVATSEQDKPKRraKRGVLGEPATPDKKENDAKSSDSSVGE 292
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  873 GKQEQQSQkqwqeeemwKEEQKQATPKQAEQEEKQKQRGQEEEE---LPKSSLQRLE----EGTQKMKTQGLLLEKEngq 945
Cdd:NF033838  293 ETLPSPSL---------KPEKKVAEAEKKVEEAKKKAKDQKEEDrrnYPTNTYKTLEleiaESDVKVKEAELELVKE--- 360
                         330       340       350       360       370       380
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301  946 mrqiqkEAKHlgpHRRREKGKE-KQKPERGLEDLERQIKTKDQMQMKETQPK---ELEKMVIQTP 1006
Cdd:NF033838  361 ------EAKE---PRNEEKIKQaKAKVESKKAEATRLEKIKTDRKKAEEEAKrkaAEEDKVKEKP 416
DamX COG3266
Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell ...
1322-1622 1.36e-03

Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 442497 [Multi-domain]  Cd Length: 455  Bit Score: 43.69  E-value: 1.36e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1322 ALGIPLTPQQAQTQEITLTPQQAQALgMPLTTQQAQELGIPLTPQHAQALGMPLTTQQAQELGIPLTPQQAQALGMPLTT 1401
Cdd:COG3266    53 LLAGLLLLLIRLLSEAVDLGALASAA-LLLALASLALLGILLLALLALLLDLLLLADLLRAAALLLLKLLLLLLTLLLLV 131
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1402 QQAQELGIPLTPQQAQELGIPFTPQQAQAQEITLTPQQAQALGMPLTAQQAQELgiTLTPQQAQELGIPLTPQQAQALGI 1481
Cdd:COG3266   132 LLLLLALLLALLLDLPLLTLLIVLPLLEEQLLLLALQDIQGTLQALGAVAALLG--LRKAEEALALRAGSAAADALALLL 209
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1482 PLIPPQAQELGIPLTPQQ--AQALGILLIPPQAQELGIPLTPQQAQALgipliPPQAQELGIPLTPQQVQALGIPLIPPQ 1559
Cdd:COG3266   210 LLLASALGEAVAAAAELAalALLAAGAAEVLTARLVLLLLIIGSALKA-----PSQASSASAPATTSLGEQQEVSLPPAV 284
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 224458301 1560 AQELEiPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGIPlTPQQAQA 1622
Cdd:COG3266   285 AAQPA-AAAAAQPSAVALPAAPAAAAAAAAPAEAAAPQPTAAKPVVTETAAPAAP-APEAAAA 345
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
817-1000 4.66e-03

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 42.36  E-value: 4.66e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   817 EKEKQRQEqyLQEGQEQMSGMSLK--------QQLLGERNLlKEHYEKISENWEEKKAWLQMKEGKQEqqsqkqwqeeem 888
Cdd:TIGR02169  171 KKEKALEE--LEEVEENIERLDLIidekrqqlERLRREREK-AERYQALLKEKREYEGYELLKEKEAL------------ 235
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   889 wkEEQKQATPKQ-AEQEEKQKQRGQEEEELPK---SSLQRLEEGTQKMKtqglllEKENGQMRQIQKEAKHLGPHRRREK 964
Cdd:TIGR02169  236 --ERQKEAIERQlASLEEELEKLTEEISELEKrleEIEQLLEELNKKIK------DLGEEEQLRVKEKIGELEAEIASLE 307
                          170       180       190
                   ....*....|....*....|....*....|....*..
gi 224458301   965 GKEKQKpERGLEDLERQI-KTKDQMQMKETQPKELEK 1000
Cdd:TIGR02169  308 RSIAEK-ERELEDAEERLaKLEAEIDKLLAEIEELER 343
 
Name Accession Description Interval E-value
PTZ00121 PTZ00121
MAEBL; Provisional
381-1042 3.48e-14

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 79.03  E-value: 3.48e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  381 ELENIVDEVQRKETkdsgIKWDSTISYTAQAERT-PDLTELRQQPVASEDISEDSTKDNVSLKKGDFYQEDETDEYQSWK 459
Cdd:PTZ00121 1221 EDAKKAEAVKKAEE----AKKDAEEAKKAEEERNnEEIRKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAK 1296
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  460 RSH--KKATYVYETSGPNLSDNKSGQKVSEAKPsqyyELQVLKKKRKEMKSFSEDKsKSPTEAKRKHLSLTETKSQGGKS 537
Cdd:PTZ00121 1297 KAEekKKADEAKKKAEEAKKADEAKKKAEEAKK----KADAAKKKAEEAKKAAEAA-KAEAEAAADEAEAAEEKAEAAEK 1371
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  538 GTSmmmlEQFRK---VKRESPFDKRPTAAEIKVEPTTESLD--KEGKGEIRSLVEPLSMIQFDDTAEPQKGKIKGKKHHI 612
Cdd:PTZ00121 1372 KKE----EAKKKadaAKKKAEEKKKADEAKKKAEEDKKKADelKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKAD 1447
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  613 SSGTITSKEEKTEEKEELTKQVKSHQLVKSLSRVAKETSESTRVLESPDGKSEQsnLEEFQEAimaflKQKIDNIGKAFD 692
Cdd:PTZ00121 1448 EAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADE--AKKAAEA-----KKKADEAKKAEE 1520
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  693 KKtvpKEEELLKRAEAEKLGiiKAKMEEYFQKVAEtvtkiLRKYKDTKKEEQVgeKPIKQKKVVSFMPGLHFQKSPISAK 772
Cdd:PTZ00121 1521 AK---KADEAKKAEEAKKAD--EAKKAEEKKKADE-----LKKAEELKKAEEK--KKAEEAKKAEEDKNMALRKAEEAKK 1588
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  773 SESSTLLSYESTDPVINNLIQMILAEIESERdiPTVSTVQKDHKEKEKQRQEQYLQEgQEQMSGMSLKQQllGERNLLKE 852
Cdd:PTZ00121 1589 AEEARIEEVMKLYEEEKKMKAEEAKKAEEAK--IKAEELKKAEEEKKKVEQLKKKEA-EEKKKAEELKKA--EEENKIKA 1663
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  853 HYEKISENWEEKKAwlqmKEGKQEQQSQKQWQEEEMWKEEQKQATPKQAEQEEKQKQRGQE---EEELPKSSLQRLEEGT 929
Cdd:PTZ00121 1664 AEEAKKAEEDKKKA----EEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEElkkAEEENKIKAEEAKKEA 1739
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  930 QKMKTQGLLLEKENGQMRQIQ---KEAKHLGPHRRREKG---KE--KQKPERGLEDLERQIK-TKDQMQMKETQPKELEK 1000
Cdd:PTZ00121 1740 EEDKKKAEEAKKDEEEKKKIAhlkKEEEKKAEEIRKEKEaviEEelDEEDEKRRMEVDKKIKdIFDNFANIIEGGKEGNL 1819
                         650       660       670       680
                  ....*....|....*....|....*....|....*....|....
gi 224458301 1001 MVIQTPMTLSPRWKSVL--KDVQRSyEGKEFQRNLKTLENLPDE 1042
Cdd:PTZ00121 1820 VINDSKEMEDSAIKEVAdsKNMQLE-EADAFEKHKFNKNNENGE 1862
PHA03247 PHA03247
large tegument protein UL36; Provisional
1602-1953 3.30e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 62.65  E-value: 3.30e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1602 PLTPQQAQAQGIPlTPQQAQAlgISLTPQQAQAQGITLTPQQAQalgvPITPVN-----AWVSAVTLTSEQTHALESPMN 1676
Cdd:PHA03247 2557 PAAPPAAPDRSVP-PPRPAPR--PSEPAVTSRARRPDAPPQSAR----PRAPVDdrgdpRGPAPPSPLPPDTHAPDPPPP 2629
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1677 LEQAQEQLLKLGVPLTLDKAHTLGSPLTLKQVQWSHRPFQKSKASLPTG--QSIISRLSPSLRLSLASSA--PTAEKSSi 1752
Cdd:PHA03247 2630 SPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSppQRPRRRAARPTVGSLTSLAdpPPPPPTP- 2708
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1753 fgVSSTPLQISRVPLNQGPFAPGKplemgilSEPGKLGAPQTLRSSGQTLVYGGQSTSAQFPAPQAPPSPG--QLPISRA 1830
Cdd:PHA03247 2709 --EPAPHALVSATPLPPGPAAARQ-------ASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAppAAPAAGP 2779
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1831 PPTPGQPFIAGVPPTSGQIPSLWAPLSPGQPLVPEASSIPGDLLESGPLTfseqlqefqPPATAEQ--SPYLQAPSTPGQ 1908
Cdd:PHA03247 2780 PRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLP---------PPTSAQPtaPPPPPGPPPPSL 2850
                         330       340       350       360
                  ....*....|....*....|....*....|....*....|....*
gi 224458301 1909 HLATWTLPGrasslwiPPTSRHPPTLWPSPAPGKPQKSWSPSVAK 1953
Cdd:PHA03247 2851 PLGGSVAPG-------GDVRRRPPSRSPAAKPAAPARPPVRRLAR 2888
PHA03247 PHA03247
large tegument protein UL36; Provisional
1422-1937 4.57e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 62.26  E-value: 4.57e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1422 PFTPQQAQAQEITlTPQQAQALGMPLTAQQAQELGITLTPQQAQELGIPLTPQQAQALGIPLiPPQAQELGIPltPQQAQ 1501
Cdd:PHA03247 2557 PAAPPAAPDRSVP-PPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPL-PPDTHAPDPP--PPSPS 2632
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1502 ALGILLIPPQAQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQvqalgiPLIPPQAQELEIPLTPQQAQALGIPLTP 1581
Cdd:PHA03247 2633 PAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASS------PPQRPRRRAARPTVGSLTSLADPPPPPP 2706
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1582 QqaqelgiPLTPQQAQELGIPLTPQQAQAQGI-------PLTPQQAQALGISLTPQQAQAQGITLTPQQAQALGVPIT-- 1652
Cdd:PHA03247 2707 T-------PEPAPHALVSATPLPPGPAAARQAspalpaaPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAgp 2779
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1653 PVNAWVSAVTLTSEQTHALESPMNLEQAQEQLL--KLGVPLTLDKAHTLGSPLTLKQVQwSHRPFQKSKASLPTGQSI-- 1728
Cdd:PHA03247 2780 PRRLTRPAVASLSESRESLPSPWDPADPPAAVLapAAALPPAASPAGPLPPPTSAQPTA-PPPPPGPPPPSLPLGGSVap 2858
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1729 ---ISRLSPSLRLSLASSAPTAEKSSifGVSSTPLQISRVPLNQGPFAPGKPlemgilSEPGKLGAPQTlrssgqtlvyg 1805
Cdd:PHA03247 2859 ggdVRRRPPSRSPAAKPAAPARPPVR--RLARPAVSRSTESFALPPDQPERP------PQPQAPPPPQP----------- 2919
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1806 gQSTSAQFPAPQAPPSPGQLPISRAPPTPGQpfiAGVPPTSGQIPSLW-APLSPGQPLVPeassipgdllesgpltfseq 1884
Cdd:PHA03247 2920 -QPQPPPPPQPQPPPPPPPRPQPPLAPTTDP---AGAGEPSGAVPQPWlGALVPGRVAVP-------------------- 2975
                         490       500       510       520       530
                  ....*....|....*....|....*....|....*....|....*....|....*...
gi 224458301 1885 lqEFQPPATAEQSPYLQAPSTPGQHLATWTLPGRASSLWIPPTSRHPP-----TLWPS 1937
Cdd:PHA03247 2976 --RFRVPQPAPSREAPASSTPPLTGHSLSRVSSWASSLALHEETDPPPvslkqTLWPP 3031
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
1512-1940 5.64e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 61.71  E-value: 5.64e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1512 AQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQVQALGIPLIPPQAQELEIPLtPQQAQALGIPLTPQQAqelGIPL 1591
Cdd:pfam03154  162 AQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQP-PNQTQSTAAPHTLIQQ---TPTL 237
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1592 TPQQ-------AQELGIPLTPQQAQAQG---------IPLTPQQAQAlGISLTPQQAQAQGITLTPQQAQAlGVPITPVN 1655
Cdd:pfam03154  238 HPQRlpsphppLQPMTQPPPPSQVSPQPlpqpslhgqMPPMPHSLQT-GPSHMQHPVPPQPFPLTPQSSQS-QVPPGPSP 315
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1656 AWVSAVTLTSEQTHALESPMNLEQAQEQLLKlgvPLTLDKAHTLGSPLT-LKQVQWSHRPFQKSKASLPTGQSIISRLSP 1734
Cdd:pfam03154  316 AAPGQSQQRIHTPPSQSQLQSQQPPREQPLP---PAPLSMPHIKPPPTTpIPQLPNPQSHKHPPHLSGPSPFQMNSNLPP 392
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1735 SLRLSLASSAPTAEKSSifgVSSTPLQIsrvpLNQGPFAPGKPLEMGILSepgklgapqtlrssgqtlvyggQSTSAQFP 1814
Cdd:pfam03154  393 PPALKPLSSLSTHHPPS---AHPPPLQL----MPQSQQLPPPPAQPPVLT----------------------QSQSLPPP 443
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1815 APQAPPSPGQLPISRAPPTPGQPFIAGVPPTSgqipslwapLSPGQPLVPEASSIPGdllesgpltfseqlqeFQPPATA 1894
Cdd:pfam03154  444 AASHPPTSGLHQVPSQSPFPQHPFVPGGPPPI---------TPPSGPPTSTSSAMPG----------------IQPPSSA 498
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|....*.
gi 224458301  1895 EQSPYLQAPSTPGQHLATWTLPGRASSLWIPPTSRHPPTLWPSPAP 1940
Cdd:pfam03154  499 SVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEP 544
PHA03379 PHA03379
EBNA-3A; Provisional
1216-1621 9.16e-09

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 61.23  E-value: 9.16e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1216 GIPLTPQQAQALGITLTLQQAQQLGIPLTPQQAQALGITLTPKQVQELGIPLTPQQAQALGITLTPKQAQELGIPLNPQQ 1295
Cdd:PHA03379  414 GTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDGRP 493
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1296 AQT--------LGIPLTPKQAQALGIPFTPQQAQALGIPLTPQQAQTQEITLTPQQAQAL----GMPLTTQQAQE--LGI 1361
Cdd:PHA03379  494 ACApvpapagpIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPPLIAmqgpGETSGIVRVRErwRPA 573
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1362 PLTPQHAQALG-MPLTTQQAQeLGIPLTPQQAQALGMPLTTQQA---QELGIPLTPQQAQELGIPFTPQQAQAQEITLTP 1437
Cdd:PHA03379  574 PWTPNPPRSPSqMSVRDRLAR-LRAEAQPYQASVEVQPPQLTQVspqQPMEYPLEPEQQMFPGSPFSQVADVMRAGGVPA 652
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1438 QQAQALGMPLtaqqaqelgitltpQQAQELGIPLTPQQAQALGIPLIPP-QAQELGIPLTPQQAQALGILLIPPQAQELG 1516
Cdd:PHA03379  653 MQPQYFDLPL--------------QQPISQGAPLAPLRASMGPVPPVPAtQPQYFDIPLTEPINQGASAAHFLPQQPMEG 718
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1517 iPLTPQQAQALGIPLIP------PQAQELGIPLTpqQVQALGIPLIP-PQAQELEIPLTPQQAQALGIPLTPqqaqelGI 1589
Cdd:PHA03379  719 -PLVPERWMFQGATLSQsvrpgvAQSQYFDLPLT--QPINHGAPAAHfLHQPPMEGPWVPEQWMFQGAPPSQ------GT 789
                         410       420       430
                  ....*....|....*....|....*....|..
gi 224458301 1590 PLTPQQAQELGIPltPQQAQAQGIPLTPQQAQ 1621
Cdd:PHA03379  790 DVVQHQLDALGYV--LHVLNHPGVPVSPAVNQ 819
PHA03247 PHA03247
large tegument protein UL36; Provisional
1362-1679 2.55e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 56.49  E-value: 2.55e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1362 PLTPQHAQALGMPLTTQQAQELGI-------PLTPQQAQALGMPLTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQEIT 1434
Cdd:PHA03247 2708 PEPAPHALVSATPLPPGPAAARQAspalpaaPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPA 2787
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1435 LTPQQAQALGMPLTAQQAQELGITLTPQQAqelgipLTPQQAQALGIPlIPPQAQELGIPLTPQqaqalgilliPPQAqe 1514
Cdd:PHA03247 2788 VASLSESRESLPSPWDPADPPAAVLAPAAA------LPPAASPAGPLP-PPTSAQPTAPPPPPG----------PPPP-- 2848
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1515 lgiPLTPQQAQALGIPLI--PPQAQELGIPLTPQQVQALGIPLIPPQAQELEIPLTPQQAQALGIPLTPQQAQELgiPLT 1592
Cdd:PHA03247 2849 ---SLPLGGSVAPGGDVRrrPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQ--PQP 2923
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1593 PQQAQELGIPLTPQQAQAQGIPLTPQQA--QALGISLTPQQ-AQAQGI-----TLTPQQAQALGVPITP----------- 1653
Cdd:PHA03247 2924 PPPPQPQPPPPPPPRPQPPLAPTTDPAGagEPSGAVPQPWLgALVPGRvavprFRVPQPAPSREAPASStppltghslsr 3003
                         330       340
                  ....*....|....*....|....*.
gi 224458301 1654 VNAWVSAVTLTSEQTHAlesPMNLEQ 1679
Cdd:PHA03247 3004 VSSWASSLALHEETDPP---PVSLKQ 3026
SMC_N pfam02463
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ...
291-1039 2.55e-07

RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.


Pssm-ID: 426784 [Multi-domain]  Cd Length: 1161  Bit Score: 56.52  E-value: 2.55e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   291 AHETSEAEKELSLKIIRDLSNENEMLQQKLQDAEEKCEQLIRSKIVIEQLYAKLSTSSTLKVLPGPSPQSSRAIIKVGDT 370
Cdd:pfam02463  254 ESSKQEIEKEEEKLAQVLKENKEEEKEKKLQEEELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKEKKKAEKELKKE 333
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   371 EDNMDN---------ILDKELENIVDEVQRKETKDSGIKWDSTISYTAQAERTPDLTELRQQPVASEDISEDSTKDNVSL 441
Cdd:pfam02463  334 KEEIEElekelkeleIKREAEEEEEEELEKLQEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKSEEEKEAQLLLEL 413
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   442 KKGDFYQEDETDEYQSWKRSHKKatyvyetsgpNLSDNKSGQKVSEAKPSQYYELQVLKKKRKEMKSFSEDKSKSPTEAK 521
Cdd:pfam02463  414 ARQLEDLLKEEKKEELEILEEEE----------ESIELKQGKLTEEKEELEKQELKLLKDELELKKSEDLLKETQLVKLQ 483
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   522 RKHLSLTETKSQGG---KSGTSMMMLEQFRKVKRESPFDKRPTAAEIKVEPTTESLDKEGKGEIRSLVEPLSMIQFDDTA 598
Cdd:pfam02463  484 EQLELLLSRQKLEErsqKESKARSGLKVLLALIKDGVGGRIISAHGRLGDLGVAVENYKVAISTAVIVEVSATADEVEER 563
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   599 EP-QKGKIKGKKHHISSGTITSKEEKTEEKEELTKQVKSHQLVKSLSRVAKETSESTRVLESPDG---KSEQSNLEEFQE 674
Cdd:pfam02463  564 QKlVRALTELPLGARKLRLLIPKLKLPLKSIAVLEIDPILNLAQLDKATLEADEDDKRAKVVEGIlkdTELTKLKESAKA 643
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   675 AIMAFLKQKIDNIGKAFDKKTVPKEEELLKRAEAEKLGIIKAKMEEYFQKVAETVTKILRKYKDTKKEEQVgekpIKQKK 754
Cdd:pfam02463  644 KESGLRKGVSLEEGLAEKSEVKASLSELTKELLEIQELQEKAESELAKEEILRRQLEIKKKEQREKEELKK----LKLEA 719
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   755 VVSFMPGLHFQKSPISaksesstllsyestdpVINNLIQMILAEIESERDIPTVSTVQKDHKEKEKQRQEQYLQEGQEQm 834
Cdd:pfam02463  720 EELLADRVQEAQDKIN----------------EELKLLKQKIDEEEEEEEKSRLKKEEKEEEKSELSLKEKELAEEREK- 782
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   835 sgmslkqQLLGERNLLKEHYEKI---SENWEEKKAWLQMKEGKQEQqsqkqwqeeemwkEEQKQATPKQAEQEEKQKQRG 911
Cdd:pfam02463  783 -------TEKLKVEEEKEEKLKAqeeELRALEEELKEEAELLEEEQ-------------LLIEQEEKIKEEELEELALEL 842
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   912 QEEEELPKSSLQRLEEgtQKMKTQGLLLEKENGQMRQIQKEAKHlgphrrREKGKEKQKPERGLEDLERQIKTKDQMQMK 991
Cdd:pfam02463  843 KEEQKLEKLAEEELER--LEEEITKEELLQELLLKEEELEEQKL------KDELESKEEKEKEEKKELEEESQKLNLLEE 914
                          730       740       750       760
                   ....*....|....*....|....*....|....*....|....*...
gi 224458301   992 ETQPKELEKMVIQTPMTLSPRWKSVLKDVQRSYEGKEFQrNLKTLENL 1039
Cdd:pfam02463  915 KENEIEERIKEEAEILLKYEEEPEELLLEEADEKEKEEN-NKEEEEER 961
PHA03247 PHA03247
large tegument protein UL36; Provisional
1554-1956 4.11e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 56.10  E-value: 4.11e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1554 PLIPPQAQELEIPlTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGIPLTPQQAQAlgiSLTPQQAQ 1633
Cdd:PHA03247 2557 PAAPPAAPDRSVP-PPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAP---DPPPPSPS 2632
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1634 AQGITLTPQQAQALGVPITPVNAWVSAVTLTSEQTHALESPMNLEQAQEQLLKLGVPLTLDKAHTLGSPLTLKQVQWSHR 1713
Cdd:PHA03247 2633 PAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAP 2712
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1714 PFQKSKASLPTGQSIISRLSPSLRLSLASSAPTAEKSsifgVSSTPLQISRVPLNQGPFAPGKPlemgilSEPGKLGAPQ 1793
Cdd:PHA03247 2713 HALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPA----TPGGPARPARPPTTAGPPAPAPP------AAPAAGPPRR 2782
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1794 TLRSSGQTLVYGGQS---------------------TSAQFPAPQAPPSPGQLPIsrAPPTPGQPFIAGVPPTSGQIP-- 1850
Cdd:PHA03247 2783 LTRPAVASLSESRESlpspwdpadppaavlapaaalPPAASPAGPLPPPTSAQPT--APPPPPGPPPPSLPLGGSVAPgg 2860
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1851 -------SLWAPLSPGQPLVPEASSIPGDLLESGPLTFSeqlqefQPPATAEQSPYLQAPSTPgqhLATWTLPGRASSLW 1923
Cdd:PHA03247 2861 dvrrrppSRSPAAKPAAPARPPVRRLARPAVSRSTESFA------LPPDQPERPPQPQAPPPP---QPQPQPPPPPQPQP 2931
                         410       420       430
                  ....*....|....*....|....*....|....
gi 224458301 1924 IPPTS-RHPPTLWPSPAPgKPQKSWSPSVAKKRL 1956
Cdd:PHA03247 2932 PPPPPpRPQPPLAPTTDP-AGAGEPSGAVPQPWL 2964
PHA03247 PHA03247
large tegument protein UL36; Provisional
1509-1952 1.03e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 54.56  E-value: 1.03e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1509 PPQAQELGIPlTPQQAQALGIPLIPPQAQELGIPLTPQQVQALGIPLIPPQAQELEIPLTPQQAQALGIPLTPQ-QAQEL 1587
Cdd:PHA03247 2560 PPAAPDRSVP-PPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSpAANEP 2638
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1588 GIPLTPQQAQELGIPLTPQQAQAQGIPLTPQQAQALGISLTPQQAQAQGITLTPQQAQALGVPITPVNAWVSAVTLTSEQ 1667
Cdd:PHA03247 2639 DPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSA 2718
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1668 THALESPMNLEQAQEQL-LKLGVPLTLDKAHTLGSPLTLKQVQWSHRPFQKSKASLPTG-------QSIISRLSPSLRLS 1739
Cdd:PHA03247 2719 TPLPPGPAAARQASPALpAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAgpprrltRPAVASLSESRESL 2798
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1740 LASSAPTAEKSSIFGVSSTplqisrVPLNQGPFAPGKPLEMGILSEPGKLGAP-QTLRSSGQTLVYGG--QSTSAQFPAP 1816
Cdd:PHA03247 2799 PSPWDPADPPAAVLAPAAA------LPPAASPAGPLPPPTSAQPTAPPPPPGPpPPSLPLGGSVAPGGdvRRRPPSRSPA 2872
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1817 QAPPSPGQLPISR----APPTPGQPFiAGVPPTSGQIPSLWAPLSPGQPLVPEASSIPGDLLESGPLTFSEQLQEFQPPA 1892
Cdd:PHA03247 2873 AKPAAPARPPVRRlarpAVSRSTESF-ALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAG 2951
                         410       420       430       440       450       460
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301 1893 TAEQSPYLQAPS----TPGQHLATWTL-PGRASSLWIPPTSRHPPTLWPSPApgkpQKSWSPSVA 1952
Cdd:PHA03247 2952 AGEPSGAVPQPWlgalVPGRVAVPRFRvPQPAPSREAPASSTPPLTGHSLSR----VSSWASSLA 3012
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
1046-1487 3.09e-06

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 52.24  E-value: 3.09e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1046 ISITPPPSLQYSLPGALPISGQPLTkcihlTPQQAQEVGITltpqqaQAQGITLTLQQAQELGIPLTPQQAQALEILFTP 1125
Cdd:cd22540    44 AAVTPPAPPQPTPRKLVPIKPAPLP-----LGPGKNSIGFL------SAKGNIIQLQGSQLSSSAPGGQQVFAIQNPTMI 112
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1126 QQAQALGIPLTPQ---QTQVQGITLTPQQDQAPGISlTTQQAQKLGIPLTPQQAQA-LGIPLTPQQAQELGIPLTPQQAQ 1201
Cdd:cd22540   113 IKGSQTRSSTNQQyqiSPQIQAAGQINNSGQIQIIP-GTNQAIITPVQVLQQPQQAhKPVPIKPAPLQTSNTNSASLQVP 191
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1202 ALRVSLTPQQAQELGIPLTPQQAQALGITLTL-------------QQAQQLGIPLTP--QQAQALGITLTPKQVQELGIP 1266
Cdd:cd22540   192 GNVIKLQSGGNVALTLPVNNLVGTQDGATQLQlaaapskpskkirKKSAQAAQPAVTvaEQVETVLIETTADNIIQAGNN 271
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1267 LTPQQAQALGITLTPKQAQELGIPLNPQQAQTLGIPLTPKQAQALGIPFTPQQAQALGIPLTPQQAQTQEITLTPQ-QAQ 1345
Cdd:cd22540   272 LLIVQSPGTGQPAVLQQVQVLQPKQEQQVVQIPQQALRVVQAASATLPTVPQKPLQNIQIQNSEPTPTQVYIKTPSgEVQ 351
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1346 ALGM---PLTTQQAQELGIPLTPQHAQALGMPLTTQQAQELGIPLTPQQAQALGmpLTTQQAQELGIplTPQQAQELGIP 1422
Cdd:cd22540   352 TVLLqeaPAATATPSSSTSTVQQQVTANNGTGTSKPNYNVRKERTLPKIAPAGG--IISLNAAQLAA--AAQAIQTININ 427
                         410       420       430       440       450       460
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301 1423 FTPQQAQAQEITLTPQQAQalgmpLTAQQAQELGITLTPQQAQElgipLTPQQAQALGIPLIPPQ 1487
Cdd:cd22540   428 GVQVQGVPVTITNAGGQQQ-----LTVQTVSSNNLTISGLSPTQ----IQLQMEQALEIETQPGE 483
PHA03379 PHA03379
EBNA-3A; Provisional
1108-1525 8.20e-06

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 51.21  E-value: 8.20e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1108 GIPLTPQQAQALEILFTPQQAQALGIPLTPQQTQVQGITLTPQQDQAPGISLTTQQAQKLGIPLTPQQAQALGIPLT--- 1184
Cdd:PHA03379  414 GTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDgrp 493
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1185 -----PQQAQELGIPLTPQQAQALRVSLTPQQAQELGIPLTPQQAQALgitltlqqaQQLGIPLTPQQA-QALGITLTPK 1258
Cdd:PHA03379  494 acapvPAPAGPIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVAL---------ERPVCPAPPLIAmQGPGETSGIV 564
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1259 QVQE--LGIPLTPQQAQAL-------GITLTPKQAQELGIPLNPQQAQtlgIPLTPKQaQALGIPFTPQQAQALGIPLTP 1329
Cdd:PHA03379  565 RVRErwRPAPWTPNPPRSPsqmsvrdRLARLRAEAQPYQASVEVQPPQ---LTQVSPQ-QPMEYPLEPEQQMFPGSPFSQ 640
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1330 QQAQTQEITLTPQQAQALGMPLttQQAQELGIPLTPQHAQALGM-PLTTQQAQELGIPLTPQQAQALGMPLTTQQAQELG 1408
Cdd:PHA03379  641 VADVMRAGGVPAMQPQYFDLPL--QQPISQGAPLAPLRASMGPVpPVPATQPQYFDIPLTEPINQGASAAHFLPQQPMEG 718
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1409 iPLTPQQAqelgipFTPQQAQAQEITLTPQQAQALGMPLTAQQAQELGITLTPQQAQELGiPLTPQQAQALGIPLIPpqa 1488
Cdd:PHA03379  719 -PLVPERW------MFQGATLSQSVRPGVAQSQYFDLPLTQPINHGAPAAHFLHQPPMEG-PWVPEQWMFQGAPPSQ--- 787
                         410       420       430
                  ....*....|....*....|....*....|....*..
gi 224458301 1489 qelGIPLTPQQAQALGilLIPPQAQELGIPLTPQQAQ 1525
Cdd:PHA03379  788 ---GTDVVQHQLDALG--YVLHVLNHPGVPVSPAVNQ 819
AvrBs3 NF041308
type III secretion system effector avirulence protein AvrBs3;
1087-1646 1.65e-05

type III secretion system effector avirulence protein AvrBs3;


Pssm-ID: 469205 [Multi-domain]  Cd Length: 1179  Bit Score: 50.34  E-value: 1.65e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1087 LTPQQ----AQAQGITLTLQQAQELgIPLTPQQAQALeilfTPQQAQALGIPLTPQQTQVQGITLTPQQDQAPgISLTTQ 1162
Cdd:NF041308  326 LTPEQvvaiASNDGGKQALETVQRL-LPVLCQAEHGL----TPDQVVAIASNIGGKPALETVQRLLPVLCQPP-HGLTPD 399
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1163 QAQKLGIPLTPQQAQALGIPLTPQQAQELGiPLTPQQAQALRVSLTPQQAQELGIPLTPQQAQALGITltlqQAQQLGIP 1242
Cdd:NF041308  400 QVVAIASNDGGKQALETVQRLLPVLCQAPH-GLTPDQVVAIASNDGGKQALETVQRLLPELCQAHGLT----PDQVVAIA 474
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1243 LTPQQAQALGIT--LTPKQVQeLGIPLTPQQAQALGITLTPKQAQELGIPLNPQQAQTLGiPLTPKQAQALGIPFTPQQA 1320
Cdd:NF041308  475 SNGGGKQALETVqrLLPVLCQ-PPHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQPPH-GLTPEQVVAIASHDGGKQA 552
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1321 QALGIPLTPQQAQTQEiTLTPQQAQALGMPLTTQQA----QELgIP--------LTPQHAQALGMPLTTQQAQELGIPLT 1388
Cdd:NF041308  553 LETVHRLLPVLCQAPH-GLTPEQVVAIASHNGGKQAletvQRL-LPvlcqrpygLTPNQVVAIASNDGGKQALETVQRLL 630
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1389 PQQAQAlGMPLTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQEiTLTPQQAQALGMPLTAQQAQELGITLTPQQAQElG 1468
Cdd:NF041308  631 PVLCQA-PHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQRPH-GLTPHQVVAIASNDGGKQALETVQRLLPVLCQP-P 707
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1469 IPLTPQQAQALGIPLIPPQAQELGIPLTPqqaqalgILLIPPQAqelgipLTPQQAQALGIPLIPPQA----QELgIP-- 1542
Cdd:NF041308  708 YGLTPEQVVAIASNNGGKQALETVQRLLP-------VLCQRPHG------LTPDQVVAIASNDGGKQAletvQRL-LPvl 773
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1543 ------LTPQQVQALGIPLIPPQAQELEIPLTPQQAQALGiPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGipLT 1616
Cdd:NF041308  774 cqpphgLTPDQVVAIASNDGGKQALETVQRLLPVLCDAPH-GLTPHQVVAIASNIGGRQALETVQRLLPVLCQAHG--LT 850
                         570       580       590
                  ....*....|....*....|....*....|
gi 224458301 1617 PQQAQALGISLTPQQAQAQGITLTPQQAQA 1646
Cdd:NF041308  851 PDQVVAIASNNGGKQALETVQRLLPVLCQP 880
PRK10263 PRK10263
DNA translocase FtsK; Provisional
1040-1538 2.70e-05

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 49.70  E-value: 2.70e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1040 PDEKEPISITPPPSLQYSLPGALPISGQPLTKCiHLTPQQAQEVGITLTPQQAQAQGITLTLQQAQELGIPLTPQQAQAL 1119
Cdd:PRK10263  377 PEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYA-PAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQST 455
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1120 ---EILFTPQQAQALGIPLTPQQTQVQGITLTPQQDQAPGISLTTqqaqklgiPLTP----------------QQAQALG 1180
Cdd:PRK10263  456 fapQSTYQTEQTYQQPAAQEPLYQQPQPVEQQPVVEPEPVVEETK--------PARPplyyfeeveekrarerEQLAAWY 527
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1181 IPLtPQQAQElGIPLTPQQAQALRVSLTPQQAQELGIPLTPQQAQAlgiTLTLQQAQQLGIPL---------TPQQAQAL 1251
Cdd:PRK10263  528 QPI-PEPVKE-PEPIKSSLKAPSVAAVPPVEAAAAVSPLASGVKKA---TLATGAAATVAAPVfslansggpRPQVKEGI 602
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1252 GITLT-PKQVQelgIPlTPQQAQALGITLTPKQAQELGIPLNPQQAQTLGIPLTPKQAQA-----LGIPFTPQQAQALG- 1324
Cdd:PRK10263  603 GPQLPrPKRIR---VP-TRRELASYGIKLPSQRAAEEKAREAQRNQYDSGDQYNDDEIDAmqqdeLARQFAQTQQQRYGe 678
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1325 -----IPLTPQQAQT-QEITLTPQQAQalgmpltTQQAQELGipltPQHAQALGMPLTTQQAQELGIPLTPQQAQALGMP 1398
Cdd:PRK10263  679 qyqhdVPVNAEDADAaAEAELARQFAQ-------TQQQRYSG----EQPAGANPFSLDDFEFSPMKALLDDGPHEPLFTP 747
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1399 LTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQeitltPQQAQAlgmplTAQQAQELGITLTPQ-QAQELGIPLTPQ-QA 1476
Cdd:PRK10263  748 IVEPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQ-----PQQPVA-----PQPQYQQPQQPVAPQpQYQQPQQPVAPQpQY 817
                         490       500       510       520       530       540
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301 1477 QALGIPLIP-PQAQELGIPLTPQQAQAL--GILLIPPQAQELGIPLTPQQAQALGIPliPPQAQE 1538
Cdd:PRK10263  818 QQPQQPVAPqPQYQQPQQPVAPQPQDTLlhPLLMRNGDSRPLHKPTTPLPSLDLLTP--PPSEVE 880
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
1738-1975 5.75e-05

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 48.63  E-value: 5.75e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1738 LSLASSAPTAEKSSIFGVSSTPLQISRVPLNQGPFAPGKPlemgILSEPGKLGAPQTLRSSGQTLVYG---GQSTSAQFP 1814
Cdd:PHA03307   93 STLAPASPAREGSPTPPGPSSPDPPPPTPPPASPPPSPAP----DLSEMLRPVGSPGPPPAASPPAAGaspAAVASDAAS 168
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1815 APQ-APPSPGQLPISRAPPTPGQPFIAGVPPTSGQiPSLWAPLSPGQPLVPEASSIPGDLLESGPLTFSEQLQEFQPPAT 1893
Cdd:PHA03307  169 SRQaALPLSSPEETARAPSSPPAEPPPSTPPAAAS-PRPPRRSSPISASASSPAPAPGRSAADDAGASSSDSSSSESSGC 247
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1894 AEQspylQAPSTPGQHLATWTLPGR--ASSLWIPPTSRHPPTLwPSPAPGKPQKSWSPSVAKKRLAIISSLKSKSV-LIH 1970
Cdd:PHA03307  248 GWG----PENECPLPRPAPITLPTRiwEASGWNGPSSRPGPAS-SSSSPRERSPSPSPSSPGSGPAPSSPRASSSSsSSR 322

                  ....*
gi 224458301 1971 PSAPD 1975
Cdd:PHA03307  323 ESSSS 327
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
1630-1950 1.40e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 47.45  E-value: 1.40e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1630 QQAQAQGITLTPQQAQALGVPITPVNAWVSAVTLTSEQTHALESPMNLEQAQEQLLKLGVPLTldkahtlgSPLTLKQVQ 1709
Cdd:pfam03154  163 QQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTA--------APHTLIQQT 234
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1710 WSHRPfqkskASLPTGQSIISRLSPSLRLSLASSAPTAEKSSifgvsSTPLQISRVPLNQGPFA---PGKPLEMGILSEP 1786
Cdd:pfam03154  235 PTLHP-----QRLPSPHPPLQPMTQPPPPSQVSPQPLPQPSL-----HGQMPPMPHSLQTGPSHmqhPVPPQPFPLTPQS 304
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1787 GKLGAPqtlrSSGQTLVYGGQSTSAQFPAPQAPPSPGQLPISRA-PPTP-GQPFIAgvPPTSGQIPSLWAPLS---PGQP 1861
Cdd:pfam03154  305 SQSQVP----PGPSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPlPPAPlSMPHIK--PPPTTPIPQLPNPQShkhPPHL 378
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1862 LVPEASSIPGDLLESGPLTFSEQLQEFQPPAT-------AEQSPYLQA-PSTPGQHLATWTLPgrasslwiPPTSRHPPT 1933
Cdd:pfam03154  379 SGPSPFQMNSNLPPPPALKPLSSLSTHHPPSAhppplqlMPQSQQLPPpPAQPPVLTQSQSLP--------PPAASHPPT 450
                          330       340
                   ....*....|....*....|
gi 224458301  1934 LWPSPAPGK---PQKSWSPS 1950
Cdd:pfam03154  451 SGLHQVPSQspfPQHPFVPG 470
SMC_N pfam02463
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ...
635-993 1.47e-04

RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.


Pssm-ID: 426784 [Multi-domain]  Cd Length: 1161  Bit Score: 47.27  E-value: 1.47e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   635 KSHQLVKSLSRVAKETSESTRVLESPDGKSEQSNLEEFQEAIMAfLKQKIDNIGKAFDKKTVPKEEELLKRAEAEKLGII 714
Cdd:pfam02463  174 ALKKLIEETENLAELIIDLEELKLQELKLKEQAKKALEYYQLKE-KLELEEEYLLYLDYLKLNEERIDLLQELLRDEQEE 252
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   715 KAKMEEYFQKVAETVTKILRKYKDTKKEEQVGEKPIKQKKvvsfmpglhfqkspISAKSESSTLLSYEST-----DPVIN 789
Cdd:pfam02463  253 IESSKQEIEKEEEKLAQVLKENKEEEKEKKLQEEELKLLA--------------KEEEELKSELLKLERRkvddeEKLKE 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   790 NLIQMILAEIESERDIPTVStvQKDHKEKEKQRQEQYLQEGQEQMSgmslkQQLLGERNLLKEHYEKISENWEEKKAwlq 869
Cdd:pfam02463  319 SEKEKKKAEKELKKEKEEIE--ELEKELKELEIKREAEEEEEEELE-----KLQEKLEQLEEELLAKKKLESERLSS--- 388
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   870 mkegkqeqqsqkqwQEEEMWKEEQKQATPKQAEQEEKQKQRGQEEEELPKSSLQRLEEGTQKmKTQGLLLEKENGQMRQI 949
Cdd:pfam02463  389 --------------AAKLKEEELELKSEEEKEAQLLLELARQLEDLLKEEKKEELEILEEEE-ESIELKQGKLTEEKEEL 453
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 224458301   950 QKEAKHLGPHRRREKGKEKQKPERGLEDLERQIKTKDQMQMKET 993
Cdd:pfam02463  454 EKQELKLLKDELELKKSEDLLKETQLVKLQEQLELLLSRQKLEE 497
KREPA2 cd23959
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of ...
1770-1908 1.91e-04

Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of the parasitic protozoan's KREPA RNA editing catalytic complex (RECC). Kinetoplastid RNA editing (KRE) proteins occur as pairs or sets of related proteins in multiple complexes. KREPA complex is composed of six components (KREPA1-6), which share a conserved C-terminal region containing an oligonucleotide-binding (OB)-fold-like domain. KREPAs are responsible for the site-specific insertion and deletion of U nucleotides in the kinetoplastid mitochondria pre-messenger RNA. Apart from the conserved C-terminal OB-fold domain, KREPA1, KREPA2, and KREPA3 contain two conserved C2H2 zinc-finger domains. KREPA2 and kinetoplastid RNA editing ligase 1 (KREL1) are specific for ligation post-U-deletion and are paralogous to KREL2 and KREPA1 that are specific for ligation post-U-insertion. KREPA2, is critical for RECC stability and KREL1 integration into the complex.


Pssm-ID: 467780 [Multi-domain]  Cd Length: 424  Bit Score: 46.40  E-value: 1.91e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1770 GPFAPGKPLEMGILSePGKLGAPQTLRSSGQTLVYGGQSTSAQFPAPQAPPSP--GQLPI--SRAPPTPGQPFIAGVPPT 1845
Cdd:cd23959    97 DAFAMAPDESLGPFR-AARVPNPFSASSSTQRETHKTAQVAPPKAEPQTAPVTpfGQLPMfgQHPPPAKPLPAAAAAQQS 175
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 224458301 1846 S---GQIPSLWAPLSP-GQPLVPEASSIPGDLLESGPLTFSEQLQEFQPPATAEQSPylQAPSTPGQ 1908
Cdd:cd23959   176 SaspGEVASPFASGTVsASPFATATDTAPSSGAPDGFPAEASAPSPFAAPASAASFP--AAPVANGE 240
PTZ00121 PTZ00121
MAEBL; Provisional
646-1001 2.69e-04

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 46.67  E-value: 2.69e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  646 VAKETSESTRVLESPDGKSEQSNLEEFQEAIMAFLK----QKIDNIGKAFDkktVPKEEELlKRAEAEKLGIIKAKMEEy 721
Cdd:PTZ00121 1088 RADEATEEAFGKAEEAKKTETGKAEEARKAEEAKKKaedaRKAEEARKAED---ARKAEEA-RKAEDAKRVEIARKAED- 1162
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  722 fQKVAEtvtkILRKYKDTKKEEQVgekpikqKKVVSFMPGLHFQKSPISAKSESSTllSYESTDPVINnliqmiLAEIES 801
Cdd:PTZ00121 1163 -ARKAE----EARKAEDAKKAEAA-------RKAEEVRKAEELRKAEDARKAEAAR--KAEEERKAEE------ARKAED 1222
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  802 ERDIPTVSTVQKDHKEKEKQRQEQYLQEGQEQMSGMSLKQQLLGERnllkehyeKISENWEEKKAWLQMKEGKQEQQSQK 881
Cdd:PTZ00121 1223 AKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARR--------QAAIKAEEARKADELKKAEEKKKADE 1294
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  882 QWQEEEMWKEEQKQATPKQAEQEEKQKQRGQEEEELPKSSLQRLEEGTQKMKTQGLLLEKENGQMRQIQKEAKHLGPHRR 961
Cdd:PTZ00121 1295 AKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKE 1374
                         330       340       350       360
                  ....*....|....*....|....*....|....*....|....*....
gi 224458301  962 REKGK---------EKQKPERGLEDLERQIKTKDQMQMKETQPKELEKM 1001
Cdd:PTZ00121 1375 EAKKKadaakkkaeEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEA 1423
PHA03247 PHA03247
large tegument protein UL36; Provisional
1792-1982 3.50e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 46.08  E-value: 3.50e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1792 PQTLRSSGQTLVYGGQSTSAQFPAPQAPPSPGQLPISRAPPTPGQPFIAGVPPTSGQIPSLWAPLS--------PGQPLV 1863
Cdd:PHA03247 2628 PPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGsltsladpPPPPPT 2707
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1864 PEASsiPGDLLESGPLTFSEQL-QEFQPPATAEQSPylqaPSTPgqhlATWTLPGRASSLWIPPTSRHPPTLWPSPAPGK 1942
Cdd:PHA03247 2708 PEPA--PHALVSATPLPPGPAAaRQASPALPAAPAP----PAVP----AGPATPGGPARPARPPTTAGPPAPAPPAAPAA 2777
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|
gi 224458301 1943 PqkswsPSVAKKRLAIISSLKSKSVLIHPSAPDFKVAQVP 1982
Cdd:PHA03247 2778 G-----PPRRLTRPAVASLSESRESLPSPWDPADPPAAVL 2812
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
1284-1552 5.03e-04

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 45.02  E-value: 5.03e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1284 AQELGIPLNPQQAqtlgIPLTPKQAQALgipFTPQQAQALGIPLTPQQAQTQEITLTPQQAQALGMPLTTQQAQELGIPL 1363
Cdd:cd22553    88 ANSGLLQTNNQQA----IQLAPGGTQAI---LANQQTLIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNNMTQTIPV 160
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1364 TpqhaqalgMPLTTQQAQELgipltpqqAQALGMPLTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQEITLTPQQAQAL 1443
Cdd:cd22553   161 Q--------VPVSTANGQTV--------YQTIQVPIQAIQSGNAGGGNQALQAQVIPQLAQAAQLQPQQLAQVSSQGYIQ 224
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1444 GMPLTAQQAQELGITLTPQQAQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQAQALGILLIPPQAQELGIPLTPQQ 1523
Cdd:cd22553   225 QIPANASQQQPQMVQQGPNQSGQIIGQVASASSIQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPASS 304
                         250       260       270
                  ....*....|....*....|....*....|
gi 224458301 1524 AqalgIPLIPPQAQELGIPLTP-QQVQALG 1552
Cdd:cd22553   305 S----IPTVVQQQAIQGNPLPPgTQIIAAG 330
PHA03378 PHA03378
EBNA-3B; Provisional
1499-1951 5.41e-04

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 45.44  E-value: 5.41e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1499 QAQALGILLIPPQAQELGIPLTPQQAQALGIPLIP-PQAQELGIPL---TPQQVQALG--IPLIPPQAQELEipltpQQA 1572
Cdd:PHA03378  446 HSQAPTVVLHRPPTQPLEGPTGPLSVQAPLEPWQPlPHPQVTPVILhqpPAQGVQAHGsmLDLLEKDDEDME-----QRV 520
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1573 QALGIPLTPQQaqelgiPLTPQQA-----QELGI----PLTPQQAQAQGIP---LTPQQAQALGISLTPQQAQ-----AQ 1635
Cdd:PHA03378  521 MATLLPPSPPQ------PRAGRRApcvytEDLDIesdePASTEPVHDQLLPapgLGPLQIQPLTSPTTSQLASsapsyAQ 594
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1636 GITLTPQQAQALGVPITPVNAwvsAVTLTSEQTHALESPMNLEQAQEQLLKLGVPLTLDKAHTLGSPLTLKQVQWS---H 1712
Cdd:PHA03378  595 TPWPVPHPSQTPEPPTTQSHI---PETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPPQVEITPYKPTWTqigH 671
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1713 RPFQKSkaslPTGQSIISRLSPS-LRLSLASSAPTAEKSSifGVSSTPLQISRVPLNQGPFAPGKPLEMG-ILSEPGKLG 1790
Cdd:PHA03378  672 IPYQPS----PTGANTMLPIQWApGTMQPPPRAPTPMRPP--AAPPGRAQRPAAATGRARPPAAAPGRARpPAAAPGRAR 745
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1791 APQTLRSSGQTLV---------YGGQSTSAQFPAPQAPPSPGQLPisRAPPTPGQPfiAGVPPTSGQI------------ 1849
Cdd:PHA03378  746 PPAAAPGRARPPAaapgrarppAAAPGAPTPQPPPQAPPAPQQRP--RGAPTPQPP--PQAGPTSMQLmpraapgqqgpt 821
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1850 ----------------PSLWAP--LSPGQPLVPEAS--SIPGDLLESGPLTFSEQLQEFQPPATAEQSPYLQA---PSTP 1906
Cdd:PHA03378  822 kqilrqlltggvkrgrPSLKKPaaLERQAAAGPTPSpgSGTSDKIVQAPVFYPPVLQPIQVMRQLGSVRAAAAstvTQAP 901
                         490       500       510       520
                  ....*....|....*....|....*....|....*....|....*
gi 224458301 1907 GQHLATWTLPGRASSLWIPPTSRHPPTLWPSPAPGKPQKSWSPSV 1951
Cdd:PHA03378  902 TEYTGERRGVGPMHPTDIPPSKRAKTDAYVESQPPHGGQSHSFSV 946
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
1319-1656 7.37e-04

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 44.53  E-value: 7.37e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1319 QAQALGIPLTPQQAQTQEITLTPQQAQALGMPLTTQQAQelgiplTPQHAQALGMPLTTQQAQelGIPLTPQQAQALGMP 1398
Cdd:cd22540    89 QGSQLSSSAPGGQQVFAIQNPTMIIKGSQTRSSTNQQYQ------ISPQIQAAGQINNSGQIQ--IIPGTNQAIITPVQV 160
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1399 LTTQQAQELGIPLTPQQAQELGIPFTPQQAQAQEITLTPQQAQALGMPLTAQQAQELGITLTPQQAQELGIPLTPQQAQA 1478
Cdd:cd22540   161 LQQPQQAHKPVPIKPAPLQTSNTNSASLQVPGNVIKLQSGGNVALTLPVNNLVGTQDGATQLQLAAAPSKPSKKIRKKSA 240
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1479 LGIPLIPPQAQELGIPL----TPQQAQALGILLIpPQAQELGIPLTPQQAQAL------GIPLIPP------QAQELGIP 1542
Cdd:cd22540   241 QAAQPAVTVAEQVETVLiettADNIIQAGNNLLI-VQSPGTGQPAVLQQVQVLqpkqeqQVVQIPQqalrvvQAASATLP 319
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1543 LTPQQV-QALGIPLIPPQAQELEIPLTPQQAQALGI---PLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGIPLTPQ 1618
Cdd:cd22540   320 TVPQKPlQNIQIQNSEPTPTQVYIKTPSGEVQTVLLqeaPAATATPSSSTSTVQQQVTANNGTGTSKPNYNVRKERTLPK 399
                         330       340       350       360
                  ....*....|....*....|....*....|....*....|..
gi 224458301 1619 QAQALG-ISL--TPQQAQAQGI-TLTPQQAQALGVPITPVNA 1656
Cdd:cd22540   400 IAPAGGiISLnaAQLAAAAQAIqTININGVQVQGVPVTITNA 441
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
1431-1660 8.45e-04

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 44.25  E-value: 8.45e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1431 QEITLTPQQAQALgmpLTAQQAQELGITLTPQQAQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQAQALG-----I 1505
Cdd:cd22553    99 QAIQLAPGGTQAI---LANQQTLIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNNMTQTIPVQVPVSTANGqtvyqT 175
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1506 LLIPPQAQELGIPLTPQQAqaLGIPLIPPQAQelgipltPQQVQALGIPLIPPQAQELEIPLTPQQAQALGIPLTPQQAQ 1585
Cdd:cd22553   176 IQVPIQAIQSGNAGGGNQA--LQAQVIPQLAQ-------AAQLQPQQLAQVSSQGYIQQIPANASQQQPQMVQQGPNQSG 246
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1586 ELGIPLTPQQAQELGIPLTPQQAQAQGIPLTPQQAQALGISLTPQQAQAQGITLTP--------QQAQALGVPITPVNAW 1657
Cdd:cd22553   247 QIIGQVASASSIQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPAsssiptvvQQQAIQGNPLPPGTQI 326

                  ...
gi 224458301 1658 VSA 1660
Cdd:cd22553   327 IAA 329
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
1292-1453 1.13e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 44.26  E-value: 1.13e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1292 NPQQAQTLGIPLTPKQAQAlgipfTPQQAQALGIPLTPQQAQTQEITLTPQQAQALGMPLTTQQAqelgiPLTPQHAQAL 1371
Cdd:pfam09770  209 KPAQQPAPAPAQPPAAPPA-----QQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQR-----PQSPQPDPAQ 278
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1372 GMPLTTQQAQELGIPLTPQQ-AQALGMPlTTQQAQELGIPLTPQQAQElgiPFTPQQAQAQeitltpQQAQALGMPLTAQ 1450
Cdd:pfam09770  279 PSIQPQAQQFHQQPPPVPVQpTQILQNP-NRLSAARVGYPQNPQPGVQ---PAPAHQAHRQ------QGSFGRQAPIITH 348

                   ...
gi 224458301  1451 QAQ 1453
Cdd:pfam09770  349 PQQ 351
flk PRK10715
flagella biosynthesis regulator Flk;
1291-1491 1.14e-03

flagella biosynthesis regulator Flk;


Pssm-ID: 182670  Cd Length: 335  Bit Score: 43.51  E-value: 1.14e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1291 LNPQQAQTLgipLTPKQAQALGIPFtPQQAQALGIPLTP--QQAQTQEIT----LTPQQAQALGMPLTTQQAQELGIPLT 1364
Cdd:PRK10715  130 LSPEQLKQV---LTLLQNGQLSIPQ-PQQRPATDRPLLPaeHNALNQLVTklaaATGEQPKKIWQSMLELSGVKSGELIP 205
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1365 PQHAQALGMPLTTQQAqelgipLTPQQAQALGMPLTTqqaqeLGIPLTPQQAQELgIPFTPQQAQAqeitlTPQqaqalg 1444
Cdd:PRK10715  206 AKHFPLLSQWLQARQT------LSQQHAPTLESLQAA-----LKQPLDAQEQQLL-SDYAQQRFQA-----SPQ------ 262
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*..
gi 224458301 1445 MPLTAQQAQELGITLTPQQAQELGIPLTPQQAQALGIPLIPPQAQEL 1491
Cdd:PRK10715  263 TPLTPAQVQDLLNQLFQRRVERIQEALEPRPLQPLINPLIAPLPDTL 309
PRK10263 PRK10263
DNA translocase FtsK; Provisional
1368-1916 1.20e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 44.31  E-value: 1.20e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1368 AQALGMPL--TTQQAQELGIPLTPQQAQALGMPLTTQQAQELGIPLTPQqaqelGIPFTPQQAQAQEITLTPQQ-----A 1440
Cdd:PRK10263  330 TQSWAAPVepVTQTPPVASVDVPPAQPTVAWQPVPGPQTGEPVIAPAPE-----GYPQQSQYAQPAVQYNEPLQqpvqpQ 404
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1441 QALGMPLTAQQAQELGITLTPQQAQELGIPLTPQQAQALGIPLIPPQAQELGIPLTPQQAQALGILLIPPQAQELGIPLT 1520
Cdd:PRK10263  405 QPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQPAAQEPLYQQPQPV 484
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1521 PQQAQALGIPLIppqaqELGIPLTPqqvqalgiPLIPPQAQELEIPLTPQQAQALGIPL-TPQQAQELGIPLTPQQAQEL 1599
Cdd:PRK10263  485 EQQPVVEPEPVV-----EETKPARP--------PLYYFEEVEEKRAREREQLAAWYQPIpEPVKEPEPIKSSLKAPSVAA 551
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1600 GIPLTPQQAQA---QGIPLTPQQAQALGISLTPQQAQAQGITLTPQQAQALGvPITPVNAWVSAVTLTSEQTHALESP-- 1674
Cdd:PRK10263  552 VPPVEAAAAVSplaSGVKKATLATGAAATVAAPVFSLANSGGPRPQVKEGIG-PQLPRPKRIRVPTRRELASYGIKLPsq 630
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1675 -MNLEQAQE---QLLKLGVPLTLDKAHTLGSPLTLKQV--QWSHRPFQKSKASLPTgQSIISRLSPSLRLSLASSAPTAE 1748
Cdd:PRK10263  631 rAAEEKAREaqrNQYDSGDQYNDDEIDAMQQDELARQFaqTQQQRYGEQYQHDVPV-NAEDADAAAEAELARQFAQTQQQ 709
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1749 KSSifgvSSTPLQISRVPLNQGPFAPGKPLEMGILSEPgkLGAPQTLRSSGQTLVYGGQSTSAQFPAPQAPPSPGQLPIS 1828
Cdd:PRK10263  710 RYS----GEQPAGANPFSLDDFEFSPMKALLDDGPHEP--LFTPIVEPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQPQQ 783
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1829 RAPPTP--GQPFIAGVPPTSGQipslwaplSPGQPLVPEassiPGDLLESGPLTFSEQLQEFQPPATAE----------- 1895
Cdd:PRK10263  784 PVAPQPqyQQPQQPVAPQPQYQ--------QPQQPVAPQ----PQYQQPQQPVAPQPQYQQPQQPVAPQpqdtllhpllm 851
                         570       580
                  ....*....|....*....|....
gi 224458301 1896 ---QSPYLQAPSTPGQHLATWTLP 1916
Cdd:PRK10263  852 rngDSRPLHKPTTPLPSLDLLTPP 875
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
639-1006 1.29e-03

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 43.85  E-value: 1.29e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  639 LVKSLSRVAKETSESTRVLESpdgKSEQSnleefqeaIMAFLKQKIDNIGKAFDKKTVPKEEellKRAEAEKlgiikakm 718
Cdd:NF033838   89 LNKKLSDIKTEYLYELNVLKE---KSEAE--------LTSKTKKELDAAFEQFKKDTLEPGK---KVAEATK-------- 146
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  719 eeyfqKVAETVtkilRKYKDTKKEEQVGEKPIKQKKVvsfmpGLHFQKSPISAKSESSTLLSYESTDPVINNLIQMILAE 798
Cdd:NF033838  147 -----KVEEAE----KKAKDQKEEDRRNYPTNTYKTL-----ELEIAESDVEVKKAELELVKEEAKEPRDEEKIKQAKAK 212
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  799 IESERDIPT----VSTVQKDHKEKEKQRQEQYLQEGQEQMSGMSLKQQLLG--ERNLLKEHYEKISENWEEKKAWLQMKE 872
Cdd:NF033838  213 VESKKAEATrlekIKTDREKAEEEAKRRADAKLKEAVEKNVATSEQDKPKRraKRGVLGEPATPDKKENDAKSSDSSVGE 292
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  873 GKQEQQSQkqwqeeemwKEEQKQATPKQAEQEEKQKQRGQEEEE---LPKSSLQRLE----EGTQKMKTQGLLLEKEngq 945
Cdd:NF033838  293 ETLPSPSL---------KPEKKVAEAEKKVEEAKKKAKDQKEEDrrnYPTNTYKTLEleiaESDVKVKEAELELVKE--- 360
                         330       340       350       360       370       380
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 224458301  946 mrqiqkEAKHlgpHRRREKGKE-KQKPERGLEDLERQIKTKDQMQMKETQPK---ELEKMVIQTP 1006
Cdd:NF033838  361 ------EAKE---PRNEEKIKQaKAKVESKKAEATRLEKIKTDRKKAEEEAKrkaAEEDKVKEKP 416
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
1791-1952 1.29e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 44.10  E-value: 1.29e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1791 APQTLRSSGQTLVYGGQSTSAQFPAPQAPPSPGQLPisRAPPTPGQPFIAGVPPTSGQIPSLWAPLSPGQPLVPEAssip 1870
Cdd:PRK12323  429 APEALAAARQASARGPGGAPAPAPAPAAAPAAAARP--AAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEEL---- 502
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1871 gdllesgpltfseqlqefqPPATAEQSPYLQAPSTPGQHLATWTLPGRASSLWIPPTSRHPPTLWPSPAPGKPQKSWSPS 1950
Cdd:PRK12323  503 -------------------PPEFASPAPAQPDAAPAGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAP 563

                  ..
gi 224458301 1951 VA 1952
Cdd:PRK12323  564 RP 565
DamX COG3266
Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell ...
1322-1622 1.36e-03

Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 442497 [Multi-domain]  Cd Length: 455  Bit Score: 43.69  E-value: 1.36e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1322 ALGIPLTPQQAQTQEITLTPQQAQALgMPLTTQQAQELGIPLTPQHAQALGMPLTTQQAQELGIPLTPQQAQALGMPLTT 1401
Cdd:COG3266    53 LLAGLLLLLIRLLSEAVDLGALASAA-LLLALASLALLGILLLALLALLLDLLLLADLLRAAALLLLKLLLLLLTLLLLV 131
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1402 QQAQELGIPLTPQQAQELGIPFTPQQAQAQEITLTPQQAQALGMPLTAQQAQELgiTLTPQQAQELGIPLTPQQAQALGI 1481
Cdd:COG3266   132 LLLLLALLLALLLDLPLLTLLIVLPLLEEQLLLLALQDIQGTLQALGAVAALLG--LRKAEEALALRAGSAAADALALLL 209
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1482 PLIPPQAQELGIPLTPQQ--AQALGILLIPPQAQELGIPLTPQQAQALgipliPPQAQELGIPLTPQQVQALGIPLIPPQ 1559
Cdd:COG3266   210 LLLASALGEAVAAAAELAalALLAAGAAEVLTARLVLLLLIIGSALKA-----PSQASSASAPATTSLGEQQEVSLPPAV 284
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 224458301 1560 AQELEiPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGIPlTPQQAQA 1622
Cdd:COG3266   285 AAQPA-AAAAAQPSAVALPAAPAAAAAAAAPAEAAAPQPTAAKPVVTETAAPAAP-APEAAAA 345
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
1217-1452 1.60e-03

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 43.48  E-value: 1.60e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1217 IPLTPQQAQAL---GITLTLQQAQQL------GIPLTPQQAQALGITLTPKQVQELGIPLTPQQAQALGITLTpkqaQEL 1287
Cdd:cd22553   101 IQLAPGGTQAIlanQQTLIRPNTVQGqanasnVLQNIAQIASGGNAVQLPLNNMTQTIPVQVPVSTANGQTVY----QTI 176
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1288 GIPLNPQQAQTLGIPLTPKQAQALgipftPQQAQalgipltPQQAQTQEITLTPQQAQALGMPLTTQQAQELGIPLTP-Q 1366
Cdd:cd22553   177 QVPIQAIQSGNAGGGNQALQAQVI-----PQLAQ-------AAQLQPQQLAQVSSQGYIQQIPANASQQQPQMVQQGPnQ 244
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1367 HAQALGMPLTTQQAQELGIPLTPQQaQALGMPLTTQQAQELGIPLTPQQAQELGIPFTPqqaqAQEITLTPQQAQALGMP 1446
Cdd:cd22553   245 SGQIIGQVASASSIQAAAIPLTVYT-GALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPA----SSSIPTVVQQQAIQGNP 319

                  ....*.
gi 224458301 1447 LTAQQA 1452
Cdd:cd22553   320 LPPGTQ 325
DUF5401 pfam17380
Family of unknown function (DUF5401); This is a family of unknown function found in ...
799-1055 1.71e-03

Family of unknown function (DUF5401); This is a family of unknown function found in Chromadorea.


Pssm-ID: 375164 [Multi-domain]  Cd Length: 722  Bit Score: 43.57  E-value: 1.71e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   799 IESERDIPTVSTVQKDhKEKEKQRQEQYLQEGQE--QMSGMSLKQQLLGER---NLLKEHYEKISENWEEKKAWLQMKEG 873
Cdd:pfam17380  344 MERERELERIRQEERK-RELERIRQEEIAMEISRmrELERLQMERQQKNERvrqELEAARKVKILEEERQRKIQQQKVEM 422
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   874 KQEQQSqkqwqeeemwKEEQKQATPKQAEQE-EKQKQRGQEEEELPKSSLQRLEEGTQKMKTQGLLLEKENGQMRQIQKE 952
Cdd:pfam17380  423 EQIRAE----------QEEARQREVRRLEEErAREMERVRLEEQERQQQVERLRQQEEERKRKKLELEKEKRDRKRAEEQ 492
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   953 -----AKHLGPHRRR--EKGKEKQKPERGLEDLERQIKTKDQMQMKE----TQPKELEKMVIQTPMTLSPRWKSVLKDVQ 1021
Cdd:pfam17380  493 rrkilEKELEERKQAmiEEERKRKLLEKEMEERQKAIYEEERRREAEeerrKQQEMEERRRIQEQMRKATEERSRLEAME 572
                          250       260       270
                   ....*....|....*....|....*....|....
gi 224458301  1022 RSyegKEFQRNLKTLENLPDEKEpiSITPPPSLQ 1055
Cdd:pfam17380  573 RE---REMMRQIVESEKARAEYE--ATTPITTIK 601
PRK03918 PRK03918
DNA double-strand break repair ATPase Rad50;
781-1051 1.81e-03

DNA double-strand break repair ATPase Rad50;


Pssm-ID: 235175 [Multi-domain]  Cd Length: 880  Bit Score: 43.51  E-value: 1.81e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  781 YESTDPVINNLIQMILAEIESERD-IPTVSTVQKDHKEKEKqRQEQYLQEGQEQMSGM-SLKQQLLGERNLLKEhYEKIS 858
Cdd:PRK03918  160 YENAYKNLGEVIKEIKRRIERLEKfIKRTENIEELIKEKEK-ELEEVLREINEISSELpELREELEKLEKEVKE-LEELK 237
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  859 ENWEEKKAWLQMKEGKQEQQsqkqwqeeemwkEEQKQATPKQAEQEEKqkqrgqEEEELpKSSLQRLEEgtqkmktqgll 938
Cdd:PRK03918  238 EEIEELEKELESLEGSKRKL------------EEKIRELEERIEELKK------EIEEL-EEKVKELKE----------- 287
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  939 LEKENGQMRQIQKEakhlgphrRREKGKEKQKPERGLEDLERQIKT-KDQMQMKETQPKELEKmviqtpmtLSPRWKSVL 1017
Cdd:PRK03918  288 LKEKAEEYIKLSEF--------YEEYLDELREIEKRLSRLEEEINGiEERIKELEEKEERLEE--------LKKKLKELE 351
                         250       260       270
                  ....*....|....*....|....*....|....*
gi 224458301 1018 KDVQRSYE-GKEFQRNLKTLENLPDEKEPISITPP 1051
Cdd:PRK03918  352 KRLEELEErHELYEEAKAKKEELERLKKRLTGLTP 386
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
1739-1946 1.83e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.62  E-value: 1.83e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1739 SLASSAPTAEKSSIFG----VSSTPLQISRVPLNQGPfAPGKPLEMGILSEPGKLGAPQTLRSSGQTLVYGGQSTSAQF- 1813
Cdd:PHA03307   23 RPPATPGDAADDLLSGsqgqLVSDSAELAAVTVVAGA-AACDRFEPPTGPPPGPGTEAPANESRSTPTWSLSTLAPASPa 101
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1814 -------PAPQAPPSPGQLPISRAPPTPGQPFIAGVPPTSGQIPSLWAPLSPGQPLVPEASsiPGDLLESGPLTFSEQLq 1886
Cdd:PHA03307  102 regsptpPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAV--ASDAASSRQAALPLSS- 178
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 224458301 1887 efqPPATAeqspylQAPSTPGQHLATWTLPGRASSlwiPPTSRHPP----TLWPSPAPGKPQKS 1946
Cdd:PHA03307  179 ---PEETA------RAPSSPPAEPPPSTPPAAASP---RPPRRSSPisasASSPAPAPGRSAAD 230
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
1770-1944 1.85e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 43.44  E-value: 1.85e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1770 GPFAPGKPLEMGILSEPGKLGAPqtlrssgqtlvyGGQSTSAQFPAPQAPPSPGQLPISRAPPTPGQPFIAGVPPTSGQI 1849
Cdd:PRK07764  599 GPPAPASSGPPEEAARPAAPAAP------------AAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGG 666
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1850 PSLWAPLSPGQPLVPEASSIPGDllESGPLTFSEQLQEFQPPATAEQSPYLQAPSTPGQHLATWTLPGRASSLWIPPTSR 1929
Cdd:PRK07764  667 DGWPAKAGGAAPAAPPPAPAPAA--PAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPE 744
                         170
                  ....*....|....*
gi 224458301 1930 HPPTLWPSPAPGKPQ 1944
Cdd:PRK07764  745 PDDPPDPAGAPAQPP 759
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
1797-1974 1.94e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 43.60  E-value: 1.94e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1797 SSGQTLVYGGQSTSAQFPAPQAPPSPgqlpisRAPPTPGQPFIAGVPPTSGQIPSLWAPLSPGQPLVPEASSIPGDLLES 1876
Cdd:pfam03154  160 SSAQQQILQTQPPVLQAQSGAASPPS------PPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQ 233
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1877 GPLTFSEQLQEFQPPATAEQSP----YLQAPSTPGQHLATWTLPGRASSLWIPPTSRHPPTLWPSPAPGKPQKSWSPSVA 1952
Cdd:pfam03154  234 TPTLHPQRLPSPHPPLQPMTQPpppsQVSPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQPFPLTPQSSQSQVPPGP 313
                          170       180
                   ....*....|....*....|..
gi 224458301  1953 KKRLAIISSLKSKSVLIHPSAP 1974
Cdd:pfam03154  314 SPAAPGQSQQRIHTPPSQSQLQ 335
PHA03377 PHA03377
EBNA-3C; Provisional
1790-1975 2.28e-03

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 43.50  E-value: 2.28e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1790 GAPQTLRSSGQTLVYGGQSTSAQ----FPAPQAPPSPGQLPISRAPPTPGQPFIAGVPPTSGQIP--SLWAPLSPGQPLV 1863
Cdd:PHA03377  696 GRAQPSEESHLSSMSPTQPISHEeqprYEDPDDPLDLSLHPDQAPPPSHQAPYSGHEEPQAQQAPypGYWEPRPPQAPYL 775
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1864 ----PEA-----SSIPGDlleSGPLTFSEQLQEFQPP-ATAEQSPYLQAPSTPgqhlatwtlpgrasslWIPPTSRHPPT 1933
Cdd:PHA03377  776 gyqePQAqgvqvSSYPGY---AGPWGLRAQHPRYRHSwAYWSQYPGHGHPQGP----------------WAPRPPHLPPQ 836
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|..
gi 224458301 1934 LWPSPAPGKPQKSWSPSVAKKRLAIISSLKSKSVLIHPSAPD 1975
Cdd:PHA03377  837 WDGSAGHGQDQVSQFPHLQSETGPPRLQLSQVPQLPYSQTLV 878
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
1803-1950 2.65e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 43.05  E-value: 2.65e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1803 VYGGQSTSAQFPAPQAPPSPGQLPISRAPPTPGQPfIAGVPPTSGQIPSLWAPLSPGQPLVPEASSIPGDLLESGPLTFS 1882
Cdd:PRK07764  587 VVGPAPGAAGGEGPPAPASSGPPEEAARPAAPAAP-AAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDG 665
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1883 EQLQ--EFQPPATAEQSPylqAPSTPGQHLATWTLPGRASSlwiPPTSRHPPTLWPSPAPGKPQKSWSPS 1950
Cdd:PRK07764  666 GDGWpaKAGGAAPAAPPP---APAPAAPAAPAGAAPAQPAP---APAATPPAGQADDPAAQPPQAAQGAS 729
PHA03378 PHA03378
EBNA-3B; Provisional
1153-1539 4.25e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 42.36  E-value: 4.25e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1153 QAPGISLTTQQAQKLGIPLTPQQAQAlgiPLTPQQA----QELGIPLTPQQAQALRV------------SLTPQQAQELG 1216
Cdd:PHA03378  448 QAPTVVLHRPPTQPLEGPTGPLSVQA---PLEPWQPlphpQVTPVILHQPPAQGVQAhgsmldllekddEDMEQRVMATL 524
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1217 IPLTPQQAQAlGITLTLQQAQQLGI---------PLTPQQAQALGitLTPKQVQELGIPLTPQQAqalgiTLTPKQAQEL 1287
Cdd:PHA03378  525 LPPSPPQPRA-GRRAPCVYTEDLDIesdepastePVHDQLLPAPG--LGPLQIQPLTSPTTSQLA-----SSAPSYAQTP 596
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1288 GipLNPQQAQTLGIPLTPKQAQALGipfTPQQAQALGIPLTPQQAQTQEITLTP-------QQAQALGMPLTTQQAQELG 1360
Cdd:PHA03378  597 W--PVPHPSQTPEPPTTQSHIPETS---APRQWPMPLRPIPMRPLRMQPITFNVlvfptphQPPQVEITPYKPTWTQIGH 671
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1361 IPLTPQHAQALGM------PLTTQQAQELGIPLTPQQAQalgmPLTTQQAQELGIPLTPQQAqelgipfTPQQAQAQEIT 1434
Cdd:PHA03378  672 IPYQPSPTGANTMlpiqwaPGTMQPPPRAPTPMRPPAAP----PGRAQRPAAATGRARPPAA-------APGRARPPAAA 740
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1435 LTPQQAQAlGMPLTAQQAQELGITLTPQQAQElGIPLTPQQAQALGIPLIPPQAQElgiplTPQQaqalgilliPPQAQE 1514
Cdd:PHA03378  741 PGRARPPA-AAPGRARPPAAAPGRARPPAAAP-GAPTPQPPPQAPPAPQQRPRGAP-----TPQP---------PPQAGP 804
                         410       420
                  ....*....|....*....|....*
gi 224458301 1515 LGIPLTPQQAQALGIPLIPPQAQEL 1539
Cdd:PHA03378  805 TSMQLMPRAAPGQQGPTKQILRQLL 829
Borrelia_P83 pfam05262
Borrelia P83/100 protein; This family consists of several Borrelia P83/P100 antigen proteins.
890-1030 4.41e-03

Borrelia P83/100 protein; This family consists of several Borrelia P83/P100 antigen proteins.


Pssm-ID: 114011 [Multi-domain]  Cd Length: 489  Bit Score: 42.30  E-value: 4.41e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   890 KEEQKQATPKQAEQEEKQKQRGQEEEELPKSSLQRLEEGtqkmktqgllLEKENGQMRQIQKEAKHLGPHRRREKGKEKQ 969
Cdd:pfam05262  204 KERESQEDAKRAQQLKEELDKKQIDADKAQQKADFAQDN----------ADKQRDEVRQKQQEAKNLPKPADTSSPKEDK 273
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 224458301   970 K----PERGLEDLERQIKTKDQMQMKET---------QPKELEKMVIQTPMTLSPRWKSVLKDVQRSYEGKEFQ 1030
Cdd:pfam05262  274 QvaenQKREIEKAQIEIKKNDEEALKAKdhkafdlkqESKASEKEAEDKELEAQKKREPVAEDLQKTKPQVEAQ 347
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
817-1000 4.66e-03

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 42.36  E-value: 4.66e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   817 EKEKQRQEqyLQEGQEQMSGMSLK--------QQLLGERNLlKEHYEKISENWEEKKAWLQMKEGKQEqqsqkqwqeeem 888
Cdd:TIGR02169  171 KKEKALEE--LEEVEENIERLDLIidekrqqlERLRREREK-AERYQALLKEKREYEGYELLKEKEAL------------ 235
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301   889 wkEEQKQATPKQ-AEQEEKQKQRGQEEEELPK---SSLQRLEEGTQKMKtqglllEKENGQMRQIQKEAKHLGPHRRREK 964
Cdd:TIGR02169  236 --ERQKEAIERQlASLEEELEKLTEEISELEKrleEIEQLLEELNKKIK------DLGEEEQLRVKEKIGELEAEIASLE 307
                          170       180       190
                   ....*....|....*....|....*....|....*..
gi 224458301   965 GKEKQKpERGLEDLERQI-KTKDQMQMKETQPKELEK 1000
Cdd:TIGR02169  308 RSIAEK-ERELEDAEERLaKLEAEIDKLLAEIEELER 343
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
1389-1551 4.67e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 42.33  E-value: 4.67e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1389 PQQAQALGMPLTTQQaqelgIPLTPQQAQELGIPFTPQQAQAQEITLTPQQAQALGMPLTAQQaqelgitlTPQQAQelg 1468
Cdd:pfam09770  210 PAQQPAPAPAQPPAA-----PPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQ--------RPQSPQ--- 273
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1469 ipltPQQAQalgiPLIPPQAQELGIPLTPQQAQALGILLIP--PQAQELGIPLTPQQAQALGIPLIPPQAQ-----ELGI 1541
Cdd:pfam09770  274 ----PDPAQ----PSIQPQAQQFHQQPPPVPVQPTQILQNPnrLSAARVGYPQNPQPGVQPAPAHQAHRQQgsfgrQAPI 345
                          170
                   ....*....|
gi 224458301  1542 PLTPQQVQAL 1551
Cdd:pfam09770  346 ITHPQQLAQL 355
PRK03918 PRK03918
DNA double-strand break repair ATPase Rad50;
670-980 6.05e-03

DNA double-strand break repair ATPase Rad50;


Pssm-ID: 235175 [Multi-domain]  Cd Length: 880  Bit Score: 41.97  E-value: 6.05e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  670 EEFQEAIMAFLKQKIDNIGKAFdKKTVPKEEELlkRAEAEKLGIIKAKMEEYF--QKVAETVTKI---LRKY------KD 738
Cdd:PRK03918  447 EEHRKELLEEYTAELKRIEKEL-KEIEEKERKL--RKELRELEKVLKKESELIklKELAEQLKELeekLKKYnleeleKK 523
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  739 TKKEEQVGEKPIKQKKVVSfmpglhfqkspiSAKSESSTLLSYESTDPVINNLIQMI---LAEIESERDIPTVSTVQKDh 815
Cdd:PRK03918  524 AEEYEKLKEKLIKLKGEIK------------SLKKELEKLEELKKKLAELEKKLDELeeeLAELLKELEELGFESVEEL- 590
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  816 kEKEKQRQEQYLQEGQEQMSGMSLKQQLLGERNLLKEHYEKISENWEEKKAWLQMKEGKQeqqsqkqwqeeemwkEEQKQ 895
Cdd:PRK03918  591 -EERLKELEPFYNEYLELKDAEKELEREEKELKKLEEELDKAFEELAETEKRLEELRKEL---------------EELEK 654
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  896 atpKQAEQEEKQKqrgqEEEELPKSS-LQRLEEGtqkmktqgllLEKENGQMRQIQKEAKHLGPHR--RREKGKEKQKPE 972
Cdd:PRK03918  655 ---KYSEEEYEEL----REEYLELSReLAGLRAE----------LEELEKRREEIKKTLEKLKEELeeREKAKKELEKLE 717

                  ....*...
gi 224458301  973 RGLEDLER 980
Cdd:PRK03918  718 KALERVEE 725
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
1520-1679 6.22e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 41.95  E-value: 6.22e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1520 TPQQAQALGIPLIPPQAQelgiPLTPQQVQALGIPLIPPQAQELEIPLTPQQAQALGIPLTPQQAQELGiPLTPQQAQel 1599
Cdd:pfam09770  208 KKPAQQPAPAPAQPPAAP----PAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSP-QPDPAQPS-- 280
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1600 giPLTPQQAQAQGIPLTPQQ-------------AQALGISLTPQQAQAQGITLTPQQAQALGVPITPVNAWVSAVTLTSE 1666
Cdd:pfam09770  281 --IQPQAQQFHQQPPPVPVQptqilqnpnrlsaARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHPQQLAQLSEE 358
                          170
                   ....*....|...
gi 224458301  1667 QTHAlespmNLEQ 1679
Cdd:pfam09770  359 EKAA-----YLDE 366
PHA03379 PHA03379
EBNA-3A; Provisional
1550-1998 7.96e-03

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 41.58  E-value: 7.96e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1550 ALGIPLIPPQAQELEIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQAQGIPLTPQQAQALGISLTP 1629
Cdd:PHA03379  412 TYGTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDG 491
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1630 QQAqaqgitltPQQAQALGVPItpVNAWVSAVTLTSEQTHALESPMNLEQA-----QEQLLKLGVPLTLDKAHT-LGSPL 1703
Cdd:PHA03379  492 RPA--------CAPVPAPAGPI--VRPWEASLSQVPGVAFAPVMPQPMPVEpvpvpTVALERPVCPAPPLIAMQgPGETS 561
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1704 TLKQVQWSHRPfqksKASLPTGQSIISRLSPSLRLSLASSAPTAEKSSifgVSSTPLQISRVPlnqgpfaPGKPLEMGIl 1783
Cdd:PHA03379  562 GIVRVRERWRP----APWTPNPPRSPSQMSVRDRLARLRAEAQPYQAS---VEVQPPQLTQVS-------PQQPMEYPL- 626
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1784 sEPGKLGAPQTLRSSGQTLVYGGQSTSAQfpaPQAPPSPGQLPIS-RAPPTPGQPFIAGVPPTSGQIPS-----LWAPLS 1857
Cdd:PHA03379  627 -EPEQQMFPGSPFSQVADVMRAGGVPAMQ---PQYFDLPLQQPISqGAPLAPLRASMGPVPPVPATQPQyfdipLTEPIN 702
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1858 PGQ-------------PLVPEASSIPGDLLESGPLTFSEQLQEFQPPAT------AEQSPYLQAPSTPGQhlatwtlpgr 1918
Cdd:PHA03379  703 QGAsaahflpqqpmegPLVPERWMFQGATLSQSVRPGVAQSQYFDLPLTqpinhgAPAAHFLHQPPMEGP---------- 772
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1919 asslWIPPTsrhpptlWP-SPAPGKPQKSWSPSVAKKRLAIISSLKSKSVLIHPSAPDFKVAQVPF--TTKKFQMSEVSD 1995
Cdd:PHA03379  773 ----WVPEQ-------WMfQGAPPSQGTDVVQHQLDALGYVLHVLNHPGVPVSPAVNQYHVSQAAFglPIDEDESGEGSD 841

                  ...
gi 224458301 1996 TSE 1998
Cdd:PHA03379  842 TSE 844
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
1496-1643 8.51e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 41.56  E-value: 8.51e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1496 TPQQAQALGILLIPPQAQelgiPLTPQQAQALGIPLIPPQAQELGIPLTPQQVQALGIPlippqAQELEIPLTPQQAQAL 1575
Cdd:pfam09770  208 KKPAQQPAPAPAQPPAAP----PAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHP-----VTILQRPQSPQPDPAQ 278
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 224458301  1576 GIPLTPQQAQELGIPLTPQQ-AQELGIPLTPQQAQAQGIPLTPQQAQALGISLTPQQAQAQG----ITLTPQQ 1643
Cdd:pfam09770  279 PSIQPQAQQFHQQPPPVPVQpTQILQNPNRLSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGrqapIITHPQQ 351
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
1476-1631 8.80e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 41.56  E-value: 8.80e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1476 AQALGIPliPPQAQElgIPLTPQQAQAlgillIPPQAQELGIPLTPQQAQALGIPLIPPQAQELGIPLT----PQQVQAL 1551
Cdd:pfam09770  205 AQAKKPA--QQPAPA--PAQPPAAPPA-----QQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTilqrPQSPQPD 275
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301  1552 -GIPLIPPQAQELEIPLTPQQAQALGIPLTP--QQAQELGIPLTPQQAQElgiPLTPQQAQAQGipltPQQAQALGISLT 1628
Cdd:pfam09770  276 pAQPSIQPQAQQFHQQPPPVPVQPTQILQNPnrLSAARVGYPQNPQPGVQ---PAPAHQAHRQQ----GSFGRQAPIITH 348

                   ...
gi 224458301  1629 PQQ 1631
Cdd:pfam09770  349 PQQ 351
PHA03377 PHA03377
EBNA-3C; Provisional
1805-1951 9.26e-03

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 41.19  E-value: 9.26e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1805 GGQSTSAQFPAPQAPPSPgqlPISRAPPTPGQPFIAgvPPTSGqiPSLWAPLSPGQPLVPEASSIPGDLLESGPLTFSEQ 1884
Cdd:PHA03377  539 GFQRSGRRQKRATPPKVS---PSDRGPPKASPPVMA--PPSTG--PRVMATPSTGPRDMAPPSTGPRQQAKCKDGPPASG 611
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 224458301 1885 LQEFQPPATAeqsPYLQAPSTPGQHLATWTLP---GRA-SSLW------------IPPTSRHPPTLWPSPapgkPQKSWS 1948
Cdd:PHA03377  612 PHEKQPPSSA---PRDMAPSVVRMFLRERLLEqstGPKpKSFWemragrdgsgiqQEPSSRRQPATQSTP----PRPSWL 684

                  ...
gi 224458301 1949 PSV 1951
Cdd:PHA03377  685 PSV 687
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH