NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|767998433|ref|XP_011524185|]
View 

activity-dependent neuroprotector homeobox protein 2 isoform X1 [Homo sapiens]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
ADNP_N super family cl45031
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the ...
1-962 1.55e-63

Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the N-terminal domain of Activity-dependent neuroprotector homeobox protein (ADNP, also known as Activity- dependent neuroprotective protein), which contains zinc finger motifs. It is involved in transcriptional regulation and it is vital for mammalian brain formation. In humans, de novo mutations result in a syndromic form of autism-like spectrum disorder (ASD), including cognitive and motor deficits, the ADNP syndrome. This protein is also related to autophagy and the pathophysiology of schizophrenia.


The actual alignment was detected with superfamily member pfam19627:

Pssm-ID: 466132 [Multi-domain]  Cd Length: 744  Bit Score: 230.51  E-value: 1.55e-63
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433     1 MFQIPVENLDNIRK----------------------DLKGFDPGEKYFHNTSWGDVSLWEPSGKKVR-YRTKPYCCGLCK 57
Cdd:pfam19627    1 MFQLPVNNLGSLRKarknvkkilsdigleyckehieDFKDFEPNDFYIKNTSWDDVCLWDPSLTKNQdYRTKPFCCSGCP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433    58 YSTKVLTSFKNHLHRYHEDEIDQELVIPCPNCVFASQPKVVGRHFRMFHAPVRKVQNYTVNILGETKSSRSDVIS----- 132
Cdd:pfam19627   81 FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTYNGNKKTLETHIKLFHMPNNVVRQPSGGPVGFKDKSKQDSLKpkqgd 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   133 ------FTCLKCNFSNTLYYSMKKHVLVAHFHYLINSYFGlRTEEmgeqpKT-NDTVSIEKIPPPDKYYCKKCNANASSQ 205
Cdd:pfam19627  161 sveqavYYCKKCTYRDPLYNVVRKHIYREHFQHVAAPYVA-KPGE-----KSvNGAVASSNTRDDGSIHCKRCLFMPRTY 234
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   206 DALMYHiltsdihrdlenklrsVISEHIKrtgllkqthiapkpaahlaapangsapsapaqppcfhlalpqnspspaAGQ 285
Cdd:pfam19627  235 EALVQH----------------VIEDHER------------------------------------------------IGY 250
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   286 PVTVAQGapgslthsppaagqshmtlvssplpvgqnsltlqppapqpvflshgvplHQSVnppVLPLSQPvgpvnksvgt 365
Cdd:pfam19627  251 QVTAMIG-------------------------------------------------HTNV---VVPRSKP---------- 268
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   366 svlpinqtvrpgvLPLTQPVGPINRPVGpgvLPVSPSVTPGVLQAvspgvlsvsravpsgvlpagqmtpagqmtpagvip 445
Cdd:pfam19627  269 -------------LMLIAPKPQDKKSLG---VTQKGGLVTGNVRS----------------------------------- 297
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   446 gqtatsgvLPTGQMvqsgvlpvgqtapSRVLPPGqtaplrvisagqvvpSGLLSPnqtvsssavVPVNQGvnsgvlQLSQ 525
Cdd:pfam19627  298 --------LSSQQM-------------NRLSIPK---------------ANLLSN---------VHLKQG------SYGL 326
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   526 PVVSGVLPVGQPVRPGvlqlnqtvgtniLPVNQPVR-PGASQNttfltsgsiLRQLIPTGkqvNGIPTYTLAPVSVTLP- 603
Cdd:pfam19627  327 KSMPSFYVLGQQVRLS------------LPGNAQVSvPQQSQT---------VKQLLPGG---NGRPSTVGSSQSGQQPa 382
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   604 ---VPPGGLATVAPPQM-PIQLLPSGAAAPMAGSMPGMPSppvlvnaaqsvfvqASSSAADTNQVLKqakqWKTCPVCNE 679
Cdd:pfam19627  383 rfsVQSGNSASSSSSQLkSPPLSSSVAATRALGQGPSKSS--------------ASAAGLNTSYTQK----WKICTICNE 444
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   680 LFPSNVYQVHMEVAHKhsesksgeklePEKLAACAPFLKWMREKTVRCLSCKCLVSEEELIHHLLMHGLGCLFCPCTFHD 759
Cdd:pfam19627  445 LFPENVYSAHFEKEHK-----------AEKVPAVANYIMKIHNFTSKCLYCNRYLPSDTLLNHMLIHGLSCPYCRSTFND 513
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   760 IKGLSEHSRNRHLGKKKLPMDYSNRGFQLDVDaNGNLLFPHLDFITILPKEKLGEREVYLA---------ILAGIHSKSL 830
Cdd:pfam19627  514 VEKMVAHMRMVHPDEEVGPRTDSPLTFDLTLQ-QGNPKNIQLLVTTYNMRDAPEESVAFHAqnnspqpkkPKPKVQEKSD 592
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   831 VPvyVKVRPQaegTPGSTGKRV--STCPFCF----GPFvtTEAYELHLKERHHIMPTVHTVLKSPAFKCIHCCGVYTGNM 904
Cdd:pfam19627  593 VP--VKSSPQ---AAVPYKKDVgkTLCPLCFsilkGPI--SDALAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNM 665
                          970       980       990      1000      1010      1020
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 767998433   905 TLAAIAVHLVRCRSAPK--DSSSDLQAQPGFIHNSELLLVSGEVMH-DSSFSVKRKLPDGH 962
Cdd:pfam19627  666 TASTITLHLVHCRGVGKtqNGQDKSAPSPRVTQSPGAAPLKRELEHvDPALPKKRKLDDEE 726
PHA03247 super family cl33720
large tegument protein UL36; Provisional
246-492 2.16e-09

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 62.26  E-value: 2.16e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  246 PKPAAhLAAPANGSAPSAPAQPPCFHLALPQNSP-SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLT 324
Cdd:PHA03247 2758 ARPPT-TAGPPAPAPPAAPAAGPPRRLTRPAVASlSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQP 2836
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  325 LQPPAPqPVFLSHGVPLHQSVNP--PVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGpiNRPVGPGVLPVSPS 402
Cdd:PHA03247 2837 TAPPPP-PGPPPPSLPLGGSVAPggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFA--LPPDQPERPPQPQA 2913
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  403 VTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTP----AGQMTPAGVIPgqTATSGVLPTGQmVQSGVLPVGQTAPSRVLPP 478
Cdd:PHA03247 2914 PPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPttdpAGAGEPSGAVP--QPWLGALVPGR-VAVPRFRVPQPAPSREAPA 2990
                         250
                  ....*....|....
gi 767998433  479 GQTAPLRVISAGQV 492
Cdd:PHA03247 2991 SSTPPLTGHSLSRV 3004
HOX smart00389
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key ...
1021-1074 1.58e-06

Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes


:

Pssm-ID: 197696 [Multi-domain]  Cd Length: 57  Bit Score: 46.09  E-value: 1.58e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....
gi 767998433   1021 PKKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRY 1074
Cdd:smart00389    1 KRRKRTSFTPEQLEELEKEFQKNPYPSREEREELAKKLGLSERQVKVWFQNRRA 54
 
Name Accession Description Interval E-value
ADNP_N pfam19627
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the ...
1-962 1.55e-63

Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the N-terminal domain of Activity-dependent neuroprotector homeobox protein (ADNP, also known as Activity- dependent neuroprotective protein), which contains zinc finger motifs. It is involved in transcriptional regulation and it is vital for mammalian brain formation. In humans, de novo mutations result in a syndromic form of autism-like spectrum disorder (ASD), including cognitive and motor deficits, the ADNP syndrome. This protein is also related to autophagy and the pathophysiology of schizophrenia.


Pssm-ID: 466132 [Multi-domain]  Cd Length: 744  Bit Score: 230.51  E-value: 1.55e-63
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433     1 MFQIPVENLDNIRK----------------------DLKGFDPGEKYFHNTSWGDVSLWEPSGKKVR-YRTKPYCCGLCK 57
Cdd:pfam19627    1 MFQLPVNNLGSLRKarknvkkilsdigleyckehieDFKDFEPNDFYIKNTSWDDVCLWDPSLTKNQdYRTKPFCCSGCP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433    58 YSTKVLTSFKNHLHRYHEDEIDQELVIPCPNCVFASQPKVVGRHFRMFHAPVRKVQNYTVNILGETKSSRSDVIS----- 132
Cdd:pfam19627   81 FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTYNGNKKTLETHIKLFHMPNNVVRQPSGGPVGFKDKSKQDSLKpkqgd 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   133 ------FTCLKCNFSNTLYYSMKKHVLVAHFHYLINSYFGlRTEEmgeqpKT-NDTVSIEKIPPPDKYYCKKCNANASSQ 205
Cdd:pfam19627  161 sveqavYYCKKCTYRDPLYNVVRKHIYREHFQHVAAPYVA-KPGE-----KSvNGAVASSNTRDDGSIHCKRCLFMPRTY 234
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   206 DALMYHiltsdihrdlenklrsVISEHIKrtgllkqthiapkpaahlaapangsapsapaqppcfhlalpqnspspaAGQ 285
Cdd:pfam19627  235 EALVQH----------------VIEDHER------------------------------------------------IGY 250
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   286 PVTVAQGapgslthsppaagqshmtlvssplpvgqnsltlqppapqpvflshgvplHQSVnppVLPLSQPvgpvnksvgt 365
Cdd:pfam19627  251 QVTAMIG-------------------------------------------------HTNV---VVPRSKP---------- 268
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   366 svlpinqtvrpgvLPLTQPVGPINRPVGpgvLPVSPSVTPGVLQAvspgvlsvsravpsgvlpagqmtpagqmtpagvip 445
Cdd:pfam19627  269 -------------LMLIAPKPQDKKSLG---VTQKGGLVTGNVRS----------------------------------- 297
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   446 gqtatsgvLPTGQMvqsgvlpvgqtapSRVLPPGqtaplrvisagqvvpSGLLSPnqtvsssavVPVNQGvnsgvlQLSQ 525
Cdd:pfam19627  298 --------LSSQQM-------------NRLSIPK---------------ANLLSN---------VHLKQG------SYGL 326
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   526 PVVSGVLPVGQPVRPGvlqlnqtvgtniLPVNQPVR-PGASQNttfltsgsiLRQLIPTGkqvNGIPTYTLAPVSVTLP- 603
Cdd:pfam19627  327 KSMPSFYVLGQQVRLS------------LPGNAQVSvPQQSQT---------VKQLLPGG---NGRPSTVGSSQSGQQPa 382
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   604 ---VPPGGLATVAPPQM-PIQLLPSGAAAPMAGSMPGMPSppvlvnaaqsvfvqASSSAADTNQVLKqakqWKTCPVCNE 679
Cdd:pfam19627  383 rfsVQSGNSASSSSSQLkSPPLSSSVAATRALGQGPSKSS--------------ASAAGLNTSYTQK----WKICTICNE 444
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   680 LFPSNVYQVHMEVAHKhsesksgeklePEKLAACAPFLKWMREKTVRCLSCKCLVSEEELIHHLLMHGLGCLFCPCTFHD 759
Cdd:pfam19627  445 LFPENVYSAHFEKEHK-----------AEKVPAVANYIMKIHNFTSKCLYCNRYLPSDTLLNHMLIHGLSCPYCRSTFND 513
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   760 IKGLSEHSRNRHLGKKKLPMDYSNRGFQLDVDaNGNLLFPHLDFITILPKEKLGEREVYLA---------ILAGIHSKSL 830
Cdd:pfam19627  514 VEKMVAHMRMVHPDEEVGPRTDSPLTFDLTLQ-QGNPKNIQLLVTTYNMRDAPEESVAFHAqnnspqpkkPKPKVQEKSD 592
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   831 VPvyVKVRPQaegTPGSTGKRV--STCPFCF----GPFvtTEAYELHLKERHHIMPTVHTVLKSPAFKCIHCCGVYTGNM 904
Cdd:pfam19627  593 VP--VKSSPQ---AAVPYKKDVgkTLCPLCFsilkGPI--SDALAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNM 665
                          970       980       990      1000      1010      1020
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 767998433   905 TLAAIAVHLVRCRSAPK--DSSSDLQAQPGFIHNSELLLVSGEVMH-DSSFSVKRKLPDGH 962
Cdd:pfam19627  666 TASTITLHLVHCRGVGKtqNGQDKSAPSPRVTQSPGAAPLKRELEHvDPALPKKRKLDDEE 726
PHA03247 PHA03247
large tegument protein UL36; Provisional
248-641 5.40e-15

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 80.37  E-value: 5.40e-15
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  248 PAAHLAAPANGSAPSAPAQPPcfhlalpqnsPSPAAGQPVTVAQGAPGSLTHSPPAA--GQSHMTLVSSPLPVGQNSLTL 325
Cdd:PHA03247 2557 PAAPPAAPDRSVPPPRPAPRP----------SEPAVTSRARRPDAPPQSARPRAPVDdrGDPRGPAPPSPLPPDTHAPDP 2626
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  326 QPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPV------NKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPV 399
Cdd:PHA03247 2627 PPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPP 2706
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  400 SPSVTPGVLQAVSPgvLSVSRAVPSGVLPAGQMTPAGQMTPAG-VIPGQTATSGVLPTGQMVQSGVLPVGQ-TAPSRVLP 477
Cdd:PHA03247 2707 TPEPAPHALVSATP--LPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAGPPAPAPPAAPaAGPPRRLT 2784
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  478 PGQTAPLRVISAGQVVPSGLLSPNQTVSS-SAVVPVNQGVNSGVlqlsqPVVSGVLPVGQPVRPGVLQLNQTVGTNILPV 556
Cdd:PHA03247 2785 RPAVASLSESRESLPSPWDPADPPAAVLApAAALPPAASPAGPL-----PPPTSAQPTAPPPPPGPPPPSLPLGGSVAPG 2859
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  557 NQPVRPGASQNTTFLTSGsilrqliPTGKQVNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPG 636
Cdd:PHA03247 2860 GDVRRRPPSRSPAAKPAA-------PARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932

                  ....*
gi 767998433  637 MPSPP 641
Cdd:PHA03247 2933 PPPPP 2937
PHA03247 PHA03247
large tegument protein UL36; Provisional
246-492 2.16e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 62.26  E-value: 2.16e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  246 PKPAAhLAAPANGSAPSAPAQPPCFHLALPQNSP-SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLT 324
Cdd:PHA03247 2758 ARPPT-TAGPPAPAPPAAPAAGPPRRLTRPAVASlSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQP 2836
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  325 LQPPAPqPVFLSHGVPLHQSVNP--PVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGpiNRPVGPGVLPVSPS 402
Cdd:PHA03247 2837 TAPPPP-PGPPPPSLPLGGSVAPggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFA--LPPDQPERPPQPQA 2913
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  403 VTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTP----AGQMTPAGVIPgqTATSGVLPTGQmVQSGVLPVGQTAPSRVLPP 478
Cdd:PHA03247 2914 PPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPttdpAGAGEPSGAVP--QPWLGALVPGR-VAVPRFRVPQPAPSREAPA 2990
                         250
                  ....*....|....
gi 767998433  479 GQTAPLRVISAGQV 492
Cdd:PHA03247 2991 SSTPPLTGHSLSRV 3004
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
362-679 1.58e-07

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 55.03  E-value: 1.58e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  362 SVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSpgVLSVSRAVPSGVLPAGQMTPAG-QMTP 440
Cdd:cd22553    35 ETHDPLILSPPLSQPQQIITAQSSGSAAGGVAYSVSPAVQTVTVDGHEAIF--IPANSGLLQTNNQQAIQLAPGGtQAIL 112
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  441 AGvipGQTATSGVLPTGQMVQSGVLPV-GQTAPSRV---LPP---GQTAPLRV-ISA--GQVVPSGLLSPNQTVSSSAVV 510
Cdd:cd22553   113 AN---QQTLIRPNTVQGQANASNVLQNiAQIASGGNavqLPLnnmTQTIPVQVpVSTanGQTVYQTIQVPIQAIQSGNAG 189
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  511 PVNQGVNSGVL-QLSQPvvsgvlpvgqpvrpGVLQLNQTVGTNILPVNQPVRPGASQNTTFL------TSGSILRQLIPT 583
Cdd:cd22553   190 GGNQALQAQVIpQLAQA--------------AQLQPQQLAQVSSQGYIQQIPANASQQQPQMvqqgpnQSGQIIGQVASA 255
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  584 -GKQVNGIPTYTLApvSVTLPVPPGGLATVAP-PQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQASSSAADT 661
Cdd:cd22553   256 sSIQAAAIPLTVYT--GALAGQNGSNQQQVGQiVTSPIQGMTQGLTAPASSSIPTVVQQQAIQGNPLPPGTQIIAAGQQL 333
                         330       340       350
                  ....*....|....*....|....*....|....*.
gi 767998433  662 NQVLKQAKQWK------------------TCPVCNE 679
Cdd:cd22553   334 QQDPNDPTKWQvvadgtpgskkrlrrvacTCPNCRD 369
HOX smart00389
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key ...
1021-1074 1.58e-06

Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes


Pssm-ID: 197696 [Multi-domain]  Cd Length: 57  Bit Score: 46.09  E-value: 1.58e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....
gi 767998433   1021 PKKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRY 1074
Cdd:smart00389    1 KRRKRTSFTPEQLEELEKEFQKNPYPSREEREELAKKLGLSERQVKVWFQNRRA 54
Soli_cterm TIGR03437
Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in ...
443-645 4.34e-06

Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in 90 proteins of Solibacter usitatus Ellin6076, nearly always as the C-terminal domain of a much larger protein. No homologs to this domain are detected outside of S. usitatus, a member of the Acidobacteria.


Pssm-ID: 274578 [Multi-domain]  Cd Length: 215  Bit Score: 48.81  E-value: 4.34e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   443 VIPGQTAT---SGVLPTGQMVQSGVLPVgQTAPSRVLPPGQTAPLRVISAGQV---VPSGLLSPNQTVsssaVVpVNQGV 516
Cdd:TIGR03437    2 VAPGSIVSifgTNLAPATLTAAGGPLPT-SLGGVSVTVNGVAAPLLYVSPGQInaqVPYEVAPGAATV----TV-TYNGG 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   517 NSGVLQLS-QPVVSGVLPVGQ-PVRPGVLQLNQtvGTNILPVNQPVRPGaSQNTTFLTSGSILRQLIPTGKQVNGIPTY- 593
Cdd:TIGR03437   76 ASAAVTVTvAAAAPGIFTLDGsGTGQAAALNNQ--DGSVNSAANPAAPG-DVVVLYATGLGPTSPAVADGAPAPSSPLAp 152
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 767998433   594 TLAPVSVTL-----PVPPGGLATVAPPQMPIQL-LPSGAAA---PMAGSMPGMPSPPVLVN 645
Cdd:TIGR03437  153 ALAPVTVTIggvpaTVLYAGLAPGFVGLYQVNVrVPAGLATgavPVVITVGGVTSNAVTIA 213
homeodomain cd00086
Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic ...
1022-1078 7.71e-06

Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner.


Pssm-ID: 238039 [Multi-domain]  Cd Length: 59  Bit Score: 44.16  E-value: 7.71e-06
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....*..
gi 767998433 1022 KKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRYICMK 1078
Cdd:cd00086     1 RRKRTRFTPEQLEELEKEFEKNPYPSREEREELAKELGLTERQVKIWFQNRRAKLKR 57
PPE COG5651
PPE-repeat protein [Function unknown];
340-542 3.17e-05

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 47.58  E-value: 3.17e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  340 PLHQSVNPPVLPLSQPVGPVNKSVGTSV--LPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLS 417
Cdd:COG5651   170 PPPTITNPGGLLGAQNAGSGNTSSNPGFanLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGFAGTGAAAGAAAAAAAAAA 249
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  418 VSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLRVISAGqVVPSGL 497
Cdd:COG5651   250 AAGAGASAALASLAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAATGLGLGAGGAAGAAGAT-GAGAAL 328
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*
gi 767998433  498 LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGV 542
Cdd:COG5651   329 GAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGA 373
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
242-412 2.64e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 45.14  E-value: 2.64e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   242 THIAPKPAAH-LAAPANGSAPSApaQPPCFHLaLPQNS--PSPAAgQPVTVAQgapgSLTHSPPAAgqshmtlvSSPLPV 318
Cdd:pfam03154  388 SNLPPPPALKpLSSLSTHHPPSA--HPPPLQL-MPQSQqlPPPPA-QPPVLTQ----SQSLPPPAA--------SHPPTS 451
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   319 GQNSLTLQPPAPQPVFLSHGVPL------HQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPINRPV 392
Cdd:pfam03154  452 GLHQVPSQSPFPQHPFVPGGPPPitppsgPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPE 531
                          170       180
                   ....*....|....*....|
gi 767998433   393 GPGVLPVSPSVTPGVLQAVS 412
Cdd:pfam03154  532 SPPPPPRSPSPEPTVVNTPS 551
 
Name Accession Description Interval E-value
ADNP_N pfam19627
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the ...
1-962 1.55e-63

Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the N-terminal domain of Activity-dependent neuroprotector homeobox protein (ADNP, also known as Activity- dependent neuroprotective protein), which contains zinc finger motifs. It is involved in transcriptional regulation and it is vital for mammalian brain formation. In humans, de novo mutations result in a syndromic form of autism-like spectrum disorder (ASD), including cognitive and motor deficits, the ADNP syndrome. This protein is also related to autophagy and the pathophysiology of schizophrenia.


Pssm-ID: 466132 [Multi-domain]  Cd Length: 744  Bit Score: 230.51  E-value: 1.55e-63
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433     1 MFQIPVENLDNIRK----------------------DLKGFDPGEKYFHNTSWGDVSLWEPSGKKVR-YRTKPYCCGLCK 57
Cdd:pfam19627    1 MFQLPVNNLGSLRKarknvkkilsdigleyckehieDFKDFEPNDFYIKNTSWDDVCLWDPSLTKNQdYRTKPFCCSGCP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433    58 YSTKVLTSFKNHLHRYHEDEIDQELVIPCPNCVFASQPKVVGRHFRMFHAPVRKVQNYTVNILGETKSSRSDVIS----- 132
Cdd:pfam19627   81 FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTYNGNKKTLETHIKLFHMPNNVVRQPSGGPVGFKDKSKQDSLKpkqgd 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   133 ------FTCLKCNFSNTLYYSMKKHVLVAHFHYLINSYFGlRTEEmgeqpKT-NDTVSIEKIPPPDKYYCKKCNANASSQ 205
Cdd:pfam19627  161 sveqavYYCKKCTYRDPLYNVVRKHIYREHFQHVAAPYVA-KPGE-----KSvNGAVASSNTRDDGSIHCKRCLFMPRTY 234
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   206 DALMYHiltsdihrdlenklrsVISEHIKrtgllkqthiapkpaahlaapangsapsapaqppcfhlalpqnspspaAGQ 285
Cdd:pfam19627  235 EALVQH----------------VIEDHER------------------------------------------------IGY 250
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   286 PVTVAQGapgslthsppaagqshmtlvssplpvgqnsltlqppapqpvflshgvplHQSVnppVLPLSQPvgpvnksvgt 365
Cdd:pfam19627  251 QVTAMIG-------------------------------------------------HTNV---VVPRSKP---------- 268
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   366 svlpinqtvrpgvLPLTQPVGPINRPVGpgvLPVSPSVTPGVLQAvspgvlsvsravpsgvlpagqmtpagqmtpagvip 445
Cdd:pfam19627  269 -------------LMLIAPKPQDKKSLG---VTQKGGLVTGNVRS----------------------------------- 297
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   446 gqtatsgvLPTGQMvqsgvlpvgqtapSRVLPPGqtaplrvisagqvvpSGLLSPnqtvsssavVPVNQGvnsgvlQLSQ 525
Cdd:pfam19627  298 --------LSSQQM-------------NRLSIPK---------------ANLLSN---------VHLKQG------SYGL 326
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   526 PVVSGVLPVGQPVRPGvlqlnqtvgtniLPVNQPVR-PGASQNttfltsgsiLRQLIPTGkqvNGIPTYTLAPVSVTLP- 603
Cdd:pfam19627  327 KSMPSFYVLGQQVRLS------------LPGNAQVSvPQQSQT---------VKQLLPGG---NGRPSTVGSSQSGQQPa 382
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   604 ---VPPGGLATVAPPQM-PIQLLPSGAAAPMAGSMPGMPSppvlvnaaqsvfvqASSSAADTNQVLKqakqWKTCPVCNE 679
Cdd:pfam19627  383 rfsVQSGNSASSSSSQLkSPPLSSSVAATRALGQGPSKSS--------------ASAAGLNTSYTQK----WKICTICNE 444
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   680 LFPSNVYQVHMEVAHKhsesksgeklePEKLAACAPFLKWMREKTVRCLSCKCLVSEEELIHHLLMHGLGCLFCPCTFHD 759
Cdd:pfam19627  445 LFPENVYSAHFEKEHK-----------AEKVPAVANYIMKIHNFTSKCLYCNRYLPSDTLLNHMLIHGLSCPYCRSTFND 513
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   760 IKGLSEHSRNRHLGKKKLPMDYSNRGFQLDVDaNGNLLFPHLDFITILPKEKLGEREVYLA---------ILAGIHSKSL 830
Cdd:pfam19627  514 VEKMVAHMRMVHPDEEVGPRTDSPLTFDLTLQ-QGNPKNIQLLVTTYNMRDAPEESVAFHAqnnspqpkkPKPKVQEKSD 592
                          890       900       910       920       930       940       950       960
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   831 VPvyVKVRPQaegTPGSTGKRV--STCPFCF----GPFvtTEAYELHLKERHHIMPTVHTVLKSPAFKCIHCCGVYTGNM 904
Cdd:pfam19627  593 VP--VKSSPQ---AAVPYKKDVgkTLCPLCFsilkGPI--SDALAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNM 665
                          970       980       990      1000      1010      1020
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 767998433   905 TLAAIAVHLVRCRSAPK--DSSSDLQAQPGFIHNSELLLVSGEVMH-DSSFSVKRKLPDGH 962
Cdd:pfam19627  666 TASTITLHLVHCRGVGKtqNGQDKSAPSPRVTQSPGAAPLKRELEHvDPALPKKRKLDDEE 726
PHA03247 PHA03247
large tegument protein UL36; Provisional
248-641 5.40e-15

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 80.37  E-value: 5.40e-15
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  248 PAAHLAAPANGSAPSAPAQPPcfhlalpqnsPSPAAGQPVTVAQGAPGSLTHSPPAA--GQSHMTLVSSPLPVGQNSLTL 325
Cdd:PHA03247 2557 PAAPPAAPDRSVPPPRPAPRP----------SEPAVTSRARRPDAPPQSARPRAPVDdrGDPRGPAPPSPLPPDTHAPDP 2626
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  326 QPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPV------NKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPV 399
Cdd:PHA03247 2627 PPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPP 2706
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  400 SPSVTPGVLQAVSPgvLSVSRAVPSGVLPAGQMTPAGQMTPAG-VIPGQTATSGVLPTGQMVQSGVLPVGQ-TAPSRVLP 477
Cdd:PHA03247 2707 TPEPAPHALVSATP--LPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAGPPAPAPPAAPaAGPPRRLT 2784
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  478 PGQTAPLRVISAGQVVPSGLLSPNQTVSS-SAVVPVNQGVNSGVlqlsqPVVSGVLPVGQPVRPGVLQLNQTVGTNILPV 556
Cdd:PHA03247 2785 RPAVASLSESRESLPSPWDPADPPAAVLApAAALPPAASPAGPL-----PPPTSAQPTAPPPPPGPPPPSLPLGGSVAPG 2859
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  557 NQPVRPGASQNTTFLTSGsilrqliPTGKQVNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPG 636
Cdd:PHA03247 2860 GDVRRRPPSRSPAAKPAA-------PARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932

                  ....*
gi 767998433  637 MPSPP 641
Cdd:PHA03247 2933 PPPPP 2937
PHA03247 PHA03247
large tegument protein UL36; Provisional
246-640 7.02e-14

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 76.90  E-value: 7.02e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  246 PKPAAHLAAPANGSA---PSAPAQP--PCFHLALPQNSPSPAAGQPVTVAQGAPgsltHSPPAAGQSHmtlvSSPLPVGQ 320
Cdd:PHA03247 2571 PRPAPRPSEPAVTSRarrPDAPPQSarPRAPVDDRGDPRGPAPPSPLPPDTHAP----DPPPPSPSPA----ANEPDPHP 2642
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  321 NSLTLQPPAPQPVFLSHGVPLHQSVNP---PVLPLSQPVGPVNKSVGTSVLPINQTVRPGvlPLTQPVGPINRPVGPGVl 397
Cdd:PHA03247 2643 PPTVPPPERPRDDPAPGRVSRPRRARRlgrAAQASSPPQRPRRRAARPTVGSLTSLADPP--PPPPTPEPAPHALVSAT- 2719
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  398 PVSPSVTPGVLQAVSPGVLSVSRAVPSG-VLPAGQMTPAGQMTPAGviPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVL 476
Cdd:PHA03247 2720 PLPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAG--PPAPAPPAAPAAGPPRRLTRPAVASLSESRES 2797
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  477 PPGQTAPLRViSAGQVVPSGLLSPNQTVSS-----SAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGVLQlnQTVGT 551
Cdd:PHA03247 2798 LPSPWDPADP-PAAVLAPAAALPPAASPAGplpppTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSR--SPAAK 2874
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  552 NILPVNQPVR----PGASQNTTFLTSGSILRQLIPTGK-QVNGIPTYTLAPVSVTLPVPPgglatvAPPQMPIQLLPSGA 626
Cdd:PHA03247 2875 PAAPARPPVRrlarPAVSRSTESFALPPDQPERPPQPQaPPPPQPQPQPPPPPQPQPPPP------PPPRPQPPLAPTTD 2948
                         410
                  ....*....|....
gi 767998433  627 AAPMAGSMPGMPSP 640
Cdd:PHA03247 2949 PAGAGEPSGAVPQP 2962
PHA03247 PHA03247
large tegument protein UL36; Provisional
246-492 2.16e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 62.26  E-value: 2.16e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  246 PKPAAhLAAPANGSAPSAPAQPPCFHLALPQNSP-SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLT 324
Cdd:PHA03247 2758 ARPPT-TAGPPAPAPPAAPAAGPPRRLTRPAVASlSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQP 2836
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  325 LQPPAPqPVFLSHGVPLHQSVNP--PVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGpiNRPVGPGVLPVSPS 402
Cdd:PHA03247 2837 TAPPPP-PGPPPPSLPLGGSVAPggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFA--LPPDQPERPPQPQA 2913
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  403 VTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTP----AGQMTPAGVIPgqTATSGVLPTGQmVQSGVLPVGQTAPSRVLPP 478
Cdd:PHA03247 2914 PPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPttdpAGAGEPSGAVP--QPWLGALVPGR-VAVPRFRVPQPAPSREAPA 2990
                         250
                  ....*....|....
gi 767998433  479 GQTAPLRVISAGQV 492
Cdd:PHA03247 2991 SSTPPLTGHSLSRV 3004
PHA03247 PHA03247
large tegument protein UL36; Provisional
245-578 9.00e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 59.95  E-value: 9.00e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  245 APKPAAHLAAPANGSAPSAPAQPPcfhlALPQNSPSPAAGQPVTVAQGAPGS-LTHSPPAAGQSHMTLVSsPLPVGQNSL 323
Cdd:PHA03247 2688 ARPTVGSLTSLADPPPPPPTPEPA----PHALVSATPLPPGPAAARQASPALpAAPAPPAVPAGPATPGG-PARPARPPT 2762
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  324 TLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQ-PVGPINRPvgPGVLPVSPS 402
Cdd:PHA03247 2763 TAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAAsPAGPLPPP--TSAQPTAPP 2840
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  403 VTPGVLQAvspgvlsvSRAVPSGVLPAGqmtPAGQMTPAGVIPGQTATSGVLPTGQMVQSgvlPVGQTAPSRVLPPGQTA 482
Cdd:PHA03247 2841 PPPGPPPP--------SLPLGGSVAPGG---DVRRRPPSRSPAAKPAAPARPPVRRLARP---AVSRSTESFALPPDQPE 2906
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  483 PLRVISAGQvvPSGLLSPNQTVSSSAVVPVNQGVNSGVLQ-----LSQPVVSGVLPVGQ--PVRPGVLQLNQTvgtnILP 555
Cdd:PHA03247 2907 RPPQPQAPP--PPQPQPQPPPPPQPQPPPPPPPRPQPPLApttdpAGAGEPSGAVPQPWlgALVPGRVAVPRF----RVP 2980
                         330       340
                  ....*....|....*....|...
gi 767998433  556 VNQPVRPGASQNTTFLTSGSILR 578
Cdd:PHA03247 2981 QPAPSREAPASSTPPLTGHSLSR 3003
PHA03379 PHA03379
EBNA-3A; Provisional
233-564 1.05e-07

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 56.22  E-value: 1.05e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  233 IKRTGllKQTHIAPKPAAHLAAPANGSaPSAPAQPPcfHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMtlv 312
Cdd:PHA03379  391 LMRAG--KLTERAREALEKASEPTYGT-PRPPVEKP--RPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSM--- 462
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  313 sSPLPVGQNsltlqPPAP----QPVFLSHGVPlhQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPI 388
Cdd:PHA03379  463 -APCPVAQL-----PPGPlqdlEPGDQLPGVV--QDGRPACAPVPAPAGPIVRPWEASLSQVPGVAFAPVMPQPMPVEPV 534
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  389 NRPVGPGVLPVSPSVTPGVLQAvsPGVLSVSRAV----------PSGVLPAGQMT---------PAGQMTPAGV------ 443
Cdd:PHA03379  535 PVPTVALERPVCPAPPLIAMQG--PGETSGIVRVrerwrpapwtPNPPRSPSQMSvrdrlarlrAEAQPYQASVevqppq 612
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  444 ---IPGQTATSGVL-PTGQM------------VQSGVLPVGQtAPSRVLPPGQ-------TAPLRViSAGQVVPSGLLSP 500
Cdd:PHA03379  613 ltqVSPQQPMEYPLePEQQMfpgspfsqvadvMRAGGVPAMQ-PQYFDLPLQQpisqgapLAPLRA-SMGPVPPVPATQP 690
                         330       340       350       360       370       380       390
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 767998433  501 nQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLP---------VGQPVRPGVLQlNQTVGtniLPVNQPVRPGA 564
Cdd:PHA03379  691 -QYFDIPLTEPINQGASAAHFLPQQPMEGPLVPerwmfqgatLSQSVRPGVAQ-SQYFD---LPLTQPINHGA 758
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
362-679 1.58e-07

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 55.03  E-value: 1.58e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  362 SVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSpgVLSVSRAVPSGVLPAGQMTPAG-QMTP 440
Cdd:cd22553    35 ETHDPLILSPPLSQPQQIITAQSSGSAAGGVAYSVSPAVQTVTVDGHEAIF--IPANSGLLQTNNQQAIQLAPGGtQAIL 112
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  441 AGvipGQTATSGVLPTGQMVQSGVLPV-GQTAPSRV---LPP---GQTAPLRV-ISA--GQVVPSGLLSPNQTVSSSAVV 510
Cdd:cd22553   113 AN---QQTLIRPNTVQGQANASNVLQNiAQIASGGNavqLPLnnmTQTIPVQVpVSTanGQTVYQTIQVPIQAIQSGNAG 189
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  511 PVNQGVNSGVL-QLSQPvvsgvlpvgqpvrpGVLQLNQTVGTNILPVNQPVRPGASQNTTFL------TSGSILRQLIPT 583
Cdd:cd22553   190 GGNQALQAQVIpQLAQA--------------AQLQPQQLAQVSSQGYIQQIPANASQQQPQMvqqgpnQSGQIIGQVASA 255
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  584 -GKQVNGIPTYTLApvSVTLPVPPGGLATVAP-PQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQASSSAADT 661
Cdd:cd22553   256 sSIQAAAIPLTVYT--GALAGQNGSNQQQVGQiVTSPIQGMTQGLTAPASSSIPTVVQQQAIQGNPLPPGTQIIAAGQQL 333
                         330       340       350
                  ....*....|....*....|....*....|....*.
gi 767998433  662 NQVLKQAKQWK------------------TCPVCNE 679
Cdd:cd22553   334 QQDPNDPTKWQvvadgtpgskkrlrrvacTCPNCRD 369
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
279-642 1.91e-07

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 55.54  E-value: 1.91e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   279 PSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSlTLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGP 358
Cdd:pfam03154  149 PSPQDNESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTT-QAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAP 227
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   359 VNKSVGTSVLPINQ--TVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTPAG 436
Cdd:pfam03154  228 HTLIQQTPTLHPQRlpSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQPFPLTPQSSQS 307
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   437 QM--TPAGVIPGQTATSGVLPTGQ-MVQSGVLPVGQTAPSRVLP-----PGQTAPLRVISAGQ--------VVPSGLLSP 500
Cdd:pfam03154  308 QVppGPSPAAPGQSQQRIHTPPSQsQLQSQQPPREQPLPPAPLSmphikPPPTTPIPQLPNPQshkhpphlSGPSPFQMN 387
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   501 NQTVSSSAVVPVNQGVNSGVLQLSQPVVSgVLPVGQPVRPGVLQlnqtvgTNILPVNQPVRPGASQNTTFLTSGSILRQl 580
Cdd:pfam03154  388 SNLPPPPALKPLSSLSTHHPPSAHPPPLQ-LMPQSQQLPPPPAQ------PPVLTQSQSLPPPAASHPPTSGLHQVPSQ- 459
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 767998433   581 iptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPV 642
Cdd:pfam03154  460 -------SPFPQHPFVPGGPPPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPL 514
HOX smart00389
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key ...
1021-1074 1.58e-06

Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes


Pssm-ID: 197696 [Multi-domain]  Cd Length: 57  Bit Score: 46.09  E-value: 1.58e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....
gi 767998433   1021 PKKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRY 1074
Cdd:smart00389    1 KRRKRTSFTPEQLEELEKEFQKNPYPSREEREELAKKLGLSERQVKVWFQNRRA 54
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
241-658 1.65e-06

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 52.46  E-value: 1.65e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   241 QTHIAPKPAAHLAAPANGSAPSaPAQPPCFHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPaagQSHMTLVSSPLPVGQ 320
Cdd:pfam03154  175 QAQSGAASPPSPPPPGTTQAAT-AGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTP---TLHPQRLPSPHPPLQ 250
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   321 nSLTLQPPAPQPVFLSHGVPLHQSVNPPvLPLSQPVGP--VNKSVGTSVLPI-NQTVRPGVLPLTQPVGPI---NRPVGP 394
Cdd:pfam03154  251 -PMTQPPPPSQVSPQPLPQPSLHGQMPP-MPHSLQTGPshMQHPVPPQPFPLtPQSSQSQVPPGPSPAAPGqsqQRIHTP 328
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   395 GVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQM-TPAGQMTPAGVipgqtatSGVLPTgQMvqsgvlpvgqtaPS 473
Cdd:pfam03154  329 PSQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTPIPQLpNPQSHKHPPHL-------SGPSPF-QM------------NS 388
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   474 RVLPPGQTAPLRVISAGQVvPSGLLSPnqtvsssavvpvnqgvnsgvLQLsqpvvsgvLPVGQPVRPgvlqlnqtvgtni 553
Cdd:pfam03154  389 NLPPPPALKPLSSLSTHHP-PSAHPPP--------------------LQL--------MPQSQQLPP------------- 426
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   554 lPVNQPvrPGASQNTTFLTSGSilrqliptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMpiqLLPSGAAAPMAGS 633
Cdd:pfam03154  427 -PPAQP--PVLTQSQSLPPPAA------------SHPPTSGLHQVPSQSPFPQHPFVPGGPPPI---TPPSGPPTSTSSA 488
                          410       420
                   ....*....|....*....|....*
gi 767998433   634 MPGMpSPPVLVNAAQSVFVQASSSA 658
Cdd:pfam03154  489 MPGI-QPPSSASVSSSGPVPAAVSC 512
PHA03247 PHA03247
large tegument protein UL36; Provisional
255-659 2.87e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.86  E-value: 2.87e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  255 PANGSAPSAPAQPPCfhlalPQNSPSPAAGQPVTVAQGAPGSLTHSPPA-AGQSHMTLVSSPLPVGQNSLTLQPPAPQPV 333
Cdd:PHA03247 2483 PAEARFPFAAGAAPD-----PGGGGPPDPDAPPAPSRLAPAILPDEPVGePVHPRMLTWIRGLEELASDDAGDPPPPLPP 2557
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  334 FLSHGVPlHQSVnPPVLPLSQPVGPVNKSvgtsvlpinQTVRPGVLPltQPvgpiNRPVGPGVLPVSPsvtpgvlqavsP 413
Cdd:PHA03247 2558 AAPPAAP-DRSV-PPPRPAPRPSEPAVTS---------RARRPDAPP--QS----ARPRAPVDDRGDP-----------R 2609
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  414 GVLSVSRAVPSGVLPAgqmTPAGQMTPAGVIPGQTATSGVLPTGQmvqsgvlPVGQTAPSRVLPPgqtapLRVISAGQvv 493
Cdd:PHA03247 2610 GPAPPSPLPPDTHAPD---PPPPSPSPAANEPDPHPPPTVPPPER-------PRDDPAPGRVSRP-----RRARRLGR-- 2672
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  494 PSGLLSPNQTVSSSAVVPVNQGVNSgvlqLSQPVVSGVLPVGQPvRPGVLQLNQTVGTNILPVNQPVRPGASqnttfLTS 573
Cdd:PHA03247 2673 AAQASSPPQRPRRRAARPTVGSLTS----LADPPPPPPTPEPAP-HALVSATPLPPGPAAARQASPALPAAP-----APP 2742
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  574 GSILRQLIPTGKQVNGIPTYTLAPVSvtlPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQ 653
Cdd:PHA03247 2743 AVPAGPATPGGPARPARPPTTAGPPA---PAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALP 2819

                  ....*.
gi 767998433  654 ASSSAA 659
Cdd:PHA03247 2820 PAASPA 2825
Soli_cterm TIGR03437
Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in ...
443-645 4.34e-06

Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in 90 proteins of Solibacter usitatus Ellin6076, nearly always as the C-terminal domain of a much larger protein. No homologs to this domain are detected outside of S. usitatus, a member of the Acidobacteria.


Pssm-ID: 274578 [Multi-domain]  Cd Length: 215  Bit Score: 48.81  E-value: 4.34e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   443 VIPGQTAT---SGVLPTGQMVQSGVLPVgQTAPSRVLPPGQTAPLRVISAGQV---VPSGLLSPNQTVsssaVVpVNQGV 516
Cdd:TIGR03437    2 VAPGSIVSifgTNLAPATLTAAGGPLPT-SLGGVSVTVNGVAAPLLYVSPGQInaqVPYEVAPGAATV----TV-TYNGG 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   517 NSGVLQLS-QPVVSGVLPVGQ-PVRPGVLQLNQtvGTNILPVNQPVRPGaSQNTTFLTSGSILRQLIPTGKQVNGIPTY- 593
Cdd:TIGR03437   76 ASAAVTVTvAAAAPGIFTLDGsGTGQAAALNNQ--DGSVNSAANPAAPG-DVVVLYATGLGPTSPAVADGAPAPSSPLAp 152
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 767998433   594 TLAPVSVTL-----PVPPGGLATVAPPQMPIQL-LPSGAAA---PMAGSMPGMPSPPVLVN 645
Cdd:TIGR03437  153 ALAPVTVTIggvpaTVLYAGLAPGFVGLYQVNVrVPAGLATgavPVVITVGGVTSNAVTIA 213
homeodomain cd00086
Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic ...
1022-1078 7.71e-06

Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner.


Pssm-ID: 238039 [Multi-domain]  Cd Length: 59  Bit Score: 44.16  E-value: 7.71e-06
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....*..
gi 767998433 1022 KKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRYICMK 1078
Cdd:cd00086     1 RRKRTRFTPEQLEELEKEFEKNPYPSREEREELAKELGLTERQVKIWFQNRRAKLKR 57
DUF4813 pfam16072
Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. ...
423-659 1.13e-05

Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 345 and 672 amino acids in length.


Pssm-ID: 435117 [Multi-domain]  Cd Length: 288  Bit Score: 48.60  E-value: 1.13e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   423 PSGVLPAGqmtpaGQMTPAGVIPGqtaTSGVLPTGqmvqsGVlPVGQT----APSRVLPPGQTaplrVISAGQVVPSGLL 498
Cdd:pfam16072   13 PGGYAPAG-----ATYHPAGQVPA---GATYYPSG-----GV-PHGATyypqAPVAAVPAGAT----YLPAGAAIPAGAT 74
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   499 SPNQTVSSSAVVPVNQGVNSG---------VLQLSQPVVSGVLPVGQPVRPGVLQLNQTVGTNILPVNQPvrPGASQNTT 569
Cdd:pfam16072   75 YYPQAPKSSSGLGLGTGLIAGalggailghALTPTQTRVVEHAPSSGGGGGGGGYSNGNNEDKIIIINNG--PPGSVTTT 152
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   570 FLTSGSilrQLIPTGKQVNGiptytlAPVSVTLPVPPGGLATVAPPQMPIQlLPSGAAAPMAGSMPGMPSPPVLVNAAQS 649
Cdd:pfam16072  153 SAGSGT---TVINAGGQQPA------APAAPAYPVAPAAYPAQAPAAAPAP-APGAPQTPLAPLNPVAAAPAAAAGAAAA 222
                          250
                   ....*....|
gi 767998433   650 VFVQASSSAA 659
Cdd:pfam16072  223 PVVAAAAPAA 232
PHA03247 PHA03247
large tegument protein UL36; Provisional
244-412 1.44e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 49.55  E-value: 1.44e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  244 IAPKPAAHLAAPANGSAPSAPAQPPCFHLA--------LPQNSP--SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVS 313
Cdd:PHA03247 2828 LPPPTSAQPTAPPPPPGPPPPSLPLGGSVApggdvrrrPPSRSPaaKPAAPARPPVRRLARPAVSRSTESFALPPDQPER 2907
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  314 SPLPVGQNSLTLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPvGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVG 393
Cdd:PHA03247 2908 PPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDP-AGAGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAPSR 2986
                         170
                  ....*....|....*....
gi 767998433  394 PGVLPVSPSVTPGVLQAVS 412
Cdd:PHA03247 2987 EAPASSTPPLTGHSLSRVS 3005
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
238-579 1.57e-05

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 48.48  E-value: 1.57e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  238 LLKQTHIAPKPAAHLAAPANGSAPSAPAQPPCFHLALPQNSPSPAAGQP--VTVAQ---GAPGSLTHSPPAAGQShMTL- 311
Cdd:cd22553     1 FNQSQQVAPSELAQVATTASNIGGQQKQAQSDSSETHDPLILSPPLSQPqqIITAQssgSAAGGVAYSVSPAVQT-VTVd 79
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  312 ----VSSPLPVGQNSLTLQPPAPQPVFLSHGVPLHQSVnppvlpLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGP 387
Cdd:cd22553    80 gheaIFIPANSGLLQTNNQQAIQLAPGGTQAILANQQT------LIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNN 153
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  388 INRPVgPGVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTPAGQMTPAGVI-PGQTATsgvlPTGQMVQSGVLP 466
Cdd:cd22553   154 MTQTI-PVQVPVSTANGQTVYQTIQVPIQAIQSGNAGGGNQALQAQVIPQLAQAAQLqPQQLAQ----VSSQGYIQQIPA 228
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  467 VGQTAPSRVLPPGQTaplrviSAGQVVPSGL-LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGVLQL 545
Cdd:cd22553   229 NASQQQPQMVQQGPN------QSGQIIGQVAsASSIQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPA 302
                         330       340       350
                  ....*....|....*....|....*....|....
gi 767998433  546 NQTVGTNILPvNQPVRPGASQNTTFLTSGSILRQ 579
Cdd:cd22553   303 SSSIPTVVQQ-QAIQGNPLPPGTQIIAAGQQLQQ 335
PPE COG5651
PPE-repeat protein [Function unknown];
340-542 3.17e-05

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 47.58  E-value: 3.17e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  340 PLHQSVNPPVLPLSQPVGPVNKSVGTSV--LPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLS 417
Cdd:COG5651   170 PPPTITNPGGLLGAQNAGSGNTSSNPGFanLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGFAGTGAAAGAAAAAAAAAA 249
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  418 VSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLRVISAGqVVPSGL 497
Cdd:COG5651   250 AAGAGASAALASLAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAATGLGLGAGGAAGAAGAT-GAGAAL 328
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*
gi 767998433  498 LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGV 542
Cdd:COG5651   329 GAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGA 373
PHA02682 PHA02682
ORF080 virion core protein; Provisional
246-353 4.72e-05

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 46.39  E-value: 4.72e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  246 PKPAAHLAAPANGSAPSAPAQPPCFHLALPQNSPSPAAGQPvtvAQGAPGSLTHSPPAagqshmtlvsSPLPvgqnslTL 325
Cdd:PHA02682   96 PACAPAAPAPAVTCPAPAPACPPATAPTCPPPAVCPAPARP---APACPPSTRQCPPA----------PPLP------TP 156
                          90       100
                  ....*....|....*....|....*....
gi 767998433  326 QP-PAPQPVFlshgvpLHQSVNPPVLPLS 353
Cdd:PHA02682  157 KPaPAAKPIF------LHNQLPPPDYPAA 179
PRK10263 PRK10263
DNA translocase FtsK; Provisional
246-478 9.91e-05

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 46.62  E-value: 9.91e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  246 PKPAAHLAAPAngSAPSAPAQPPCFHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQ---SHMTLVSSPLPVGQNS 322
Cdd:PRK10263  362 PVPGPQTGEPV--IAPAPEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYyapAPEQPAQQPYYAPAPE 439
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  323 LTL-----QPPAPQPVFLSHgvPLHQSVNPPVLPLSQPVG-----PVNKSVGTSVLPINQTVRPGVLPL----------- 381
Cdd:PRK10263  440 QPVagnawQAEEQQSTFAPQ--STYQTEQTYQQPAAQEPLyqqpqPVEQQPVVEPEPVVEETKPARPPLyyfeeveekra 517
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  382 ---------TQPV-GPI--NRPVGPGVLPVSPSVTPGVLQAvsPGVLSVSRAVPSGVLPAGqmTPAGQMTPAGvipgqTA 449
Cdd:PRK10263  518 rereqlaawYQPIpEPVkePEPIKSSLKAPSVAAVPPVEAA--AAVSPLASGVKKATLATG--AAATVAAPVF-----SL 588
                         250       260
                  ....*....|....*....|....*....
gi 767998433  450 TSGVLPTGQmVQSGVLPvGQTAPSRVLPP 478
Cdd:PRK10263  589 ANSGGPRPQ-VKEGIGP-QLPRPKRIRVP 615
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
242-412 2.64e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 45.14  E-value: 2.64e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   242 THIAPKPAAH-LAAPANGSAPSApaQPPCFHLaLPQNS--PSPAAgQPVTVAQgapgSLTHSPPAAgqshmtlvSSPLPV 318
Cdd:pfam03154  388 SNLPPPPALKpLSSLSTHHPPSA--HPPPLQL-MPQSQqlPPPPA-QPPVLTQ----SQSLPPPAA--------SHPPTS 451
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   319 GQNSLTLQPPAPQPVFLSHGVPL------HQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPINRPV 392
Cdd:pfam03154  452 GLHQVPSQSPFPQHPFVPGGPPPitppsgPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPE 531
                          170       180
                   ....*....|....*....|
gi 767998433   393 GPGVLPVSPSVTPGVLQAVS 412
Cdd:pfam03154  532 SPPPPPRSPSPEPTVVNTPS 551
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
245-391 2.66e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 45.03  E-value: 2.66e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   245 APKPAAhlaaPANGSAPSAPAQPPCFHLALPQNSPSPAAGQ--PVTVAQGAPGSLTHSPPAA----GQSHMTLVSSPLPV 318
Cdd:pfam09770  207 AKKPAQ----QPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQqqPQQQPQQPQQHPGQGHPVTilqrPQSPQPDPAQPSIQ 282
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 767998433   319 GQNSLTLQPPAPQPVflshgVPLHQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVR-PGVLPltQPVGPINRP 391
Cdd:pfam09770  283 PQAQQFHQQPPPVPV-----QPTQILQNPNRLSAARVGYPQNPQPGVQPAPAHQAHRqQGSFG--RQAPIITHP 349
PHA03247 PHA03247
large tegument protein UL36; Provisional
248-643 3.94e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 44.93  E-value: 3.94e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  248 PAAHLAAP-ANGSAPSAPAQPPCFHLALPQNSPSPAAGQPVTV----------------AQGAPGSLTHS--PPAAGQSH 308
Cdd:PHA03247 2489 PFAAGAAPdPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPrmltwirgleelasddAGDPPPPLPPAapPAAPDRSV 2568
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  309 MTLVSSPLPVGqnsltlqpPAPQPVFLSHGVPLHQsvNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPgvlPLTQPVGPI 388
Cdd:PHA03247 2569 PPPRPAPRPSE--------PAVTSRARRPDAPPQS--ARPRAPVDDRGDPRGPAPPSPLPPDTHAPDP---PPPSPSPAA 2635
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  389 NRPVGPGVLPVSPSVTPGvlQAVSPGVLSVSRAVPSGVLPAGQMTPAGQMTPAGVIPGqtatsgVLPTGQMVQSGVLPVG 468
Cdd:PHA03247 2636 NEPDPHPPPTVPPPERPR--DDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPT------VGSLTSLADPPPPPPT 2707
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  469 QTAPSRVLPPGQTAPLRVISAGQVVPSGLLSPnqtvsssavvpvnqgvnsgvlqLSQPVVSGVLPVGQPVRPGVLQLNQT 548
Cdd:PHA03247 2708 PEPAPHALVSATPLPPGPAAARQASPALPAAP----------------------APPAVPAGPATPGGPARPARPPTTAG 2765
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  549 VGTNILPVNQPVRPGASQNTTFLTSGSILRQLIPTGKQVngiptytlAPVSVTLPVPPGGLATVAPPQMPiqLLPSGAAA 628
Cdd:PHA03247 2766 PPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDP--------ADPPAAVLAPAAALPPAASPAGP--LPPPTSAQ 2835
                         410
                  ....*....|....*
gi 767998433  629 PMAGSMPGMPSPPVL 643
Cdd:PHA03247 2836 PTAPPPPPGPPPPSL 2850
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
241-419 7.34e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 43.71  E-value: 7.34e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  241 QTHIAPKPAAHLAAPANGSAPSAPAQPPCFHLALPqnSPSPAAGQPVTVAQGAPGSLTHSPPAAgqshmtlvSSPLPVGQ 320
Cdd:PRK12323  392 PAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARR--SPAPEALAAARQASARGPGGAPAPAPA--------PAAAPAAA 461
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  321 NSLTLQPPAPQPVFLSHGVPLHQSVNPPV-----------LPLSQPV-GPVNKSVGTSVLPINQTVRPGVLPL-----TQ 383
Cdd:PRK12323  462 ARPAAAGPRPVAAAAAAAPARAAPAAAPApadddpppweeLPPEFASpAPAQPDAAPAGWVAESIPDPATADPddafeTL 541
                         170       180       190
                  ....*....|....*....|....*....|....*.
gi 767998433  384 PVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLSVS 419
Cdd:PRK12323  542 APAPAAAPAPRAAAATEPVVAPRPPRASASGLPDMF 577
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
291-689 8.68e-04

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 43.46  E-value: 8.68e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   291 QGAPGSLTHSPPAAGQSHMtlVSSPLPVGQN--SLTLQPPAPQPVFLSHGVPLHQSVNPPVL--PLSQPVGPVNKSVGTS 366
Cdd:pfam09606   60 QQQPQGGQGNGGMGGGQQG--MPDPINALQNlaGQGTRPQMMGPMGPGPGGPMGQQMGGPGTasNLLASLGRPQMPMGGA 137
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   367 VLPINQTvrpGVLPLTQPVGpinrpVGPGVLPVSPSVTPGVLQAvspgvlsvsravpsgvlPAGQMTPAGQMTPaGVIPG 446
Cdd:pfam09606  138 GFPSQMS---RVGRMQPGGQ-----AGGMMQPSSGQPGSGTPNQ-----------------MGPNGGPGQGQAG-GMNGG 191
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   447 QTATSGVLPTGQMVQSGVL-------PVGQTAPSRVLPP---GQTAPLRVISAGQVVPSGllsPNQTVSSSAVVPVNQgV 516
Cdd:pfam09606  192 QQGPMGGQMPPQMGVPGMPgpadagaQMGQQAQANGGMNpqqMGGAPNQVAMQQQQPQQQ---GQQSQLGMGINQMQQ-M 267
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   517 NSGVLQLSQPVVSGVLPVGQPVRPGVLQLNQTVGTNILPVNQPVRPgasqnttfltsgsilRQlipTGKQVNGIPTytlA 596
Cdd:pfam09606  268 PQGVGGGAGQGGPGQPMGPPGQQPGAMPNVMSIGDQNNYQQQQTRQ---------------QQ---QQQGGNHPAA---H 326
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   597 PVSVTLPVPPGGLATVAPPQMPIQLLPSGA-----AAPMAGSMPGMPSPPVLVNAAQSVFVQasssaadTNQVLKQAKQw 671
Cdd:pfam09606  327 QQQMNQSVGQGGQVVALGGLNHLETWNPGNfgglgANPMQRGQPGMMSSPSPVPGQQVRQVT-------PNQFMRQSPQ- 398
                          410
                   ....*....|....*...
gi 767998433   672 ktcpvcnelfPSNVYQVH 689
Cdd:pfam09606  399 ----------PSVPSPQG 406
SP4_N cd22536
N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins ...
255-648 1.13e-03

N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. Human SP4 is a risk gene of multiple psychiatric disorders including schizophrenia, bipolar disorder, and major depression. SP4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP4.


Pssm-ID: 411773 [Multi-domain]  Cd Length: 623  Bit Score: 42.98  E-value: 1.13e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  255 PANGSAPSAPAQppcFHLALPQNSPSPAAGQPVTVA---QGAPGSLTHSPPAAGQSHMTLVSSP--LPVGQNSLTLQPPA 329
Cdd:cd22536   115 KAGNSNASAPGQ---FQVIQVQNMQNPSGSVQYQVIpqiQTVEGQQIQISPANATALQDLQGQIqlIPAGNNQAILTTPN 191
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  330 PQP-------VFLSHGVPLHqsVNPPV-LPLSQPVGPVNKSVGTSVLPINQtvrpGVLPLTQPVgpINRPVGPG-----V 396
Cdd:cd22536   192 RTAsgniiaqNLANQTVPVQ--IRPGVsIPLQLQTIPGAQAQVVTTLPINI----GGVTLALPV--INNVAAGGgsgqlV 263
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  397 LPVSPSVTPGVLQAVSPGVLSVSRAVPSgvlpagqmtpagqmTPAGVIPGQTATSGVLPTGQMVQSGVLPVGQ--TAPSR 474
Cdd:cd22536   264 QPSDGGVSNGNQLVSTPITTASVSTMPE--------------SPSSSTTCTTTASTSLTSSDTLVSSAETGQYasTAASS 329
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  475 VL----PPGQTAPLRVISAGQVVPSGLLS-PNQTVSSSAVVPVNQGVNSgVLQLSQPVVSgVLPVGQPVRPgVLQLNQTV 549
Cdd:cd22536   330 ERteeePQTSAAESEAQSSSQLQSNGLQNvQDQSNSLQQVQIVGQPILQ-QIQIQQPQQQ-IIQAIQPQSF-QLQSGQTI 406
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  550 GTNILPVNQPVRPGASQNTT-------FLT-SGSI---------LRQLIPTGKQVNGIPTY-TLAPVSVTlpvppGGLAT 611
Cdd:cd22536   407 QTIQQQPLQNVQLQAVQSPTqvlirapTLTpSGQIswqtvqvqnIQSLSNLQVQNAGLPQQlTLTPVSSS-----AGGTT 481
                         410       420       430
                  ....*....|....*....|....*....|....*..
gi 767998433  612 VAppqmpiQLLPsgaaAPMAGSmpgmpspPVLVNAAQ 648
Cdd:cd22536   482 IA------QIAP----VAVAGT-------PITLNAAQ 501
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
225-480 1.50e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 42.83  E-value: 1.50e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   225 LRSVISEHIKRtglLKQTHIAPKPAAHLAAPANgsAPSAPAQPPCFHLALP------QNSPS----PAAGQPVTV----- 289
Cdd:pfam03154  231 IQQTPTLHPQR---LPSPHPPLQPMTQPPPPSQ--VSPQPLPQPSLHGQMPpmphslQTGPShmqhPVPPQPFPLtpqss 305
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   290 -AQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLTLQPPAPQPVFL------------------SHGVPLHQSVNPPV- 349
Cdd:pfam03154  306 qSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPPAPLSMphikpppttpipqlpnpqSHKHPPHLSGPSPFq 385
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   350 LPLSQPVGPVNK---SVGTSVLPINQTVRPGVLPLTQPVGPinRPVGPGVLPVSPSVTPGVLQAVSPGvlSVSRAVPSGV 426
Cdd:pfam03154  386 MNSNLPPPPALKplsSLSTHHPPSAHPPPLQLMPQSQQLPP--PPAQPPVLTQSQSLPPPAASHPPTS--GLHQVPSQSP 461
                          250       260       270       280       290
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 767998433   427 LPAGQMTPAG--QMTPAGVIPgqTATSGVLPTGQMVQSGVLPVGQTAP---SRVLPPGQ 480
Cdd:pfam03154  462 FPQHPFVPGGppPITPPSGPP--TSTSSAMPGIQPPSSASVSSSGPVPaavSCPLPPVQ 518
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
247-411 1.78e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 42.55  E-value: 1.78e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  247 KPAAHLAAPANGSAPSAPAqppcfhlALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLTLQ 326
Cdd:PRK07994  360 HPAAPLPEPEVPPQSAAPA-------ASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETTSQLLAARQQLQRA 432
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  327 PPAPQPvflshgvplhqsvnppvlPLSQPVGPVNKSVGTSVLPINQTVRPgvLPLTQPVGPIN------RPVGPGVLPVS 400
Cdd:PRK07994  433 QGATKA------------------KKSEPAAASRARPVNSALERLASVRP--APSALEKAPAKkeayrwKATNPVEVKKE 492
                         170
                  ....*....|.
gi 767998433  401 PSVTPGVLQAV 411
Cdd:PRK07994  493 PVATPKALKKA 503
PPE COG5651
PPE-repeat protein [Function unknown];
409-662 1.84e-03

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 41.80  E-value: 1.84e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  409 QAVSpgvLSVSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATS---GVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLR 485
Cdd:COG5651   155 AAAS---AAAVALTPFTQPPPTITNPGGLLGAQNAGSGNTSSNpgfANLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGF 231
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  486 VISAGQVVPSGLLSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPvgqpvrpgvlqlNQTVGTNILPVNQPVRPGAS 565
Cdd:COG5651   232 AGTGAAAGAAAAAAAAAAAAGAGASAALASLAATLLNASSLGLAATAA------------SSAATNLGLAGSPLGLAGGG 299
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  566 QNTTFLTSGSilrqliptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPVLVN 645
Cdd:COG5651   300 AGAAAATGLG------------LGAGGAAGAAGATGAGAALGAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGG 367
                         250
                  ....*....|....*..
gi 767998433  646 AAQSVFVQASSSAADTN 662
Cdd:COG5651   368 GGSAGAAAGAASGGGAA 384
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
384-500 7.47e-03

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 40.44  E-value: 7.47e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433   384 PVGPINRPVGPGVLPVSPSVTPGvlqAVSPGVLSVSRAVPSGVLPAGQMTPAgqmTPAGVIPGQTATSGVLPTGQMVQSG 463
Cdd:TIGR01645  284 PPDALLQPATVSAIPAAAAVAAA---AATAKIMAAEAVAGAAVLGPRAQSPA---TPSSSLPTDIGNKAVVSSAKKEAEE 357
                           90       100       110
                   ....*....|....*....|....*....|....*..
gi 767998433   464 VLPVGQTAPSRVLPPGQTAPLRVISAGQVVPSGLLSP 500
Cdd:TIGR01645  358 VPPLPQAAPAVVKPGPMEIPTPVPPPGLAIPSLVAPP 394
PHA03378 PHA03378
EBNA-3B; Provisional
245-466 8.76e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 40.44  E-value: 8.76e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  245 APKPAAHLAA-------PANGSAPSAPAQPPCFHLALPQNSPSPA---AGQPVTVAQGAPGSLTHSPPAAGQSHMTlvSS 314
Cdd:PHA03378  700 APTPMRPPAAppgraqrPAAATGRARPPAAAPGRARPPAAAPGRArppAAAPGRARPPAAAPGRARPPAAAPGAPT--PQ 777
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  315 PLPVGQNSLTLQP---PAPQPVflSHGVPLHQSVNPPVLPLSQpvGPVNKSVGTSVLPINQTVRPGVlpLTQPVGPINRP 391
Cdd:PHA03378  778 PPPQAPPAPQQRPrgaPTPQPP--PQAGPTSMQLMPRAAPGQQ--GPTKQILRQLLTGGVKRGRPSL--KKPAALERQAA 851
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 767998433  392 VGPGVLPVS---------PSVTPGVLQAVS-PGVLSVSRAVPSGVLP------AGQMTPAGQMTPAGVIPGQTATSGVLP 455
Cdd:PHA03378  852 AGPTPSPGSgtsdkivqaPVFYPPVLQPIQvMRQLGSVRAAAASTVTqapteyTGERRGVGPMHPTDIPPSKRAKTDAYV 931
                         250
                  ....*....|.
gi 767998433  456 TGQMVQSGVLP 466
Cdd:PHA03378  932 ESQPPHGGQSH 942
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH