|
Name |
Accession |
Description |
Interval |
E-value |
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
13-332 |
2.34e-28 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 116.28 E-value: 2.34e-28
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 13 AWSPASQYplyLATGtsaqqldssfSTNGTLEIFEVDFRDPSLDLK-HRGVLSALSRFHklvwgsfgsglleSSGVIVGG 91
Cdd:cd00200 16 AFSPDGKL---LATG----------SGDGTIKVWDLETGELLRTLKgHTGPVRDVAASA-------------DGTYLASG 69
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 92 GDNGMLILYNVthilsSGKEPVIAQKQkHTGAVRALDLNPfQGNLLASGASDSEIFIWDLNNLNVPMTLGSKSQqppeDI 171
Cdd:cd00200 70 SSDKTIRLWDL-----ETGECVRTLTG-HTSYVSSVAFSP-DGRILSSSSRDKTIKVWDVETGKCLTTLRGHTD----WV 138
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 172 KALSWNrQAQHILSSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDiATQLVLCSEDDrlpVIQLWDLRfASS 251
Cdd:cd00200 139 NSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKLWDLS-TGK 210
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 252 PLKVLESHSRGILSVSWSQaDAELLLTSAKDSQILCRNLGSSEVVYKLPTQSSWCFDVQWCPRDPSVFSaASFNGWISLY 331
Cdd:cd00200 211 CLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GSADGTIRIW 288
|
.
gi 14149696 332 S 332
Cdd:cd00200 289 D 289
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
13-333 |
6.41e-25 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 108.85 E-value: 6.41e-25
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 13 AWSPASQYplyLATGtsaqqldssfSTNGTLEIFEVDFRDPSLDLK-HRGVLSALSrFHklvwgsfgsglleSSG-VIVG 90
Cdd:COG2319 127 AFSPDGKT---LASG----------SADGTVRLWDLATGKLLRTLTgHSGAVTSVA-FS-------------PDGkLLAS 179
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 91 GGDNGMLILYNVThilsSGKEpvIAQKQKHTGAVRALDLNPfQGNLLASGASDSEIFIWDLNNLNVPMTLGSKSQQpped 170
Cdd:COG2319 180 GSDDGTVRLWDLA----TGKL--LRTLTGHTGAVRSVAFSP-DGKLLASGSADGTVRLWDLATGKLLRTLTGHSGS---- 248
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 171 IKALSWNRQAQHILsSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDiATQLVLCSEDDRlpvIQLWDLRfAS 250
Cdd:COG2319 249 VRSVAFSPDGRLLA-SGSADGTVRLWDLATGELLRTLTGHSGGVN--SVAFSPD-GKLLASGSDDGT---VRLWDLA-TG 320
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 251 SPLKVLESHSRGILSVSWSqADAELLLTSAKDSQILCRNLGSSEVVYKLPTQSSWCFDVQWCPrDPSVFSAASFNGWISL 330
Cdd:COG2319 321 KLLRTLTGHTGAVRSVAFS-PDGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSP-DGRTLASGSADGTVRL 398
|
...
gi 14149696 331 YSV 333
Cdd:COG2319 399 WDL 401
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
728-1076 |
5.17e-12 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 70.57 E-value: 5.17e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 728 LRGPHGVSPGPATTYRVTQYANLLAAQGSLATAMSFLPRDCAQPPVQQlrdrlfhaqgsavlgqQSPPFPFPRIVVGATL 807
Cdd:pfam03154 174 LQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQT----------------QSTAAPHTLIQQTPTL 237
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 808 HSKETSSYRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLAP------------SHPSPYQG-PRTQNISDYRAP-GPQAIQ 873
Cdd:pfam03154 238 HPQRLPSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPPmphslqtgpshmQHPVPPQPfPLTPQSSQSQVPpGPSPAA 317
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 874 PLPLSPGVRPASSQPQLlggqrvQVPNPvgfPGTWPLPGSPLPMacPGIMRPGSTSLPETPRLFP---LLPLRPLGPGRM 950
Cdd:pfam03154 318 PGQSQQRIHTPPSQSQL------QSQQP---PREQPLPPAPLSM--PHIKPPPTTPIPQLPNPQShkhPPHLSGPSPFQM 386
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 951 VSHTPAPPASFPVPYLPGD--PGA--------PCSSVLPTTG----ILTPHPGPQDSWKEAPAPRGNLQRNKLP----ET 1012
Cdd:pfam03154 387 NSNLPPPPALKPLSSLSTHhpPSAhppplqlmPQSQQLPPPPaqppVLTQSQSLPPPAASHPPTSGLHQVPSQSpfpqHP 466
|
330 340 350 360 370 380 390
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 14149696 1013 FMP--PAPITAPVMSLTPELQGILPSQPPVSSVSHAPPGVPGELS-----LQLQHLPPEKMERKELPPEHQ 1076
Cdd:pfam03154 467 FVPggPPPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPVPAAVScplppVQIKEEALDEAEEPESPPPPP 537
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
771-1076 |
1.63e-11 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 69.20 E-value: 1.63e-11
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 771 PPVQQLRDRLFHAQGSAVLGQQSPPFPFPRIVVGATLHSKETSsyRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLA---P 847
Cdd:PHA03247 2626 PPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRAR--RLGRAAQASSPPQRPRRRAARPTVGSLTSLAdppP 2703
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 848 SHPSPYQGPRTQNISDYRAPGPQAIQ----PLPLSPGVRPASSQPQLLGGqrvqvPNPVGFPGTWPLPGSPLPMACPGIM 923
Cdd:PHA03247 2704 PPPTPEPAPHALVSATPLPPGPAAARqaspALPAAPAPPAVPAGPATPGG-----PARPARPPTTAGPPAPAPPAAPAAG 2778
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 924 RPGSTSLPEtprlfpllplrplgpgrmVSHTPAPPASFPVPYLPGDPGA----PCSSVLPTTGILTPHPGPQDSWKEAPA 999
Cdd:PHA03247 2779 PPRRLTRPA------------------VASLSESRESLPSPWDPADPPAavlaPAAALPPAASPAGPLPPPTSAQPTAPP 2840
|
250 260 270 280 290 300 310
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 14149696 1000 -PRGNLQRNKLPETFMPP-APITAPVMSLTPELQGILPSQPPVSSVSHAPPGVPGElSLQLQHLPPEKMERKELPPEHQ 1076
Cdd:PHA03247 2841 pPPGPPPPSLPLGGSVAPgGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTE-SFALPPDQPERPPQPQAPPPPQ 2918
|
|
| ACE1-Sec16-like |
cd09233 |
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ... |
561-691 |
5.24e-05 |
|
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.
Pssm-ID: 187750 [Multi-domain] Cd Length: 314 Bit Score: 46.87 E-value: 5.24e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 561 KDIDGLLSQA-------LLLGELGPAVELCLKEERFADAIILAQAGGTDLLKQTQERYlAKKKTKISSLLACVVQ---KN 630
Cdd:cd09233 56 VGTDIAEQKAlnrfrnlLLTGNRKEALELALDNGLWAHALLLASSLGKETWAEVVSRF-ARSESKLNDPLQTLYQlfsGN 134
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 14149696 631 WKDVVCTCS---------LKNWREALALLLTySGTEKFPELCDM-LGTRMEQEGsraLTSEARLCYVCSGS 691
Cdd:cd09233 135 SPEAITELAdnpaeaewaLGNWREHLAIILS-NRTSNLDLEALVeLGDLLAQRG---LVEAAHICYLLAGV 201
|
|
| PTZ00420 |
PTZ00420 |
coronin; Provisional |
80-202 |
1.11e-04 |
|
coronin; Provisional
Pssm-ID: 240412 [Multi-domain] Cd Length: 568 Bit Score: 46.48 E-value: 1.11e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 80 GLLESSGVIV------GGGDNGMLILYNVTHilssgKEPVIAQKqKHTGAVRALDLNPFQGNLLASGASDSEIFIWDL-- 151
Cdd:PTZ00420 33 GIACSSGFVAvpweveGGGLIGAIRLENQMR-----KPPVIKLK-GHTSSILDLQFNPCFSEILASGSEDLTIRVWEIph 106
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*...
gi 14149696 152 NNLNV-----PMTL--GSKSQqppedIKALSWNRQAQHILSSAHPSGKAVVWDLrKNE 202
Cdd:PTZ00420 107 NDESVkeikdPQCIlkGHKKK-----ISIIDWNPMNYYIMCSSGFDSFVNIWDI-ENE 158
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
120-150 |
2.37e-04 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 39.60 E-value: 2.37e-04
10 20 30
....*....|....*....|....*....|.
gi 14149696 120 HTGAVRALDLNPfQGNLLASGASDSEIFIWD 150
Cdd:smart00320 11 HTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
120-150 |
1.00e-03 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 37.71 E-value: 1.00e-03
10 20 30
....*....|....*....|....*....|.
gi 14149696 120 HTGAVRALDLNPfQGNLLASGASDSEIFIWD 150
Cdd:pfam00400 10 HTGSVTSLAFSP-DGKLLASGSDDGTVKVWD 39
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
865-990 |
1.12e-03 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 42.87 E-value: 1.12e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 865 RAPGPQAIQPLPLSPgvrpASSQPQLLG-GQRVQVP-NPVGFPGTWPLPGSplpmacpgiMRPGSTSLPETPRLFPLLPL 942
Cdd:TIGR01628 379 QPRMRQLPMGSPMGG----AMGQPPYYGqGPQQQFNgQPLGWPRMSMMPTP---------MGPGGPLRPNGLAPMNAVRA 445
|
90 100 110 120
....*....|....*....|....*....|....*....|....*...
gi 14149696 943 RPLGPGRMvshtPAPPASFPVPYLPGDPGAPCSSVLPTTGILTPHPGP 990
Cdd:TIGR01628 446 PSRNAQNA----AQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGGQ 489
|
|
| PspC_subgroup_2 |
NF033839 |
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ... |
810-1038 |
1.18e-03 |
|
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.
Pssm-ID: 468202 [Multi-domain] Cd Length: 557 Bit Score: 42.83 E-value: 1.18e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 810 KETSSYRLGSQPSHQ--VPTPSPRPRVFTPQSSPAMPLAPSHPSPY-QGPRTQNISDYRAPGPQA-IQPLPLSPGVRPAS 885
Cdd:NF033839 290 KKPSAPKPGMQPSPQpeKKEVKPEPETPKPEVKPQLEKPKPEVKPQpEKPKPEVKPQLETPKPEVkPQPEKPKPEVKPQP 369
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 886 SQPQllggqrvqvpnpvgfPGTWPLPGSPLPMACPGIMRPGSTSLPETPRLFPLLPLRPLGPGRMVSHTPAPPAsfpvPY 965
Cdd:NF033839 370 EKPK---------------PEVKPQPETPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPK----PE 430
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 14149696 966 LPGDPGAPCSSVLPTTGILTPHPGPQdswKEAPAPRGNLQRNKlPETFMPPAPITAPVMSLTPELQGILPSQP 1038
Cdd:NF033839 431 VKPQPEKPKPEVKPQPEKPKPEVKPQ---PETPKPEVKPQPEK-PKPEVKPQPEKPKPDNSKPQADDKKPSTP 499
|
|
| PBP1 |
COG5180 |
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ... |
736-1074 |
8.58e-03 |
|
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];
Pssm-ID: 444064 [Multi-domain] Cd Length: 548 Bit Score: 40.05 E-value: 8.58e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 736 PGPATTYRVTQYANLLAAQGS--------LATAMSFLPRDCAQPPVQQLRDRLFHAQGSavLGQQSPPFPFPRIVVGATL 807
Cdd:COG5180 133 PKAKVTREATSASAGVALAAAllqrsdpiLAKDPDGDSASTLPPPAEKLDKVLTEPRDA--LKDSPEKLDRPKVEVKDEA 210
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 808 HSKETSSYRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLAPSHPSPyqGPRTQNISDYRAPGPQAIQPLPLSPGVRPASSQ 887
Cdd:COG5180 211 QEEPPDLTGGADHPRPEAASSPKVDPPSTSEARSRPATVDAQPEM--RPPADAKERRRAAIGDTPAAEPPGLPVLEAGSE 288
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 888 PQLlGGQRVQVPNPVGFPGTWPLPGSPLPMACPGIMRPGSTSLPEtprLFPLLPLRPLGPGRMVSHTPA--PPASFPVPY 965
Cdd:COG5180 289 PQS-DAPEAETARPIDVKGVASAPPATRPVRPPGGARDPGTPRPG---QPTERPAGVPEAASDAGQPPSayPPAEEAVPG 364
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 966 LPGDPGAP--CSSVLPTTGILTPHPGPQDSWKEAPAPRGNLQRNKLPEtFMPPAPITAPVMSLTPELQGILPSQPPVSSV 1043
Cdd:COG5180 365 KPLEQGAPrpGSSGGDGAPFQPPNGAPQPGLGRRGAPGPPMGAGDLVQ-AALDGGGRETASLGGAAGGAGQGPKADFVPG 443
|
330 340 350
....*....|....*....|....*....|.
gi 14149696 1044 SHAPPGVPGELSLQLQHLPPEKMERKELPPE 1074
Cdd:COG5180 444 DAESVSGPAGLADQAGAAASTAMADFVAPVT 474
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
13-332 |
2.34e-28 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 116.28 E-value: 2.34e-28
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 13 AWSPASQYplyLATGtsaqqldssfSTNGTLEIFEVDFRDPSLDLK-HRGVLSALSRFHklvwgsfgsglleSSGVIVGG 91
Cdd:cd00200 16 AFSPDGKL---LATG----------SGDGTIKVWDLETGELLRTLKgHTGPVRDVAASA-------------DGTYLASG 69
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 92 GDNGMLILYNVthilsSGKEPVIAQKQkHTGAVRALDLNPfQGNLLASGASDSEIFIWDLNNLNVPMTLGSKSQqppeDI 171
Cdd:cd00200 70 SSDKTIRLWDL-----ETGECVRTLTG-HTSYVSSVAFSP-DGRILSSSSRDKTIKVWDVETGKCLTTLRGHTD----WV 138
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 172 KALSWNrQAQHILSSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDiATQLVLCSEDDrlpVIQLWDLRfASS 251
Cdd:cd00200 139 NSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKLWDLS-TGK 210
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 252 PLKVLESHSRGILSVSWSQaDAELLLTSAKDSQILCRNLGSSEVVYKLPTQSSWCFDVQWCPRDPSVFSaASFNGWISLY 331
Cdd:cd00200 211 CLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GSADGTIRIW 288
|
.
gi 14149696 332 S 332
Cdd:cd00200 289 D 289
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
13-333 |
6.41e-25 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 108.85 E-value: 6.41e-25
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 13 AWSPASQYplyLATGtsaqqldssfSTNGTLEIFEVDFRDPSLDLK-HRGVLSALSrFHklvwgsfgsglleSSG-VIVG 90
Cdd:COG2319 127 AFSPDGKT---LASG----------SADGTVRLWDLATGKLLRTLTgHSGAVTSVA-FS-------------PDGkLLAS 179
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 91 GGDNGMLILYNVThilsSGKEpvIAQKQKHTGAVRALDLNPfQGNLLASGASDSEIFIWDLNNLNVPMTLGSKSQQpped 170
Cdd:COG2319 180 GSDDGTVRLWDLA----TGKL--LRTLTGHTGAVRSVAFSP-DGKLLASGSADGTVRLWDLATGKLLRTLTGHSGS---- 248
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 171 IKALSWNRQAQHILsSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDiATQLVLCSEDDRlpvIQLWDLRfAS 250
Cdd:COG2319 249 VRSVAFSPDGRLLA-SGSADGTVRLWDLATGELLRTLTGHSGGVN--SVAFSPD-GKLLASGSDDGT---VRLWDLA-TG 320
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 251 SPLKVLESHSRGILSVSWSqADAELLLTSAKDSQILCRNLGSSEVVYKLPTQSSWCFDVQWCPrDPSVFSAASFNGWISL 330
Cdd:COG2319 321 KLLRTLTGHTGAVRSVAFS-PDGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSP-DGRTLASGSADGTVRL 398
|
...
gi 14149696 331 YSV 333
Cdd:COG2319 399 WDL 401
|
|
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
120-334 |
1.86e-24 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 104.72 E-value: 1.86e-24
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 120 HTGAVRALDLNPfQGNLLASGASDSEIFIWDLNNLNVPMTLGSKSqQPPEDIKALSWNRQ-------------------- 179
Cdd:cd00200 8 HTGGVTCVAFSP-DGKLLATGSGDGTIKVWDLETGELLRTLKGHT-GPVRDVAASADGTYlasgssdktirlwdletgec 85
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 180 ------------------AQHILSSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDiaTQLVLCSEDDRLpvI 241
Cdd:cd00200 86 vrtltghtsyvssvafspDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNS--VAFSPD--GTFVASSSQDGT--I 159
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 242 QLWDLRfASSPLKVLESHSRGILSVSWSqADAELLLTSAKDSQILCRNLGSSEVVYKLPTQSSWCFDVQWCPrDPSVFSA 321
Cdd:cd00200 160 KLWDLR-TGKCVATLTGHTGEVNSVAFS-PDGEKLLSSSSDGTIKLWDLSTGKCLGTLRGHENGVNSVAFSP-DGYLLAS 236
|
250
....*....|...
gi 14149696 322 ASFNGWISLYSVM 334
Cdd:cd00200 237 GSEDGTIRVWDLR 249
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
51-333 |
5.06e-19 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 90.74 E-value: 5.06e-19
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 51 RDPSLDLKHRGVLSALSRFHKLVWGSFGSGLLESSGVIVGGGDNGMLILYNVTHILSSGKEPVIAQKQKHTGAVRALDLN 130
Cdd:COG2319 8 ALAAASADLALALLAAALGALLLLLLGLAAAVASLAASPDGARLAAGAGDLTLLLLDAAAGALLATLLGHTAAVLSVAFS 87
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 131 PfQGNLLASGASDSEIFIWDLNNLNVPMTLgsksQQPPEDIKALSWNRQAQHILSSAHPsGKAVVWDLRKNEPIIKVSDH 210
Cdd:COG2319 88 P-DGRLLASASADGTVRLWDLATGLLLRTL----TGHTGAVRSVAFSPDGKTLASGSAD-GTVRLWDLATGKLLRTLTGH 161
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 211 SNRMHCsgLAWHPDiATQLVLCSEDDRlpvIQLWDLRfASSPLKVLESHSRGILSVSWSqADAELLLTSAKDSQILCRNL 290
Cdd:COG2319 162 SGAVTS--VAFSPD-GKLLASGSDDGT---VRLWDLA-TGKLLRTLTGHTGAVRSVAFS-PDGKLLASGSADGTVRLWDL 233
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 14149696 291 GSSEVVYKLPTQSSWCFDVQWCPrDPSVFSAASFNGWISLYSV 333
Cdd:COG2319 234 ATGKLLRTLTGHSGSVRSVAFSP-DGRLLASGSADGTVRLWDL 275
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
728-1076 |
5.17e-12 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 70.57 E-value: 5.17e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 728 LRGPHGVSPGPATTYRVTQYANLLAAQGSLATAMSFLPRDCAQPPVQQlrdrlfhaqgsavlgqQSPPFPFPRIVVGATL 807
Cdd:pfam03154 174 LQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQT----------------QSTAAPHTLIQQTPTL 237
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 808 HSKETSSYRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLAP------------SHPSPYQG-PRTQNISDYRAP-GPQAIQ 873
Cdd:pfam03154 238 HPQRLPSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPPmphslqtgpshmQHPVPPQPfPLTPQSSQSQVPpGPSPAA 317
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 874 PLPLSPGVRPASSQPQLlggqrvQVPNPvgfPGTWPLPGSPLPMacPGIMRPGSTSLPETPRLFP---LLPLRPLGPGRM 950
Cdd:pfam03154 318 PGQSQQRIHTPPSQSQL------QSQQP---PREQPLPPAPLSM--PHIKPPPTTPIPQLPNPQShkhPPHLSGPSPFQM 386
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 951 VSHTPAPPASFPVPYLPGD--PGA--------PCSSVLPTTG----ILTPHPGPQDSWKEAPAPRGNLQRNKLP----ET 1012
Cdd:pfam03154 387 NSNLPPPPALKPLSSLSTHhpPSAhppplqlmPQSQQLPPPPaqppVLTQSQSLPPPAASHPPTSGLHQVPSQSpfpqHP 466
|
330 340 350 360 370 380 390
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 14149696 1013 FMP--PAPITAPVMSLTPELQGILPSQPPVSSVSHAPPGVPGELS-----LQLQHLPPEKMERKELPPEHQ 1076
Cdd:pfam03154 467 FVPggPPPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPVPAAVScplppVQIKEEALDEAEEPESPPPPP 537
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
771-1076 |
1.63e-11 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 69.20 E-value: 1.63e-11
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 771 PPVQQLRDRLFHAQGSAVLGQQSPPFPFPRIVVGATLHSKETSsyRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLA---P 847
Cdd:PHA03247 2626 PPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRAR--RLGRAAQASSPPQRPRRRAARPTVGSLTSLAdppP 2703
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 848 SHPSPYQGPRTQNISDYRAPGPQAIQ----PLPLSPGVRPASSQPQLLGGqrvqvPNPVGFPGTWPLPGSPLPMACPGIM 923
Cdd:PHA03247 2704 PPPTPEPAPHALVSATPLPPGPAAARqaspALPAAPAPPAVPAGPATPGG-----PARPARPPTTAGPPAPAPPAAPAAG 2778
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 924 RPGSTSLPEtprlfpllplrplgpgrmVSHTPAPPASFPVPYLPGDPGA----PCSSVLPTTGILTPHPGPQDSWKEAPA 999
Cdd:PHA03247 2779 PPRRLTRPA------------------VASLSESRESLPSPWDPADPPAavlaPAAALPPAASPAGPLPPPTSAQPTAPP 2840
|
250 260 270 280 290 300 310
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 14149696 1000 -PRGNLQRNKLPETFMPP-APITAPVMSLTPELQGILPSQPPVSSVSHAPPGVPGElSLQLQHLPPEKMERKELPPEHQ 1076
Cdd:PHA03247 2841 pPPGPPPPSLPLGGSVAPgGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTE-SFALPPDQPERPPQPQAPPPPQ 2918
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
794-1078 |
4.27e-11 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 68.04 E-value: 4.27e-11
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 794 PPFPFPRIvvgatlHSKETSSYRLGSQPSHQVPTPSPRP-RVFTPQSSPAMPLAPSHPSPYQGPRTQ-------NISDYR 865
Cdd:PHA03247 2626 PPPPSPSP------AANEPDPHPPPTVPPPERPRDDPAPgRVSRPRRARRLGRAAQASSPPQRPRRRaarptvgSLTSLA 2699
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 866 APGPQAIQPLPLSPGVRPASSQPQLLGGQRVQVPNPVGFPGTWPLPGSP-LPMACPGIMRPGSTSLPEtprlfpllplrp 944
Cdd:PHA03247 2700 DPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPaTPGGPARPARPPTTAGPP------------ 2767
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 945 lgpgrmvshTPAPPASFPVPYLPGDPGAPCSSVLPTTGILTPHPGPQDSWKEAPAPRGNLQRNKLPETFMPPAPITAPVM 1024
Cdd:PHA03247 2768 ---------APAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTA 2838
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 14149696 1025 SLTPElQGILPSQPPVSSVS-------HAPPGVPGELSLQLQHLPPEKMERKELPPEHQSL 1078
Cdd:PHA03247 2839 PPPPP-GPPPPSLPLGGSVApggdvrrRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESF 2898
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
821-1052 |
1.90e-10 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 65.73 E-value: 1.90e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 821 PSHQVPTP--SPRP-------RVFTP-----QSSPAMPLAPSHPSPYQGPRTQNISDYRAPGPQAIQPLPLS---PGVRP 883
Cdd:PHA03247 2564 PDRSVPPPrpAPRPsepavtsRARRPdappqSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAAnepDPHPP 2643
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 884 ASSQPQLL-----GGQRVQVPNPVGFPGTWPLPGSPLPMACPGIMRPGSTSLPETPRLFPLLPLRPLGPGRMVSHTPAPP 958
Cdd:PHA03247 2644 PTVPPPERprddpAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPP 2723
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 959 -------ASFPVPYLPGDPGAPCSSVLP-----------TTGILTPHP------GPQDSWKEAPAPRGNLQRNKLPETFM 1014
Cdd:PHA03247 2724 gpaaarqASPALPAAPAPPAVPAGPATPggparparpptTAGPPAPAPpaapaaGPPRRLTRPAVASLSESRESLPSPWD 2803
|
250 260 270 280
....*....|....*....|....*....|....*....|.
gi 14149696 1015 P---PAPITAPVMSLTPELQGILPSQPPVSSVSHAPPGVPG 1052
Cdd:PHA03247 2804 PadpPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPG 2844
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
729-1081 |
4.16e-10 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 64.57 E-value: 4.16e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 729 RGPHGVSPGPATTYRV-------TQYANLLAAQGSLATAMSFLPRDCAQPPVQQLRDRLfHAQGSAVLGQQSPPFPFPRI 801
Cdd:PHA03247 2609 RGPAPPSPLPPDTHAPdppppspSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRA-RRLGRAAQASSPPQRPRRRA 2687
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 802 VVGATlhSKETSSYRlgsqPSHQVPTPSPRPRVFTP------------QSSPAMPLAPSHPSPYQGPRTqNISDYRAPGP 869
Cdd:PHA03247 2688 ARPTV--GSLTSLAD----PPPPPPTPEPAPHALVSatplppgpaaarQASPALPAAPAPPAVPAGPAT-PGGPARPARP 2760
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 870 QAIQ--PLPLSPGVRPASSQPQL--------------LGGQRVQVPNPVGFPG-------------TWPLPGSPLPMACP 920
Cdd:PHA03247 2761 PTTAgpPAPAPPAAPAAGPPRRLtrpavaslsesresLPSPWDPADPPAAVLApaaalppaaspagPLPPPTSAQPTAPP 2840
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 921 GIMRPGSTSLPETPRLFPLLPLRPLGPGRMVSHTPAPPASFPVPYLPGDPGAPCSSVLPttgilTPHPGPQdswkeaPAP 1000
Cdd:PHA03247 2841 PPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFA-----LPPDQPE------RPP 2909
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 1001 RGNLQRNKLPETFMPPAPITAPVMSLTPELQGILPSQPPVSSVSHAPPGVPGElslQLQHLPPEKME--RKELPPEHQSL 1078
Cdd:PHA03247 2910 QPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQP---WLGALVPGRVAvpRFRVPQPAPSR 2986
|
...
gi 14149696 1079 KSS 1081
Cdd:PHA03247 2987 EAP 2989
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
793-1063 |
2.14e-08 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 58.63 E-value: 2.14e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 793 SPPFPFPRIVVGATLHSKETSSYRlGSQPSHQVPT--------------------PSPRPRVFTPQSSPAMPLAPSHPSP 852
Cdd:pfam03154 145 SPSIPSPQDNESDSDSSAQQQILQ-TQPPVLQAQSgaasppsppppgttqaatagPTPSAPSVPPQGSPATSQPPNQTQS 223
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 853 YQGP-----RTQNISDYRAPGPQ-AIQPLPLSP---GVRPASSQPQLLGGQRVQVPNPVGF-PGTWPLPGSPLPMACPGI 922
Cdd:pfam03154 224 TAAPhtliqQTPTLHPQRLPSPHpPLQPMTQPPppsQVSPQPLPQPSLHGQMPPMPHSLQTgPSHMQHPVPPQPFPLTPQ 303
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 923 MRPGSTSLPETPRLFPLLPLRplgpgrmvSHTPAPPASFPvpylPGDPgaPCSSVLPTTGILTPH--PGPQDSWKEAPAP 1000
Cdd:pfam03154 304 SSQSQVPPGPSPAAPGQSQQR--------IHTPPSQSQLQ----SQQP--PREQPLPPAPLSMPHikPPPTTPIPQLPNP 369
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 14149696 1001 rgnlQRNKLPETFMPPAPITAPVmSLTPElqgilPSQPPVSSVS-HAPPGV---PGELSLQLQHLPP 1063
Cdd:pfam03154 370 ----QSHKHPPHLSGPSPFQMNS-NLPPP-----PALKPLSSLStHHPPSAhppPLQLMPQSQQLPP 426
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
817-1053 |
1.49e-06 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 52.57 E-value: 1.49e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 817 LGSQPSHQ-----VPTPSPRPRVFTPQSSPAMPLAPSHPSPYQGPRTQNISDYRAPGPQAIQPLPLSPGVRPASSQPQLL 891
Cdd:PRK12323 361 LAFRPGQSgggagPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQAS 440
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 892 GGQRVQVPNPVGFPGTWPLPGSPLPMACPGIMRPGSTSLPETPRLFPLLPlrplgpgrmvshtPAPPASFPVPYLPGDPG 971
Cdd:PRK12323 441 ARGPGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPA-------------PADDDPPPWEELPPEFA 507
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 972 APcssvlpttGILTPHPGPQDsWKEAPAPRGNLQRNKLPETFMPPAPITAPVMSLTPELQGILPSQPPVSSVSHAPPGVP 1051
Cdd:PRK12323 508 SP--------APAQPDAAPAG-WVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRPPRASASGLPDMFD 578
|
..
gi 14149696 1052 GE 1053
Cdd:PRK12323 579 GD 580
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
791-1095 |
2.64e-06 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 52.25 E-value: 2.64e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 791 QQSPPFPFPRIVVGATLHSKETSSYRLGSQPSHQVPTPSPRP----------RVFTPQS-----SPAMPLAPSHPSPYQG 855
Cdd:PHA03247 2704 PPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPagpatpggpaRPARPPTtagppAPAPPAAPAAGPPRRL 2783
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 856 PR----TQNISDYRAPGPQAIQPLP---------LSPGVRPASSQPQLLGGQRVQVPNPVGFP------GTWPLPGSPLP 916
Cdd:PHA03247 2784 TRpavaSLSESRESLPSPWDPADPPaavlapaaaLPPAASPAGPLPPPTSAQPTAPPPPPGPPppslplGGSVAPGGDVR 2863
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 917 MACPGIMRPGSTSLPETPRLFPLLPLRPLGPGRMVSHTPAPPASFPVPYLPGDPGAPcsSVLPTTGILTPHPGPQDSWKE 996
Cdd:PHA03247 2864 RRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQ--PQPPPPPQPQPPPPPPPRPQP 2941
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 997 APAPRGNLQRNKLPETFMP--------PAPITAPVMSLTPELQGI-LPSQPPVSSVSHAPPGV---PGELSLQLQHLPPE 1064
Cdd:PHA03247 2942 PLAPTTDPAGAGEPSGAVPqpwlgalvPGRVAVPRFRVPQPAPSReAPASSTPPLTGHSLSRVsswASSLALHEETDPPP 3021
|
330 340 350
....*....|....*....|....*....|....*
gi 14149696 1065 KMERKELPP----EHQSLKSSFEALLQRCSLSATD 1095
Cdd:PHA03247 3022 VSLKQTLWPpddtEDSDADSLFDSDSERSDLEALD 3056
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
792-1048 |
1.82e-05 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 49.31 E-value: 1.82e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 792 QSPPFPFPRIVVgatlhSKETSSYRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLAPsHPSPYQGPRTQNISDYRAPGPQA 871
Cdd:PRK10263 342 QTPPVASVDVPP-----AQPTVAWQPVPGPQTGEPVIAPAPEGYPQQSQYAQPAVQ-YNEPLQQPVQPQQPYYAPAAEQP 415
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 872 IQPLPLSPGVRPASSQPQLLGgqrvQVPNPVGFPGTWPLPGSPLpmacpgiMRPGSTSLPETPRLFpllplrplgpgrmv 951
Cdd:PRK10263 416 AQQPYYAPAPEQPAQQPYYAP----APEQPVAGNAWQAEEQQST-------FAPQSTYQTEQTYQQ-------------- 470
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 952 shtpapPASFPVPYLPGDPGAPCSSVLPTTGILTPHPG--PQDSWKEAPAPRGNlQRNKLPETFMP-PAPITAPVMSLTP 1028
Cdd:PRK10263 471 ------PAAQEPLYQQPQPVEQQPVVEPEPVVEETKPArpPLYYFEEVEEKRAR-EREQLAAWYQPiPEPVKEPEPIKSS 543
|
250 260
....*....|....*....|
gi 14149696 1029 ELQGILPSQPPVSSVSHAPP 1048
Cdd:PRK10263 544 LKAPSVAAVPPVEAAAAVSP 563
|
|
| ACE1-Sec16-like |
cd09233 |
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ... |
561-691 |
5.24e-05 |
|
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.
Pssm-ID: 187750 [Multi-domain] Cd Length: 314 Bit Score: 46.87 E-value: 5.24e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 561 KDIDGLLSQA-------LLLGELGPAVELCLKEERFADAIILAQAGGTDLLKQTQERYlAKKKTKISSLLACVVQ---KN 630
Cdd:cd09233 56 VGTDIAEQKAlnrfrnlLLTGNRKEALELALDNGLWAHALLLASSLGKETWAEVVSRF-ARSESKLNDPLQTLYQlfsGN 134
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 14149696 631 WKDVVCTCS---------LKNWREALALLLTySGTEKFPELCDM-LGTRMEQEGsraLTSEARLCYVCSGS 691
Cdd:cd09233 135 SPEAITELAdnpaeaewaLGNWREHLAIILS-NRTSNLDLEALVeLGDLLAQRG---LVEAAHICYLLAGV 201
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
808-1076 |
5.73e-05 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 47.37 E-value: 5.73e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 808 HSKETSSYRLGSQPSHQVPTPSPR-------PRVF-TPQSSPAMPLAPSHPS-------PYQgPRTQNISDYRAP--GPQ 870
Cdd:PHA03378 614 HIPETSAPRQWPMPLRPIPMRPLRmqpitfnVLVFpTPHQPPQVEITPYKPTwtqighiPYQ-PSPTGANTMLPIqwAPG 692
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 871 AIQPLPLSPG-VRPASSQPqllggqrVQVPNPVGFPGTWPLPGSPLPMACPGIMRPGSTSLPETPRLFPLLPLRPLGPGR 949
Cdd:PHA03378 693 TMQPPPRAPTpMRPPAAPP-------GRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGRAR 765
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 950 MVSHTPAPPASFPVPYLPGDP-----GAPCSS----VLPTTGILTPH--PGPQDSWKEA-------------PAPRGNLQ 1005
Cdd:PHA03378 766 PPAAAPGAPTPQPPPQAPPAPqqrprGAPTPQpppqAGPTSMQLMPRaaPGQQGPTKQIlrqlltggvkrgrPSLKKPAA 845
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 1006 RNKLPETFMPPAP--------ITAPVM---SLTP-ELQGILPSQPPV--SSVSHAPPGVPGE----LSLQLQHLPPEKME 1067
Cdd:PHA03378 846 LERQAAAGPTPSPgsgtsdkiVQAPVFyppVLQPiQVMRQLGSVRAAaaSTVTQAPTEYTGErrgvGPMHPTDIPPSKRA 925
|
....*....
gi 14149696 1068 RKELPPEHQ 1076
Cdd:PHA03378 926 KTDAYVESQ 934
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
13-153 |
6.10e-05 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 46.83 E-value: 6.10e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 13 AWSPASQYplyLATGtsaqqldssfSTNGTLEIFevdfrdpslDLKHRGVLSALSRFHKLVWG-SFGSglleSSGVIVGG 91
Cdd:COG2319 295 AFSPDGKL---LASG----------SDDGTVRLW---------DLATGKLLRTLTGHTGAVRSvAFSP----DGKTLASG 348
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|..
gi 14149696 92 GDNGMLILYNvthiLSSGKEpvIAQKQKHTGAVRALDLNPfQGNLLASGASDSEIFIWDLNN 153
Cdd:COG2319 349 SDDGTVRLWD----LATGEL--LRTLTGHTGAVTSVAFSP-DGRTLASGSADGTVRLWDLAT 403
|
|
| PTZ00420 |
PTZ00420 |
coronin; Provisional |
80-202 |
1.11e-04 |
|
coronin; Provisional
Pssm-ID: 240412 [Multi-domain] Cd Length: 568 Bit Score: 46.48 E-value: 1.11e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 80 GLLESSGVIV------GGGDNGMLILYNVTHilssgKEPVIAQKqKHTGAVRALDLNPFQGNLLASGASDSEIFIWDL-- 151
Cdd:PTZ00420 33 GIACSSGFVAvpweveGGGLIGAIRLENQMR-----KPPVIKLK-GHTSSILDLQFNPCFSEILASGSEDLTIRVWEIph 106
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*...
gi 14149696 152 NNLNV-----PMTL--GSKSQqppedIKALSWNRQAQHILSSAHPSGKAVVWDLrKNE 202
Cdd:PTZ00420 107 NDESVkeikdPQCIlkGHKKK-----ISIIDWNPMNYYIMCSSGFDSFVNIWDI-ENE 158
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
120-150 |
2.37e-04 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 39.60 E-value: 2.37e-04
10 20 30
....*....|....*....|....*....|.
gi 14149696 120 HTGAVRALDLNPfQGNLLASGASDSEIFIWD 150
Cdd:smart00320 11 HTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
|
|
| PTZ00421 |
PTZ00421 |
coronin; Provisional |
114-312 |
6.01e-04 |
|
coronin; Provisional
Pssm-ID: 173611 [Multi-domain] Cd Length: 493 Bit Score: 43.73 E-value: 6.01e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 114 IAQKQKHTGAVRALDLNPFQGNLLASGASDSEIFIWDLNNLNVPMTLGSKSQQppedIKALSWNRQAQhILSSAHPSGKA 193
Cdd:PTZ00421 118 IVHLQGHTKKVGIVSFHPSAMNVLASAGADMVVNVWDVERGKAVEVIKCHSDQ----ITSLEWNLDGS-LLCTTSKDKKL 192
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 194 VVWDLRKNEPIIKVSDH-SNRMHCSGLAWHPDIATQLVLCSEDDRLpvIQLWDLRFASSPLKVLESHSRGILSVSWSQAD 272
Cdd:PTZ00421 193 NIIDPRDGTIVSSVEAHaSAKSQRCLWAKRKDLIITLGCSKSQQRQ--IMLWDTRKMASPYSTVDLDQSSALFIPFFDED 270
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|..
gi 14149696 273 AELLLTSAKDSQIL------------CRNLGSSEVVYKLPTQSSWCFDVQWC 312
Cdd:PTZ00421 271 TNLLYIGSKGEGNIrcfelmnerltfCSSYSSVEPHKGLCMMPKWSLDTRKC 322
|
|
| PHA03377 |
PHA03377 |
EBNA-3C; Provisional |
765-1051 |
9.65e-04 |
|
EBNA-3C; Provisional
Pssm-ID: 177614 [Multi-domain] Cd Length: 1000 Bit Score: 43.50 E-value: 9.65e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 765 PRDCAQPPVQQ-LRDRLFHAQgsavLGQQSPPFPFPRIVVGATLHSKETSSYRlgsQPSHQVPTPSPR--PRVFTPQSSP 841
Cdd:PHA03377 622 PRDMAPSVVRMfLRERLLEQS----TGPKPKSFWEMRAGRDGSGIQQEPSSRR---QPATQSTPPRPSwlPSVFVLPSVD 694
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 842 A---MPLAPSHPSPYQGprTQNISDYRAP---GPQAIQPLPLSPGVRPASSQPQLLGG----QRVQVPnpvgFPGTWPLP 911
Cdd:PHA03377 695 AgraQPSEESHLSSMSP--TQPISHEEQPryeDPDDPLDLSLHPDQAPPPSHQAPYSGheepQAQQAP----YPGYWEPR 768
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 912 GSPLPMAcpGIMRPGSTSLPETPRLFPLLPLRPLGPGRMVSHTPAPPASFPVPYLPGDPGAPCSsvlpttgiltPHPGPQ 991
Cdd:PHA03377 769 PPQAPYL--GYQEPQAQGVQVSSYPGYAGPWGLRAQHPRYRHSWAYWSQYPGHGHPQGPWAPRP----------PHLPPQ 836
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 14149696 992 dsWKEAPAPrGNLQRNKLPETFMPPAPITaPVMSLTPELQGILP----SQPPVSSVSHAPPGVP 1051
Cdd:PHA03377 837 --WDGSAGH-GQDQVSQFPHLQSETGPPR-LQLSQVPQLPYSQTlvssSAPSWSSPQPRAPIRP 896
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
120-150 |
1.00e-03 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 37.71 E-value: 1.00e-03
10 20 30
....*....|....*....|....*....|.
gi 14149696 120 HTGAVRALDLNPfQGNLLASGASDSEIFIWD 150
Cdd:pfam00400 10 HTGSVTSLAFSP-DGKLLASGSDDGTVKVWD 39
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
865-990 |
1.12e-03 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 42.87 E-value: 1.12e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 865 RAPGPQAIQPLPLSPgvrpASSQPQLLG-GQRVQVP-NPVGFPGTWPLPGSplpmacpgiMRPGSTSLPETPRLFPLLPL 942
Cdd:TIGR01628 379 QPRMRQLPMGSPMGG----AMGQPPYYGqGPQQQFNgQPLGWPRMSMMPTP---------MGPGGPLRPNGLAPMNAVRA 445
|
90 100 110 120
....*....|....*....|....*....|....*....|....*...
gi 14149696 943 RPLGPGRMvshtPAPPASFPVPYLPGDPGAPCSSVLPTTGILTPHPGP 990
Cdd:TIGR01628 446 PSRNAQNA----AQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGGQ 489
|
|
| PspC_subgroup_2 |
NF033839 |
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ... |
810-1038 |
1.18e-03 |
|
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.
Pssm-ID: 468202 [Multi-domain] Cd Length: 557 Bit Score: 42.83 E-value: 1.18e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 810 KETSSYRLGSQPSHQ--VPTPSPRPRVFTPQSSPAMPLAPSHPSPY-QGPRTQNISDYRAPGPQA-IQPLPLSPGVRPAS 885
Cdd:NF033839 290 KKPSAPKPGMQPSPQpeKKEVKPEPETPKPEVKPQLEKPKPEVKPQpEKPKPEVKPQLETPKPEVkPQPEKPKPEVKPQP 369
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 886 SQPQllggqrvqvpnpvgfPGTWPLPGSPLPMACPGIMRPGSTSLPETPRLFPLLPLRPLGPGRMVSHTPAPPAsfpvPY 965
Cdd:NF033839 370 EKPK---------------PEVKPQPETPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPK----PE 430
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 14149696 966 LPGDPGAPCSSVLPTTGILTPHPGPQdswKEAPAPRGNLQRNKlPETFMPPAPITAPVMSLTPELQGILPSQP 1038
Cdd:NF033839 431 VKPQPEKPKPEVKPQPEKPKPEVKPQ---PETPKPEVKPQPEK-PKPEVKPQPEKPKPDNSKPQADDKKPSTP 499
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
792-1011 |
1.57e-03 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 42.83 E-value: 1.57e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 792 QSPPFPFPRIVVGA--TLHSKETSSYRLGSQPSHQVP---TPSPRPRVFTPQSSPAMPLA-------PSH---PSPYQGP 856
Cdd:pfam03154 308 QVPPGPSPAAPGQSqqRIHTPPSQSQLQSQQPPREQPlppAPLSMPHIKPPPTTPIPQLPnpqshkhPPHlsgPSPFQMN 387
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 857 RT----------QNISDYRAPG---------PQAiQPLPLSPGVRPASSQPQLLGGQRVQVPNPVGF---PGTWPLPGSP 914
Cdd:pfam03154 388 SNlppppalkplSSLSTHHPPSahppplqlmPQS-QQLPPPPAQPPVLTQSQSLPPPAASHPPTSGLhqvPSQSPFPQHP 466
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 915 LPMACPGIMRPGSTSLPETPRLFPllplrplgpgrmvshTPAPPASFPVPYLPGDPGAPcSSVLPTTGILTPHPGPQDSW 994
Cdd:pfam03154 467 FVPGGPPPITPPSGPPTSTSSAMP---------------GIQPPSSASVSSSGPVPAAV-SCPLPPVQIKEEALDEAEEP 530
|
250
....*....|....*..
gi 14149696 995 KEAPAPrgnlQRNKLPE 1011
Cdd:pfam03154 531 ESPPPP----PRSPSPE 543
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
819-1052 |
1.85e-03 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 42.47 E-value: 1.85e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 819 SQPSHQVPTPSPRPRvFTPQSSPAMPLAPSHPsPYQGPRTQNISDYRAPGPQaiqplPLSPGVRPASSQPQLLGGQRVQV 898
Cdd:PHA03307 73 PGPGTEAPANESRST-PTWSLSTLAPASPARE-GSPTPPGPSSPDPPPPTPP-----PASPPPSPAPDLSEMLRPVGSPG 145
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 899 PNPVGFPGTWPLPGSPLPMA---CPGIMRPgSTSLPETPRLFPLLPLRPLGPGRMVSHTPAPPAsfpvPYLPGDPGAPCS 975
Cdd:PHA03307 146 PPPAASPPAAGASPAAVASDaasSRQAALP-LSSPEETARAPSSPPAEPPPSTPPAAASPRPPR----RSSPISASASSP 220
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 14149696 976 SVLPT-TGILTPHPGPQDSWKEAPAPRGNLQRNKLPETfmPPAPITAPVMSLTPElQGILPSQPPVSSVSHAPPGVPG 1052
Cdd:PHA03307 221 APAPGrSAADDAGASSSDSSSSESSGCGWGPENECPLP--RPAPITLPTRIWEAS-GWNGPSSRPGPASSSSSPRERS 295
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
867-1076 |
1.94e-03 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 42.36 E-value: 1.94e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 867 PGPQAIQPLPLSPGVRPASSQPQLlggqrVQVPNPVGFPGT-------------------WPLPGSPLPMAcPGIMRPGS 927
Cdd:PHA03378 569 LGPLQIQPLTSPTTSQLASSAPSY-----AQTPWPVPHPSQtpeppttqshipetsaprqWPMPLRPIPMR-PLRMQPIT 642
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 928 TSL---PETPRLFPLLPLRPLGPGRMVSHTPAPP------ASFPVPYLPGD---PGAPCSSVLPTTGILTPHPGPQDSWK 995
Cdd:PHA03378 643 FNVlvfPTPHQPPQVEITPYKPTWTQIGHIPYQPsptganTMLPIQWAPGTmqpPPRAPTPMRPPAAPPGRAQRPAAATG 722
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 996 EAPAPRGNLQRNKLPETfmppAPITAPVMSLTPElqgilPSQPPVSSVSHAPP--GVPGELSLQLQ-HLPPEKMERKELP 1072
Cdd:PHA03378 723 RARPPAAAPGRARPPAA----APGRARPPAAAPG-----RARPPAAAPGRARPpaAAPGAPTPQPPpQAPPAPQQRPRGA 793
|
....
gi 14149696 1073 PEHQ 1076
Cdd:PHA03378 794 PTPQ 797
|
|
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
252-335 |
2.09e-03 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 41.55 E-value: 2.09e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 252 PLKVLESHSRGILSVSWSqADAELLLTSAKDSQILCRNLGSSEVVYKLPTQSSWCFDVQWCPRDPSVFSaASFNGWISLY 331
Cdd:cd00200 1 LRRTLKGHTGGVTCVAFS-PDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLAS-GSSDKTIRLW 78
|
....
gi 14149696 332 SVMG 335
Cdd:cd00200 79 DLET 82
|
|
| half-pint |
TIGR01645 |
poly-U binding splicing factor, half-pint family; The proteins represented by this model ... |
829-1070 |
3.00e-03 |
|
poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.
Pssm-ID: 130706 [Multi-domain] Cd Length: 612 Bit Score: 41.59 E-value: 3.00e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 829 SPRPRVFTPQSSPAMPLAPSHPSpyqGPRTQNISDYRAPGPQAIQPLPLSPGVRPASSQPQLLGGQRVqvpnpVGFPGTW 908
Cdd:TIGR01645 283 TPPDALLQPATVSAIPAAAAVAA---AAATAKIMAAEAVAGAAVLGPRAQSPATPSSSLPTDIGNKAV-----VSSAKKE 354
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 909 PLPGSPLPMACPGIMRPGSTSLPetprlfpllplrplgpgrmvshTPAPPASFPVPYLPGDPGApcssVLPTtgilTPHP 988
Cdd:TIGR01645 355 AEEVPPLPQAAPAVVKPGPMEIP----------------------TPVPPPGLAIPSLVAPPGL----VAPT----EINP 404
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 989 GPQDSwkeapaPRGNLQRNKLPETFMPpapitapvmsLTPELQGILPSQPPVSSVSHAPPGVPGELSLQLQHLPPEKMER 1068
Cdd:TIGR01645 405 SFLAS------PRKKMKREKLPVTFGA----------LDDTLAWKEPSKEDQTSEDGKMLAIMGEAAAALALEPKKKKKE 468
|
..
gi 14149696 1069 KE 1070
Cdd:TIGR01645 469 KE 470
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
758-1049 |
3.28e-03 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 41.85 E-value: 3.28e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 758 ATAMSF-LPRDCAQPPVQQLRDRLFHAQGSAVLGQQSPPFPFPRIVVGATLHSKETssyrLGSQPSHQVPTPSPRPRVFT 836
Cdd:PHA03247 198 AGAMVFfVPSGPGPAAPADLTAAALHLYGASETYLQDEPFVERRVVISHPLRGDIA----APAPPPVVGEGADRAPETAR 273
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 837 PQSSPAMPLAPSHPSPYQGPRTQNISDYRAPGPQAIQPLPLSPGVRPASSQPQL---LGGQRVQVP-------NPVGFPG 906
Cdd:PHA03247 274 GATGPPPPPEAAAPNGAAAPPDGVWGAALAGAPLALPAPPDPPPPAPAGDAEEEddeDGAMEVVSPlprprqhYPLGFPK 353
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 907 ----TWPlPGSPLPMACPGIMRPGSTSLPETPRLFPLLPLRPLGPGRMVSHTPAP--PASFPVPyLPGDPGAPCSSVLPT 980
Cdd:PHA03247 354 rrrpTWT-PPSSLEDLSAGRHHPKRASLPTRKRRSARHAATPFARGPGGDDQTRPaaPVPASVP-TPAPTPVPASAPPPP 431
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 14149696 981 TgilTPHPGPQDSWKEAPAPRGNLQrnklPETFMPPAPITAPVMSLTPELQGILPSQPPvssvshAPPG 1049
Cdd:PHA03247 432 A---TPLPSAEPGSDDGPAPPPERQ----PPAPATEPAPDDPDDATRKALDALRERRPP------EPPG 487
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
804-1081 |
7.25e-03 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 40.43 E-value: 7.25e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 804 GATLHSKET---SSYRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLAPSHPSPYQGPRTQNISDYRAPGPQAIQPLPLSPg 880
Cdd:PHA03379 396 KLTERAREAlekASEPTYGTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGP- 474
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 881 VRPASSQPQLLGgqrvQVPNPVGFPGTWPLPGSPlpmacpgIMRPGSTSLPETPRLFPLLplrplgpgrmVSHTPAPPAS 960
Cdd:PHA03379 475 LQDLEPGDQLPG----VVQDGRPACAPVPAPAGP-------IVRPWEASLSQVPGVAFAP----------VMPQPMPVEP 533
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 961 FPVPYLPGD-PGAPCSSVLPTTGILTPHPGPQDSWKEAPAPrgnlqrnklpetfMPPAPITAPV-MSLTPELQGILP-SQ 1037
Cdd:PHA03379 534 VPVPTVALErPVCPAPPLIAMQGPGETSGIVRVRERWRPAP-------------WTPNPPRSPSqMSVRDRLARLRAeAQ 600
|
250 260 270 280
....*....|....*....|....*....|....*....|....
gi 14149696 1038 PPVSSVSHAPPgvpgelslQLQHLPPEKMERKELPPEHQSLKSS 1081
Cdd:PHA03379 601 PYQASVEVQPP--------QLTQVSPQQPMEYPLEPEQQMFPGS 636
|
|
| PBP1 |
COG5180 |
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ... |
736-1074 |
8.58e-03 |
|
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];
Pssm-ID: 444064 [Multi-domain] Cd Length: 548 Bit Score: 40.05 E-value: 8.58e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 736 PGPATTYRVTQYANLLAAQGS--------LATAMSFLPRDCAQPPVQQLRDRLFHAQGSavLGQQSPPFPFPRIVVGATL 807
Cdd:COG5180 133 PKAKVTREATSASAGVALAAAllqrsdpiLAKDPDGDSASTLPPPAEKLDKVLTEPRDA--LKDSPEKLDRPKVEVKDEA 210
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 808 HSKETSSYRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLAPSHPSPyqGPRTQNISDYRAPGPQAIQPLPLSPGVRPASSQ 887
Cdd:COG5180 211 QEEPPDLTGGADHPRPEAASSPKVDPPSTSEARSRPATVDAQPEM--RPPADAKERRRAAIGDTPAAEPPGLPVLEAGSE 288
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 888 PQLlGGQRVQVPNPVGFPGTWPLPGSPLPMACPGIMRPGSTSLPEtprLFPLLPLRPLGPGRMVSHTPA--PPASFPVPY 965
Cdd:COG5180 289 PQS-DAPEAETARPIDVKGVASAPPATRPVRPPGGARDPGTPRPG---QPTERPAGVPEAASDAGQPPSayPPAEEAVPG 364
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 966 LPGDPGAP--CSSVLPTTGILTPHPGPQDSWKEAPAPRGNLQRNKLPEtFMPPAPITAPVMSLTPELQGILPSQPPVSSV 1043
Cdd:COG5180 365 KPLEQGAPrpGSSGGDGAPFQPPNGAPQPGLGRRGAPGPPMGAGDLVQ-AALDGGGRETASLGGAAGGAGQGPKADFVPG 443
|
330 340 350
....*....|....*....|....*....|.
gi 14149696 1044 SHAPPGVPGELSLQLQHLPPEKMERKELPPE 1074
Cdd:COG5180 444 DAESVSGPAGLADQAGAAASTAMADFVAPVT 474
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
704-901 |
8.74e-03 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 40.40 E-value: 8.74e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 704 QALSPMALQDLMEKVMVLNRSLEQLRGPHGVSPGPATTYRVTQYANLLAAQGSLATAMSFLPRDCAQPPVQQLRDRLFHA 783
Cdd:pfam09770 179 PAAQPASLPAPSRKMMSLEEVEAAMRAQAKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPG 258
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 14149696 784 QGSAVLGQQSPPFPFPrivvgatlhsketssyrlgsqpshQVPTPSPRPR-VFTPQSSPAMPLAPSHpsPYQGPRTQniS 862
Cdd:pfam09770 259 QGHPVTILQRPQSPQP------------------------DPAQPSIQPQaQQFHQQPPPVPVQPTQ--ILQNPNRL--S 310
|
170 180 190
....*....|....*....|....*....|....*....
gi 14149696 863 DYRAPGPQAIQPLPLSPGVRPASSQPQLLGGQRVQVPNP 901
Cdd:pfam09770 311 AARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHP 349
|
|
|