NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1318663272|ref|NP_001346252|]
View 

protein transport protein Sec31A isoform a [Mus musculus]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 super family cl29593
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-332 8.70e-25

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


The actual alignment was detected with superfamily member cd00200:

Pssm-ID: 475233 [Multi-domain]  Cd Length: 289  Bit Score: 105.88  E-value: 8.70e-25
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272   13 AWSPAQNhpiYLATGtsaqqldatfSTNASLEIFELD-------LSDPSLDMKSCATFSSSHRyhkliwgphkmdskgdv 85
Cdd:cd00200     16 AFSPDGK---LLATG----------SGDGTIKVWDLEtgellrtLKGHTGPVRDVAASADGTY----------------- 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272   86 sgvLIAGGENGNIILYDPSKiiagdKEVVIAQKDkHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     66 ---LASGSSDKTIRLWDLET-----GECVRTLTG-HTSYVSSVA---FSPDgrILSSSSRDKTIKVWDVETGKCLTTLRG 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  164 KTQPpedISCIAWNrQVQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDrlpVIQM 243
Cdd:cd00200    134 HTDW---VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKL 203
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  244 WDLRfASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaAS 323
Cdd:cd00200    204 WDLS-TGKCLGTLRGHENGVNSVAFS-PDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GS 280

                   ....*....
gi 1318663272  324 FDGRISVYS 332
Cdd:cd00200    281 ADGTIRIWD 289
PHA03247 super family cl33720
large tegument protein UL36; Provisional
750-1104 1.25e-12

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 73.05  E-value: 1.25e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  750 SQYASLLAAQGSIAAALAFLPDNTNQPNIVQLRDRlCKAQGKPVSGQESSQSPyeRQPLSKGRPGPVAGHSQMPRvqtqq 829
Cdd:PHA03247  2632 SPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRR-ARRLGRAAQASSPPQRP--RRRAARPTVGSLTSLADPPP----- 2703
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  830 yyPHVRIAPTVTTWSNKTPTAL-PSHPPAASPSDTQGENPPPP--GFIMQGNVIPNPAAPLPTAPGH-MPSQLPPYPQPQ 905
Cdd:PHA03247  2704 --PPPTPEPAPHALVSATPLPPgPAAARQASPALPAAPAPPAVpaGPATPGGPARPARPPTTAGPPApAPPAAPAAGPPR 2781
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  906 PYQPAQQYSFGTGGAAAYRPQQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSSSAASfPPPSSGASFQHGG 985
Cdd:PHA03247  2782 RLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPP-PSLPLGGSVAPGG 2860
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  986 PGA--PPSSSAYALPPGTTGPQNGWNDPPALNRVPKKKKMPENFM--PPVPITSPIMNPSGDPQSQGLQQQPSTPGPLSS 1061
Cdd:PHA03247  2861 DVRrrPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPerPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQ 2940
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 1318663272 1062 HASFPQQHLAG-GQPFHGVQQPLAQTGMPPSFSKPNTEGAPGAP 1104
Cdd:PHA03247  2941 PPLAPTTDPAGaGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAP 2984
ACE1-Sec16-like super family cl14807
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
572-766 1.79e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


The actual alignment was detected with superfamily member cd09233:

Pssm-ID: 449359 [Multi-domain]  Cd Length: 314  Bit Score: 60.73  E-value: 1.79e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  572 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKyFAKSQSKIT---RLITAVVMKNWREIVESC---- 643
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  644 -----DLKNWREALAAVLTYAKPD-EFSALCdLLGTRLEREGDSLlrtQACLCYICAGnverlvacwtkAQDGSSPLS-- 715
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALV-ELGDLLAQRGLVE---AAHICYLLAG-----------VPLGPYPSSps 211
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1318663272  716 -----LQDLIEKVVILR--KAVQLT------QALDTNTVG--ALLAEKMsQYASLLAAQGSIAAAL 766
Cdd:cd09233    212 scllgGAVHNKSPRTFAtpEAIQLTeiyeyaLSLGNPQFGlpHLQPYKL-IHAARLAELGLVSEAL 276
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-332 8.70e-25

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 105.88  E-value: 8.70e-25
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272   13 AWSPAQNhpiYLATGtsaqqldatfSTNASLEIFELD-------LSDPSLDMKSCATFSSSHRyhkliwgphkmdskgdv 85
Cdd:cd00200     16 AFSPDGK---LLATG----------SGDGTIKVWDLEtgellrtLKGHTGPVRDVAASADGTY----------------- 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272   86 sgvLIAGGENGNIILYDPSKiiagdKEVVIAQKDkHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     66 ---LASGSSDKTIRLWDLET-----GECVRTLTG-HTSYVSSVA---FSPDgrILSSSSRDKTIKVWDVETGKCLTTLRG 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  164 KTQPpedISCIAWNrQVQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDrlpVIQM 243
Cdd:cd00200    134 HTDW---VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKL 203
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  244 WDLRfASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaAS 323
Cdd:cd00200    204 WDLS-TGKCLGTLRGHENGVNSVAFS-PDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GS 280

                   ....*....
gi 1318663272  324 FDGRISVYS 332
Cdd:cd00200    281 ADGTIRIWD 289
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 1.40e-23

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 104.99  E-value: 1.40e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272   89 LIAGGENGNIILYDpskiIAGDKEvvIAQKDKHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTQPp 168
Cdd:COG2319    177 LASGSDDGTVRLWD----LATGKL--LRTLTGHTGAVRSVAFS-PDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGS- 248
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  169 edISCIAWNRQVQHiLASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDvATQMVLASEDDRlpvIQMWDLRf 248
Cdd:COG2319    249 --VRSVAFSPDGRL-LASGSADGTVRLWDLATGELLRTLTGHSGGVN--SVAFSPD-GKLLASGSDDGT---VRLWDLA- 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  249 ASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRI 328
Cdd:COG2319    319 TGKLLRTLTGHTGAVRSVAFS-PDGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSPDGRTLAS-GSADGTV 396

                   ....*
gi 1318663272  329 SVYSI 333
Cdd:COG2319    397 RLWDL 401
PHA03247 PHA03247
large tegument protein UL36; Provisional
750-1104 1.25e-12

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 73.05  E-value: 1.25e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  750 SQYASLLAAQGSIAAALAFLPDNTNQPNIVQLRDRlCKAQGKPVSGQESSQSPyeRQPLSKGRPGPVAGHSQMPRvqtqq 829
Cdd:PHA03247  2632 SPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRR-ARRLGRAAQASSPPQRP--RRRAARPTVGSLTSLADPPP----- 2703
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  830 yyPHVRIAPTVTTWSNKTPTAL-PSHPPAASPSDTQGENPPPP--GFIMQGNVIPNPAAPLPTAPGH-MPSQLPPYPQPQ 905
Cdd:PHA03247  2704 --PPPTPEPAPHALVSATPLPPgPAAARQASPALPAAPAPPAVpaGPATPGGPARPARPPTTAGPPApAPPAAPAAGPPR 2781
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  906 PYQPAQQYSFGTGGAAAYRPQQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSSSAASfPPPSSGASFQHGG 985
Cdd:PHA03247  2782 RLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPP-PSLPLGGSVAPGG 2860
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  986 PGA--PPSSSAYALPPGTTGPQNGWNDPPALNRVPKKKKMPENFM--PPVPITSPIMNPSGDPQSQGLQQQPSTPGPLSS 1061
Cdd:PHA03247  2861 DVRrrPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPerPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQ 2940
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 1318663272 1062 HASFPQQHLAG-GQPFHGVQQPLAQTGMPPSFSKPNTEGAPGAP 1104
Cdd:PHA03247  2941 PPLAPTTDPAGaGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAP 2984
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
775-1111 1.68e-10

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 65.56  E-value: 1.68e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  775 QPNIVQLRDRLCKAQGKPVSGQESSQSPYERQPLSKGRPGPVAGHSQMPRVQTQQYYPHVRIAPTVTTWSNKTPT---AL 851
Cdd:pfam03154  170 QPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRLPSphpPL 249
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  852 PSHPPAASPSDTQGENPPPPGFIMQGNVIPNPaapLPTAPGHMP----------------SQLPPYPQPQPYQPAQQYSF 915
Cdd:pfam03154  250 QPMTQPPPPSQVSPQPLPQPSLHGQMPPMPHS---LQTGPSHMQhpvppqpfpltpqssqSQVPPGPSPAAPGQSQQRIH 326
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  916 GTGGAAAYRPQQPVA----PPASNAYPNT--PYISPVASY-SGQPQMYTAQQASSPTSSSAASFPPPS-----SGASFQH 983
Cdd:pfam03154  327 TPPSQSQLQSQQPPReqplPPAPLSMPHIkpPPTTPIPQLpNPQSHKHPPHLSGPSPFQMNSNLPPPPalkplSSLSTHH 406
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  984 GGPGAPPS----------SSAYALPPGTTGPQN----GWNDPP--ALNRVPKKKKMPEN-FMP--PVPITSPIMNPSGDP 1044
Cdd:pfam03154  407 PPSAHPPPlqlmpqsqqlPPPPAQPPVLTQSQSlpppAASHPPtsGLHQVPSQSPFPQHpFVPggPPPITPPSGPPTSTS 486
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1318663272 1045 QSQGLQQQPSTpGPLSSHASFPQQHLAGGQPFHGVQQPLAQTGMPPSFSKPNTEGAPGAPIGNTIQH 1111
Cdd:pfam03154  487 SAMPGIQPPSS-ASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEPTVVNTPSH 552
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
572-766 1.79e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 60.73  E-value: 1.79e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  572 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKyFAKSQSKIT---RLITAVVMKNWREIVESC---- 643
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  644 -----DLKNWREALAAVLTYAKPD-EFSALCdLLGTRLEREGDSLlrtQACLCYICAGnverlvacwtkAQDGSSPLS-- 715
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALV-ELGDLLAQRGLVE---AAHICYLLAG-----------VPLGPYPSSps 211
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1318663272  716 -----LQDLIEKVVILR--KAVQLT------QALDTNTVG--ALLAEKMsQYASLLAAQGSIAAAL 766
Cdd:cd09233    212 scllgGAVHNKSPRTFAtpEAIQLTeiyeyaLSLGNPQFGlpHLQPYKL-IHAARLAELGLVSEAL 276
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
572-766 3.43e-07

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 53.33  E-value: 3.43e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  572 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKY----FAKSQSKITRLItAVVMK----NWREIVE- 641
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFvrseFKGSNNKSGESL-AALYQvfagNSEEAVDe 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  642 --------SCDLKNWREALAAVLTYAKPDEFSALCDlLGTRLEREGdslLRTQACLCYICAgNVERLVACWTKAQDGSSP 713
Cdd:pfam12931   79 lvppsknaLWALDNWRETLALVLSNRSPGDVEALLA-LGDLLAQYG---RTEAAHICFLLA-GLPLSQTVLLGADHVRFP 153
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1318663272  714 LSLQDLIEkvvilrkAVQLTQ----ALDTNTVGA-------LLAEKMsQYASLLAAQGSIAAAL 766
Cdd:pfam12931  154 STFGNDLE-------SILLTEiyeyALSLSPPQPpfvglphLLPYKL-QHAAVLAEYGLVSEAQ 209
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
925-1057 3.33e-04

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 44.80  E-value: 3.33e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  925 PQQPVAPPASNAYPNTPYIS--PVASYSGQPQMYTAQQAssptsssaasFPPPSsgasfqhgGPGAPPSSSAYALPPgtt 1002
Cdd:TIGR01628  383 RQLPMGSPMGGAMGQPPYYGqgPQQQFNGQPLGWPRMSM----------MPTPM--------GPGGPLRPNGLAPMN--- 441
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 1318663272 1003 gpQNGWNDPPALNRVPKKKKMPENFMPpvpitspimNPSGDPQSQGLQQQPSTPG 1057
Cdd:TIGR01628  442 --AVRAPSRNAQNAAQKPPMQPVMYPP---------NYQSLPLSQDLPQPQSTAS 485
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-332 8.70e-25

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 105.88  E-value: 8.70e-25
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272   13 AWSPAQNhpiYLATGtsaqqldatfSTNASLEIFELD-------LSDPSLDMKSCATFSSSHRyhkliwgphkmdskgdv 85
Cdd:cd00200     16 AFSPDGK---LLATG----------SGDGTIKVWDLEtgellrtLKGHTGPVRDVAASADGTY----------------- 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272   86 sgvLIAGGENGNIILYDPSKiiagdKEVVIAQKDkHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     66 ---LASGSSDKTIRLWDLET-----GECVRTLTG-HTSYVSSVA---FSPDgrILSSSSRDKTIKVWDVETGKCLTTLRG 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  164 KTQPpedISCIAWNrQVQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDrlpVIQM 243
Cdd:cd00200    134 HTDW---VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKL 203
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  244 WDLRfASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaAS 323
Cdd:cd00200    204 WDLS-TGKCLGTLRGHENGVNSVAFS-PDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GS 280

                   ....*....
gi 1318663272  324 FDGRISVYS 332
Cdd:cd00200    281 ADGTIRIWD 289
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 1.40e-23

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 104.99  E-value: 1.40e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272   89 LIAGGENGNIILYDpskiIAGDKEvvIAQKDKHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTQPp 168
Cdd:COG2319    177 LASGSDDGTVRLWD----LATGKL--LRTLTGHTGAVRSVAFS-PDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGS- 248
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  169 edISCIAWNRQVQHiLASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDvATQMVLASEDDRlpvIQMWDLRf 248
Cdd:COG2319    249 --VRSVAFSPDGRL-LASGSADGTVRLWDLATGELLRTLTGHSGGVN--SVAFSPD-GKLLASGSDDGT---VRLWDLA- 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  249 ASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRI 328
Cdd:COG2319    319 TGKLLRTLTGHTGAVRSVAFS-PDGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSPDGRTLAS-GSADGTV 396

                   ....*
gi 1318663272  329 SVYSI 333
Cdd:COG2319    397 RLWDL 401
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 4.83e-22

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 100.37  E-value: 4.83e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272   89 LIAGGENGNIILYDpskiIAGDKEvvIAQKDKHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGAKTQ 166
Cdd:COG2319    135 LASGSADGTVRLWD----LATGKL--LRTLTGHSGAVTSVA---FSPDgkLLASGSDDGTVRLWDLATGKLLRTLTGHTG 205
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  167 PpedISCIAWNRQvQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDRlpvIQMWDL 246
Cdd:COG2319    206 A---VRSVAFSPD-GKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRS--VAFSPD-GRLLASGSADGT---VRLWDL 275
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  247 RfASSPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDG 326
Cdd:COG2319    276 A-TGELLRTLTGHSGGVNSVAFS-PDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLAS-GSDDG 352

                   ....*..
gi 1318663272  327 RISVYSI 333
Cdd:COG2319    353 TVRLWDL 359
WD40 COG2319
WD40 repeat [General function prediction only];
121-338 6.94e-20

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 93.82  E-value: 6.94e-20
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  121 HTGPVRALDVNiFQTNLVASGANESEIYIWDLnnfATPMTPGAKTQPPEDISCIAWNRQvQHILASASPSGRATVWDLRK 200
Cdd:COG2319     77 HTAAVLSVAFS-PDGRLLASASADGTVRLWDL---ATGLLLRTLTGHTGAVRSVAFSPD-GKTLASGSADGTVRLWDLAT 151
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  201 NEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDRlpvIQMWDLRfASSPLRVLENHARGILAVAWSmADPELLLSCG 280
Cdd:COG2319    152 GKLLRTLTGHSGAVTS--VAFSPD-GKLLASGSDDGT---VRLWDLA-TGKLLRTLTGHTGAVRSVAFS-PDGKLLASGS 223
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1318663272  281 KDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPrNPAVLSAASFDGRISVYSIMGGSI 338
Cdd:COG2319    224 ADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSP-DGRLLASGSADGTVRLWDLATGEL 280
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
171-337 1.11e-16

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 82.00  E-value: 1.11e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  171 ISCIAWNRQvQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMhcSGLAWHPDvATQMVLASEDDrlpVIQMWDLRfAS 250
Cdd:cd00200     12 VTCVAFSPD-GKLLATGSGDGTIKVWDLETGELLRTLKGHTGPV--RDVAASAD-GTYLASGSSDK---TIRLWDLE-TG 83
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  251 SPLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPrNPAVLSAASFDGRISV 330
Cdd:cd00200     84 ECVRTLTGHTSYVSSVAFS-PDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSP-DGTFVASSSQDGTIKL 161

                   ....*..
gi 1318663272  331 YSIMGGS 337
Cdd:cd00200    162 WDLRTGK 168
PHA03247 PHA03247
large tegument protein UL36; Provisional
750-1104 1.25e-12

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 73.05  E-value: 1.25e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  750 SQYASLLAAQGSIAAALAFLPDNTNQPNIVQLRDRlCKAQGKPVSGQESSQSPyeRQPLSKGRPGPVAGHSQMPRvqtqq 829
Cdd:PHA03247  2632 SPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRR-ARRLGRAAQASSPPQRP--RRRAARPTVGSLTSLADPPP----- 2703
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  830 yyPHVRIAPTVTTWSNKTPTAL-PSHPPAASPSDTQGENPPPP--GFIMQGNVIPNPAAPLPTAPGH-MPSQLPPYPQPQ 905
Cdd:PHA03247  2704 --PPPTPEPAPHALVSATPLPPgPAAARQASPALPAAPAPPAVpaGPATPGGPARPARPPTTAGPPApAPPAAPAAGPPR 2781
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  906 PYQPAQQYSFGTGGAAAYRPQQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSSSAASfPPPSSGASFQHGG 985
Cdd:PHA03247  2782 RLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPP-PSLPLGGSVAPGG 2860
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  986 PGA--PPSSSAYALPPGTTGPQNGWNDPPALNRVPKKKKMPENFM--PPVPITSPIMNPSGDPQSQGLQQQPSTPGPLSS 1061
Cdd:PHA03247  2861 DVRrrPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPerPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQ 2940
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 1318663272 1062 HASFPQQHLAG-GQPFHGVQQPLAQTGMPPSFSKPNTEGAPGAP 1104
Cdd:PHA03247  2941 PPLAPTTDPAGaGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAP 2984
PHA03247 PHA03247
large tegument protein UL36; Provisional
790-1106 6.40e-11

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 67.27  E-value: 6.40e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  790 GKPVSGQESSQSPYERQPLSKGRPGPVAGHSQMPRVQTQQYYPHVRIAP-----------TVTTWSNKTPTALPSHPPAA 858
Cdd:PHA03247  2631 PSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPpqrprrraarpTVGSLTSLADPPPPPPTPEP 2710
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  859 SPSDTQGENPPPPGFIMQGNVIP-NPAAPLPTAPGHMPSQLPPYPQPQPYQPAqqysfgTGGAAAYRPQQPVAPPASNAy 937
Cdd:PHA03247  2711 APHALVSATPLPPGPAAARQASPaLPAAPAPPAVPAGPATPGGPARPARPPTT------AGPPAPAPPAAPAAGPPRRL- 2783
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  938 PNTPYISPVASYSGQPQMYTAQQASSPTSSSAASFPPPSSGAsfqhgGPGAPPSSSAYALPPGTTGP------QNGWNDP 1011
Cdd:PHA03247  2784 TRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPA-----GPLPPPTSAQPTAPPPPPGPpppslpLGGSVAP 2858
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272 1012 --PALNRVPKKKKMPENFMPPVPITSPIMNPSGDPQSQGLQQQPSTPGPLSSHASFPQ-QHLAGGQPFHGVQQPLAQTGM 1088
Cdd:PHA03247  2859 ggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPpQPQPQPPPPPQPQPPPPPPPR 2938
                          330
                   ....*....|....*....
gi 1318663272 1089 PPSFSKPNTEGAP-GAPIG 1106
Cdd:PHA03247  2939 PQPPLAPTTDPAGaGEPSG 2957
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
775-1111 1.68e-10

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 65.56  E-value: 1.68e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  775 QPNIVQLRDRLCKAQGKPVSGQESSQSPYERQPLSKGRPGPVAGHSQMPRVQTQQYYPHVRIAPTVTTWSNKTPT---AL 851
Cdd:pfam03154  170 QPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRLPSphpPL 249
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  852 PSHPPAASPSDTQGENPPPPGFIMQGNVIPNPaapLPTAPGHMP----------------SQLPPYPQPQPYQPAQQYSF 915
Cdd:pfam03154  250 QPMTQPPPPSQVSPQPLPQPSLHGQMPPMPHS---LQTGPSHMQhpvppqpfpltpqssqSQVPPGPSPAAPGQSQQRIH 326
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  916 GTGGAAAYRPQQPVA----PPASNAYPNT--PYISPVASY-SGQPQMYTAQQASSPTSSSAASFPPPS-----SGASFQH 983
Cdd:pfam03154  327 TPPSQSQLQSQQPPReqplPPAPLSMPHIkpPPTTPIPQLpNPQSHKHPPHLSGPSPFQMNSNLPPPPalkplSSLSTHH 406
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  984 GGPGAPPS----------SSAYALPPGTTGPQN----GWNDPP--ALNRVPKKKKMPEN-FMP--PVPITSPIMNPSGDP 1044
Cdd:pfam03154  407 PPSAHPPPlqlmpqsqqlPPPPAQPPVLTQSQSlpppAASHPPtsGLHQVPSQSPFPQHpFVPggPPPITPPSGPPTSTS 486
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1318663272 1045 QSQGLQQQPSTpGPLSSHASFPQQHLAGGQPFHGVQQPLAQTGMPPSFSKPNTEGAPGAPIGNTIQH 1111
Cdd:pfam03154  487 SAMPGIQPPSS-ASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPPPPPRSPSPEPTVVNTPSH 552
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
572-766 1.79e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 60.73  E-value: 1.79e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  572 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKyFAKSQSKIT---RLITAVVMKNWREIVESC---- 643
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  644 -----DLKNWREALAAVLTYAKPD-EFSALCdLLGTRLEREGDSLlrtQACLCYICAGnverlvacwtkAQDGSSPLS-- 715
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALV-ELGDLLAQRGLVE---AAHICYLLAG-----------VPLGPYPSSps 211
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1318663272  716 -----LQDLIEKVVILR--KAVQLT------QALDTNTVG--ALLAEKMsQYASLLAAQGSIAAAL 766
Cdd:cd09233    212 scllgGAVHNKSPRTFAtpEAIQLTeiyeyaLSLGNPQFGlpHLQPYKL-IHAARLAELGLVSEAL 276
WD40 COG2319
WD40 repeat [General function prediction only];
89-247 3.51e-09

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 60.31  E-value: 3.51e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272   89 LIAGGENGNIILYDpskiIAGDKEVVIaqKDKHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTqpp 168
Cdd:COG2319    261 LASGSADGTVRLWD----LATGELLRT--LTGHSGGVNSVAFS-PDGKLLASGSDDGTVRLWDLATGKLLRTLTGHT--- 330
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  169 EDISCIAWNRQVQhILASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPD---VATqmvlASEDDRlpvIQMWD 245
Cdd:COG2319    331 GAVRSVAFSPDGK-TLASGSDDGTVRLWDLATGELLRTLTGHTGAVT--SVAFSPDgrtLAS----GSADGT---VRLWD 400

                   ..
gi 1318663272  246 LR 247
Cdd:COG2319    401 LA 402
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
572-766 3.43e-07

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 53.33  E-value: 3.43e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  572 ITRALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLAQTQKKY----FAKSQSKITRLItAVVMK----NWREIVE- 641
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFvrseFKGSNNKSGESL-AALYQvfagNSEEAVDe 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  642 --------SCDLKNWREALAAVLTYAKPDEFSALCDlLGTRLEREGdslLRTQACLCYICAgNVERLVACWTKAQDGSSP 713
Cdd:pfam12931   79 lvppsknaLWALDNWRETLALVLSNRSPGDVEALLA-LGDLLAQYG---RTEAAHICFLLA-GLPLSQTVLLGADHVRFP 153
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1318663272  714 LSLQDLIEkvvilrkAVQLTQ----ALDTNTVGA-------LLAEKMsQYASLLAAQGSIAAAL 766
Cdd:pfam12931  154 STFGNDLE-------SILLTEiyeyALSLSPPQPpfvglphLLPYKL-QHAAVLAEYGLVSEAQ 209
PHA03247 PHA03247
large tegument protein UL36; Provisional
799-1060 1.06e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.40  E-value: 1.06e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  799 SQSPYERQPLSKGRPGPVAghsqmPRVQTQqyyphvriAPTVTTWSNKTPTALPSHPPAASPSDTQGENPPPPGFIMQGN 878
Cdd:PHA03247  2753 GPARPARPPTTAGPPAPAP-----PAAPAA--------GPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALP 2819
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  879 VIPNPAAPLPTAPGHMPSQLPPYPQPQPYQPAQQYSFGTGGAAAYRP--QQPVAPPASNAYPNT-----PYISPVASYSG 951
Cdd:PHA03247  2820 PAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPpsRSPAAKPAAPARPPVrrlarPAVSRSTESFA 2899
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  952 QPQMYTAQQASSPTSSSAASFPPPSSGASFQhggPGAPPSSSAYALPPGTTGPQ----------NGWNDPPALNRVPKKK 1021
Cdd:PHA03247  2900 LPPDQPERPPQPQAPPPPQPQPQPPPPPQPQ---PPPPPPPRPQPPLAPTTDPAgagepsgavpQPWLGALVPGRVAVPR 2976
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|....*..
gi 1318663272 1022 KMPENFMPPVPITSPIMNPSGDPQSQGLQQQPS--------TPGPLS 1060
Cdd:PHA03247  2977 FRVPQPAPSREAPASSTPPLTGHSLSRVSSWASslalheetDPPPVS 3023
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
252-336 2.57e-06

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 50.80  E-value: 2.57e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  252 PLRVLENHARGILAVAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRISVY 331
Cdd:cd00200      1 LRRTLKGHTGGVTCVAFS-PDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLAS-GSSDKTIRLW 78

                   ....*
gi 1318663272  332 SIMGG 336
Cdd:cd00200     79 DLETG 83
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
796-1168 2.84e-06

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 51.55  E-value: 2.84e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  796 QESSQSPYERQPLSKGrPGPVAGHSQMPRVQTQQYYPHVRIAPTVTTWSNKtptALPSHPPAASPSDTQGENPPPPGFIM 875
Cdd:pfam09606  160 QPSSGQPGSGTPNQMG-PNGGPGQGQAGGMNGGQQGPMGGQMPPQMGVPGM---PGPADAGAQMGQQAQANGGMNPQQMG 235
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  876 QG--NVIPNPAAPLPTApghmpsqlPPYPQPQPYQPAQQYSFGTGGAA-AYRPQQPVAPPasnayPNTPYISPVASYSGQ 952
Cdd:pfam09606  236 GApnQVAMQQQQPQQQG--------QQSQLGMGINQMQQMPQGVGGGAgQGGPGQPMGPP-----GQQPGAMPNVMSIGD 302
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  953 PQMYTAQQASSPTSSSAASFPPPSSGASFQHGGPGAPPSSSAYALPPGTTGPQN--GWNDPPALNRVPKKKKMPEnfmpP 1030
Cdd:pfam09606  303 QNNYQQQQTRQQQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNfgGLGANPMQRGQPGMMSSPS----P 378
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272 1031 VPITSPI-MNPsgdPQSQGLQQQPSTPGPLSSHASFPQQHLAGGQPF-HGVQQPLAQTG-MPPSFSKPNTEGaPGAPIGN 1107
Cdd:pfam09606  379 VPGQQVRqVTP---NQFMRQSPQPSVPSPQGPGSQPPQSHPGGMIPSpALIPSPSPQMSqQPAQQRTIGQDS-PGGSLNT 454
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1318663272 1108 TIQ---------HVQALPTEK---ITKKPIPEEHLILKTTfedliqrclssaTDPQTKRKLDDASKRLEFLYD 1168
Cdd:pfam09606  455 PGQsavnsplnpQEEQLYREKyrqLTKYIEPLKRMIAKME------------NDPGDIDKMNKMKRLLEILSN 515
PHA03247 PHA03247
large tegument protein UL36; Provisional
813-1127 4.74e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.48  E-value: 4.74e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  813 PGP-VAGHSQMPRVQTQQYYPHVRIAPtvttwsnKTPTALPShPPAASPSDTQGENPPPPGFIMQGNVIPNPAaPLPTAP 891
Cdd:PHA03247  2578 SEPaVTSRARRPDAPPQSARPRAPVDD-------RGDPRGPA-PPSPLPPDTHAPDPPPPSPSPAANEPDPHP-PPTVPP 2648
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  892 GHMPSQLPPYPQPQPYQPAQQYSFGTGGAAA-YRPQQPVAPPAsnaypntpyISPVASYSGQPqmytaqqassptsssaa 970
Cdd:PHA03247  2649 PERPRDDPAPGRVSRPRRARRLGRAAQASSPpQRPRRRAARPT---------VGSLTSLADPP----------------- 2702
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  971 sfPPPSSGAsfqhggPGAPPSSSAYALPPGttgPQNGWNDPPALNRVPKKKKMPENFMPPVPITSPIM--NPSGDPQSQG 1048
Cdd:PHA03247  2703 --PPPPTPE------PAPHALVSATPLPPG---PAAARQASPALPAAPAPPAVPAGPATPGGPARPARppTTAGPPAPAP 2771
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272 1049 LQQQPSTPGP-LSSHASFPQQHLAGGQPFHGVQQPLAQTGMPPSFSKPNTEgAPGAPIGNTIQHVQALPTekITKKPIPE 1127
Cdd:PHA03247  2772 PAAPAAGPPRrLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAA-SPAGPLPPPTSAQPTAPP--PPPGPPPP 2848
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
950-1129 4.02e-05

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 48.22  E-value: 4.02e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  950 SGQPQMYTAQQASSPTSSSAASFPPPSSGASFQHGGPGAPPSS---SAYALPPGTTGPQNGWNDPPALNRV--------- 1017
Cdd:pfam03154  161 SAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSApsvPPQGSPATSQPPNQTQSTAAPHTLIqqtptlhpq 240
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272 1018 ----------PKKKKMPENFMPPVPITSPIMNPSGDPQSQGLQQQPS--------TPGPLSSHASFPQ-----QHLAGGQ 1074
Cdd:pfam03154  241 rlpsphpplqPMTQPPPPSQVSPQPLPQPSLHGQMPPMPHSLQTGPShmqhpvppQPFPLTPQSSQSQvppgpSPAAPGQ 320
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 1318663272 1075 PFHGVQQPLAQTgMPPSFSKPNTEGAPGAPIgnTIQHVQALPTEKITKKPIPEEH 1129
Cdd:pfam03154  321 SQQRIHTPPSQS-QLQSQQPPREQPLPPAPL--SMPHIKPPPTTPIPQLPNPQSH 372
PHA03247 PHA03247
large tegument protein UL36; Provisional
795-1106 6.78e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 47.63  E-value: 6.78e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  795 GQESSQSPYERQPLSKGRPGPVAGHSQMPRvqtqqyyPHVRIAPtvttwSNKTPTALPSHPP--AASP------------ 860
Cdd:PHA03247  2476 GAPVYRRPAEARFPFAAGAAPDPGGGGPPD-------PDAPPAP-----SRLAPAILPDEPVgePVHPrmltwirgleel 2543
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  861 -SDTQGENPPPPgfimqgnvipnPAAPLPTAPGHmpSQLPPYPQPQPYQPAQQYSFGTGGAAAY--RPQQPVAPPASNAY 937
Cdd:PHA03247  2544 aSDDAGDPPPPL-----------PPAAPPAAPDR--SVPPPRPAPRPSEPAVTSRARRPDAPPQsaRPRAPVDDRGDPRG 2610
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  938 PNTPYISPVASYSGQPQMYTAQQASSPTSSSAASFPPPSSGASFQHGGPGAPPSSSAYAL--PPGTTGPQNGWNdPPALN 1015
Cdd:PHA03247  2611 PAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLgrAAQASSPPQRPR-RRAAR 2689
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272 1016 rvPKKKKMPENFMPPVPITSPimNPSGDPQSQGLqqqPSTPGPLSSHASFPQQHLAGGQPF--HGVQQPLAQT--GMPPS 1091
Cdd:PHA03247  2690 --PTVGSLTSLADPPPPPPTP--EPAPHALVSAT---PLPPGPAAARQASPALPAAPAPPAvpAGPATPGGPArpARPPT 2762
                          330
                   ....*....|....*
gi 1318663272 1092 FSKPNTEGAPGAPIG 1106
Cdd:PHA03247  2763 TAGPPAPAPPAAPAA 2777
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
921-1085 1.07e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 46.57  E-value: 1.07e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  921 AAYRPQQPVAPPAsnaypntpyisPVASYSGQPQMYTAQQASSPTSssaasFPPPSSGASFQHGGPGAPPSSSAYALPPG 1000
Cdd:pfam09770  201 AAMRAQAKKPAQQ-----------PAPAPAQPPAAPPAQQAQQQQQ-----FPPQIQQQQQPQQQPQQPQQHPGQGHPVT 264
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272 1001 T-TGPQNGWNDPPALNRVPKKKKMPENfMPPVPI--TSPIMNPS--GDPQSQGLQQQPSTPGPLSSHASFPQQHLAGGQ- 1074
Cdd:pfam09770  265 IlQRPQSPQPDPAQPSIQPQAQQFHQQ-PPPVPVqpTQILQNPNrlSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQa 343
                          170
                   ....*....|.
gi 1318663272 1075 PFHGVQQPLAQ 1085
Cdd:pfam09770  344 PIITHPQQLAQ 354
PHA03378 PHA03378
EBNA-3B; Provisional
807-1118 1.63e-04

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 46.21  E-value: 1.63e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  807 PLSKGRPGPVAGHSQMPRVQTQQYYPHVRIAPTvtTWSNKTpTALPSHPPAASPSDTQGENPPPPGFIMQgnVIPNPAAP 886
Cdd:PHA03378   590 PSYAQTPWPVPHPSQTPEPPTTQSHIPETSAPR--QWPMPL-RPIPMRPLRMQPITFNVLVFPTPHQPPQ--VEITPYKP 664
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  887 LPTAPGHMPSQLPPYPQPQPYQPAQQYSFGTGGAAAYRPQQPVAPPASNAYPntpyisPVASYSGQPQMYTAQQASSPTS 966
Cdd:PHA03378   665 TWTQIGHIPYQPSPTGANTMLPIQWAPGTMQPPPRAPTPMRPPAAPPGRAQR------PAAATGRARPPAAAPGRARPPA 738
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  967 SSAASFPPPSSGASFQHGGPGAPPSSSAYALPPGTTGPQNGWNDPPALNRVPKKKKMPENfMPPVPITSPIMNPSGDPQS 1046
Cdd:PHA03378   739 AAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPPPQAPPAPQQRPRGAPTPQP-PPQAGPTSMQLMPRAAPGQ 817
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272 1047 QGLQQQ--------------PSTPGPLS---SHASFPQQHLAGGQPFHGVQQPLAqtgMPPSFSKPNTEGAPGAPIGNTI 1109
Cdd:PHA03378   818 QGPTKQilrqlltggvkrgrPSLKKPAAlerQAAAGPTPSPGSGTSDKIVQAPVF---YPPVLQPIQVMRQLGSVRAAAA 894

                   ....*....
gi 1318663272 1110 QHVQALPTE 1118
Cdd:PHA03378   895 STVTQAPTE 903
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
925-1057 3.33e-04

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 44.80  E-value: 3.33e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  925 PQQPVAPPASNAYPNTPYIS--PVASYSGQPQMYTAQQAssptsssaasFPPPSsgasfqhgGPGAPPSSSAYALPPgtt 1002
Cdd:TIGR01628  383 RQLPMGSPMGGAMGQPPYYGqgPQQQFNGQPLGWPRMSM----------MPTPM--------GPGGPLRPNGLAPMN--- 441
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 1318663272 1003 gpQNGWNDPPALNRVPKKKKMPENFMPpvpitspimNPSGDPQSQGLQQQPSTPG 1057
Cdd:TIGR01628  442 --AVRAPSRNAQNAAQKPPMQPVMYPP---------NYQSLPLSQDLPQPQSTAS 485
PHA03379 PHA03379
EBNA-3A; Provisional
791-1104 4.10e-04

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 44.66  E-value: 4.10e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  791 KPVSGQESSQSPYERQ------PLSKGRPGPVAGHS---QMPRV-QTQQYYPHVRIAPT---VTTWSnKTPTALPSHPPA 857
Cdd:PHA03379   444 EPPPVHDLEPGPLHDQhsmapcPVAQLPPGPLQDLEpgdQLPGVvQDGRPACAPVPAPAgpiVRPWE-ASLSQVPGVAFA 522
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  858 A-SPSDTQGENPPPPGFIMQGNVIPNPAAPLPTAPG-----------HMPSQLPPYPQPQPYQPAQQYSFGTGGAAAYRP 925
Cdd:PHA03379   523 PvMPQPMPVEPVPVPTVALERPVCPAPPLIAMQGPGetsgivrvrerWRPAPWTPNPPRSPSQMSVRDRLARLRAEAQPY 602
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  926 QQPVA---PPASNAYPNTPYISP-VASYSGQPQMYTAQQASSPTSSSAASFPPPSSGASFQH----GGPGAPPSSSAYAL 997
Cdd:PHA03379   603 QASVEvqpPQLTQVSPQQPMEYPlEPEQQMFPGSPFSQVADVMRAGGVPAMQPQYFDLPLQQpisqGAPLAPLRASMGPV 682
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  998 PP-GTTGPQngWNDppalnrvpkkkkmpenfmppVPITSPImnPSGDPQSQGLQQQPSTPGPLSSHASFPqqhlaGGQPF 1076
Cdd:PHA03379   683 PPvPATQPQ--YFD--------------------IPLTEPI--NQGASAAHFLPQQPMEGPLVPERWMFQ-----GATLS 733
                          330       340       350
                   ....*....|....*....|....*....|.
gi 1318663272 1077 HGVQQPLAQT---GMPpsFSKPNTEGAPGAP 1104
Cdd:PHA03379   734 QSVRPGVAQSqyfDLP--LTQPINHGAPAAH 762
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
838-1084 5.20e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 44.48  E-value: 5.20e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  838 PTVTTWSNKTPTALPSHPPAASPSDTQGENPPPPgfimqgnvipnPAAPLPTAPGHMPSQLPPYPQPQPYQPAQQYSFGT 917
Cdd:PRK12323   375 ATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAA-----------APAAAAAARAVAAAPARRSPAPEALAAARQASARG 443
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  918 GGAAAYRPQQPVAPPASNAYPNTPYISPVASYSGQPQmytAQQASSPTSSSAASFPPPSSGASFQHGGPGAPPSSSAYAL 997
Cdd:PRK12323   444 PGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAP---ARAAPAAAPAPADDDPPPWEELPPEFASPAPAQPDAAPAG 520
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  998 PPGTTGPQNGWNDPPAlNRVPKKKKMPENFMPPVPITSPIMNPSGDPQsqglQQQPSTPGplSSHASFPQqhLAGGQPFH 1077
Cdd:PRK12323   521 WVAESIPDPATADPDD-AFETLAPAPAAAPAPRAAAATEPVVAPRPPR----ASASGLPD--MFDGDWPA--LAARLPVR 591

                   ....*..
gi 1318663272 1078 GVQQPLA 1084
Cdd:PRK12323   592 GLAQQLA 598
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
805-1066 1.38e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.24  E-value: 1.38e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  805 RQPLSKGRPGPVAGHSQMPRVQTQQyyphvRIAPTVTTWSNKTPTAlPSHPPAASPSDTQGENPPPPGfimqgnviPNPA 884
Cdd:PHA03307    59 AAACDRFEPPTGPPPGPGTEAPANE-----SRSTPTWSLSTLAPAS-PAREGSPTPPGPSSPDPPPPT--------PPPA 124
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  885 APLPTAPghmpsqlppypqpqpyqpaqqysfgTGGAAAYRPQQPVAPPASNAYPNTPyISPVASYSGQPqmyTAQQASSP 964
Cdd:PHA03307   125 SPPPSPA-------------------------PDLSEMLRPVGSPGPPPAASPPAAG-ASPAAVASDAA---SSRQAALP 175
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  965 TSSSAASFPPPSSGAS----------FQHGGPGAPPSSSAYALPPGTTGPQNGWNDPPALN------RVPKKKKMPENfM 1028
Cdd:PHA03307   176 LSSPEETARAPSSPPAepppstppaaASPRPPRRSSPISASASSPAPAPGRSAADDAGASSsdssssESSGCGWGPEN-E 254
                          250       260       270
                   ....*....|....*....|....*....|....*...
gi 1318663272 1029 PPVPITSPIMNPSGDPQSQGLQQQPSTPGPLSSHASFP 1066
Cdd:PHA03307   255 CPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPR 292
PRK10263 PRK10263
DNA translocase FtsK; Provisional
875-989 2.22e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 42.38  E-value: 2.22e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  875 MQGNVIPNPAAPLPTaPGHMPSQLPPYPQPQPYQPAQQYSFGTGGAAAYRPQQPVAPPASNAYPNTPyISPVASYSgQPQ 954
Cdd:PRK10263   732 MKALLDDGPHEPLFT-PIVEPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQP-VAPQPQYQ-QPQ 808
                           90       100       110
                   ....*....|....*....|....*....|....*
gi 1318663272  955 MYTAQQASSPTSSSAASFPPPSSgasfQHGGPGAP 989
Cdd:PRK10263   809 QPVAPQPQYQQPQQPVAPQPQYQ----QPQQPVAP 839
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
848-1032 2.63e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 41.90  E-value: 2.63e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  848 PTAlPSHPPAASPSDTQGENPPPPGFIMQGNVIPNPAAPLPTAPGHMPSQLPPYPQPQPYQPAQQYSFGTGGAAAYRPQQ 927
Cdd:PRK07764   615 PAA-PAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAP 693
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  928 PVAPPASNAY--PNTPYISPVASYSGQPQMYTAQQASSPTSSSAASFPPPSSGAsfqHGGPGAPPSSSAyalPPGTTGPQ 1005
Cdd:PRK07764   694 AGAAPAQPAPapAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDD---PPDPAGAPAQPP---PPPAPAPA 767
                          170       180
                   ....*....|....*....|....*..
gi 1318663272 1006 NGWNDPPALNRVPKKKKMPENFMPPVP 1032
Cdd:PRK07764   768 AAPAAAPPPSPPSEEEEMAEDDAPSMD 794
PHA03379 PHA03379
EBNA-3A; Provisional
852-1105 2.69e-03

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 41.97  E-value: 2.69e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  852 PSHPPAASPSDTQGENPPPPGFIMQGNVIPNPAAPLPTAPghmpsqlpPYPQPQPYQPAQQYSFGTGGAA-AYRPQQPVA 930
Cdd:PHA03379   435 TSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGP--------LQDLEPGDQLPGVVQDGRPACApVPAPAGPIV 506
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  931 PPASNAYPNTPYISPvASYSGQPqMYTAQQASSPTSSSAASFPPPSSGASfqhGGPGAPPS----SSAYALPPGTTGPQN 1006
Cdd:PHA03379   507 RPWEASLSQVPGVAF-APVMPQP-MPVEPVPVPTVALERPVCPAPPLIAM---QGPGETSGivrvRERWRPAPWTPNPPR 581
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272 1007 GWNDPPALNRVPKKKKMPENFMPPVPITSPIMnpsgdpqSQGLQQQPSTpGPLSshasfPQQHLAGGQPFHGVQQPLAQT 1086
Cdd:PHA03379   582 SPSQMSVRDRLARLRAEAQPYQASVEVQPPQL-------TQVSPQQPME-YPLE-----PEQQMFPGSPFSQVADVMRAG 648
                          250       260
                   ....*....|....*....|....*..
gi 1318663272 1087 GM----PPSFS----KPNTEGAPGAPI 1105
Cdd:PHA03379   649 GVpamqPQYFDlplqQPISQGAPLAPL 675
DUF4813 pfam16072
Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. ...
917-1013 4.61e-03

Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 345 and 672 amino acids in length.


Pssm-ID: 435117 [Multi-domain]  Cd Length: 288  Bit Score: 40.51  E-value: 4.61e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  917 TGGAAAYRPQQPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPtsssaasfPPPSSGASFQHGGPGAPPSSSAYA 996
Cdd:pfam16072  163 AGGQQPAAPAAPAYPVAPAAYPAQAPAAAPAPAPGAPQTPLAPLNPVA--------AAPAAAAGAAAAPVVAAAAPAAAA 234
                           90
                   ....*....|....*..
gi 1318663272  997 LPPGTTGPQNGWNDPPA 1013
Cdd:pfam16072  235 PPPPAPAAPPADAAPPA 251
SSDP pfam04503
Single-stranded DNA binding protein, SSDP; This is a family of eukaryotic single-stranded DNA ...
857-1104 8.90e-03

Single-stranded DNA binding protein, SSDP; This is a family of eukaryotic single-stranded DNA binding proteins with specificity to a pyrimidine-rich element found in the promoter region of the alpha2(I) collagen gene.


Pssm-ID: 461334 [Multi-domain]  Cd Length: 293  Bit Score: 39.55  E-value: 8.90e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  857 AASPSDTQGENPPPPGfiMQGNviPNPAAPLPTAPGHMPSQLPPYPQPQPYQPAQQYsfGTGGAAAYRPQQPVAPPASNA 936
Cdd:pfam04503   18 AAAPSPVMGQMPPGDG--MPGG--PMPPGFFQSPPSHPSSQPSPHAQPPPHNPATMM--GPHSQPFMGPRYPGGPRPSVR 91
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  937 YPNT-------PYISPVASYSGQPqmyTAQQASSPTSSSAASFPPPS----SGASFQHGGPG---APPSSSA--YALPPG 1000
Cdd:pfam04503   92 MPQQgndfngpPGQQPMMPNSMDP---TRPGGHPNMGGPMQRMNPPRgpgmGPMGPQSYGPGmrgPPPNSTDgpGGMPPM 168
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272 1001 TTGPQNG--WNDPPALNRVPKKKKMPENFMPP-----VPITSPIMNPSGDPQSQG--LQQQPSTPGPLSSHASFPQ---- 1067
Cdd:pfam04503  169 NMGPGGRrpWPQPNASNPLPYSSSSPGSYGGPpggggPPGPTPIMPSPQDSTNSGenMYTLMNPVGPGGNRANFPMgpgl 248
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|.
gi 1318663272 1068 --QHLAGGQPFHGVQQPLAQTGMPPSFSKPNT--EGAPGAP 1104
Cdd:pfam04503  249 egPMGPNGMEPHHSNGSLGSGDMDGMKNSPANvlSNGPGTP 289
PRK10263 PRK10263
DNA translocase FtsK; Provisional
927-1101 8.93e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 40.45  E-value: 8.93e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  927 QPVAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSSSAASFPPPssgasfQHGGPGAPPSSSAYALPPGTTGPQN 1006
Cdd:PRK10263   318 EPVAVAAAATTATQSWAAPVEPVTQTPPVASVDVPPAQPTVAWQPVPGP------QTGEPVIAPAPEGYPQQSQYAQPAV 391
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272 1007 GWNDPPALNRVPKKKKMPENFMPPVPITSPIMNPSGDPQSQGLQQQPSTPGPLSSHASFPQQHLAGGQPFHGVQQPLAQ- 1085
Cdd:PRK10263   392 QYNEPLQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQp 471
                          170
                   ....*....|....*.
gi 1318663272 1086 TGMPPSFSKPNTEGAP 1101
Cdd:PRK10263   472 AAQEPLYQQPQPVEQQ 487
PHA03247 PHA03247
large tegument protein UL36; Provisional
850-1104 9.33e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 40.69  E-value: 9.33e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  850 ALPSHPPAASPSDTQGENPPPPGFIMQGNVIPNPAAPLP---TAPGHMPSQLPPYPQPQPYQP-----AQQYSFGTGG-- 919
Cdd:PHA03247  2473 LFPGAPVYRRPAEARFPFAAGAAPDPGGGGPPDPDAPPApsrLAPAILPDEPVGEPVHPRMLTwirglEELASDDAGDpp 2552
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  920 ---AAAYRPQQP---VAPPASNAYPNTPYISPVASYSGQPQMYTAQQASSPTSSSAASFPPPSSGASFQHGGPGAPPSSS 993
Cdd:PHA03247  2553 pplPPAAPPAAPdrsVPPPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPS 2632
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1318663272  994 AYALPPG-----TTGPQNGWNDPPALNRVPKKKKMPENFMPPVPiTSPIMNP--------------SGDPQSQGLQQQPS 1054
Cdd:PHA03247  2633 PAANEPDphpppTVPPPERPRDDPAPGRVSRPRRARRLGRAAQA-SSPPQRPrrraarptvgsltsLADPPPPPPTPEPA 2711
                          250       260       270       280       290
                   ....*....|....*....|....*....|....*....|....*....|
gi 1318663272 1055 tPGPLSSHASFPQQHLAGGQPFhgvQQPLAQTGMPPSFSKPNTEGAPGAP 1104
Cdd:PHA03247  2712 -PHALVSATPLPPGPAAARQAS---PALPAAPAPPAVPAGPATPGGPARP 2757
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH