NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|19112374|ref|NP_595582|]
View 

putative COPII-coated vesicle-associated protein Sec31 [Schizosaccharomyces pombe]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 COG2319
WD40 repeat [General function prediction only];
11-335 8.89e-24

WD40 repeat [General function prediction only];


:

Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 105.38  E-value: 8.89e-24
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   11 TLAWSPRGvndnqALLALGGYTGTegsknsdtlLELWNenPESQKPVGSIDVKTRF-YDLAWekSLDkpmG-VIAGSLED 88
Cdd:COG2319  125 SVAFSPDG-----KTLASGSADGT---------VRLWD--LATGKLLRTLTGHSGAvTSVAF--SPD---GkLLASGSDD 183
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   89 GGIGFWDPAAilksdeaSASIATYKSENGSILGpLDFNrlqPN--LLASGDNKGDVWVWDIKHPQQPFALPKQnrSSEVH 166
Cdd:COG2319  184 GTVRLWDLAT-------GKLLRTLTGHTGAVRS-VAFS---PDgkLLASGSADGTVRLWDLATGKLLRTLTGH--SGSVR 250
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  167 VVSWNNKvSHILASGNATEYTTVWDVKLKRQVLNLSylgaagvsAATGAVNSIAWHPNNaTRLATAIDDNRnpiILTWDL 246
Cdd:COG2319  251 SVAFSPD-GRLLASGSADGTVRLWDLATGELLRTLT--------GHSGGVNSVAFSPDG-KLLASGSDDGT---VRLWDL 317
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  247 RQPTVPQnILTGHQKAALSLSWCPeDPTFLLSSGKDGRAMVWNVETGESLGSFPRSGNWYTKSSWCPsNSNRVAVASLEG 326
Cdd:COG2319  318 ATGKLLR-TLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSP-DGRTLASGSADG 394

                 ....*....
gi 19112374  327 KVSIFSIQS 335
Cdd:COG2319  395 TVRLWDLAT 403
PHA03247 super family cl33720
large tegument protein UL36; Provisional
780-1110 8.20e-19

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 93.46  E-value: 8.20e-19
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   780 PGAKEEIQRLTMLLEPHAVPPIHQIKQTGYAPVQPKTSQASSILPTVPRTTSYTSPYATTSSHITPADVHPLPPPSTSTT 859
Cdd:PHA03247 2686 RAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAG 2765
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   860 AGWNDAPmlgqlpmrRAAPSMAPVRSPFPGASSAQPaamSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGEtyhPPT 939
Cdd:PHA03247 2766 PPAPAPP--------AAPAAGPPRRLTRPAVASLSE---SRESLPSPWDPADPPAAVLAPAAALPPAASPAGPL---PPP 2831
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   940 ASGTRVPPVQQPSHPNPYTP----VAPQSPVAAASRISSSPNMPPSNPYTPiavASSTVNPAHTYKPHggSQIVPPPKQP 1015
Cdd:PHA03247 2832 TSAQPTAPPPPPGPPPPSLPlggsVAPGGDVRRRPPSRSPAAKPAAPARPP---VRRLARPAVSRSTE--SFALPPDQPE 2906
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1016 anrvvPLPPTASQrasayEPPTVSVPSPSALSPSVTPQLPPVS-SRLPPVSATRPQiPQPPPVSTA------LPSSSAVS 1088
Cdd:PHA03247 2907 -----RPPQPQAP-----PPPQPQPQPPPPPQPQPPPPPPPRPqPPLAPTTDPAGA-GEPSGAVPQpwlgalVPGRVAVP 2975
                         330       340
                  ....*....|....*....|..
gi 19112374  1089 RPPIATSAGRSSTAASTSAPLT 1110
Cdd:PHA03247 2976 RFRVPQPAPSREAPASSTPPLT 2997
SRA1 super family cl06366
Steroid receptor RNA activator (SRA1); This family consists of several hypothetical mammalian ...
1072-1217 8.33e-07

Steroid receptor RNA activator (SRA1); This family consists of several hypothetical mammalian steroid receptor RNA activator proteins. SRA-RNAs likely to encode stable proteins are widely expressed in breast cancer cell lines. SRA-RNA is a steroid receptor co-activator which acts as a functional RNA and is classified as belonging to the growing family of functional non-coding RNAs.


The actual alignment was detected with superfamily member pfam07304:

Pssm-ID: 429397 [Multi-domain]  Cd Length: 147  Bit Score: 49.82  E-value: 8.33e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1072 PQPPPVSTALPsssavsrPPIATSAGRSSTAASTSAPLTYPAGDrshIPGNLRPiyemLNAELQRVSQSLPPQmsrVVHD 1151
Cdd:pfam07304    2 MGPPPPSVAPP-------PPPPGGPGPLPQVEPTDSPVSESEPV---VEDVMSV----LNQALDACRGTVRKQ---VCDD 64
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 19112374   1152 TEKRLNMLFDRLNSNVLSKPLTDELLALATSLNAHDYQTASNIQTNIVTTLGDQCEHWIVGVTRLI 1217
Cdd:pfam07304   65 VSKRLRLLEDSWRSGKLSLPVRRRMDTLSQELQSGHWDAADDIHRSLMVDHVTEVSQWMVGVKRLI 130
 
Name Accession Description Interval E-value
WD40 COG2319
WD40 repeat [General function prediction only];
11-335 8.89e-24

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 105.38  E-value: 8.89e-24
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   11 TLAWSPRGvndnqALLALGGYTGTegsknsdtlLELWNenPESQKPVGSIDVKTRF-YDLAWekSLDkpmG-VIAGSLED 88
Cdd:COG2319  125 SVAFSPDG-----KTLASGSADGT---------VRLWD--LATGKLLRTLTGHSGAvTSVAF--SPD---GkLLASGSDD 183
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   89 GGIGFWDPAAilksdeaSASIATYKSENGSILGpLDFNrlqPN--LLASGDNKGDVWVWDIKHPQQPFALPKQnrSSEVH 166
Cdd:COG2319  184 GTVRLWDLAT-------GKLLRTLTGHTGAVRS-VAFS---PDgkLLASGSADGTVRLWDLATGKLLRTLTGH--SGSVR 250
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  167 VVSWNNKvSHILASGNATEYTTVWDVKLKRQVLNLSylgaagvsAATGAVNSIAWHPNNaTRLATAIDDNRnpiILTWDL 246
Cdd:COG2319  251 SVAFSPD-GRLLASGSADGTVRLWDLATGELLRTLT--------GHSGGVNSVAFSPDG-KLLASGSDDGT---VRLWDL 317
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  247 RQPTVPQnILTGHQKAALSLSWCPeDPTFLLSSGKDGRAMVWNVETGESLGSFPRSGNWYTKSSWCPsNSNRVAVASLEG 326
Cdd:COG2319  318 ATGKLLR-TLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSP-DGRTLASGSADG 394

                 ....*....
gi 19112374  327 KVSIFSIQS 335
Cdd:COG2319  395 TVRLWDLAT 403
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
81-332 1.35e-20

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 93.55  E-value: 1.35e-20
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   81 VIAGSlEDGGIGFWDPaailksdEASASIATYKSENGSILGpLDFNRLQPNLLASGDNKgDVWVWDIKHPQQPFALPKQN 160
Cdd:cd00200   66 LASGS-SDKTIRLWDL-------ETGECVRTLTGHTSYVSS-VAFSPDGRILSSSSRDK-TIKVWDVETGKCLTTLRGHT 135
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  161 RSseVHVVSWNnKVSHILASGNATEYTTVWDVKLKRQVLNLSylgaagvsAATGAVNSIAWHPNNAtRLATAIDDNrnpI 240
Cdd:cd00200  136 DW--VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLT--------GHTGEVNSVAFSPDGE-KLLSSSSDG---T 200
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  241 ILTWDLRQPTVPQnILTGHQKAALSLSWCPeDPTFLLSSGKDGRAMVWNVETGESLGSFPRSGNWYTKSSWCPsNSNRVA 320
Cdd:cd00200  201 IKLWDLSTGKCLG-TLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSP-DGKRLA 277
                        250
                 ....*....|..
gi 19112374  321 VASLEGKVSIFS 332
Cdd:cd00200  278 SGSADGTIRIWD 289
PHA03247 PHA03247
large tegument protein UL36; Provisional
780-1110 8.20e-19

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 93.46  E-value: 8.20e-19
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   780 PGAKEEIQRLTMLLEPHAVPPIHQIKQTGYAPVQPKTSQASSILPTVPRTTSYTSPYATTSSHITPADVHPLPPPSTSTT 859
Cdd:PHA03247 2686 RAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAG 2765
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   860 AGWNDAPmlgqlpmrRAAPSMAPVRSPFPGASSAQPaamSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGEtyhPPT 939
Cdd:PHA03247 2766 PPAPAPP--------AAPAAGPPRRLTRPAVASLSE---SRESLPSPWDPADPPAAVLAPAAALPPAASPAGPL---PPP 2831
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   940 ASGTRVPPVQQPSHPNPYTP----VAPQSPVAAASRISSSPNMPPSNPYTPiavASSTVNPAHTYKPHggSQIVPPPKQP 1015
Cdd:PHA03247 2832 TSAQPTAPPPPPGPPPPSLPlggsVAPGGDVRRRPPSRSPAAKPAAPARPP---VRRLARPAVSRSTE--SFALPPDQPE 2906
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1016 anrvvPLPPTASQrasayEPPTVSVPSPSALSPSVTPQLPPVS-SRLPPVSATRPQiPQPPPVSTA------LPSSSAVS 1088
Cdd:PHA03247 2907 -----RPPQPQAP-----PPPQPQPQPPPPPQPQPPPPPPPRPqPPLAPTTDPAGA-GEPSGAVPQpwlgalVPGRVAVP 2975
                         330       340
                  ....*....|....*....|..
gi 19112374  1089 RPPIATSAGRSSTAASTSAPLT 1110
Cdd:PHA03247 2976 RFRVPQPAPSREAPASSTPPLT 2997
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
793-1121 6.70e-17

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 86.36  E-value: 6.70e-17
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    793 LEPHAVPPIHQIKQTGYAPVQPKTSQASSILPTVPRTTSYTSPyattsshITPAdvhPLPPPSTSTTAGWNDAPML-GQL 871
Cdd:pfam03154  206 VPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRLPSPHPP-------LQPM---TQPPPPSQVSPQPLPQPSLhGQM 275
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    872 PMRRAAPSMAPVRSPFPGASsaQPAAMSRTSSVSTLPPPPPTASmtasapaiasppppkvgetyhPPTASGTRVPPVQQP 951
Cdd:pfam03154  276 PPMPHSLQTGPSHMQHPVPP--QPFPLTPQSSQSQVPPGPSPAA---------------------PGQSQQRIHTPPSQS 332
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    952 SHPNPYTPVAPQSPVAAASRISSSPnmPPSNPYTPIAVASSTVNPAHTYKPHG---GSQIVPPP--KQPANRVVPLPPTA 1026
Cdd:pfam03154  333 QLQSQQPPREQPLPPAPLSMPHIKP--PPTTPIPQLPNPQSHKHPPHLSGPSPfqmNSNLPPPPalKPLSSLSTHHPPSA 410
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1027 SQRASAYEPPTVSVPSPSALSPSVT--PQLPPVSSRLPPVSATRPQIPQPP-PVSTALPSSSAVSRPPIATSAGRSSTAA 1103
Cdd:pfam03154  411 HPPPLQLMPQSQQLPPPPAQPPVLTqsQSLPPPAASHPPTSGLHQVPSQSPfPQHPFVPGGPPPITPPSGPPTSTSSAMP 490
                          330
                   ....*....|....*...
gi 19112374   1104 STSAPLTYPAGDRSHIPG 1121
Cdd:pfam03154  491 GIQPPSSASVSSSGPVPA 508
SRA1 pfam07304
Steroid receptor RNA activator (SRA1); This family consists of several hypothetical mammalian ...
1072-1217 8.33e-07

Steroid receptor RNA activator (SRA1); This family consists of several hypothetical mammalian steroid receptor RNA activator proteins. SRA-RNAs likely to encode stable proteins are widely expressed in breast cancer cell lines. SRA-RNA is a steroid receptor co-activator which acts as a functional RNA and is classified as belonging to the growing family of functional non-coding RNAs.


Pssm-ID: 429397 [Multi-domain]  Cd Length: 147  Bit Score: 49.82  E-value: 8.33e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1072 PQPPPVSTALPsssavsrPPIATSAGRSSTAASTSAPLTYPAGDrshIPGNLRPiyemLNAELQRVSQSLPPQmsrVVHD 1151
Cdd:pfam07304    2 MGPPPPSVAPP-------PPPPGGPGPLPQVEPTDSPVSESEPV---VEDVMSV----LNQALDACRGTVRKQ---VCDD 64
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 19112374   1152 TEKRLNMLFDRLNSNVLSKPLTDELLALATSLNAHDYQTASNIQTNIVTTLGDQCEHWIVGVTRLI 1217
Cdd:pfam07304   65 VSKRLRLLEDSWRSGKLSLPVRRRMDTLSQELQSGHWDAADDIHRSLMVDHVTEVSQWMVGVKRLI 130
PLN00181 PLN00181
protein SPA1-RELATED; Provisional
211-335 5.25e-05

protein SPA1-RELATED; Provisional


Pssm-ID: 177776 [Multi-domain]  Cd Length: 793  Bit Score: 47.77  E-value: 5.25e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   211 AATGAVNSIAWHPNNATRLATAiddNRNPIILTWDL-RQPTVPQniLTGHQKAALSLSWCPEDPTFLLSSGKDGRAMVWN 289
Cdd:PLN00181  530 ASRSKLSGICWNSYIKSQVASS---NFEGVVQVWDVaRSQLVTE--MKEHEKRVWSIDYSSADPTLLASGSDDGSVKLWS 604
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|.
gi 19112374   290 VETGESLGSFPrsgnwyTKSSWC----PSNSNR-VAVASLEGKVSIFSIQS 335
Cdd:PLN00181  605 INQGVSIGTIK------TKANICcvqfPSESGRsLAFGSADHKVYYYDLRN 649
KLF10_11_N cd21974
N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins; This subfamily ...
950-1084 1.83e-04

N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins; This subfamily is composed of Kruppel-like factor or Krueppel-like factor (KLF) 10, KLF11, and similar proteins. KLF10 was first identified in human osteoblasts and plays a role in mediating estrogen (E2) signaling in bone and skeletal homeostasis and a regulatory role in tumor formation and metastasis. KLF11 is involved in cell growth, apoptosis, cellular inflammation and differentiation, endometriosis, and cholesterol, prostaglandin, neurotransmitter, fat, and sugar metabolism. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF10/11 belong to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF10, KLF11, and similar proteins.


Pssm-ID: 409243 [Multi-domain]  Cd Length: 229  Bit Score: 44.54  E-value: 1.83e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  950 QPSHPNPYTPV--------APQSPVAAASriSSSPNMPPsnPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVvp 1021
Cdd:cd21974   24 RLRKPRPLTPSsdssdeddAPESPKDFHS--LSSLCMTP--PYSPPFFEASHSPSVASLHPPSAASSQPPPEPESSEP-- 97
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374 1022 lPPTASQRASAyepptVSV-----PSPSALSPSVTPQLPPVSSRLPPVSA--TRPQIPQPPPVSTALPSS 1084
Cdd:cd21974   98 -PAASPQRAQA-----TSVirhtaDPVPVSPPPVLCQMLPVSSSSGVIVAflKAPQQPSPQPQKPALPQP 161
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
930-1079 6.53e-04

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 43.91  E-value: 6.53e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    930 KVGETYHPPTasgtrvpPVQQPSHPNPYTPVAPQSPVAAASRI--------SSSPNMPPSNPYTPIAVASSTVnpahtyk 1001
Cdd:TIGR01645  277 RVGKCVTPPD-------ALLQPATVSAIPAAAAVAAAAATAKImaaeavagAAVLGPRAQSPATPSSSLPTDI------- 342
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 19112374   1002 phGGSQIVPPPKQPANRVVPLPPTAsqrASAYEPPTVSVPSPsalspsVTPQLPPVSSRLPPVSATRPQIPQPPPVST 1079
Cdd:TIGR01645  343 --GNKAVVSSAKKEAEEVPPLPQAA---PAVVKPGPMEIPTP------VPPPGLAIPSLVAPPGLVAPTEINPSFLAS 409
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
252-289 7.72e-04

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 38.06  E-value: 7.72e-04
                            10        20        30
                    ....*....|....*....|....*....|....*...
gi 19112374     252 PQNILTGHQKAALSLSWCPeDPTFLLSSGKDGRAMVWN 289
Cdd:smart00320    4 LLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
WD40 pfam00400
WD domain, G-beta repeat;
252-289 8.38e-04

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 38.10  E-value: 8.38e-04
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 19112374    252 PQNILTGHQKAALSLSWCPeDPTFLLSSGKDGRAMVWN 289
Cdd:pfam00400    3 LLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVWD 39
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
986-1076 1.32e-03

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 40.93  E-value: 1.32e-03
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374     986 PIAVASSTVNPAHTYKPHGGSQIVPP--PKQPANRVVPLPP----TASQRASAYEPPTVSVPS-PSALSPsvtPQLPPVS 1058
Cdd:smart00818   38 QIIPVSQQHPPTHTLQPHHHIPVLPAqqPVVPQQPLMPVPGqhsmTPTQHHQPNLPQPAQQPFqPQPLQP---PQPQQPM 114
                            90
                    ....*....|....*...
gi 19112374    1059 SRLPPVSATRPQIPQPPP 1076
Cdd:smart00818  115 QPQPPVHPIPPLPPQPPL 132
DamX COG3266
Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell ...
936-1104 6.92e-03

Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 442497 [Multi-domain]  Cd Length: 455  Bit Score: 40.60  E-value: 6.92e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  936 HPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPPSNPYTPiAVASSTVNPAHTYKPHGGSQIVPPPKQP 1015
Cdd:COG3266  208 LLLLLASALGEAVAAAAELAALALLAAGAAEVLTARLVLLLLIIGSALKAP-SQASSASAPATTSLGEQQEVSLPPAVAA 286
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374 1016 AnrvvPLPPTASQRASAyEPPTVSVPSPSALSPsvTPQLPPVSSRLPPVSATRPQIPQPPPV---STALPSSSAVSRPPI 1092
Cdd:COG3266  287 Q----PAAAAAAQPSAV-ALPAAPAAAAAAAAP--AEAAAPQPTAAKPVVTETAAPAAPAPEaaaAAAAPAAPAVAKKLA 359
                        170
                 ....*....|..
gi 19112374 1093 ATSAGRSSTAAS 1104
Cdd:COG3266  360 ADEQWLASQPAS 371
 
Name Accession Description Interval E-value
WD40 COG2319
WD40 repeat [General function prediction only];
11-335 8.89e-24

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 105.38  E-value: 8.89e-24
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   11 TLAWSPRGvndnqALLALGGYTGTegsknsdtlLELWNenPESQKPVGSIDVKTRF-YDLAWekSLDkpmG-VIAGSLED 88
Cdd:COG2319  125 SVAFSPDG-----KTLASGSADGT---------VRLWD--LATGKLLRTLTGHSGAvTSVAF--SPD---GkLLASGSDD 183
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   89 GGIGFWDPAAilksdeaSASIATYKSENGSILGpLDFNrlqPN--LLASGDNKGDVWVWDIKHPQQPFALPKQnrSSEVH 166
Cdd:COG2319  184 GTVRLWDLAT-------GKLLRTLTGHTGAVRS-VAFS---PDgkLLASGSADGTVRLWDLATGKLLRTLTGH--SGSVR 250
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  167 VVSWNNKvSHILASGNATEYTTVWDVKLKRQVLNLSylgaagvsAATGAVNSIAWHPNNaTRLATAIDDNRnpiILTWDL 246
Cdd:COG2319  251 SVAFSPD-GRLLASGSADGTVRLWDLATGELLRTLT--------GHSGGVNSVAFSPDG-KLLASGSDDGT---VRLWDL 317
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  247 RQPTVPQnILTGHQKAALSLSWCPeDPTFLLSSGKDGRAMVWNVETGESLGSFPRSGNWYTKSSWCPsNSNRVAVASLEG 326
Cdd:COG2319  318 ATGKLLR-TLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSP-DGRTLASGSADG 394

                 ....*....
gi 19112374  327 KVSIFSIQS 335
Cdd:COG2319  395 TVRLWDLAT 403
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
81-332 1.35e-20

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 93.55  E-value: 1.35e-20
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   81 VIAGSlEDGGIGFWDPaailksdEASASIATYKSENGSILGpLDFNRLQPNLLASGDNKgDVWVWDIKHPQQPFALPKQN 160
Cdd:cd00200   66 LASGS-SDKTIRLWDL-------ETGECVRTLTGHTSYVSS-VAFSPDGRILSSSSRDK-TIKVWDVETGKCLTTLRGHT 135
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  161 RSseVHVVSWNnKVSHILASGNATEYTTVWDVKLKRQVLNLSylgaagvsAATGAVNSIAWHPNNAtRLATAIDDNrnpI 240
Cdd:cd00200  136 DW--VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLT--------GHTGEVNSVAFSPDGE-KLLSSSSDG---T 200
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  241 ILTWDLRQPTVPQnILTGHQKAALSLSWCPeDPTFLLSSGKDGRAMVWNVETGESLGSFPRSGNWYTKSSWCPsNSNRVA 320
Cdd:cd00200  201 IKLWDLSTGKCLG-TLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSP-DGKRLA 277
                        250
                 ....*....|..
gi 19112374  321 VASLEGKVSIFS 332
Cdd:cd00200  278 SGSADGTIRIWD 289
WD40 COG2319
WD40 repeat [General function prediction only];
80-335 4.28e-20

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 94.21  E-value: 4.28e-20
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   80 GVIAGSLEDGGIGFWDPAAilksdeaSASIATYKSENGSILGpLDFNrlqPN--LLASGDNKGDVWVWDIKHPQQPFALP 157
Cdd:COG2319   91 RLLASASADGTVRLWDLAT-------GLLLRTLTGHTGAVRS-VAFS---PDgkTLASGSADGTVRLWDLATGKLLRTLT 159
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  158 KQnrSSEVHVVSWNNKvSHILASGNATEYTTVWDVKLKRQVLNLSylgaagvsAATGAVNSIAWHPNNaTRLATAIDDNR 237
Cdd:COG2319  160 GH--SGAVTSVAFSPD-GKLLASGSDDGTVRLWDLATGKLLRTLT--------GHTGAVRSVAFSPDG-KLLASGSADGT 227
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  238 npiILTWDLRQPTVPQnILTGHQKAALSLSWCPeDPTFLLSSGKDGRAMVWNVETGESLGSFPRSGNWYTKSSWCPsNSN 317
Cdd:COG2319  228 ---VRLWDLATGKLLR-TLTGHSGSVRSVAFSP-DGRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSVAFSP-DGK 301
                        250
                 ....*....|....*...
gi 19112374  318 RVAVASLEGKVSIFSIQS 335
Cdd:COG2319  302 LLASGSDDGTVRLWDLAT 319
WD40 COG2319
WD40 repeat [General function prediction only];
130-335 2.81e-19

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 91.90  E-value: 2.81e-19
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  130 PNLLASGDNKGDVWVWDIKHPQQPFALpkQNRSSEVHVVSWnNKVSHILASGNATEYTTVWDVKLKRQVLNLsylgaagv 209
Cdd:COG2319   90 GRLLASASADGTVRLWDLATGLLLRTL--TGHTGAVRSVAF-SPDGKTLASGSADGTVRLWDLATGKLLRTL-------- 158
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  210 SAATGAVNSIAWHPNNaTRLATAIDDNRnpiILTWDLRQPTVPQnILTGHQKAALSLSWCPeDPTFLLSSGKDGRAMVWN 289
Cdd:COG2319  159 TGHSGAVTSVAFSPDG-KLLASGSDDGT---VRLWDLATGKLLR-TLTGHTGAVRSVAFSP-DGKLLASGSADGTVRLWD 232
                        170       180       190       200
                 ....*....|....*....|....*....|....*....|....*.
gi 19112374  290 VETGESLGSFPRSGNWYTKSSWCPsNSNRVAVASLEGKVSIFSIQS 335
Cdd:COG2319  233 LATGKLLRTLTGHSGSVRSVAFSP-DGRLLASGSADGTVRLWDLAT 277
PHA03247 PHA03247
large tegument protein UL36; Provisional
780-1110 8.20e-19

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 93.46  E-value: 8.20e-19
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   780 PGAKEEIQRLTMLLEPHAVPPIHQIKQTGYAPVQPKTSQASSILPTVPRTTSYTSPYATTSSHITPADVHPLPPPSTSTT 859
Cdd:PHA03247 2686 RAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAG 2765
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   860 AGWNDAPmlgqlpmrRAAPSMAPVRSPFPGASSAQPaamSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGEtyhPPT 939
Cdd:PHA03247 2766 PPAPAPP--------AAPAAGPPRRLTRPAVASLSE---SRESLPSPWDPADPPAAVLAPAAALPPAASPAGPL---PPP 2831
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   940 ASGTRVPPVQQPSHPNPYTP----VAPQSPVAAASRISSSPNMPPSNPYTPiavASSTVNPAHTYKPHggSQIVPPPKQP 1015
Cdd:PHA03247 2832 TSAQPTAPPPPPGPPPPSLPlggsVAPGGDVRRRPPSRSPAAKPAAPARPP---VRRLARPAVSRSTE--SFALPPDQPE 2906
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1016 anrvvPLPPTASQrasayEPPTVSVPSPSALSPSVTPQLPPVS-SRLPPVSATRPQiPQPPPVSTA------LPSSSAVS 1088
Cdd:PHA03247 2907 -----RPPQPQAP-----PPPQPQPQPPPPPQPQPPPPPPPRPqPPLAPTTDPAGA-GEPSGAVPQpwlgalVPGRVAVP 2975
                         330       340
                  ....*....|....*....|..
gi 19112374  1089 RPPIATSAGRSSTAASTSAPLT 1110
Cdd:PHA03247 2976 RFRVPQPAPSREAPASSTPPLT 2997
PHA03247 PHA03247
large tegument protein UL36; Provisional
810-1147 5.20e-18

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 90.77  E-value: 5.20e-18
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   810 APVQPKTsqassilPTVPRTTSyTSPYATTSSHITPADVHPLPPPSTSTTAGWNDAPMLGQLPMR--------------- 874
Cdd:PHA03247 2591 APPQSAR-------PRAPVDDR-GDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPpperprddpapgrvs 2662
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   875 ---------RAAPSMAPVRSPFPGAssAQPAAMSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGETYHPPTASGTRV 945
Cdd:PHA03247 2663 rprrarrlgRAAQASSPPQRPRRRA--ARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPA 2740
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   946 PPvqqpshPNPYTPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVVPLP-- 1023
Cdd:PHA03247 2741 PP------AVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLap 2814
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1024 ----PTASQRASAYEPPTVSVPSPSALSPsvtpqlPPVSSRLPPVSATRPQIP--QPPPVSTALPSSSAVSRPPIATSAG 1097
Cdd:PHA03247 2815 aaalPPAASPAGPLPPPTSAQPTAPPPPP------GPPPPSLPLGGSVAPGGDvrRRPPSRSPAAKPAAPARPPVRRLAR 2888
                         330       340       350       360       370
                  ....*....|....*....|....*....|....*....|....*....|
gi 19112374  1098 RSSTAASTSAPLTYPAGDRSHIPGNLRPIYEMLNAELQRVSQSLPPQMSR 1147
Cdd:PHA03247 2889 PAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPR 2938
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
793-1121 6.70e-17

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 86.36  E-value: 6.70e-17
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    793 LEPHAVPPIHQIKQTGYAPVQPKTSQASSILPTVPRTTSYTSPyattsshITPAdvhPLPPPSTSTTAGWNDAPML-GQL 871
Cdd:pfam03154  206 VPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRLPSPHPP-------LQPM---TQPPPPSQVSPQPLPQPSLhGQM 275
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    872 PMRRAAPSMAPVRSPFPGASsaQPAAMSRTSSVSTLPPPPPTASmtasapaiasppppkvgetyhPPTASGTRVPPVQQP 951
Cdd:pfam03154  276 PPMPHSLQTGPSHMQHPVPP--QPFPLTPQSSQSQVPPGPSPAA---------------------PGQSQQRIHTPPSQS 332
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    952 SHPNPYTPVAPQSPVAAASRISSSPnmPPSNPYTPIAVASSTVNPAHTYKPHG---GSQIVPPP--KQPANRVVPLPPTA 1026
Cdd:pfam03154  333 QLQSQQPPREQPLPPAPLSMPHIKP--PPTTPIPQLPNPQSHKHPPHLSGPSPfqmNSNLPPPPalKPLSSLSTHHPPSA 410
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1027 SQRASAYEPPTVSVPSPSALSPSVT--PQLPPVSSRLPPVSATRPQIPQPP-PVSTALPSSSAVSRPPIATSAGRSSTAA 1103
Cdd:pfam03154  411 HPPPLQLMPQSQQLPPPPAQPPVLTqsQSLPPPAASHPPTSGLHQVPSQSPfPQHPFVPGGPPPITPPSGPPTSTSSAMP 490
                          330
                   ....*....|....*...
gi 19112374   1104 STSAPLTYPAGDRSHIPG 1121
Cdd:pfam03154  491 GIQPPSSASVSSSGPVPA 508
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
795-1098 1.94e-16

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 85.20  E-value: 1.94e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    795 PHAVPP--IHQIKQTGYAPVQPKTSQAssiLPTVPRTTSYTSPYATTSSHITPADVHPLPPpststtagwndAPMlgqlP 872
Cdd:pfam03154  290 QHPVPPqpFPLTPQSSQSQVPPGPSPA---APGQSQQRIHTPPSQSQLQSQQPPREQPLPP-----------APL----S 351
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    873 MRRAAPsmaPVRSPFPGASSAQ----PAAMSRTSSV---STLPPPP---PTASMTasapaiasppppkvgeTYHPPTAsg 942
Cdd:pfam03154  352 MPHIKP---PPTTPIPQLPNPQshkhPPHLSGPSPFqmnSNLPPPPalkPLSSLS----------------THHPPSA-- 410
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    943 tRVPPVQQPSHPNPYTPVAPQSPVaaasrISSSPNMPPSNPYTPIAVASSTVNPAHTYKPH----GGSQIVPPPKQPanr 1018
Cdd:pfam03154  411 -HPPPLQLMPQSQQLPPPPAQPPV-----LTQSQSLPPPAASHPPTSGLHQVPSQSPFPQHpfvpGGPPPITPPSGP--- 481
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1019 vvplPPTASQRASAYEPPTVSVPSPSALSP-SVTPQLPPVSSR-LPPVSATRPQIPQPPPVSTAlPSSSAVSRPPIATSA 1096
Cdd:pfam03154  482 ----PTSTSSAMPGIQPPSSASVSSSGPVPaAVSCPLPPVQIKeEALDEAEEPESPPPPPRSPS-PEPTVVNTPSHASQS 556

                   ..
gi 19112374   1097 GR 1098
Cdd:pfam03154  557 AR 558
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
108-353 3.31e-16

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 80.46  E-value: 3.31e-16
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  108 SIATYKSENGSILGpLDFNRlQPNLLASGDNKGDVWVWDIKHPQQPFALpkQNRSSEVHVVSWNNKVSHILASGNATeYT 187
Cdd:cd00200    1 LRRTLKGHTGGVTC-VAFSP-DGKLLATGSGDGTIKVWDLETGELLRTL--KGHTGPVRDVAASADGTYLASGSSDK-TI 75
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  188 TVWDVKLKRQVLNLSylgaagvsAATGAVNSIAWHPNNaTRLATAIDDNRnpIILtWDLRQPTvPQNILTGHQKAALSLS 267
Cdd:cd00200   76 RLWDLETGECVRTLT--------GHTSYVSSVAFSPDG-RILSSSSRDKT--IKV-WDVETGK-CLTTLRGHTDWVNSVA 142
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  268 WCPeDPTFLLSSGKDGRAMVWNVETGESLGSFPRSGNWYTKSSWCPsNSNRVAVASLEGKVSIFSIQSTNTDKSQEASIK 347
Cdd:cd00200  143 FSP-DGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSP-DGEKLLSSSSDGTIKLWDLSTGKCLGTLRGHEN 220

                 ....*.
gi 19112374  348 GATSID 353
Cdd:cd00200  221 GVNSVA 226
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
802-1125 3.71e-14

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 77.50  E-value: 3.71e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    802 HQIKQTGYAPVQPKTSQASSILPTVPRTTSYTSPYATTSSHITPADVHPL--PPPSTSTtagwndaPMLGQLPMRRAAPS 879
Cdd:pfam03154  164 QQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPAtsQPPNQTQ-------STAAPHTLIQQTPT 236
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    880 MAPVRSPFPgASSAQPAamsrtssvsTLPPPPPTASMTASAPAIASPPPPKVGEtyhpPTASGTRVPPVQQPSHPNPYTP 959
Cdd:pfam03154  237 LHPQRLPSP-HPPLQPM---------TQPPPPSQVSPQPLPQPSLHGQMPPMPH----SLQTGPSHMQHPVPPQPFPLTP 302
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    960 VAPQSPV------AAASRISSSPNMPPSNPYTPIAVA--SSTVNPAHTYKPHggsqIVPPPKQPanrVVPLPPTASQRas 1031
Cdd:pfam03154  303 QSSQSQVppgpspAAPGQSQQRIHTPPSQSQLQSQQPprEQPLPPAPLSMPH----IKPPPTTP---IPQLPNPQSHK-- 373
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1032 ayEPPTVSVPSP----------------SALS----PSVTP---QLPPVSSRLPPVSATRPQIPQ----PPPVSTALPSS 1084
Cdd:pfam03154  374 --HPPHLSGPSPfqmnsnlppppalkplSSLSthhpPSAHPpplQLMPQSQQLPPPPAQPPVLTQsqslPPPAASHPPTS 451
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 19112374   1085 S---AVSRPPIATSAGRSSTAASTSAPLTYPAGDRSHIPGNLRP 1125
Cdd:pfam03154  452 GlhqVPSQSPFPQHPFVPGGPPPITPPSGPPTSTSSAMPGIQPP 495
PHA03247 PHA03247
large tegument protein UL36; Provisional
884-1125 5.99e-13

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 74.20  E-value: 5.99e-13
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   884 RSPFPGASSAQPAAMSRTSSVST-------LPPPPPTASMTASAPAIASPPPPKVGETYHP------------------- 937
Cdd:PHA03247 2471 GELFPGAPVYRRPAEARFPFAAGaapdpggGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPrmltwirgleelasddagd 2550
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   938 ----------PTASGTRVPPVQ---QPSHPN----------PYTPVAPQSPVA--AASRISSSPNMPPSNPYTPIAVASS 992
Cdd:PHA03247 2551 pppplppaapPAAPDRSVPPPRpapRPSEPAvtsrarrpdaPPQSARPRAPVDdrGDPRGPAPPSPLPPDTHAPDPPPPS 2630
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   993 TvNPAHTYKPHGGSQIVPPPKQPANRVVPLPPTASQRASAYEPPTVSVPSPSALSPSVTPqlPPVSsrlPPVSATRPQIP 1072
Cdd:PHA03247 2631 P-SPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAAR--PTVG---SLTSLADPPPP 2704
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|...
gi 19112374  1073 QPPPVSTALPSSSAVSRPPiATSAGRSSTAASTSAPLTYPAGDRSHIPGNLRP 1125
Cdd:PHA03247 2705 PPTPEPAPHALVSATPLPP-GPAAARQASPALPAAPAPPAVPAGPATPGGPAR 2756
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
795-1115 1.96e-12

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 72.13  E-value: 1.96e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   795 PHAVPPIHQIKQTGYAPVQPkTSQASSILPTVPRT----TSYTSPYATTSSHITPADVHPLPPPStsttagwndaPMLGQ 870
Cdd:PHA03307  127 PPSPAPDLSEMLRPVGSPGP-PPAASPPAAGASPAavasDAASSRQAALPLSSPEETARAPSSPP----------AEPPP 195
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   871 LPMRRAAPSMAPVRSPFPGASSAQPAAMSRTSSVStlppPPPTASMTASAPAIASPPPPKVGETYHPPTASGTRVPPVQQ 950
Cdd:PHA03307  196 STPPAAASPRPPRRSSPISASASSPAPAPGRSAAD----DAGASSSDSSSSESSGCGWGPENECPLPRPAPITLPTRIWE 271
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   951 PSHPNPYTPVAPQSPVAAASRISS---SPNMPPSNPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVVPlPPTAS 1027
Cdd:PHA03307  272 ASGWNGPSSRPGPASSSSSPRERSpspSPSSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGP-SPSRS 350
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1028 QRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTALPSSSAVSRPPIATSAGRSSTAASTSA 1107
Cdd:PHA03307  351 PSPSRPPPPADPSSPRKRPRPSRAPSSPAASAGRPTRRRARAAVAGRARRRDATGRFPAGRPRPSPLDAGAASGAFYARY 430

                  ....*...
gi 19112374  1108 PLTYPAGD 1115
Cdd:PHA03307  431 PLLTPSGE 438
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
811-1121 1.22e-11

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 69.43  E-value: 1.22e-11
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   811 PVQPKTSQASSILPTVPRTTSYTSPYATTSSHITPADVHPLPPPSTSTTAGWNDAPMLGQLPMRRAAPSMAPV---RSPF 887
Cdd:PHA03307   75 PGTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPpaaSPPA 154
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   888 PGASSAQPAAMSRTSSVSTLPPPPPTASmTASAPAIASPPPPKVGETYHPPTASGTRVPPVQQPSHPNPYTPVAPQSPVA 967
Cdd:PHA03307  155 AGASPAAVASDAASSRQAALPLSSPEET-ARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADDAG 233
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   968 AASRISSSP----------NMPPSNPYTPIAVASSTVNPAH---------TYKPHGGSQIVPPPKQPANRVVPLPPTASq 1028
Cdd:PHA03307  234 ASSSDSSSSessgcgwgpeNECPLPRPAPITLPTRIWEASGwngpssrpgPASSSSSPRERSPSPSPSSPGSGPAPSSP- 312
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1029 RASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTALPSSSAVSRPPIATSAGRSSTAASTSAP 1108
Cdd:PHA03307  313 RASSSSSSSRESSSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAASAGRPTRRRARA 392
                         330
                  ....*....|...
gi 19112374  1109 LTYPAGDRSHIPG 1121
Cdd:PHA03307  393 AVAGRARRRDATG 405
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
850-1091 3.64e-11

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 67.87  E-value: 3.64e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    850 PLPPPSTSTTAGWNDAPMLGQLPMRRAAPSMAPVRSPFPGASSAQPAAMSRTSSVSTLPP-PPPTASMTASAPAIASppp 928
Cdd:pfam03154  149 PSPQDNESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPqGSPATSQPPNQTQSTA--- 225
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    929 pkvgetyhPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPnMPPSNPYTPIAVASSTVNPAHTYKPH-GGSQ 1007
Cdd:pfam03154  226 --------APHTLIQQTPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQP-LPQPSLHGQMPPMPHSLQTGPSHMQHpVPPQ 296
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1008 IVPPPKQPANRVVPLPPTASQRASAYEPPTVSVPSPSAlspsvTPQLPPVSSRLPPVSATRPQIpQPPPVS--TALPSSS 1085
Cdd:pfam03154  297 PFPLTPQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSQL-----QSQQPPREQPLPPAPLSMPHI-KPPPTTpiPQLPNPQ 370

                   ....*.
gi 19112374   1086 AVSRPP 1091
Cdd:pfam03154  371 SHKHPP 376
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
807-1117 8.31e-11

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 66.48  E-value: 8.31e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    807 TGYAPVQPKTSQASSilPTVPRTTSYTSPyatTSSHITPADVHPLPPPSTSTTAGWNDAPMLGqlpmrRAAPSMApVRSP 886
Cdd:pfam05109  483 SGASPVTPSPSPRDN--GTESKAPDMTSP---TSAVTTPTPNATSPTPAVTTPTPNATSPTLG-----KTSPTSA-VTTP 551
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    887 FPGASSAQPAAMSRT--SSVSTLPPPPPTASMTASAPAIASPPppkVGETYHPP-----TASGTRVPPVQQPSHPNPYTP 959
Cdd:pfam05109  552 TPNATSPTPAVTTPTpnATIPTLGKTSPTSAVTTPTPNATSPT---VGETSPQAnttnhTLGGTSSTPVVTSPPKNATSA 628
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    960 VAP-QSPVAAASriSSSPNMPPSNPYTPIAVASSTVNPAH-----TYKPHGGSQI--VPPPKQPANRVvplpptaSQRAS 1031
Cdd:pfam05109  629 VTTgQHNITSSS--TSSMSLRPSSISETLSPSTSDNSTSHmplltSAHPTGGENItqVTPASTSTHHV-------STSSP 699
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1032 AYEPPTVSVPS-PSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPpvSTALPSSSAVSRPPIATSAGRSST---AASTSA 1107
Cdd:pfam05109  700 APRPGTTSQASgPGNSSTSTKPGEVNVTKGTPPKNATSPQAPSGQ--KTAVPTVTSTGGKANSTTGGKHTTghgARTSTE 777
                          330
                   ....*....|
gi 19112374   1108 PLTYPAGDRS 1117
Cdd:pfam05109  778 PTTDYGGDST 787
PHA03378 PHA03378
EBNA-3B; Provisional
938-1114 5.13e-10

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 63.93  E-value: 5.13e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   938 PTASGTRVPPVQQPSHPNPYTPVApQSPVAAASRISSSPNMPPsnPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQP-- 1015
Cdd:PHA03378  590 PSYAQTPWPVPHPSQTPEPPTTQS-HIPETSAPRQWPMPLRPI--PMRPLRMQPITFNVLVFPTPHQPPQVEITPYKPtw 666
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1016 -ANRVVPLPPTASQRASAYEP---PTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTALPSSSAVSRPP 1091
Cdd:PHA03378  667 tQIGHIPYQPSPTGANTMLPIqwaPGTMQPPPRAPTPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAPGRARP 746
                         170       180
                  ....*....|....*....|...
gi 19112374  1092 IATSAGRSSTAASTSAPLTYPAG 1114
Cdd:PHA03378  747 PAAAPGRARPPAAAPGRARPPAA 769
PHA03377 PHA03377
EBNA-3C; Provisional
776-1118 8.22e-10

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 63.53  E-value: 8.22e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   776 PTEFPGAKEEIQRLTMLLEPHAVPPIHQIKQTGYAP------VQPKTSQASSILPTVPRTTsytspyatTSSHITPADVH 849
Cdd:PHA03377  573 PSTGPRVMATPSTGPRDMAPPSTGPRQQAKCKDGPPasgpheKQPPSSAPRDMAPSVVRMF--------LRERLLEQSTG 644
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   850 PLPPPSTSTTAGwNDAPMLGQLPMRRAAPSMA-----PVRSPFPGASSAQPAAMSRTSSVSTLPPPPPTASMTASAPAIA 924
Cdd:PHA03377  645 PKPKSFWEMRAG-RDGSGIQQEPSSRRQPATQstpprPSWLPSVFVLPSVDAGRAQPSEESHLSSMSPTQPISHEEQPRY 723
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   925 SPPPPKVGETYHPPTA---------SGTRVPPVQQPSHPNPYTPVAPQSP-----VAAASRISSSPNMPPSNPYTPiavA 990
Cdd:PHA03377  724 EDPDDPLDLSLHPDQApppshqapySGHEEPQAQQAPYPGYWEPRPPQAPylgyqEPQAQGVQVSSYPGYAGPWGL---R 800
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   991 SSTVNPAHTYKPHGGSQIVPPPKQP-ANRVVPLPPT------ASQRASAYEPPTVSVPSPSALSPSVTPQLP----PVSS 1059
Cdd:PHA03377  801 AQHPRYRHSWAYWSQYPGHGHPQGPwAPRPPHLPPQwdgsagHGQDQVSQFPHLQSETGPPRLQLSQVPQLPysqtLVSS 880
                         330       340       350       360       370
                  ....*....|....*....|....*....|....*....|....*....|....*....
gi 19112374  1060 RLPPVSATRPQIPQpPPVSTALPSSSAvsrpPIATSAGRSSTAASTSAPLTYPAGDRSH 1118
Cdd:PHA03377  881 SAPSWSSPQPRAPI-RPIPTRFPPPPM----PLQDSMAVGCDSSGTACPSMPFASDYSQ 934
PHA03247 PHA03247
large tegument protein UL36; Provisional
797-1078 9.56e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 63.42  E-value: 9.56e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   797 AVPPIHQIKQTGYAPVQPKTSQASSILPTVPRTTSYTSPYATTSSHITPADVHPLPPPSTSTTAGWNDAPMLGQLPMRRA 876
Cdd:PHA03247 2776 AAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGS 2855
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   877 APSMAPV-RSPFPGASSAQPAAMSRTSsVSTLPPPPPtasmtasapaiasppppkvgetyhPPTASGTRVPPVQQPSHPN 955
Cdd:PHA03247 2856 VAPGGDVrRRPPSRSPAAKPAAPARPP-VRRLARPAV------------------------SRSTESFALPPDQPERPPQ 2910
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   956 PYTPVAPQSPvaAASRISSSPNMPPSNPYTPIAVASSTVNPAHTYKPHGGsqiVPPPKQPAnrVVPLPPTASQRASAYEP 1035
Cdd:PHA03247 2911 PQAPPPPQPQ--PQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGA---VPQPWLGA--LVPGRVAVPRFRVPQPA 2983
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 19112374  1036 PTVSVPSPSALSPSVTPqLPPVSSRLPpvSATRPQIPQPPPVS 1078
Cdd:PHA03247 2984 PSREAPASSTPPLTGHS-LSRVSSWAS--SLALHEETDPPPVS 3023
PHA03378 PHA03378
EBNA-3B; Provisional
773-1144 2.10e-09

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 62.01  E-value: 2.10e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   773 NLVPTEFPGAKEEIQRLTMLLEP----HAV-PPIHQIKQTGYAPVQPKTSQASSIlPTVPRTTSYTS----PYATTSSHI 843
Cdd:PHA03378  618 TSAPRQWPMPLRPIPMRPLRMQPitfnVLVfPTPHQPPQVEITPYKPTWTQIGHI-PYQPSPTGANTmlpiQWAPGTMQP 696
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   844 TPADVHPLPPPSTSttagwndapmlgqlPMRRAAPSMAPVRSPFPGASsaqPAAMSRTSSVSTlPPPPPTASMTASAPAI 923
Cdd:PHA03378  697 PPRAPTPMRPPAAP--------------PGRAQRPAAATGRARPPAAA---PGRARPPAAAPG-RARPPAAAPGRARPPA 758
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   924 ASPPPPKvgetyhPPTASGTRVPPVQQPSHPnpytPVAPQSPVAAasrisSSPNMPPSNPYTPIAVAsstvnpahtykph 1003
Cdd:PHA03378  759 AAPGRAR------PPAAAPGAPTPQPPPQAP----PAPQQRPRGA-----PTPQPPPQAGPTSMQLM------------- 810
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1004 ggsqivppPKQPANRVVPLPPTASQRASAyeppTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQP----PPVST 1079
Cdd:PHA03378  811 --------PRAAPGQQGPTKQILRQLLTG----GVKRGRPSLKKPAALERQAAAGPTPSPGSGTSDKIVQApvfyPPVLQ 878
                         330       340       350       360       370       380
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 19112374  1080 --ALPSSSAVSRPPIATSAgrsstaasTSAPLTYPAGDRSHIPGNLRPIYEMLNAELQRVSQSLPPQ 1144
Cdd:PHA03378  879 piQVMRQLGSVRAAAASTV--------TQAPTEYTGERRGVGPMHPTDIPPSKRAKTDAYVESQPPH 937
PRK10263 PRK10263
DNA translocase FtsK; Provisional
830-1147 2.42e-09

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 62.02  E-value: 2.42e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   830 TSYTSPYATTSSHITPADVHPLP-PPSTSTTAGWNDAPmlgqlPMRRAAPSMAPVRSPFPGASS-AQPAAMSRTSSVSTL 907
Cdd:PRK10263  327 TTATQSWAAPVEPVTQTPPVASVdVPPAQPTVAWQPVP-----GPQTGEPVIAPAPEGYPQQSQyAQPAVQYNEPLQQPV 401
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   908 PPPPPTASmtasAPAIASPPPPKVGETYHPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPPSNPYTPI 987
Cdd:PRK10263  402 QPQQPYYA----PAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQPAAQEPL 477
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   988 AVASSTVNPAHTYKPHGGSQIVPPPKQPANRVVPLPPTAS----QRASAYEPPTVSVPSPSALSPSVTPQLPPVssrLPP 1063
Cdd:PRK10263  478 YQQPQPVEQQPVVEPEPVVEETKPARPPLYYFEEVEEKRArereQLAAWYQPIPEPVKEPEPIKSSLKAPSVAA---VPP 554
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1064 VSatrpqipqppPVSTALPSSSAVSRPPIATSAgrsstAASTSAPLTYPAGDrshipGNLRPiyemlnaelqRVSQSLPP 1143
Cdd:PRK10263  555 VE----------AAAAVSPLASGVKKATLATGA-----AATVAAPVFSLANS-----GGPRP----------QVKEGIGP 604

                  ....
gi 19112374  1144 QMSR 1147
Cdd:PRK10263  605 QLPR 608
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
834-1118 3.90e-09

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 61.34  E-value: 3.90e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   834 SPYATTSSHITPADVHPLPPPSTSTTAGWNDAPML--GQLPMRRAAPSMAPVRSPFPGASSAQPAAMSRTSSVSTLPPPP 911
Cdd:PHA03307   21 FPRPPATPGDAADDLLSGSQGQLVSDSAELAAVTVvaGAAACDRFEPPTGPPPGPGTEAPANESRSTPTWSLSTLAPASP 100
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   912 PTASMTASAPAIASPPPPKVGETYHPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPPSNPYTPIAVAS 991
Cdd:PHA03307  101 AREGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASSRQAALPLSSPE 180
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   992 STVNPahtykPHGGSQIVPPPKQPANRVVPLPPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQI 1071
Cdd:PHA03307  181 ETARA-----PSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADDAGASSSDSSSSESSGCGWGPENEC 255
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*..
gi 19112374  1072 PQPPPVSTALPSSSAVSRPPIATSAGRSSTAASTSAPLTYPAGDRSH 1118
Cdd:PHA03307  256 PLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSPSS 302
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
807-1108 8.03e-09

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 59.93  E-value: 8.03e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    807 TGYAPVQPKTSQASSILPTVpRTTSYTSPYATT---SSHITPADVhpLPPPSTSTTAGWNDAPMLGQLPMRRAAPSMAPV 883
Cdd:pfam05109  415 TTHKVIFSKAPESTTTSPTL-NTTGFAAPNTTTglpSSTHVPTNL--TAPASTGPTVSTADVTSPTPAGTTSGASPVTPS 491
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    884 RSPFPGASSAQPAAMSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGETyHPPTASGTRVPPVQQPShPNPYTPVAPQ 963
Cdd:pfam05109  492 PSPRDNGTESKAPDMTSPTSAVTTPTPNATSPTPAVTTPTPNATSPTLGKT-SPTSAVTTPTPNATSPT-PAVTTPTPNA 569
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    964 S-PVAAASRISSSPNMPPSNPYTPiAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVVPLPPTASQRASAYEPPTVSVpS 1042
Cdd:pfam05109  570 TiPTLGKTSPTSAVTTPTPNATSP-TVGETSPQANTTNHTLGGTSSTPVVTSPPKNATSAVTTGQHNITSSSTSSMSL-R 647
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 19112374   1043 PSALSPSVTPQLPPVS-SRLPPVSATRP----QIPQPPPVSTA---LPSSSAVSRPPIATSAGRSSTAASTSAP 1108
Cdd:pfam05109  648 PSSISETLSPSTSDNStSHMPLLTSAHPtggeNITQVTPASTSthhVSTSSPAPRPGTTSQASGPGNSSTSTKP 721
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
876-1115 2.06e-08

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 58.73  E-value: 2.06e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   876 AAPSMAPVRSPFPGASSAQPAAMSRTSSVSTLPPPPPTASMTasapaiasppppkvgetyHPPTASGTRVPPVQQPshpn 955
Cdd:PRK12323  375 ATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAA------------------RAVAAAPARRSPAPEA---- 432
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   956 pyTPVAPQSPVAAASRISSSPNMPPSNPY--TPIAVASSTVNPAHTYKPhggsqivPPPKQPANRVVPLPPTASQRASAy 1033
Cdd:PRK12323  433 --LAAARQASARGPGGAPAPAPAPAAAPAaaARPAAAGPRPVAAAAAAA-------PARAAPAAAPAPADDDPPPWEEL- 502
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1034 ePPTVSVPSPSALSPSvtpqLPPVSSRLPPVSATRPQIPQPPPVSTAlPSSSAVSRPPIATSAGRSSTAASTSAPLTYPA 1113
Cdd:PRK12323  503 -PPEFASPAPAQPDAA----PAGWVAESIPDPATADPDDAFETLAPA-PAAAPAPRAAAATEPVVAPRPPRASASGLPDM 576

                  ..
gi 19112374  1114 GD 1115
Cdd:PRK12323  577 FD 578
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
874-1125 2.84e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 58.24  E-value: 2.84e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    874 RRAAPSMapvrsPFPGASSAQPAAMSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGETyhPPTASGTRVPPvqQPSH 953
Cdd:pfam03154  142 RSTSPSI-----PSPQDNESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATA--GPTPSAPSVPP--QGSP 212
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    954 PNPYTPVAPQSPVAAASRISSSPNMPPS---NPYTPIAVAS-----STVNPAHTYKP--HGGSQIVPPPKQ--PANRVVP 1021
Cdd:pfam03154  213 ATSQPPNQTQSTAAPHTLIQQTPTLHPQrlpSPHPPLQPMTqppppSQVSPQPLPQPslHGQMPPMPHSLQtgPSHMQHP 292
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1022 LPP----TASQRASAYEPPTvsvPSPSALSPS-VTPQLPPVSSRLPPVSATRPQiPQPPPvstalPSSSAVSRPPIATSA 1096
Cdd:pfam03154  293 VPPqpfpLTPQSSQSQVPPG---PSPAAPGQSqQRIHTPPSQSQLQSQQPPREQ-PLPPA-----PLSMPHIKPPPTTPI 363
                          250       260
                   ....*....|....*....|....*....
gi 19112374   1097 GRSSTAASTSAPLTYPAGDRSHIPGNLRP 1125
Cdd:pfam03154  364 PQLPNPQSHKHPPHLSGPSPFQMNSNLPP 392
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
937-1121 4.58e-08

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 57.55  E-value: 4.58e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   937 PPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPA 1016
Cdd:PRK07003  387 AAAAVGASAVPAVTAVTGAAGAALAPKAAAAAAATRAEAPPAAPAPPATADRGDDAADGDAPVPAKANARASADSRCDER 466
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1017 NRVVPLPP-TASQRASAYEPPTVSVPSPSALSPSvTPQLPPVSSRLPPVSATRPQIPQPPpvstALPSSSAVSRPPIATS 1095
Cdd:PRK07003  467 DAQPPADSgSASAPASDAPPDAAFEPAPRAAAPS-AATPAAVPDARAPAAASREDAPAAA----APPAPEARPPTPAAAA 541
                         170       180       190
                  ....*....|....*....|....*....|.
gi 19112374  1096 AGRSSTAASTSAPLTYPAG-----DRSHIPG 1121
Cdd:PRK07003  542 PAARAGGAAAALDVLRNAGmrvssDRGARAA 572
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
795-1076 4.99e-08

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 57.55  E-value: 4.99e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   795 PHAVPPIHQIKQTGYAPVQPKTSQASSILPTVPRTTSYTSPYATTSSHITPADvhplPPPSTSTTAGwnDAPMLGQLPMR 874
Cdd:PRK07003  377 AGAVPAPGARAAAAVGASAVPAVTAVTGAAGAALAPKAAAAAAATRAEAPPAA----PAPPATADRG--DDAADGDAPVP 450
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   875 RAAPSMAPVRSPfPGASSAQPAAMSRTSSVSTLPPPPPTASmtasapaiasppppkvgetyhPPTASGTRVPPVQQPSHP 954
Cdd:PRK07003  451 AKANARASADSR-CDERDAQPPADSGSASAPASDAPPDAAF---------------------EPAPRAAAPSAATPAAVP 508
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   955 NPYTPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVVPLPPTASQRASAYE 1034
Cdd:PRK07003  509 DARAPAAASREDAPAAAAPPAPEARPPTPAAAAPAARAGGAAAALDVLRNAGMRVSSDRGARAAAAAKPAAAPAAAPKPA 588
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*.
gi 19112374  1035 PPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQP----PP 1076
Cdd:PRK07003  589 APRVAVQVPTPRARAATGDAPPNGAARAEQAAESRGAPPPwediPP 634
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
932-1118 6.06e-08

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 57.30  E-value: 6.06e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   932 GETYHPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASSTVNPAHTYKPHGGSQIVPP 1011
Cdd:PRK07764  596 GGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGG 675
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1012 PKQPAnrVVPLPPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQ------IPQPP------PVST 1079
Cdd:PRK07764  676 AAPAA--PPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSpaaddpVPLPPepddppDPAG 753
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*
gi 19112374  1080 ALPSSSAVSRPPIATSAGRSSTAASTSAP------LTYPAGDRSH 1118
Cdd:PRK07764  754 APAQPPPPPAPAPAAAPAAAPPPSPPSEEeemaedDAPSMDDEDR 798
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
816-1115 1.23e-07

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 55.74  E-value: 1.23e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    816 TSQASSILPTVPRTTSYTSPYATTSSHITPADVHPLPP----PSTSTTAGwnDAPMLGQlPMRRAAPSMAPVRSPFPGAS 891
Cdd:pfam17823  116 AAAASSSPSSAAQSLPAAIAALPSEAFSAPRAAACRANasaaPRAAIAAA--SAPHAAS-PAPRTAASSTTAASSTTAAS 192
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    892 SAQPAAMSrtSSVSTLPPPPPTASMTASAPAIASPPPPKVGETYHPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASR 971
Cdd:pfam17823  193 SAPTTAAS--SAPATLTPARGISTAATATGHPAAGTALAAVGNSSPAAGTVTAAVGTVTPAALATLAAAAGTVASAAGTI 270
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    972 ISSSPNMPPSNPYTPIAVASSTVNPAHTYKPHG-GSQIVPPPKQPANRVVPlPPTASQRASAYEPPTVSVPSPSALS--- 1047
Cdd:pfam17823  271 NMGDPHARRLSPAKHMPSDTMARNPAAPMGAQAqGPIIQVSTDQPVHNTAG-EPTPSPSNTTLEPNTPKSVASTNLAvvt 349
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1048 --------PSVTPQLPPVSSRLPPVSATRPQI---PQPPPVSTALPSSsavsrPPIATSAGRSSTAASTSA-PLTYPAGD 1115
Cdd:pfam17823  350 ttkaqakePSASPVPVLHTSMIPEVEATSPTTqpsPLLPTQGAAGPGI-----LLAPEQVATEATAGTASAgPTPRSSGD 424
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
967-1091 1.87e-07

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 55.49  E-value: 1.87e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   967 AAASRISSSPNMPPSNPYTPIAVASSTVNPAHTYKPHGGSqivPPPKQPANRVVPlPPTASQRASAYEPPTVSVPSPSAL 1046
Cdd:PRK14951  372 AAAPAEKKTPARPEAAAPAAAPVAQAAAAPAPAAAPAAAA---SAPAAPPAAAPP-APVAAPAAAAPAAAPAAAPAAVAL 447
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*
gi 19112374  1047 SPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTALPSSSAVSRPP 1091
Cdd:PRK14951  448 APAPPAQAAPETVAIPVRVAPEPAVASAAPAPAAAPAAARLTPTE 492
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
775-1062 2.52e-07

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 55.27  E-value: 2.52e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   775 VPTEFPGAkEEIQRLTMLLEPHAVPPIHQIKQTGYAPVQPKTSQASSILPTVPRT-----------TSYTSPYATTSSHI 843
Cdd:PRK12323  307 VQDDWPEA-DDIRRLAGRFDAQEVQLFYQIANLGRSELALAPDEYAGFTMTLLRMlafrpgqsgggAGPATAAAAPVAQP 385
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   844 TPADVHPLPPPSTSTTAGWNDAPMLGQLPMRRAAPSMAPVRSPFPGASSAQPAAMSRTSSVSTLPPPPPTASMTASAPAI 923
Cdd:PRK12323  386 APAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPA 465
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   924 ASPPPPKVGETYHPPTASGTRVPPVQQPSHPNPYTPVAPQSPvaaasrissSPNMPPSNPYTPIAVASSTVNPAHTYKPH 1003
Cdd:PRK12323  466 AAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEELPPEFA---------SPAPAQPDAAPAGWVAESIPDPATADPDD 536
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|....*....
gi 19112374  1004 GGSQIVPPPKQPAnrvVPLPPTASQRASAYEPPTVSVPSpsaLSPSVTPQLPPVSSRLP 1062
Cdd:PRK12323  537 AFETLAPAPAAAP---APRAAAATEPVVAPRPPRASASG---LPDMFDGDWPALAARLP 589
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
973-1118 5.34e-07

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 53.95  E-value: 5.34e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   973 SSSPNMPPSNPYTPIAVASSTVNPAhtykphggsqivPPPKQPANRVVPLPPTASQRASAyEPPTVSVPSPSALSPSVTP 1052
Cdd:PRK14951  368 AAAEAAAPAEKKTPARPEAAAPAAA------------PVAQAAAAPAPAAAPAAAASAPA-APPAAAPPAPVAAPAAAAP 434
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 19112374  1053 QLPPvssrlPPVSATRPQIPQPPPVSTALPSSSA--VSRPPIATSAGRSSTAASTSAPLT-YPAGDRSH 1118
Cdd:PRK14951  435 AAAP-----AAAPAAVALAPAPPAQAAPETVAIPvrVAPEPAVASAAPAPAAAPAAARLTpTEEGDVWH 498
PHA03247 PHA03247
large tegument protein UL36; Provisional
937-1114 6.27e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 54.17  E-value: 6.27e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   937 PPTASG-TRVPPVQQPSHPNPytPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASstvNPAHTYKPHGGSQIVPPPKQP 1015
Cdd:PHA03247  269 PETARGaTGPPPPPEAAAPNG--AAAPPDGVWGAALAGAPLALPAPPDPPPPAPAG---DAEEEDDEDGAMEVVSPLPRP 343
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1016 ANRVvPL-----------PPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTALPSS 1084
Cdd:PHA03247  344 RQHY-PLgfpkrrrptwtPPSSLEDLSAGRHHPKRASLPTRKRRSARHAATPFARGPGGDDQTRPAAPVPASVPTPAPTP 422
                         170       180       190
                  ....*....|....*....|....*....|
gi 19112374  1085 SAVSRPPIATSAGRSSTAASTSAPLTYPAG 1114
Cdd:PHA03247  423 VPASAPPPPATPLPSAEPGSDDGPAPPPER 452
SRA1 pfam07304
Steroid receptor RNA activator (SRA1); This family consists of several hypothetical mammalian ...
1072-1217 8.33e-07

Steroid receptor RNA activator (SRA1); This family consists of several hypothetical mammalian steroid receptor RNA activator proteins. SRA-RNAs likely to encode stable proteins are widely expressed in breast cancer cell lines. SRA-RNA is a steroid receptor co-activator which acts as a functional RNA and is classified as belonging to the growing family of functional non-coding RNAs.


Pssm-ID: 429397 [Multi-domain]  Cd Length: 147  Bit Score: 49.82  E-value: 8.33e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1072 PQPPPVSTALPsssavsrPPIATSAGRSSTAASTSAPLTYPAGDrshIPGNLRPiyemLNAELQRVSQSLPPQmsrVVHD 1151
Cdd:pfam07304    2 MGPPPPSVAPP-------PPPPGGPGPLPQVEPTDSPVSESEPV---VEDVMSV----LNQALDACRGTVRKQ---VCDD 64
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 19112374   1152 TEKRLNMLFDRLNSNVLSKPLTDELLALATSLNAHDYQTASNIQTNIVTTLGDQCEHWIVGVTRLI 1217
Cdd:pfam07304   65 VSKRLRLLEDSWRSGKLSLPVRRRMDTLSQELQSGHWDAADDIHRSLMVDHVTEVSQWMVGVKRLI 130
PLN03209 PLN03209
translocon at the inner envelope of chloroplast subunit 62; Provisional
845-1062 1.15e-06

translocon at the inner envelope of chloroplast subunit 62; Provisional


Pssm-ID: 178748 [Multi-domain]  Cd Length: 576  Bit Score: 53.01  E-value: 1.15e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   845 PADVHPLPPPSTSTTAGwNDAPMLGQLPMRRAAPSMA-----PVRSPFPGASSAQPAAMSRTSSVStLPPPPPTASMTAS 919
Cdd:PLN03209  341 PVPTKPVTPEAPSPPIE-EEPPQPKAVVPRPLSPYTAyedlkPPTSPIPTPPSSSPASSKSVDAVA-KPAEPDVVPSPGS 418
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   920 APAIASPPPPKVGETYHPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASSTVNPAHT 999
Cdd:PLN03209  419 ASNVPEVEPAQVEAKKTRPLSPYARYEDLKPPTSPSPTAPTGVSPSVSSTSSVPAVPDTAPATAATDAAAPPPANMRPLS 498
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 19112374  1000 YKPHGGSQIVPPPKQPANRVVPLPPTASQRAS---AYEPPTVSV-------PSPSALSP-SVTPQLPPVSSRLP 1062
Cdd:PLN03209  499 PYAVYDDLKPPTSPSPAAPVGKVAPSSTNEVVkvgNSAPPTALAdeqhhaqPKPRPLSPyTMYEDLKPPTSPTP 572
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
884-1108 1.92e-06

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 52.48  E-value: 1.92e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   884 RSPFPGASSAQPAAMSRTSSVSTLPPPP-PTASMTASAPAIASPPPPKVGEtyhPPTASGTRVPPVQqpshPNPYTPVAP 962
Cdd:PHA03307   20 FFPRPPATPGDAADDLLSGSQGQLVSDSaELAAVTVVAGAAACDRFEPPTG---PPPGPGTEAPANE----SRSTPTWSL 92
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   963 QSPVAAASRISSSPNMPPSNPYTPIAVASSTVNPAHTYKP-HGGSQIVPPPKQPANRVVPLPPTASQRASAYEPPT---- 1037
Cdd:PHA03307   93 STLAPASPAREGSPTPPGPSSPDPPPPTPPPASPPPSPAPdLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASsrqa 172
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 19112374  1038 -VSVPSPSALSPSVTPQLPPVSSRLPPVSATRPqiPQPPPVSTALPSSSAVSRPPIATSAGRSSTAASTSAP 1108
Cdd:PHA03307  173 aLPLSSPEETARAPSSPPAEPPPSTPPAAASPR--PPRRSSPISASASSPAPAPGRSAADDAGASSSDSSSS 242
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
781-1073 2.28e-06

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 51.96  E-value: 2.28e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    781 GAKEEIQRLTMLLEPHAVPPIHQIKQTGYAPVQPKTSQASSILPTVPRTTSYtSPYATTsshiTP-ADVHPlpppststt 859
Cdd:pfam09770   93 DAIEEEQVRFNRQQPAARAAQSSAQPPASSLPQYQYASQQSQQPSKPVRTGY-EKYKEP----EPiPDLQV--------- 158
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    860 agwnDAPMLGQLPMRRAAPSMAPVRSPFPGASSAQP----------AAMSRTSSVSTLPPPPPTASMTASAPAIASPPPP 929
Cdd:pfam09770  159 ----DASLWGVAPKKAAAPAPAPQPAAQPASLPAPSrkmmsleeveAAMRAQAKKPAQQPAPAPAQPPAAPPAQQAQQQQ 234
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    930 KVGETYHPPTASGTRvpPVQQPSHPNPYTPVA----PQSPvaaaSRISSSPNMPPSNPYTPIAVASSTVNPahtykphgg 1005
Cdd:pfam09770  235 QFPPQIQQQQQPQQQ--PQQPQQHPGQGHPVTilqrPQSP----QPDPAQPSIQPQAQQFHQQPPPVPVQP--------- 299
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 19112374   1006 SQIVPPPKQPANRVVPLPPTAsqrasayepptvsVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQ 1073
Cdd:pfam09770  300 TQILQNPNRLSAARVGYPQNP-------------QPGVQPAPAHQAHRQQGSFGRQAPIITHPQQLAQ 354
KAR9 pfam08580
Yeast cortical protein KAR9; The KAR9 protein in Saccharomyces cerevisiae is a cytoskeletal ...
820-1112 2.32e-06

Yeast cortical protein KAR9; The KAR9 protein in Saccharomyces cerevisiae is a cytoskeletal protein required for karyogamy, correct positioning of the mitotic spindle and for orientation of cytoplasmic microtubules. KAR9 localizes at the shmoo tip in mating cells and at the tip of the growing bud in anaphase.


Pssm-ID: 430088 [Multi-domain]  Cd Length: 684  Bit Score: 51.75  E-value: 2.32e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    820 SSILPTVPRTTSYTSPyATTSSHITPAdvhpLPPPSTSTTAGWNDAPmlgQLPMRRAAPSmapvrspFPGASSAQPAAMS 899
Cdd:pfam08580  406 SSIFEDKNMHDTEDSP-ATLVANKTPG----SSPPSSVIMTPVNKGS---KTPSSRRGSS-------FDFGSSSERVINS 470
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    900 RTSSVSTLPPPPPTASMTASAPAIasppppkvgetyhpPTASGTRvpPVQQPSHPNPYTPVAPQSPVAAASRISSSPnmP 979
Cdd:pfam08580  471 KLRRESKLPQIASTLKQTKRPSKI--------------PRASPNH--SGFLSTPSNTATSETPTPALRPPSRPQPPP--P 532
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    980 PSNPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVVPLPPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVSS 1059
Cdd:pfam08580  533 GNRPRWNASTNTNDLDVGHNFKPLTLTTPSPTPSRSSRSSSTLPPVSPLSRDKSRSPAPTCRSVSRASRRRASRKPTRIG 612
                          250       260       270       280       290
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 19112374   1060 RLPPvsatRPQIPQPPPVSTA-----LPSSSAVSRPPIATSAGRSSTAASTSAPLTYP 1112
Cdd:pfam08580  613 SPNS----RTSLLDEPPYPKLtlskgLPRTPRNRQSYAGTSPSRSVSVSSGLGPQTRP 666
SOBP pfam15279
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ...
954-1143 3.04e-06

Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.


Pssm-ID: 464609 [Multi-domain]  Cd Length: 325  Bit Score: 50.58  E-value: 3.04e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    954 PNPYTP--VAPQSPVAAASRISSSPNmpPSNPYTP-IAVASSTVNPAHtyKPHGGSQIVPPP-------------KQPAN 1017
Cdd:pfam15279   83 ASPASTrsESVSPGPSSSASPSSSPT--SSNSSKPlISVASSSKLLAP--KPHEPPSLPPPPlppkkgrrhrpglHPPLG 158
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1018 RVVPLPPTASQRASAYEPPTvSVPSPSALSP------------SVTPQLPPVSSRLPPVSATrPQIPQPPPVSTALPSSS 1085
Cdd:pfam15279  159 RPPGSPPMSMTPRGLLGKPQ-QHPPPSPLPAfmepssmpppflRPPPSIPQPNSPLSNPMLP-GIGPPPKPPRNLGPPSN 236
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 19112374   1086 AVSRPPIATSAGRSSTAASTSAPLTYPAGDRSHIP--------GNLRPIYEMLNAELQRVSQSLPP 1143
Cdd:pfam15279  237 PMHRPPFSPHHPPPPPTPPGPPPGLPPPPPRGFTPpfgppfppVNMMPNPPEMNFGLPSLAPLVPP 302
PLN03209 PLN03209
translocon at the inner envelope of chloroplast subunit 62; Provisional
867-1126 3.96e-06

translocon at the inner envelope of chloroplast subunit 62; Provisional


Pssm-ID: 178748 [Multi-domain]  Cd Length: 576  Bit Score: 51.08  E-value: 3.96e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   867 MLGQLPMRRAAPSmAPVRSPFPGASSAQPAAMSRTSSVSTLPPPPPTASMTASAPAIASPPPPKvgetyhPPTAsgtrvp 946
Cdd:PLN03209  319 LLAKIPSQRVPPK-ESDAADGPKPVPTKPVTPEAPSPPIEEEPPQPKAVVPRPLSPYTAYEDLK------PPTS------ 385
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   947 pvqqPShPNPYTPVAPQSPVAAASRISSSPNMPPSnPYTPIAVASSTVNPAHTYK-----PHGGSQIVPPPKQPAnrvvP 1021
Cdd:PLN03209  386 ----PI-PTPPSSSPASSKSVDAVAKPAEPDVVPS-PGSASNVPEVEPAQVEAKKtrplsPYARYEDLKPPTSPS----P 455
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1022 LPPTASQrasayepPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTALPSS--SAVSRPPIATSAGRS 1099
Cdd:PLN03209  456 TAPTGVS-------PSVSSTSSVPAVPDTAPATAATDAAAPPPANMRPLSPYAVYDDLKPPTSpsPAAPVGKVAPSSTNE 528
                         250       260
                  ....*....|....*....|....*..
gi 19112374  1100 STAASTSAPLTYPAGDRSHIPGNLRPI 1126
Cdd:PLN03209  529 VVKVGNSAPPTALADEQHHAQPKPRPL 555
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
814-1113 4.25e-06

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 51.23  E-value: 4.25e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   814 PKTSQASSILPTVPRTTSYTSPYATTSSHITPADVHPLPPPSTSTTAGWNDAPMlgqlpmRRAAPSMAPV---RSPFPGA 890
Cdd:PTZ00449  511 PEGPEASGLPPKAPGDKEGEEGEHEDSKESDEPKEGGKPGETKEGEVGKKPGPA------KEHKPSKIPTlskKPEFPKD 584
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   891 SSAQPAAMSRTSSVSTLPPPPPTAsmtasapaiasPPPPKVGETYHPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAAS 970
Cdd:PTZ00449  585 PKHPKDPEEPKKPKRPRSAQRPTR-----------PKSPKLPELLDIPKSPKRPESPKSPKRPPPPQRPSSPERPEGPKI 653
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   971 RISSSPNMPPSNPYTP----------IAVAS------STVNPAHTYKPHGGSQIVPPPKQPANRVVPLPPT-ASQRASAY 1033
Cdd:PTZ00449  654 IKSPKPPKSPKPPFDPkfkekfyddyLDAAAksketkTTVVLDESFESILKETLPETPGTPFTTPRPLPPKlPRDEEFPF 733
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1034 EPPTvsvpSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTALPS---SSAVSRPPIATSAGRSSTAASTSAPLT 1110
Cdd:PTZ00449  734 EPIG----DPDAEQPDDIEFFTPPEEERTFFHETPADTPLPDILAEEFKEediHAETGEPDEAMKRPDSPSEHEDKPPGD 809

                  ...
gi 19112374  1111 YPA 1113
Cdd:PTZ00449  810 HPS 812
PRK14971 PRK14971
DNA polymerase III subunit gamma/tau;
1009-1113 4.51e-06

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237874 [Multi-domain]  Cd Length: 614  Bit Score: 50.93  E-value: 4.51e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1009 VPPPKQPANrvvPLPPTASQRASAYEPPTVSVPSPSALSPSVTPQlPPVSSRLPPVSATRPQIPQPPPVSTALPSSSAV- 1087
Cdd:PRK14971  369 ASGGRGPKQ---HIKPVFTQPAAAPQPSAAAAASPSPSQSSAAAQ-PSAPQSATQPAGTPPTVSVDPPAAVPVNPPSTAp 444
                          90       100       110
                  ....*....|....*....|....*....|....*
gi 19112374  1088 ---------SRPPIATSaGRSSTAASTSAPLTYPA 1113
Cdd:PRK14971  445 qavrpaqfkEEKKIPVS-KVSSLGPSTLRPIQEKA 478
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
864-1088 5.77e-06

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 50.75  E-value: 5.77e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   864 DAPMLGQLPMRRAAPSMAPVRSPFPGASSAQPA-AMSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGETYHPPTASG 942
Cdd:PRK07764  591 APGAAGGEGPPAPASSGPPEEAARPAAPAAPAApAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWP 670
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   943 TRVPPVQQPSHPNPYTPVAPQSPVAAASRISSspnmpPSNPYTPIAVASSTVNPAHTYKPHGGSqivpPPKQPANRVVPL 1022
Cdd:PRK07764  671 AKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPA-----PAPAATPPAGQADDPAAQPPQAAQGAS----APSPAADDPVPL 741
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 19112374  1023 PPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTALPSSSAVS 1088
Cdd:PRK07764  742 PPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEEEEMAEDDAPSMDDEDRRDAEEVAME 807
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
810-1150 1.11e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 49.98  E-value: 1.11e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   810 APVQPKTSQASSILPTVPRTTSYTSPYATTSSHITPADVHPLP---PPSTSTTAGWNDAPMLGQLPMRRAAPSMAPVRSP 886
Cdd:PRK07764  389 GGAGAPAAAAPSAAAAAPAAAPAPAAAAPAAAAAPAPAAAPQPapaPAPAPAPPSPAGNAPAGGAPSPPPAAAPSAQPAP 468
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   887 FPGASSAQPAAMSRTSSVSTLPPPPPTASMTASAPAIASPPPP----------KVGET---YHPPTASGTRVPPVQQP-- 951
Cdd:PRK07764  469 APAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAPAGADDAATlrerwpeilaAVPKRsrkTWAILLPEATVLGVRGDtl 548
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   952 --SHPNP--------------------------YTPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASSTVNPAHTYKPH 1003
Cdd:PRK07764  549 vlGFSTGglarrfaspgnaevlvtalaeelggdWQVEAVVGPAPGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPA 628
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1004 GGSQIVPP-PKQPANRVVPLPPTASQRASAYEPPTVSVP--SPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTA 1080
Cdd:PRK07764  629 PAGAAAAPaEASAAPAPGVAAPEHHPKHVAVPDASDGGDgwPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAAT 708
                         330       340       350       360       370       380       390
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1081 LPSSSAVSRPPIATSAGRSSTAASTSAPLTYPAGDRSHIPGNLRPIYEMLNAELQRVSQSLPPQMSRVVH 1150
Cdd:PRK07764  709 PPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSP 778
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
858-1117 2.40e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.83  E-value: 2.40e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   858 TTAGWNDAPMLGQLPMRRAAPSMAPVRSPFPGASSAQPAAMSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGETYHP 937
Cdd:PRK07764  386 GVAGGAGAPAAAAPSAAAAAPAAAPAPAAAAPAAAAAPAPAAAPQPAPAPAPAPAPPSPAGNAPAGGAPSPPPAAAPSAQ 465
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   938 PTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPP-----------------SNPYTPIAVAS--------- 991
Cdd:PRK07764  466 PAPAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAPAGADDaatlrerwpeilaavpkRSRKTWAILLPeatvlgvrg 545
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   992 STVNPAHTYKPHGGS----------------------QI---VPPPKQPANRVVPLPPTASQRASAYEPPTvSVPSPSAL 1046
Cdd:PRK07764  546 DTLVLGFSTGGLARRfaspgnaevlvtalaeelggdwQVeavVGPAPGAAGGEGPPAPASSGPPEEAARPA-APAAPAAP 624
                         250       260       270       280       290       300       310
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 19112374  1047 SPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTALPSSSA---VSRPPIATSAGRSSTAASTSAPLTYPAGDRS 1117
Cdd:PRK07764  625 AAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDggdGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAP 698
PHA03379 PHA03379
EBNA-3A; Provisional
937-1148 2.77e-05

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 48.52  E-value: 2.77e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   937 PPTASGTRVPPVQQPShpnpytPVAPQSPVAAASRISS-SPNMPPSNPYTPiavasSTVNPAHTYKPHGGSQIVPPPKQP 1015
Cdd:PHA03379  409 SEPTYGTPRPPVEKPR------PEVPQSLETATSHGSAqVPEPPPVHDLEP-----GPLHDQHSMAPCPVAQLPPGPLQD 477
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1016 ANR--VVPLPPTASQRASAYEP-PTVSVPSPSALSPSVTPQLPPVSSRLPPVSAtrpqipQPPPVSTALPSSSAVSRPPI 1092
Cdd:PHA03379  478 LEPgdQLPGVVQDGRPACAPVPaPAGPIVRPWEASLSQVPGVAFAPVMPQPMPV------EPVPVPTVALERPVCPAPPL 551
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 19112374  1093 ATSAGRSSTAASTSAPLTY------PAGDRSHIPGNLRPIYEMLNAELQRVSQSL---PPQMSRV 1148
Cdd:PHA03379  552 IAMQGPGETSGIVRVRERWrpapwtPNPPRSPSQMSVRDRLARLRAEAQPYQASVevqPPQLTQV 616
PRK13042 PRK13042
superantigen-like protein SSL4; Reviewed;
1025-1095 3.40e-05

superantigen-like protein SSL4; Reviewed;


Pssm-ID: 183854 [Multi-domain]  Cd Length: 291  Bit Score: 47.32  E-value: 3.40e-05
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 19112374  1025 TASQRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTAlPSSSAVSRPPIATS 1095
Cdd:PRK13042   23 TTTQAANATTPSSTKVEAPQSTPPSTKVEAPQSKPNATTPPSTKVEAPQQTPNATT-PSSTKVETPQSPTT 92
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
999-1115 3.70e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 47.94  E-value: 3.70e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   999 TYKPHGGSQIVPPPKQPANRVVPLPPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRlPPVSATRPQIPQPP--- 1075
Cdd:PRK07994  358 AFHPAAPLPEPEVPPQSAAPAASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETT-SQLLAARQQLQRAQgat 436
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|
gi 19112374  1076 PVSTALPSSSAVSRPPIATSAGRSSTAASTSAPLTYPAGD 1115
Cdd:PRK07994  437 KAKKSEPAAASRARPVNSALERLASVRPAPSALEKAPAKK 476
PRK12727 PRK12727
flagellar biosynthesis protein FlhF;
838-1035 3.86e-05

flagellar biosynthesis protein FlhF;


Pssm-ID: 237182 [Multi-domain]  Cd Length: 559  Bit Score: 47.68  E-value: 3.86e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   838 TTSSHITPADVHPLPPPSTSTTAgwndaPMLGQLPMRRAAPSMAPVRSPFPGASSAQPAAMSRTSSVS---TLPPPPPTA 914
Cdd:PRK12727   57 TARSDTPATAAAPAPAPQAPTKP-----AAPVHAPLKLSANANMSQRQRVASAAEDMIAAMALRQPVSvprQAPAAAPVR 131
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   915 SMTASAPAIASPPPPKVGETYHPPTASGTRVP-----------PVQQPSHPNPYTPVAPQSPVAAAsrissspnmPPSNP 983
Cdd:PRK12727  132 AASIPSPAAQALAHAAAVRTAPRQEHALSAVPeqlfadflttaPVPRAPVQAPVVAAPAPVPAIAA---------ALAAH 202
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|..
gi 19112374   984 YTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVVPLPPTASQRASAYEP 1035
Cdd:PRK12727  203 AAYAQDDDEQLDDDGFDLDDALPQILPPAALPPIVVAPAAPAALAAVAAAAP 254
PRK11901 PRK11901
hypothetical protein; Reviewed
816-985 4.15e-05

hypothetical protein; Reviewed


Pssm-ID: 237015 [Multi-domain]  Cd Length: 327  Bit Score: 46.99  E-value: 4.15e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   816 TSQASSILPTVPRTTSYTSPYATTSSHITPADVHPLPPPSTSTTAgwndAPmlgqlpmrrAAPSMAPVRSPFPG----AS 891
Cdd:PRK11901   87 LSSGNQSSPSAANNTSDGHDASGVKNTAPPQDISAPPISPTPTQA----AP---------PQTPNGQQRIELPGnisdAL 153
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   892 SAQ-----PAAMSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVgetyHPPTASGTRVPPVQQPSHPNPYTPVAPQSPV 966
Cdd:PRK11901  154 SQQqgqvnAASQNAQGNTSTLPTAPATVAPSKGAKVPATAETHPT----PPQKPATKKPAVNHHKTATVAVPPATSGKPK 229
                         170
                  ....*....|....*....
gi 19112374   967 AAASRISSSPNMPPSNpYT 985
Cdd:PRK11901  230 SGAASARALSSAPASH-YT 247
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
1000-1129 4.22e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 47.95  E-value: 4.22e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1000 YKPHGGSQIVPPPKQPANRVVPLPPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVS--SRLPPVSATRPQIPQ---- 1073
Cdd:PRK12323  363 FRPGQSGGGAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAapARRSPAPEALAAARQasar 442
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 19112374  1074 -----PPPVSTALPSSSAVSRPPIATSAGRSSTAASTSAPlTYPAGDRSHIPGNLRPIYEM 1129
Cdd:PRK12323  443 gpggaPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPAR-AAPAAAPAPADDDPPPWEEL 502
PLN03209 PLN03209
translocon at the inner envelope of chloroplast subunit 62; Provisional
776-1043 4.35e-05

translocon at the inner envelope of chloroplast subunit 62; Provisional


Pssm-ID: 178748 [Multi-domain]  Cd Length: 576  Bit Score: 47.61  E-value: 4.35e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   776 PTEFPGAKEEIQRLTMLLEPHA-VPPIH----QIKQTGYAPVQPKTSQASSILPTVPRTTSYTS-PYATTSSHITPADVH 849
Cdd:PLN03209  330 PKESDAADGPKPVPTKPVTPEApSPPIEeeppQPKAVVPRPLSPYTAYEDLKPPTSPIPTPPSSsPASSKSVDAVAKPAE 409
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   850 PLPPPSTSTTAGWnDAPMLGQLPMRRAAP--------SMAPVRSPFPGASSAQPAAMSRTSSVSTLPPPPPTASmtasap 921
Cdd:PLN03209  410 PDVVPSPGSASNV-PEVEPAQVEAKKTRPlspyaryeDLKPPTSPSPTAPTGVSPSVSSTSSVPAVPDTAPATA------ 482
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   922 aiasppppkVGETYHPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASstvnpahtyk 1001
Cdd:PLN03209  483 ---------ATDAAAPPPANMRPLSPYAVYDDLKPPTSPSPAAPVGKVAPSSTNEVVKVGNSAPPTALAD---------- 543
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*
gi 19112374  1002 phGGSQIVPPPKqpanrvvPLPPTasqraSAYE---PPTVSVPSP 1043
Cdd:PLN03209  544 --EQHHAQPKPR-------PLSPY-----TMYEdlkPPTSPTPSP 574
PLN00181 PLN00181
protein SPA1-RELATED; Provisional
211-335 5.25e-05

protein SPA1-RELATED; Provisional


Pssm-ID: 177776 [Multi-domain]  Cd Length: 793  Bit Score: 47.77  E-value: 5.25e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   211 AATGAVNSIAWHPNNATRLATAiddNRNPIILTWDL-RQPTVPQniLTGHQKAALSLSWCPEDPTFLLSSGKDGRAMVWN 289
Cdd:PLN00181  530 ASRSKLSGICWNSYIKSQVASS---NFEGVVQVWDVaRSQLVTE--MKEHEKRVWSIDYSSADPTLLASGSDDGSVKLWS 604
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|.
gi 19112374   290 VETGESLGSFPrsgnwyTKSSWC----PSNSNR-VAVASLEGKVSIFSIQS 335
Cdd:PLN00181  605 INQGVSIGTIK------TKANICcvqfPSESGRsLAFGSADHKVYYYDLRN 649
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
1015-1107 6.40e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 47.11  E-value: 6.40e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1015 PANRVVPLPPTASQRASAYEPPTVSVPsPSALSPSVTPQLPPVSSRLPP-VSATRPQIPQPPPVSTALPSSS-------- 1085
Cdd:PRK14950  362 PVPAPQPAKPTAAAPSPVRPTPAPSTR-PKAAAAANIPPKEPVRETATPpPVPPRPVAPPVPHTPESAPKLTraaipvde 440
                          90       100
                  ....*....|....*....|...
gi 19112374  1086 -AVSRPPIATSAGRSSTAASTSA 1107
Cdd:PRK14950  441 kPKYTPPAPPKEEEKALIADGDV 463
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
778-1120 6.65e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 47.15  E-value: 6.65e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   778 EFPGAKEeIQRLTMLLEPHAVPPIHQIKQTGYAPVQPKTSQASSILPTVPRTTSYtSPYATTSShiTPADVHPLPPPSTS 857
Cdd:PRK07003  305 EWPEAAD-LRRFAELLSPEQVQLFYQIATVGRGELGLAPDEYAGFTMTLLRMLAF-EPAVTGGG--APGGGVPARVAGAV 380
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   858 TTAGWNDAPMLGQlpmrRAAPSMAPVRSPfPGASSAQPAAMSRTSSVSTLPP--PPPTASMTASAPAIASPPPPKVGETY 935
Cdd:PRK07003  381 PAPGARAAAAVGA----SAVPAVTAVTGA-AGAALAPKAAAAAAATRAEAPPaaPAPPATADRGDDAADGDAPVPAKANA 455
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   936 HPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISsspnmPPSNPYTPIAVASSTVNPAHTYKPHGgsQIVPPPKQP 1015
Cdd:PRK07003  456 RASADSRCDERDAQPPADSGSASAPASDAPPDAAFEPA-----PRAAAPSAATPAAVPDARAPAAASRE--DAPAAAAPP 528
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1016 ANRVVPLPPTASQrasayePPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTALPSSSAVSRPPIATS 1095
Cdd:PRK07003  529 APEARPPTPAAAA------PAARAGGAAAALDVLRNAGMRVSSDRGARAAAAAKPAAAPAAAPKPAAPRVAVQVPTPRAR 602
                         330       340
                  ....*....|....*....|....*
gi 19112374  1096 AGRSSTAASTSAPLTYPAGDRSHIP 1120
Cdd:PRK07003  603 AATGDAPPNGAARAEQAAESRGAPP 627
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
876-1105 9.11e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 46.90  E-value: 9.11e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   876 AAPSMAPVRSPFPGASSAQPAAMSRTSSVSTLPPPPPTASmtasapaiasppppkvGETYHPPTASGTRVPPVQQPSHPN 955
Cdd:PRK07764  591 APGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPA----------------GAAAAPAEASAAPAPGVAAPEHHP 654
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   956 PYTPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVVPLPPTASQRASAYEP 1035
Cdd:PRK07764  655 KHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPA 734
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1036 PTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTalPSSSAVSRPPIATSAGRSSTAAST 1105
Cdd:PRK07764  735 ADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSP--PSEEEEMAEDDAPSMDDEDRRDAE 802
PHA02030 PHA02030
hypothetical protein
1004-1093 1.00e-04

hypothetical protein


Pssm-ID: 222843 [Multi-domain]  Cd Length: 336  Bit Score: 46.13  E-value: 1.00e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1004 GGSQIVPPPKQPANRVVPLPPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVssrlpPVSATRPQIPQPPPVST-ALP 1082
Cdd:PHA02030  248 GGGEDLIIKPKSKAAGSNLPAVPNVAADAGSAAAPAVPAAAAAVAQAAPSVPQV-----PNVAVLPDVPQVAPVAApAAP 322
                          90
                  ....*....|.
gi 19112374  1083 SSSAVSRPPIA 1093
Cdd:PHA02030  323 EVPAVPVVPAA 333
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
868-1025 1.08e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 46.63  E-value: 1.08e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   868 LGQLPMRRAAPSMAPVRS-PFPGASSAQPAAMSRTSSVSTLPPPPPTAsmtasapaiASPPPPKVGETYHPPTASGTRVP 946
Cdd:PRK14951  344 LGLAPDEYAALTMVLLRLlAFKPAAAAEAAAPAEKKTPARPEAAAPAA---------APVAQAAAAPAPAAAPAAAASAP 414
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 19112374   947 PVQQPSHPNPytPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVVPLPPT 1025
Cdd:PRK14951  415 AAPPAAAPPA--PVAAPAAAAPAAAPAAAPAAVALAPAPPAQAAPETVAIPVRVAPEPAVASAAPAPAAAPAAARLTPT 491
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
811-1031 1.13e-04

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 46.45  E-value: 1.13e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    811 PVQPKTSQASSILPTVPRTTS----YTSPYATTSSHI------TPADVHPLPPPSTSTTAGWNDAPMLGQLPMRRAAPSM 880
Cdd:pfam05109  572 PTLGKTSPTSAVTTPTPNATSptvgETSPQANTTNHTlggtssTPVVTSPPKNATSAVTTGQHNITSSSTSSMSLRPSSI 651
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    881 APVRSP---------FPGASSAQPA----------AMSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGET-----YH 936
Cdd:pfam05109  652 SETLSPstsdnstshMPLLTSAHPTggenitqvtpASTSTHHVSTSSPAPRPGTTSQASGPGNSSTSTKPGEVnvtkgTP 731
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    937 PPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRIS---SSPNMPPSNPYTPIAVASSTVNPAHTY-KPHGGSQIVP-- 1010
Cdd:pfam05109  732 PKNATSPQAPSGQKTAVPTVTSTGGKANSTTGGKHTTghgARTSTEPTTDYGGDSTTPRTRYNATTYlPPSTSSKLRPrw 811
                          250       260
                   ....*....|....*....|....*
gi 19112374   1011 ----PPKQPANRVVPLPPTASQRAS 1031
Cdd:pfam05109  812 tftsPPVTTAQATVPVPPTSQPRFS 836
dnaA PRK14086
chromosomal replication initiator protein DnaA;
938-1114 1.48e-04

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 45.97  E-value: 1.48e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   938 PTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASR---------ISSSPNMPPSNPYTPiavasstvnpahTYKPHGGSQI 1008
Cdd:PRK14086   90 PSAGEPAPPPPHARRTSEPELPRPGRRPYEGYGGpraddrppgLPRQDQLPTARPAYP------------AYQQRPEPGA 157
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1009 VPPPkqPANRVVPLPPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRP----------QIPQPPPVS 1078
Cdd:PRK14086  158 WPRA--ADDYGWQQQRLGFPPRAPYASPASYAPEQERDREPYDAGRPEYDQRRRDYDHPRPdwdrprrdrtDRPEPPPGA 235
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|.
gi 19112374  1079 -----TALPSSSAVSRPPIATSAGRSSTAASTSAPLTYPAG 1114
Cdd:PRK14086  236 ghvhrGGPGPPERDDAPVVPIRPSAPGPLAAQPAPAPGPGE 276
PHA03247 PHA03247
large tegument protein UL36; Provisional
849-1091 1.65e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 46.08  E-value: 1.65e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   849 HPL------PPPSTSTTAGWNDAPMlgqlPMRRAAPSMAPVRSPFPGASSAQPA---AMSRTSSVSTLPPPPPTASMTAS 919
Cdd:PHA03247  246 HPLrgdiaaPAPPPVVGEGADRAPE----TARGATGPPPPPEAAAPNGAAAPPDgvwGAALAGAPLALPAPPDPPPPAPA 321
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   920 APAIASPPPPKVGETYHPptasgtrVPpvqqpsHPNPYTPVA-PQ------SPVAAASRISSSPNMPPSNPyTPIAVASS 992
Cdd:PHA03247  322 GDAEEEDDEDGAMEVVSP-------LP------RPRQHYPLGfPKrrrptwTPPSSLEDLSAGRHHPKRAS-LPTRKRRS 387
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   993 TVNPAHTYKPHGGSQIVPPPKQPANRVVPLPPTASQRASAYEPPTVSVPSPSALSPSvTPQLPPvsSRLPPVSATRPQIP 1072
Cdd:PHA03247  388 ARHAATPFARGPGGDDQTRPAAPVPASVPTPAPTPVPASAPPPPATPLPSAEPGSDD-GPAPPP--ERQPPAPATEPAPD 464
                         250
                  ....*....|....*....
gi 19112374  1073 QPPPVSTALPSSSAVSRPP 1091
Cdd:PHA03247  465 DPDDATRKALDALRERRPP 483
PHA02682 PHA02682
ORF080 virion core protein; Provisional
893-1035 1.70e-04

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 44.85  E-value: 1.70e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   893 AQPAAMSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGETYHPPTAsgtrVPPVQQPSHPNPYTPVAPQSPVAAAsri 972
Cdd:PHA02682   68 ANSACMQRPSGQSPLAPSPACAAPAPACPACAPAAPAPAVTCPAPAPA----CPPATAPTCPPPAVCPAPARPAPAC--- 140
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 19112374   973 ssspnmPPSNPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRvVPLPPTASQRASAYEP 1035
Cdd:PHA02682  141 ------PPSTRQCPPAPPLPTPKPAPAAKPIFLHNQLPPPDYPAAS-CPTIETAPAASPVLEP 196
KLF10_11_N cd21974
N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins; This subfamily ...
950-1084 1.83e-04

N-terminal domain of Kruppel-like factor (KLF) 10, KLF11, and similar proteins; This subfamily is composed of Kruppel-like factor or Krueppel-like factor (KLF) 10, KLF11, and similar proteins. KLF10 was first identified in human osteoblasts and plays a role in mediating estrogen (E2) signaling in bone and skeletal homeostasis and a regulatory role in tumor formation and metastasis. KLF11 is involved in cell growth, apoptosis, cellular inflammation and differentiation, endometriosis, and cholesterol, prostaglandin, neurotransmitter, fat, and sugar metabolism. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF10/11 belong to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF10, KLF11, and similar proteins.


Pssm-ID: 409243 [Multi-domain]  Cd Length: 229  Bit Score: 44.54  E-value: 1.83e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  950 QPSHPNPYTPV--------APQSPVAAASriSSSPNMPPsnPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVvp 1021
Cdd:cd21974   24 RLRKPRPLTPSsdssdeddAPESPKDFHS--LSSLCMTP--PYSPPFFEASHSPSVASLHPPSAASSQPPPEPESSEP-- 97
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374 1022 lPPTASQRASAyepptVSV-----PSPSALSPSVTPQLPPVSSRLPPVSA--TRPQIPQPPPVSTALPSS 1084
Cdd:cd21974   98 -PAASPQRAQA-----TSVirhtaDPVPVSPPPVLCQMLPVSSSSGVIVAflKAPQQPSPQPQKPALPQP 161
PRK10263 PRK10263
DNA translocase FtsK; Provisional
806-1136 2.03e-04

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 45.85  E-value: 2.03e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   806 QTGYAPVQPKTSQAS----SILPTVPRTTSYTSPYATTSSH-ITPADVHPLPPPSTSTTAGWNDAPMlgQLPMRRAAPSM 880
Cdd:PRK10263  331 QSWAAPVEPVTQTPPvasvDVPPAQPTVAWQPVPGPQTGEPvIAPAPEGYPQQSQYAQPAVQYNEPL--QQPVQPQQPYY 408
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   881 APVRSPFPgaSSAQPAAMSRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGETYHPPTAsgTRVPPVQQPSHPNPYTPV 960
Cdd:PRK10263  409 APAAEQPA--QQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQ--TYQQPAAQEPLYQQPQPV 484
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   961 APQSPVAAASRISSS-PNMPPSNPYTPIAV--ASSTVNPAHTYKPhggsqiVPppkQPANRVVPLPPTASQRASAYEPPT 1037
Cdd:PRK10263  485 EQQPVVEPEPVVEETkPARPPLYYFEEVEEkrAREREQLAAWYQP------IP---EPVKEPEPIKSSLKAPSVAAVPPV 555
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1038 VSVPSPSALSPSV---TPQLPPVSSRLPPV------SATRPQI-----PQ-PPPVSTALPSSSAVSRPPIATSAGR-SST 1101
Cdd:PRK10263  556 EAAAAVSPLASGVkkaTLATGAAATVAAPVfslansGGPRPQVkegigPQlPRPKRIRVPTRRELASYGIKLPSQRaAEE 635
                         330       340       350
                  ....*....|....*....|....*....|....*
gi 19112374  1102 AASTSAPLTYPAGDRShipgNLRPIYEMLNAELQR 1136
Cdd:PRK10263  636 KAREAQRNQYDSGDQY----NDDEIDAMQQDELAR 666
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
772-983 2.18e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 45.64  E-value: 2.18e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   772 LNLVPTEFPGakeeiqrLTM-LLEPHAVPPIHQIKQTG-----YAPVQPKTSQASSILPTVPRTTSYTSPYA----TTSS 841
Cdd:PRK12323  343 LALAPDEYAG-------FTMtLLRMLAFRPGQSGGGAGpataaAAPVAQPAPAAAAPAAAAPAPAAPPAAPAaapaAAAA 415
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   842 HITPADVHPLPPPSTSTTAGWNDAPMLGQLPMRRAAPSMAPVrspfPGASSAQPAAMSRTSSVSTLPPPPPTASMTASAP 921
Cdd:PRK12323  416 ARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAA----PAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAP 491
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 19112374   922 AI-------------ASPPPPKVGETYHPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPPSNP 983
Cdd:PRK12323  492 ADddpppweelppefASPAPAQPDAAPAGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRPP 566
Gag_spuma pfam03276
Spumavirus gag protein;
937-1113 2.43e-04

Spumavirus gag protein;


Pssm-ID: 460872 [Multi-domain]  Cd Length: 614  Bit Score: 45.51  E-value: 2.43e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    937 PPTASGTRVPPVqqPSHPNPYTPvAPQSPVAAasriSSSPNMPPSNPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPa 1016
Cdd:pfam03276  187 PPGASFSGLPSL--PAIGGIHLP-AIPGIHAR----APPGNIARSLGDDIMPSLGDAGMPQPRFAFHPGNPFAEAEGHP- 258
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1017 nrvvPLPPTASQRASAYEPPTVSVPSPSALS---PSVTPQLPPVSSrLPPVSATRPQIPQPPPVSTALPSSSAVSRPPIa 1093
Cdd:pfam03276  259 ----FAEAEGERPRDIPRAPRIDAPSAPAIPaiqPIAPPMIPPIGA-PIPIPHGASIPGEHIRNPREEPIRLGREAPAI- 332
                          170       180
                   ....*....|....*....|
gi 19112374   1094 tsAGRSSTAASTSAPLTYPA 1113
Cdd:pfam03276  333 --DGRFAPAIDDLFCRIINA 350
PHA03379 PHA03379
EBNA-3A; Provisional
930-1112 3.00e-04

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 45.05  E-value: 3.00e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   930 KVGETY------------HPPTASGTRVPPVQQ-------PSHPNPYTPVAPQSP------VAAASRISSSPNMPPSNPY 984
Cdd:PHA03379  426 EVPQSLetatshgsaqvpEPPPVHDLEPGPLHDqhsmapcPVAQLPPGPLQDLEPgdqlpgVVQDGRPACAPVPAPAGPI 505
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   985 TPIAVASSTVNPAHTYKPHGgsqivppPKQPANRVVPLPPTASQRASAYEPPTVSVPSPSALSPSVTPQLppvSSRLPPV 1064
Cdd:PHA03379  506 VRPWEASLSQVPGVAFAPVM-------PQPMPVEPVPVPTVALERPVCPAPPLIAMQGPGETSGIVRVRE---RWRPAPW 575
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|
gi 19112374  1065 SATRPQIPQPPPVSTALPSSSAVSRPPIATSAGRSS--TAASTSAPLTYP 1112
Cdd:PHA03379  576 TPNPPRSPSQMSVRDRLARLRAEAQPYQASVEVQPPqlTQVSPQQPMEYP 625
KLF10_N cd21572
N-terminal domain of Kruppel-like factor 10; Kruppel-like factor 10 (KLF10; also known as ...
933-1118 3.15e-04

N-terminal domain of Kruppel-like factor 10; Kruppel-like factor 10 (KLF10; also known as Krueppel-like factor 10; early growth response(EGR)-alpha/EGRA; TGFbeta inducible early gene-1/TIEG1) is a protein that in humans is encoded by the KLF10 gene. KLF10 was first identified in human osteoblasts and plays a role in mediating estrogen (E2) signaling in bone and skeletal homeostasis and a regulatory role in tumor formation and metastasis. It may also play a role in adipocyte differentiation and adipose tissue function. KLF9, KLF10, KLF11, KLF13, KLF14, and KLF16 share a conserved a-helical motif AA/VXXL that mediates their binding to Sin3A and their activities as transcriptional repressors. KLF10 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF10.


Pssm-ID: 409241 [Multi-domain]  Cd Length: 245  Bit Score: 43.82  E-value: 3.15e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  933 ETYHPPTASGTRVPPVQQP--SHPNPYTPVAPQSPVA-------------AASRISSSPNMPPSNPYTPIAVASSTVNpa 997
Cdd:cd21572   68 EATHPPSAATLHPPAAQPPeeQHLSAETAASQQRFQCtsvirhtadaqpcSCSSCPSSPSVVPSVPAGVAGVSPVPVY-- 145
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  998 htykphggSQIVPppkqpanrVVPLPPTASQrasayeppTVSVPSPSALSPSVTPQLPPVSSRLP--PVSATRPQ-IPQP 1074
Cdd:cd21572  146 --------CQILP--------VSSSSTTVVA--------AQAPLPQPQQQAASPAQVFLMGGQVPkgPVMFLVPQpVVPT 201
                        170       180       190       200
                 ....*....|....*....|....*....|....*....|....
gi 19112374 1075 PPVSTALPSSSAVSRPPIATSAGRSSTAASTSAPLTYPAGDRSH 1118
Cdd:cd21572  202 LYVQPTLVTPGGTKLAAIAPAPGHTPSEQRKSPPQPEVSRVRSH 245
Cornifin pfam02389
Cornifin (SPRR) family; SPRR genes (formerly SPR) encode a novel class of polypeptides (small ...
948-1083 3.79e-04

Cornifin (SPRR) family; SPRR genes (formerly SPR) encode a novel class of polypeptides (small proline rich proteins) that are strongly induced during differentiation of human epidermal keratinocytes in vitro and in vivo. The most characteriztic feature of the SPRR gene family resides in the structure of the central segments of the encoded polypeptides that are built up from tandemly repeated units of either eight (SPRR1 and SPRR3) or nine (SPRR2) amino acids with the general consensus XKXPEPXX where X is any amino acid. In order to avoid bacterial contamination due to the high polar-nature of the HMM the threshold has been set very high.


Pssm-ID: 280537 [Multi-domain]  Cd Length: 135  Bit Score: 41.96  E-value: 3.79e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    948 VQQPSHPNPYTPVAPQSPVAAASRISSspnmpPSNPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPAnrvVPLP--PT 1025
Cdd:pfam02389    5 VKQPCQPPPQEPCVPTTKEPCHSKVPE-----PCNPKVPEPCCPKVPEPCCPKVPEPCCPKVPEPCCPK---VPEPcyPK 76
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 19112374   1026 ASQRASAYEPPTVSVPSPSALSPSVTPQLPPVSsrlpPVSAtRPQIPQPPPvSTALPS 1083
Cdd:pfam02389   77 VPEPCSPKVPEPCHPKAPEPCHPKVPEPCYPKA----PEPC-QPKVPEPCP-STVTPG 128
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
938-1108 4.13e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 44.48  E-value: 4.13e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   938 PTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPPSNPytpiAVASSTVNPAHtykphggsqiVPPPKQPAn 1017
Cdd:PRK12323  365 PGQSGGGAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAP----AAAAAARAVAA----------APARRSPA- 429
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1018 rvvPLPPTASQRASAYEPPTVSVPSPSAlspsvtpqlPPVssrlpPVSATRPQIPQPPPvstalpsssavsRPPIATSAG 1097
Cdd:PRK12323  430 ---PEALAAARQASARGPGGAPAPAPAP---------AAA-----PAAAARPAAAGPRP------------VAAAAAAAP 480
                         170
                  ....*....|.
gi 19112374  1098 RSSTAASTSAP 1108
Cdd:PRK12323  481 ARAAPAAAPAP 491
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
951-1126 4.59e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 44.78  E-value: 4.59e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   951 PSHPNPYTPV--APQSPVAAASRISSSPNMPPSNPYTPIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVVPLPPTA-- 1026
Cdd:PHA03307   19 EFFPRPPATPgdAADDLLSGSQGQLVSDSAELAAVTVVAGAAACDRFEPPTGPPPGPGTEAPANESRSTPTWSLSTLApa 98
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1027 --------SQRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVS-----TALPSSSAVSRP-PI 1092
Cdd:PHA03307   99 sparegspTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASpaavaSDAASSRQAALPlSS 178
                         170       180       190
                  ....*....|....*....|....*....|....
gi 19112374  1093 ATSAGRSSTAASTSAPLTYPAGDRSHIPGNLRPI 1126
Cdd:PHA03307  179 PEETARAPSSPPAEPPPSTPPAAASPRPPRRSSP 212
WD40 COG2319
WD40 repeat [General function prediction only];
178-335 6.25e-04

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 43.75  E-value: 6.25e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  178 LASGNATEYTTVWDVKLKRQVLNLSYLGAAGVSAATGAVNSIAWHPNNATRLATAIDDNRnpiilTWDLRQPTVPQNILT 257
Cdd:COG2319    1 ALSADGAALAAASADLALALLAAALGALLLLLLGLAAAVASLAASPDGARLAAGAGDLTL-----LLLDAAAGALLATLL 75
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 19112374  258 GHQKAALSLSWCPeDPTFLLSSGKDGRAMVWNVETGESLGSFPRSGNWYTKSSWCPsNSNRVAVASLEGKVSIFSIQS 335
Cdd:COG2319   76 GHTAAVLSVAFSP-DGRLLASASADGTVRLWDLATGLLLRTLTGHTGAVRSVAFSP-DGKTLASGSADGTVRLWDLAT 151
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
930-1079 6.53e-04

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 43.91  E-value: 6.53e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    930 KVGETYHPPTasgtrvpPVQQPSHPNPYTPVAPQSPVAAASRI--------SSSPNMPPSNPYTPIAVASSTVnpahtyk 1001
Cdd:TIGR01645  277 RVGKCVTPPD-------ALLQPATVSAIPAAAAVAAAAATAKImaaeavagAAVLGPRAQSPATPSSSLPTDI------- 342
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 19112374   1002 phGGSQIVPPPKQPANRVVPLPPTAsqrASAYEPPTVSVPSPsalspsVTPQLPPVSSRLPPVSATRPQIPQPPPVST 1079
Cdd:TIGR01645  343 --GNKAVVSSAKKEAEEVPPLPQAA---PAVVKPGPMEIPTP------VPPPGLAIPSLVAPPGLVAPTEINPSFLAS 409
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
252-289 7.72e-04

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 38.06  E-value: 7.72e-04
                            10        20        30
                    ....*....|....*....|....*....|....*...
gi 19112374     252 PQNILTGHQKAALSLSWCPeDPTFLLSSGKDGRAMVWN 289
Cdd:smart00320    4 LLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
WD40 pfam00400
WD domain, G-beta repeat;
252-289 8.38e-04

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 38.10  E-value: 8.38e-04
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 19112374    252 PQNILTGHQKAALSLSWCPeDPTFLLSSGKDGRAMVWN 289
Cdd:pfam00400    3 LLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVWD 39
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
1002-1150 8.43e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 43.82  E-value: 8.43e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1002 PHGGSQIVPPPKQPANRvvPLPPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTAL 1081
Cdd:PRK07764  386 GVAGGAGAPAAAAPSAA--AAAPAAAPAPAAAAPAAAAAPAPAAAPQPAPAPAPAPAPPSPAGNAPAGGAPSPPPAAAPS 463
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 19112374  1082 PSSSAVSRPPIATSAGRSSTAASTSAPLTYPAGDRSHIPGNLRPIYEMLNAELQRVSQSLpPQMSRVVH 1150
Cdd:PRK07764  464 AQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAPAGADDAATLRERWPEILAAV-PKRSRKTW 531
SP5_N cd22541
N-terminal domain of transcription factor Specificity Protein (SP) 5; Specificity Proteins ...
952-1068 8.96e-04

N-terminal domain of transcription factor Specificity Protein (SP) 5; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. All of them contain clade SP5, which plays a potential role in human cancers and was found in several human tumors including hepatocellular carcinoma, gastric cancer, and colon cancer. Leukemia inhibitor factor/Stat3 and Wnt/beta-catenin signaling pathways converge on SP5 to promote mouse embryonic stem cell self-renewal. SP5 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP5.


Pssm-ID: 412096 [Multi-domain]  Cd Length: 143  Bit Score: 41.01  E-value: 8.96e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  952 SHPNPYTPVA--PQSPVAAASRISSSPNMPPSNPYTPiavASSTVNPAHTYKPHGGSQIVPPPkqpaNRVVPLPPTASQR 1029
Cdd:cd22541    3 PHELPLTPPAepSFHQSLAYSFELSPVKMLPTPAPAP---AASAPPHPSPVSSPTQQPQQLPP----NPADDIPWWSIQQ 75
                         90       100       110       120
                 ....*....|....*....|....*....|....*....|..
gi 19112374 1030 ASAYEPPTVSVP---SPSALSPSVTPQLPPVSSRLPPVSATR 1068
Cdd:cd22541   76 SNPAHPPSTSTPlghPTFAGYQPQIAALLQTKSPAASLSTTR 117
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
986-1096 9.77e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 43.32  E-value: 9.77e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   986 PIAVASSTVNPAHTYKPHGGSQIVPPPKQPANRVVPLPPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVS 1065
Cdd:PRK07994  361 PAAPLPEPEVPPQSAAPAASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETTSQLLAARQQLQRAQGATKAKK 440
                          90       100       110
                  ....*....|....*....|....*....|.
gi 19112374  1066 ATRPQIPQPPPVSTALPSSSAVSRPPIATSA 1096
Cdd:PRK07994  441 SEPAAASRARPVNSALERLASVRPAPSALEK 471
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
797-986 1.04e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 43.44  E-value: 1.04e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   797 AVPPIHQIKQTGYAPVQPKTSQAssilpTVPRTTSYTSPYATTSSHITPADVHPLPPPSTSTTAGWNDAPMLGQLPMRRA 876
Cdd:PRK07764  598 EGPPAPASSGPPEEAARPAAPAA-----PAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAK 672
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   877 APSMAPVRSPFPGASSAQPAAMSRTSSVSTLPPPPPTASMTASAPAIASPPppkVGETYHPPTASGTRVPPVQQPSHPNP 956
Cdd:PRK07764  673 AGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQ---AAQGASAPSPAADDPVPLPPEPDDPP 749
                         170       180       190
                  ....*....|....*....|....*....|
gi 19112374   957 YTPVAPQSPVAAASRISSSPNMPPSNPYTP 986
Cdd:PRK07764  750 DPAGAPAQPPPPPAPAPAAAPAAAPPPSPP 779
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
986-1076 1.32e-03

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 40.93  E-value: 1.32e-03
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374     986 PIAVASSTVNPAHTYKPHGGSQIVPP--PKQPANRVVPLPP----TASQRASAYEPPTVSVPS-PSALSPsvtPQLPPVS 1058
Cdd:smart00818   38 QIIPVSQQHPPTHTLQPHHHIPVLPAqqPVVPQQPLMPVPGqhsmTPTQHHQPNLPQPAQQPFqPQPLQP---PQPQQPM 114
                            90
                    ....*....|....*...
gi 19112374    1059 SRLPPVSATRPQIPQPPP 1076
Cdd:smart00818  115 QPQPPVHPIPPLPPQPPL 132
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
1023-1147 1.40e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 42.78  E-value: 1.40e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1023 PPTASQRASAYEPPTVSVPSPSALSPSVTPQLPPVssrlPPVSATRPQiPQPPPVSTALPSSSAVSRPPIATSAGRSSTA 1102
Cdd:PRK14951  371 EAAAPAEKKTPARPEAAAPAAAPVAQAAAAPAPAA----APAAAASAP-AAPPAAAPPAPVAAPAAAAPAAAPAAAPAAV 445
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*
gi 19112374  1103 ASTSAPLTYPAGDRSHIPGNLRPIYEMLNAELQRVSQSLPPQMSR 1147
Cdd:PRK14951  446 ALAPAPPAQAAPETVAIPVRVAPEPAVASAAPAPAAAPAAARLTP 490
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
876-1070 1.83e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 42.55  E-value: 1.83e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   876 AAPSMAPVRSPFPGASSAQPAAMSRTSSVSTLPPPPPTASMTASAPAIASPPPpkvgetyhPPTASGTRVPPVQQPSHpn 955
Cdd:PRK07994  366 PEPEVPPQSAAPAASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPET--------TSQLLAARQQLQRAQGA-- 435
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   956 pyTPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASSTvnpahtykphggsqivPPPKQPANRVVPLPPTasqrasayEP 1035
Cdd:PRK07994  436 --TKAKKSEPAAASRARPVNSALERLASVRPAPSALEK----------------APAKKEAYRWKATNPV--------EV 489
                         170       180       190
                  ....*....|....*....|....*....|....*.
gi 19112374  1036 PTVSVPSPSALSPSVTPQ-LPPVSSRLPPVSATRPQ 1070
Cdd:PRK07994  490 KKEPVATPKALKKALEHEkTPELAAKLAAEAIERDP 525
PRK08691 PRK08691
DNA polymerase III subunits gamma and tau; Validated
819-1043 1.87e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236333 [Multi-domain]  Cd Length: 709  Bit Score: 42.39  E-value: 1.87e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   819 ASSILPTVPRTTSYTSPYATTSSHITPAdVHPLPPPSTSTTAgwndAPMlgQLPMRRAAPSMAPVRSPFPGASSAQPAAM 898
Cdd:PRK08691  364 ASCDANAVIENTELQSPSAQTAEKETAA-KKPQPRPEAETAQ----TPV--QTASAAAMPSEGKTAGPVSNQENNDVPPW 436
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   899 SRTSSVSTLPPPPPTASMTASAPAIASPPPPKVGETYHPPTASGTRVPPVQQPShPNPyTPVAPQSPVAAASRISSSPNM 978
Cdd:PRK08691  437 EDAPDEAQTAAGTAQTSAKSIQTASEAETPPENQVSKNKAADNETDAPLSEVPS-ENP-IQATPNDEAVETETFAHEAPA 514
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 19112374   979 PPSNPYTPiavasstvnPAHTYKPHGGSQIVPPP---KQPANRVVPLPPTASQRASAYEPPTVSVPSP 1043
Cdd:PRK08691  515 EPFYGYGF---------PDNDCPPEDGAEIPPPDwehAAPADTAGGGADEEAEAGGIGGNNTPSAPPP 573
PHA03379 PHA03379
EBNA-3A; Provisional
776-1109 2.02e-03

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 42.35  E-value: 2.02e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   776 PTEFPG-AKEEIQRLTMLLEPHAVPPIHQIKqtgyaPVQPKTSQASSILPtvprttSYTSPYATTSSHITPADVHPLPPP 854
Cdd:PHA03379  513 LSQVPGvAFAPVMPQPMPVEPVPVPTVALER-----PVCPAPPLIAMQGP------GETSGIVRVRERWRPAPWTPNPPR 581
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   855 STS------------TTAGWNDAPMLGQLPMRRAAPSMAPVRSP-------FPGASSAQPAAMSRTSSVSTLPPPPPTAS 915
Cdd:PHA03379  582 SPSqmsvrdrlarlrAEAQPYQASVEVQPPQLTQVSPQQPMEYPlepeqqmFPGSPFSQVADVMRAGGVPAMQPQYFDLP 661
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   916 MTASAPaiasppppkVGETYHPPTASGTRVPPVqqPSHPNPYTPVAPQSPVA----AASRISSSPNMPPSNP---YTPIA 988
Cdd:PHA03379  662 LQQPIS---------QGAPLAPLRASMGPVPPV--PATQPQYFDIPLTEPINqgasAAHFLPQQPMEGPLVPerwMFQGA 730
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   989 VASSTVNPAHTYKPHGGSQIVPP--PKQPANRVVPLPPTASQRASAYEPPTVSVPSPSAlspSVTP-QLPPVssRLPPVS 1065
Cdd:PHA03379  731 TLSQSVRPGVAQSQYFDLPLTQPinHGAPAAHFLHQPPMEGPWVPEQWMFQGAPPSQGT---DVVQhQLDAL--GYVLHV 805
                         330       340       350       360
                  ....*....|....*....|....*....|....*....|....
gi 19112374  1066 ATRPQIPQPPPVSTALPSSSAVSRPPIATSAGRSSTAASTSAPL 1109
Cdd:PHA03379  806 LNHPGVPVSPAVNQYHVSQAAFGLPIDEDESGEGSDTSEPCEAL 849
PTZ00421 PTZ00421
coronin; Provisional
214-293 2.63e-03

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 41.80  E-value: 2.63e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   214 GAVNSIAWHPNNATRLATAIDDNRnpiILTWDLRQPTVPQNI------LTGHQKAALSLSWCPEDPTFLLSSGKDGRAMV 287
Cdd:PTZ00421   76 GPIIDVAFNPFDPQKLFTASEDGT---IMGWGIPEEGLTQNIsdpivhLQGHTKKVGIVSFHPSAMNVLASAGADMVVNV 152

                  ....*.
gi 19112374   288 WNVETG 293
Cdd:PTZ00421  153 WDVERG 158
PRK12727 PRK12727
flagellar biosynthesis protein FlhF;
941-1106 3.62e-03

flagellar biosynthesis protein FlhF;


Pssm-ID: 237182 [Multi-domain]  Cd Length: 559  Bit Score: 41.51  E-value: 3.62e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   941 SGTRVPPVQQPshPNPYTPVAPQSPVAAASRISSSPNMPPSNPytpIAVASSTVNPAHTYKphggsqivpppkQPANRVV 1020
Cdd:PRK12727   60 SDTPATAAAPA--PAPQAPTKPAAPVHAPLKLSANANMSQRQR---VASAAEDMIAAMALR------------QPVSVPR 122
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1021 PLPPTASQRASAYEPPTVSVPSPSA---LSPSVTPQLPPVSSRLPPVSATRPQIPQPppvstALPSSSAVSRPPIATSAG 1097
Cdd:PRK12727  123 QAPAAAPVRAASIPSPAAQALAHAAavrTAPRQEHALSAVPEQLFADFLTTAPVPRA-----PVQAPVVAAPAPVPAIAA 197

                  ....*....
gi 19112374  1098 RSSTAASTS 1106
Cdd:PRK12727  198 ALAAHAAYA 206
PHA03201 PHA03201
uracil DNA glycosylase; Provisional
1013-1144 3.74e-03

uracil DNA glycosylase; Provisional


Pssm-ID: 165468  Cd Length: 318  Bit Score: 41.03  E-value: 3.74e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1013 KQPANRVVPLPPTASqrasayePPTVSVPSPSALSPSVTPQLPPVSSRLPPVSATrpqipQPPPVSTALP---------S 1083
Cdd:PHA03201    2 KRARSRSPSPPRRPS-------PPRPTPPRSPDASPEETPPSPPGPGAEPPPGRA-----AGPAAPRRRPrgcpagvtfS 69
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 19112374  1084 SSAVSRPPIatsaGRSSTAASTSAPLTYPAGDRSHIPGNL-RPIYE---------MLNAELQRVSQS---LPPQ 1144
Cdd:PHA03201   70 SSAPPRPPL----GLDDAPAATPPPLDWTEFRRRFLVGDAwRPLLEpelanpltaRLMAEYERRCRTeevLPPR 139
SP5_N cd22541
N-terminal domain of transcription factor Specificity Protein (SP) 5; Specificity Proteins ...
977-1108 3.88e-03

N-terminal domain of transcription factor Specificity Protein (SP) 5; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. All of them contain clade SP5, which plays a potential role in human cancers and was found in several human tumors including hepatocellular carcinoma, gastric cancer, and colon cancer. Leukemia inhibitor factor/Stat3 and Wnt/beta-catenin signaling pathways converge on SP5 to promote mouse embryonic stem cell self-renewal. SP5 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP5.


Pssm-ID: 412096 [Multi-domain]  Cd Length: 143  Bit Score: 39.08  E-value: 3.88e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  977 NMPPSNPYTPIAVASSTVNPAHTYKphggsqiVPPPKQPANRVVPLPPTASQRASAYEPPTVSVPSPSalspsvtpqlPP 1056
Cdd:cd22541    1 MAPHELPLTPPAEPSFHQSLAYSFE-------LSPVKMLPTPAPAPAASAPPHPSPVSSPTQQPQQLP----------PN 63
                         90       100       110       120       130
                 ....*....|....*....|....*....|....*....|....*....|...
gi 19112374 1057 VSSRLPPVSATRPQIPQPPPVSTALPSSSAVSRPP-IATSAGRSSTAASTSAP 1108
Cdd:cd22541   64 PADDIPWWSIQQSNPAHPPSTSTPLGHPTFAGYQPqIAALLQTKSPAASLSTT 116
PHA03291 PHA03291
envelope glycoprotein I; Provisional
845-1088 4.37e-03

envelope glycoprotein I; Provisional


Pssm-ID: 223033 [Multi-domain]  Cd Length: 401  Bit Score: 41.09  E-value: 4.37e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   845 PADVHPLPPPSTSTTAgwnDAPMLGQLPMRraAPSMAPVRSPFPGASSAQPAAmsrTSSVSTLPPPPPTASMTASApaia 924
Cdd:PHA03291  167 PAEGTLAAPPLGEGSA---DGSCDPALPLS--APRLGPADVFVPATPRPTPRT---TASPETTPTPSTTTSPPSTT---- 234
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   925 sppppkvgetyhPPTASGTRVPPVQQPSHPNPYTPvAPQSPVAAASRISSSPNMPPSNPY--TPIAVASSTVNPAHTYKP 1002
Cdd:PHA03291  235 ------------IPAPSTTIAAPQAGTTPEAEGTP-APPTPGGGEAPPANATPAPEASRYelTVTQIIQIAIPASIIACV 301
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1003 HGGSQIVPPPKQPANRVVPlpptasqRASAYEPPTVSVPSPSALSPSVTPQLPPVSSRLPPVSatrPQIPQPPPVSTALP 1082
Cdd:PHA03291  302 FLGSCACCLHRRCRRRRRR-------PARIYRPPSPVAPSISAVNEAALARLGDELKRHPPES---PRRSKRRSSQTMVP 371

                  ....*.
gi 19112374  1083 SSSAVS 1088
Cdd:PHA03291  372 SLTAIS 377
Retinal pfam15449
Retinal protein; This family of proteins is found in the photoreceptor cells of the retina. ...
881-1113 5.11e-03

Retinal protein; This family of proteins is found in the photoreceptor cells of the retina. Mutations of the gene encoding this protein have been associated with retinal disorders such as retinitis pigmentosa and late-onset progressive retinal atrophy. The function of this family of proteins is unknown, but it is likely to be important in the development and function of the retina.


Pssm-ID: 464722 [Multi-domain]  Cd Length: 1293  Bit Score: 41.30  E-value: 5.11e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    881 APVRSPFPGASSAQPAAMSRTSSVST--LPPPPPTASMTASAPAIASPPPPKVGETyhppTASGTRVPPVQQ-------- 950
Cdd:pfam15449  814 APIFPPLPTAEASKSEDTNCETEEDLehLPPPPLEILMDKSFTSLEPPESSKPAGS----SPEGTPVPGLGEagptrrtw 889
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    951 ---------------PSHPNPyTPVAPQSPVAAASRISSSP-------NMPPSNPYTPIA--VASSTV------------ 994
Cdd:pfam15449  890 aspklrasmspidllPSKSTA-SPTRPRSTGPGSSKSGCNPrklaldlNHPPAASHNPEAegGAQSQAqaeeaaslskqp 968
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    995 -------NPAHT---------------YKPHG-------------GSQIVPPPK----QPANRVVPLPPTASQRASAYEP 1035
Cdd:pfam15449  969 rkaipwhHSSHTsgqsrtsepslarptRGPHSpeaprqsqersppLVRKASPTRahwaPRADKRHPSLPSSHRPAQPSLP 1048
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1036 PTVSVPSPSaLSPSVTPqlPPVSSRL--PPVSATR----PQIPQPPPVSTALPSSSAVSRPPIATSAGRSSTAASTSAPL 1109
Cdd:pfam15449 1049 TVQRSPSPP-LSPRAPS--PPRSPRVlsPPTSKKRtsppPQHKLPSPPPESPPAQHKLSSPPTQRTEASSPSSGPSPSPP 1125

                   ....
gi 19112374   1110 TYPA 1113
Cdd:pfam15449 1126 TSPS 1129
SOBP pfam15279
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ...
811-1043 5.86e-03

Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.


Pssm-ID: 464609 [Multi-domain]  Cd Length: 325  Bit Score: 40.18  E-value: 5.86e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    811 PVQPKTSQASSILPTVPRTTSYTSPYATTSShitPADVHPLPPPSTSTTAGwndapmlgQLPMRRAAPSMAPVrSPFPGA 890
Cdd:pfam15279  125 LLAPKPHEPPSLPPPPLPPKKGRRHRPGLHP---PLGRPPGSPPMSMTPRG--------LLGKPQQHPPPSPL-PAFMEP 192
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    891 SSAQPAAMSRTSSV----STLPPPPPTASMTasapaiasppppkvgetyhPPTASGTRVPPvQQPSHPNPYTPVAPQSPv 966
Cdd:pfam15279  193 SSMPPPFLRPPPSIpqpnSPLSNPMLPGIGP-------------------PPKPPRNLGPP-SNPMHRPPFSPHHPPPP- 251
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 19112374    967 aaASRISSSPNMPPSNPYTPIAVASSTVNPahtykphggsQIVPPPKQPANRVVPLPptasqrASAYEPPTVSVPSP 1043
Cdd:pfam15279  252 --PTPPGPPPGLPPPPPRGFTPPFGPPFPP----------VNMMPNPPEMNFGLPSL------APLVPPVTVLVPYP 310
rad23 TIGR00601
UV excision repair protein Rad23; All proteins in this family for which functions are known ...
1042-1108 6.15e-03

UV excision repair protein Rad23; All proteins in this family for which functions are known are components of a multiprotein complex used for targeting nucleotide excision repair to specific parts of the genome. In humans, Rad23 complexes with the XPC protein. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]


Pssm-ID: 273167 [Multi-domain]  Cd Length: 378  Bit Score: 40.26  E-value: 6.15e-03
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 19112374   1042 SPSALSPSVTPQLPPVSSRLPPVSATRPqiPQPPPVSTALPSSSaVSRPPIATSAGRSSTAASTSAP 1108
Cdd:TIGR00601   81 TGKVAPPAATPTSAPTPTPSPPASPASG--MSAAPASAVEEKSP-SEESATATAPESPSTSVPSSGS 144
PHA02682 PHA02682
ORF080 virion core protein; Provisional
995-1112 6.20e-03

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 40.23  E-value: 6.20e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   995 NPAHTYKPHGGSQIVPPPKQPANRV---VPLPP------TASQRASAYEPPTVSVPSPSALSPS---VTPQLPPVSSRLP 1062
Cdd:PHA02682   69 NSACMQRPSGQSPLAPSPACAAPAPacpACAPAapapavTCPAPAPACPPATAPTCPPPAVCPAparPAPACPPSTRQCP 148
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|
gi 19112374  1063 PVsatrPQIPQPPPVSTALPSSSAVSRPPIATSAgrSSTAASTSAPLTYP 1112
Cdd:PHA02682  149 PA----PPLPTPKPAPAAKPIFLHNQLPPPDYPA--ASCPTIETAPAASP 192
DamX COG3266
Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell ...
936-1104 6.92e-03

Cell division protein DamX, binds to the septal ring, contains C-terminal SPOR domain [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 442497 [Multi-domain]  Cd Length: 455  Bit Score: 40.60  E-value: 6.92e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  936 HPPTASGTRVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPPSNPYTPiAVASSTVNPAHTYKPHGGSQIVPPPKQP 1015
Cdd:COG3266  208 LLLLLASALGEAVAAAAELAALALLAAGAAEVLTARLVLLLLIIGSALKAP-SQASSASAPATTSLGEQQEVSLPPAVAA 286
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374 1016 AnrvvPLPPTASQRASAyEPPTVSVPSPSALSPsvTPQLPPVSSRLPPVSATRPQIPQPPPV---STALPSSSAVSRPPI 1092
Cdd:COG3266  287 Q----PAAAAAAQPSAV-ALPAAPAAAAAAAAP--AEAAAPQPTAAKPVVTETAAPAAPAPEaaaAAAAPAAPAVAKKLA 359
                        170
                 ....*....|..
gi 19112374 1093 ATSAGRSSTAAS 1104
Cdd:COG3266  360 ADEQWLASQPAS 371
PHA03132 PHA03132
thymidine kinase; Provisional
1022-1165 7.93e-03

thymidine kinase; Provisional


Pssm-ID: 222997 [Multi-domain]  Cd Length: 580  Bit Score: 40.52  E-value: 7.93e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1022 LPPTASQRASAYE----PPTVSVPSPSALSPSVTPQLPpvssRLPPVSATRPQIPQPPPVSTALPSSSAV-----SRPPI 1092
Cdd:PHA03132   38 TPLGSTSEATSEDdddlYPPRETGSGGGVATSTIYTVP----RPPRGPEQTLDKPDSLPASRELPPGPTPvppggFRGAS 113
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 19112374  1093 ATSAGRSSTAASTSAPLTYPAgdrSHIPGNLRPIYEMLNAELQRVSQSLPPQMSRVVHDTEKRLNMLFDRLNS 1165
Cdd:PHA03132  114 SPRLGADSTSPRFLYQVNFPV---ILAPIGESNSSSEELSEEEEHSRPPPSESLKVKNGGKVYPKGFSKHKTH 183
PHA03269 PHA03269
envelope glycoprotein C; Provisional
966-1108 8.60e-03

envelope glycoprotein C; Provisional


Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 40.10  E-value: 8.60e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   966 VAAASRISSSPNMPPSNPYTPIAVASSTVNPAhtykphggsqivPPPKQPANRVvPLPPTASQRASAYEPPTVSVPSPSA 1045
Cdd:PHA03269   12 IACINLIIANLNTNIPIPELHTSAATQKPDPA------------PAPHQAASRA-PDPAVAPTSAASRKPDLAQAPTPAA 78
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 19112374  1046 L-SPSVTPQLPPVSSRLP-PVSAtrPQIPQPPPVSTALPSSSAVSRPPIATSAGrSSTAASTSAP 1108
Cdd:PHA03269   79 SeKFDPAPAPHQAASRAPdPAVA--PQLAAAPKPDAAEAFTSAAQAHEAPADAG-TSAASKKPDP 140
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
1023-1116 8.60e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 40.18  E-value: 8.60e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1023 PPTASQRAsayePPTVSVPSPSALSPSV-TPQLPPVSSRLPPVSATR-PQIPQPPPVSTALPSSSAVSR--PPIATSAGR 1098
Cdd:PRK14950  362 PVPAPQPA----KPTAAAPSPVRPTPAPsTRPKAAAAANIPPKEPVReTATPPPVPPRPVAPPVPHTPEsaPKLTRAAIP 437
                          90
                  ....*....|....*...
gi 19112374  1099 SSTAASTSAPLTYPAGDR 1116
Cdd:PRK14950  438 VDEKPKYTPPAPPKEEEK 455
MISS pfam15822
MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic ...
877-1108 8.75e-03

MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic MAPK-interacting and spindle-stabilising protein-like proteins. MISS is rich in prolines and has four potential MAPK-phosphorylation sites, a MAPK-docking site, a PEST sequence (PEST motif) and a bipartite nuclear localization signal. The endogenous protein accumulates during mouse meiotic maturation and is found as discrete dots on the MII spindle. MISS is the first example of a physiological MAPK-substrate that is stabilized in MII that specifically regulates MII spindle integrity during the CSF arrest.


Pssm-ID: 318115 [Multi-domain]  Cd Length: 238  Bit Score: 39.20  E-value: 8.75e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    877 APSMAPVRSPF--PGASSAQPAAMSRTSSVSTLPPPPPTASMTASApaiasppppkvgetyhPPTASGTRVPPVQQPSHP 954
Cdd:pfam15822   26 PPQGWPGSNPWnnPSAPPAVPSGLPPSTAPSTVPFGPAPTGMYPSI----------------PLTGPSPGPPAPFPPSGP 89
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374    955 NPYTPVAPQSPVAAASRISSSPNMPPSNPYTPIAvasstvnpahtyKPHGGsqivppPKQPAnRVVPLPPTASQRASAYE 1034
Cdd:pfam15822   90 SCPPPGGPYPAPTVPGPGPIGPYPTPNMPFPELP------------RPYGA------PTDPA-AAAPSGPWGSMSSGPWA 150
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   1035 P--------PTVSVPSP---SALSPSVTPQ-LPPVSSRLPPVSATRPQIPQPPPV-STALPSSSAVSRPPIATSAGRSST 1101
Cdd:pfam15822  151 PgmggqypaPNMPYPSPgpyPAVPPPQSPGaAPPVPWGTVPPGPWGPPAPYPDPTgSYPMPGLYPTPNNPFQVPSGPSGA 230

                   ....*..
gi 19112374   1102 AASTSAP 1108
Cdd:pfam15822  231 PPMPGGP 237
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
934-1117 9.74e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 40.23  E-value: 9.74e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374   934 TYHPPTASGT-RVPPVQQPSHPNPYTPVAPQSPVAAASRISSSPNMPPSNPYTPIAVASSTVNPAhtykPHGGSQIvpPP 1012
Cdd:PRK07994  358 AFHPAAPLPEpEVPPQSAAPAASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETTSQL----LAARQQL--QR 431
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 19112374  1013 KQPANRVVPLPPTASQRASayepptvsvPSPSALSPSVTPQLPPVSSRLPPVSATRPQIPQPPPVSTALPSssaVSRPPI 1092
Cdd:PRK07994  432 AQGATKAKKSEPAAASRAR---------PVNSALERLASVRPAPSALEKAPAKKEAYRWKATNPVEVKKEP---VATPKA 499
                         170       180
                  ....*....|....*....|....*
gi 19112374  1093 ATSAGRSSTAASTSAPLTYPAGDRS 1117
Cdd:PRK07994  500 LKKALEHEKTPELAAKLAAEAIERD 524
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH