NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|190343021|ref|NP_055637|]
View 

protein transport protein Sec24D isoform 1 [Homo sapiens]

Protein Classification

Sec24-like domain-containing protein( domain architecture ID 1019850)

Sec24-like domain-containing protein similar to yeast sec24p which is a component of yeast coat protein II (COPII), the coat protein complex responsible for vesicle budding from the endoplasmic reticulum.

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
COG5028 super family cl34873
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
132-1027 1.39e-166

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


The actual alignment was detected with superfamily member COG5028:

Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 511.26  E-value: 1.39e-166
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  132 YGSGMAPPSQGPPGPlSATSLQTPPRPPQPSILQPGSQVLPPPPTTlngpgaSPLPLPMYRPDGLSGPPPPNAQYQPPPl 211
Cdd:COG5028     4 HKKGVYPQAQSQVHT-GAASSKKSARPHRAYANFSAGQMGMPPYTT------PPLQQQSRRQIDQAATAMHNTGANNPA- 75
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  212 pgQTLGAGYPPQQANsgpqmagaqlsYPGGFPGGPAQMAGPPQpqkkldpdSIPSPIQVIENDRASRGGQVYATnTRGQI 291
Cdd:COG5028    76 --PSVMSPAFQSQQK-----------FSSPYGGSMADGTAPKP--------TNPLVPVDLFEDQPPPISDLFLP-PPPIV 133
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  292 PPLvTTDCMIQDQGNASPRFIRCTTYCFPCTSDMAKQAQIPLAAVIKPFATIPSNESPLYLVNHGEsgPVRCNRCKAYMC 371
Cdd:COG5028   134 PPL-TTNFVGSEQSNCSPKYVRSTMYAIPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRCRSYIN 210
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  372 PFMQFIEGGRRYQCGFCNCVNDVPPFYFQHLDHIGRRLDHYEKPELSLGSYEYVATLDYcrKSKPPNPPAFIFMIDVSYS 451
Cdd:COG5028   211 PFVQFIEQGRKWRCNICRSKNDVPEGFDNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLIDVSFE 288
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  452 NIKNGLVKLICEELKTMLEKIPKEEQEetsaIRVGFITYNKVLHFFNVKSNLaQPQMMVVTDVGEVFVPLLDG-FLVNYQ 530
Cdd:COG5028   289 AIKNGLVKAAIRAILENLDQIPNFDPR----TKIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlFVLPLK 363
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  531 ESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAadCPGKLFIFHSSLPTAeAPGKLKNRDDKklvntdkEKIL 610
Cdd:COG5028   364 SCKQIIETLLDRVPRIFQDNKSPKNALGPALKAAKSLIGG--TGGKIIVFLSTLPNM-GIGKLQLREDK-------ESSL 433
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  611 FQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQM--HLDRQQFLNDLRNDIEKKIGFD 688
Cdd:COG5028   434 LSCKDSFYKEFAIECSKVGISVDLFLTSEDYIDVATLSHLCRYTGGQTYFYPNFSAtrPNDATKLANDLVSHLSMEIGYE 513
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  689 AIMRVRTSTGFRATDFFGGILMNNTTDVEMAAIDCDKAVTVEFKHDDKLSeDSGALIQCAVLYTTISGQRRLRIHNLGLN 768
Cdd:COG5028   514 AVMRVRCSTGLRVSSFYGNFFNRSSDLCAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVVNLSLP 592
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  769 CSSQLADLYKSCETDALINFFAKSAFKAVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNC 848
Cdd:COG5028   593 TSSSIREVYASADQLAIACILAKKASTKALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLPLLMLA 672
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  849 LLKNcVLLSRPEISTDERAYQRQLVMTMGVADSQLFFYPQLLPIHTL-------DVKSTMLPAAVRCSESRLSEEGIFLL 921
Cdd:COG5028   673 LLKS-SAFRSGSTPSDIRISALNRLTSLPLKQLMRNIYPTLYALHDMpieaglpDEGLLVLPSPINATSSLLESGGLYLI 751
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  922 ANGLHMFLWLGVSSPPELIQGIFNVPSFAHINTDMTLLPEVGNPYSQQLRMIMGIIQQKRPYS-MKLTIVKQREQP--EM 998
Cdd:COG5028   752 DTGQKIFLWFGKDAVPSLLQDLFGVDSLSDIPSGKFTLPPTGNEFNERVRNIIGELRSVNDDStLPLVLVRGGGDPslRL 831
                         890       900
                  ....*....|....*....|....*....
gi 190343021  999 VFRQFLVEDKgLYGGSSYVDFLCCVHKEI 1027
Cdd:COG5028   832 WFFSTLVEDK-TLNIPSYLDYLQILHEKI 859
Med15 super family cl26621
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
3-240 4.19e-03

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


The actual alignment was detected with superfamily member pfam09606:

Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 41.15  E-value: 4.19e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021     3 QQGYVATPPYSQPQPGIGLSPPHYGHYGDPShtASPTGMmkPAGPLGATATR-----------GMLPPGPPPPGPHQFGQ 71
Cdd:pfam09606  243 MQQQQPQQQGQQSQLGMGINQMQQMPQGVGG--GAGQGG--PGQPMGPPGQQpgampnvmsigDQNNYQQQQTRQQQQQQ 318
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    72 NGAHATGHPPQRFPGPPPVNNVASshAPYQPSAQSSYPGPISTSSVTQLGSQLSAMqinsygsgMAPPSQGPPGPL-SAT 150
Cdd:pfam09606  319 GGNHPAAHQQQMNQSVGQGGQVVA--LGGLNHLETWNPGNFGGLGANPMQRGQPGM--------MSSPSPVPGQQVrQVT 388
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   151 SLQTPPRPPQPSILQPGsqvlppppttlnGPGASPLPLPMYRPdglsgPPPPNAQYQPPPLPGQTlgagyPPQQANSGPQ 230
Cdd:pfam09606  389 PNQFMRQSPQPSVPSPQ------------GPGSQPPQSHPGGM-----IPSPALIPSPSPQMSQQ-----PAQQRTIGQD 446
                          250
                   ....*....|
gi 190343021   231 MAGAQLSYPG 240
Cdd:pfam09606  447 SPGGSLNTPG 456
 
Name Accession Description Interval E-value
COG5028 COG5028
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
132-1027 1.39e-166

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 511.26  E-value: 1.39e-166
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  132 YGSGMAPPSQGPPGPlSATSLQTPPRPPQPSILQPGSQVLPPPPTTlngpgaSPLPLPMYRPDGLSGPPPPNAQYQPPPl 211
Cdd:COG5028     4 HKKGVYPQAQSQVHT-GAASSKKSARPHRAYANFSAGQMGMPPYTT------PPLQQQSRRQIDQAATAMHNTGANNPA- 75
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  212 pgQTLGAGYPPQQANsgpqmagaqlsYPGGFPGGPAQMAGPPQpqkkldpdSIPSPIQVIENDRASRGGQVYATnTRGQI 291
Cdd:COG5028    76 --PSVMSPAFQSQQK-----------FSSPYGGSMADGTAPKP--------TNPLVPVDLFEDQPPPISDLFLP-PPPIV 133
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  292 PPLvTTDCMIQDQGNASPRFIRCTTYCFPCTSDMAKQAQIPLAAVIKPFATIPSNESPLYLVNHGEsgPVRCNRCKAYMC 371
Cdd:COG5028   134 PPL-TTNFVGSEQSNCSPKYVRSTMYAIPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRCRSYIN 210
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  372 PFMQFIEGGRRYQCGFCNCVNDVPPFYFQHLDHIGRRLDHYEKPELSLGSYEYVATLDYcrKSKPPNPPAFIFMIDVSYS 451
Cdd:COG5028   211 PFVQFIEQGRKWRCNICRSKNDVPEGFDNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLIDVSFE 288
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  452 NIKNGLVKLICEELKTMLEKIPKEEQEetsaIRVGFITYNKVLHFFNVKSNLaQPQMMVVTDVGEVFVPLLDG-FLVNYQ 530
Cdd:COG5028   289 AIKNGLVKAAIRAILENLDQIPNFDPR----TKIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlFVLPLK 363
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  531 ESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAadCPGKLFIFHSSLPTAeAPGKLKNRDDKklvntdkEKIL 610
Cdd:COG5028   364 SCKQIIETLLDRVPRIFQDNKSPKNALGPALKAAKSLIGG--TGGKIIVFLSTLPNM-GIGKLQLREDK-------ESSL 433
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  611 FQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQM--HLDRQQFLNDLRNDIEKKIGFD 688
Cdd:COG5028   434 LSCKDSFYKEFAIECSKVGISVDLFLTSEDYIDVATLSHLCRYTGGQTYFYPNFSAtrPNDATKLANDLVSHLSMEIGYE 513
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  689 AIMRVRTSTGFRATDFFGGILMNNTTDVEMAAIDCDKAVTVEFKHDDKLSeDSGALIQCAVLYTTISGQRRLRIHNLGLN 768
Cdd:COG5028   514 AVMRVRCSTGLRVSSFYGNFFNRSSDLCAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVVNLSLP 592
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  769 CSSQLADLYKSCETDALINFFAKSAFKAVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNC 848
Cdd:COG5028   593 TSSSIREVYASADQLAIACILAKKASTKALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLPLLMLA 672
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  849 LLKNcVLLSRPEISTDERAYQRQLVMTMGVADSQLFFYPQLLPIHTL-------DVKSTMLPAAVRCSESRLSEEGIFLL 921
Cdd:COG5028   673 LLKS-SAFRSGSTPSDIRISALNRLTSLPLKQLMRNIYPTLYALHDMpieaglpDEGLLVLPSPINATSSLLESGGLYLI 751
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  922 ANGLHMFLWLGVSSPPELIQGIFNVPSFAHINTDMTLLPEVGNPYSQQLRMIMGIIQQKRPYS-MKLTIVKQREQP--EM 998
Cdd:COG5028   752 DTGQKIFLWFGKDAVPSLLQDLFGVDSLSDIPSGKFTLPPTGNEFNERVRNIIGELRSVNDDStLPLVLVRGGGDPslRL 831
                         890       900
                  ....*....|....*....|....*....
gi 190343021  999 VFRQFLVEDKgLYGGSSYVDFLCCVHKEI 1027
Cdd:COG5028   832 WFFSTLVEDK-TLNIPSYLDYLQILHEKI 859
Sec24-like cd01479
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the ...
437-696 7.39e-119

Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.


Pssm-ID: 238756 [Multi-domain]  Cd Length: 244  Bit Score: 364.29  E-value: 7.39e-119
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  437 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEEqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 516
Cdd:cd01479     1 PQPAVYVFLIDVSYNAIKSGLLATACEALLSNLDNLPGDD----PRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDLDD 76
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  517 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKaaDCPGKLFIFHSSLPTAEApGKLKNR 596
Cdd:cd01479    77 PFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLK--ETGGKIIVFQSSLPTLGA-GKLKSR 153
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  597 DDKKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQmhldrqqflND 676
Cdd:cd01479   154 EDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYPSFN---------FS 224
                         250       260
                  ....*....|....*....|
gi 190343021  677 LRNDIEKKIGFDAIMRVRTS 696
Cdd:cd01479   225 APNDVEKLVNELARYLTRKI 244
Sec23_trunk pfam04811
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
437-681 6.32e-111

Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.


Pssm-ID: 398467 [Multi-domain]  Cd Length: 241  Bit Score: 343.46  E-value: 6.32e-111
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   437 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEeqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 516
Cdd:pfam04811    1 PQPPVFLFVIDVSYNAIKSGLLAALKESLLQSLDLLPGD-----PRARVGFITFDSTVHFFNLGSSLRQPQMLVVSDLQD 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   517 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSLPTAEAPGKLKNR 596
Cdd:pfam04811   76 MFLPLPDRFLVPLSECRFVLEDLLEQLPPMFPVTKRPERCLGPALQAAFLLLKAAFTGGKIMVFQGGLPTVGPGGKLKSR 155
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   597 DDKKLVNTDKEKILFQPQTN-VYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQMHLDRQQFLN 675
Cdd:pfam04811  156 LDESHHGTDKEKAKLVKKADkFYKSLAKECVKQGHSVDLFAFSLDYVDVATLGQLSRLTGGQVYLYPSFQADVDGSKFKQ 235

                   ....*.
gi 190343021   676 DLRNDI 681
Cdd:pfam04811  236 DLQRYF 241
PTZ00395 PTZ00395
Sec24-related protein; Provisional
439-1025 1.45e-50

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 195.29  E-value: 1.45e-50
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  439 PPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIpkeeqeETSAIRVGFITYNKVLHFFNVKSNLAQP------------ 506
Cdd:PTZ00395  952 PPYFVFVVECSYNAIYNNITYTILEGIRYAVQNV------KCPQTKIAIITFNSSIYFYHCKGGKGVSgeegdggggsgn 1025
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  507 -QMMVVTDVGEVFVPL-LDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSL 584
Cdd:PTZ00395 1026 hQVIVMSDVDDPFLPLpLEDLFFGCVEEIDKINTLIDTIKSVSTTMQSYGSCGNSALKIAMDMLKERNGLGSICMFYTTT 1105
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  585 PTAeAPGKLKnrddkKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVA--SLGLVPQLTGGTLYKYN 662
Cdd:PTZ00395 1106 PNC-GIGAIK-----ELKKDLQENFLEVKQKIFYDSLLLDLYAFNISVDIFIISSNNVRVCvpSLQYVAQNTGGKILFVE 1179
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  663 NFQMHLDRQQ-FLNDLRNDIEKKIGFDAIMRVRTSTG------FRATDFFGGILMNNTtdVEMAAIDCDKAVTVEFKHDD 735
Cdd:PTZ00395 1180 NFLWQKDYKEiYMNIMDTLTSEDIAYCCELKLRYSHHmsvkklFCCNNNFNSIISVDT--IKIPKIRHDQTFAFLLNYSD 1257
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  736 KLSEDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFKAVLHQplKVIREILVNQTAH 815
Cdd:PTZ00395 1258 ISESKKQIYFQCACIYTNLWGDRFVRLHTTHMNLTSSLSTVFRYTDAEALMNILIKQLCTNILHN--DNYSKIIIDNLAA 1335
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  816 MLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNCVllSRPEISTDERAYQRQLVMTMGVADSQLFFYPQLLPIH-- 893
Cdd:PTZ00395 1336 ILFSYRINCASSAHSGQLILPDTLKLLPLFTSSLLKHNV--TKKEILHDLKVYSLIKLLSMPIISSLLYVYPVMYVIHik 1413
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  894 -------TLDVKSTM-LPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIF-NVPSFAHINTdmtlLPEVGN 964
Cdd:PTZ00395 1414 gktneidSMDVDDDLfIPKTIPSSAEKIYSNGIYLLDACTHFYLYFGFHSDANFAKEIVgDIPTEKNAHE----LNLTDT 1489
                         570       580       590       600       610       620
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 190343021  965 PYSQQLRMIMGIIQQKRPYS--MKLTIVKQREQPEMVFRQFLVEDKGlYGGSSYVDFLCCVHK 1025
Cdd:PTZ00395 1490 PNAQKVQRIIKNLSRIHHFNkyVPLVMVAPKSNEEEHLISLCVEDKA-DKEYSYVNFLCFIHK 1551
BimA_second NF040983
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia ...
143-246 6.58e-05

trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia intracellular motility A), WP_004266405.1-like proteins in Burkholderia mallei or B. pseudomallei. The term BimA has also been used for WP_011205626.1-like homologs that have a very different N-terminal half.


Pssm-ID: 468913 [Multi-domain]  Cd Length: 382  Bit Score: 46.43  E-value: 6.58e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  143 PPGPLSATSLQTPPRPPQPSilqpgsqvlPPPPTTLNGPGASPLPLPmyrPDGLSGPPPPNAQYQPPPLPGQTLGAGYPP 222
Cdd:NF040983   79 PVGDRTLPNKVPPPPPPPPP---------PPPPPPTPPPPPPPPPPP---PPPSPPPPPPPSPPPSPPPPTTTPPTRTTP 146
                          90       100
                  ....*....|....*....|....
gi 190343021  223 QQANSGPQMAGAQlsyPGGFPGGP 246
Cdd:NF040983  147 STTTPTPSMHPIQ---PTQLPSIP 167
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
102-231 1.22e-04

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 45.95  E-value: 1.22e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   102 PSAQSSYPGPISTS-SVTQLGSQLSAMQINSY-----GSGMAPPSQGppgplsatslqtPPRPPQPSILQPGSQVLPPPP 175
Cdd:TIGR01628  381 RMRQLPMGSPMGGAmGQPPYYGQGPQQQFNGQplgwpRMSMMPTPMG------------PGGPLRPNGLAPMNAVRAPSR 448
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 190343021   176 TTLNGPGASPLPLPMYRPDGLSGPPPPNAQyQPPPLPGQTLGAGYPPQQ-ANSGPQM 231
Cdd:TIGR01628  449 NAQNAAQKPPMQPVMYPPNYQSLPLSQDLP-QPQSTASQGGQNKKLAQVlASATPQM 504
BimA_first NF040984
trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular ...
94-211 2.45e-04

trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular motility protein A) is a trimeric autotransporter, homologous in its C-terminal half to a number of trimeric autotransporter adhesins. It is a virulence factor that nucleates actin, so that actin polymerization can drive escape by B. pseudomallei out of one cell and into a neighboring cell. HMM NF040983 describes a homolog with similar activity but substantial difference in sequence architecture in the N-terminal region.


Pssm-ID: 468914 [Multi-domain]  Cd Length: 517  Bit Score: 44.86  E-value: 2.45e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   94 ASSHAPYQPSaqssyPGPISTSSVTQLGSQLSAMQINSYGSgmappsqgPPGPLSATSLQTPPrpPQPSilqPGSQVLPP 173
Cdd:NF040984    6 SSSHAPDAPK-----PSSIATTLCRALASLSLGLSMDAEAN--------PPEPPGGTNIPVPP--PMPG---GGANIPVP 67
                          90       100       110
                  ....*....|....*....|....*....|....*...
gi 190343021  174 PPTTLNGPGASPLPLPmyrPDGLSGPPPpnaqyQPPPL 211
Cdd:NF040984   68 PPMPGGGANIPPPPPP---PGGIGGATP-----SPPPL 97
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
3-240 4.19e-03

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 41.15  E-value: 4.19e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021     3 QQGYVATPPYSQPQPGIGLSPPHYGHYGDPShtASPTGMmkPAGPLGATATR-----------GMLPPGPPPPGPHQFGQ 71
Cdd:pfam09606  243 MQQQQPQQQGQQSQLGMGINQMQQMPQGVGG--GAGQGG--PGQPMGPPGQQpgampnvmsigDQNNYQQQQTRQQQQQQ 318
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    72 NGAHATGHPPQRFPGPPPVNNVASshAPYQPSAQSSYPGPISTSSVTQLGSQLSAMqinsygsgMAPPSQGPPGPL-SAT 150
Cdd:pfam09606  319 GGNHPAAHQQQMNQSVGQGGQVVA--LGGLNHLETWNPGNFGGLGANPMQRGQPGM--------MSSPSPVPGQQVrQVT 388
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   151 SLQTPPRPPQPSILQPGsqvlppppttlnGPGASPLPLPMYRPdglsgPPPPNAQYQPPPLPGQTlgagyPPQQANSGPQ 230
Cdd:pfam09606  389 PNQFMRQSPQPSVPSPQ------------GPGSQPPQSHPGGM-----IPSPALIPSPSPQMSQQ-----PAQQRTIGQD 446
                          250
                   ....*....|
gi 190343021   231 MAGAQLSYPG 240
Cdd:pfam09606  447 SPGGSLNTPG 456
SAV_2336_NTERM NF041121
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ...
138-205 5.36e-03

SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.


Pssm-ID: 469044 [Multi-domain]  Cd Length: 473  Bit Score: 40.37  E-value: 5.36e-03
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 190343021  138 PPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDglsGPPPPNAQ 205
Cdd:NF041121   39 PPPAAPPSPPGDPPEPPAPEPAPLPAPYPGSLAPPPPPPPGPAGAAPGAALPVRVPA---PPALPNPL 103
SAV_2336_NTERM NF041121
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ...
137-214 7.43e-03

SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.


Pssm-ID: 469044 [Multi-domain]  Cd Length: 473  Bit Score: 39.99  E-value: 7.43e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  137 APPSQGPPGPLSATSLQTPPRPPQPSiLQPGSQVLPPPPTTLNGPGASP--LPLPMYRPDGLSGPPPPNAQ----YQPPP 210
Cdd:NF041121   20 APPSPEGPAPTAASQPATPPPPAAPP-SPPGDPPEPPAPEPAPLPAPYPgsLAPPPPPPPGPAGAAPGAALpvrvPAPPA 98

                  ....
gi 190343021  211 LPGQ 214
Cdd:NF041121   99 LPNP 102
 
Name Accession Description Interval E-value
COG5028 COG5028
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
132-1027 1.39e-166

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 511.26  E-value: 1.39e-166
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  132 YGSGMAPPSQGPPGPlSATSLQTPPRPPQPSILQPGSQVLPPPPTTlngpgaSPLPLPMYRPDGLSGPPPPNAQYQPPPl 211
Cdd:COG5028     4 HKKGVYPQAQSQVHT-GAASSKKSARPHRAYANFSAGQMGMPPYTT------PPLQQQSRRQIDQAATAMHNTGANNPA- 75
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  212 pgQTLGAGYPPQQANsgpqmagaqlsYPGGFPGGPAQMAGPPQpqkkldpdSIPSPIQVIENDRASRGGQVYATnTRGQI 291
Cdd:COG5028    76 --PSVMSPAFQSQQK-----------FSSPYGGSMADGTAPKP--------TNPLVPVDLFEDQPPPISDLFLP-PPPIV 133
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  292 PPLvTTDCMIQDQGNASPRFIRCTTYCFPCTSDMAKQAQIPLAAVIKPFATIPSNESPLYLVNHGEsgPVRCNRCKAYMC 371
Cdd:COG5028   134 PPL-TTNFVGSEQSNCSPKYVRSTMYAIPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRCRSYIN 210
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  372 PFMQFIEGGRRYQCGFCNCVNDVPPFYFQHLDHIGRRLDHYEKPELSLGSYEYVATLDYcrKSKPPNPPAFIFMIDVSYS 451
Cdd:COG5028   211 PFVQFIEQGRKWRCNICRSKNDVPEGFDNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLIDVSFE 288
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  452 NIKNGLVKLICEELKTMLEKIPKEEQEetsaIRVGFITYNKVLHFFNVKSNLaQPQMMVVTDVGEVFVPLLDG-FLVNYQ 530
Cdd:COG5028   289 AIKNGLVKAAIRAILENLDQIPNFDPR----TKIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlFVLPLK 363
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  531 ESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAadCPGKLFIFHSSLPTAeAPGKLKNRDDKklvntdkEKIL 610
Cdd:COG5028   364 SCKQIIETLLDRVPRIFQDNKSPKNALGPALKAAKSLIGG--TGGKIIVFLSTLPNM-GIGKLQLREDK-------ESSL 433
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  611 FQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQM--HLDRQQFLNDLRNDIEKKIGFD 688
Cdd:COG5028   434 LSCKDSFYKEFAIECSKVGISVDLFLTSEDYIDVATLSHLCRYTGGQTYFYPNFSAtrPNDATKLANDLVSHLSMEIGYE 513
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  689 AIMRVRTSTGFRATDFFGGILMNNTTDVEMAAIDCDKAVTVEFKHDDKLSeDSGALIQCAVLYTTISGQRRLRIHNLGLN 768
Cdd:COG5028   514 AVMRVRCSTGLRVSSFYGNFFNRSSDLCAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVVNLSLP 592
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  769 CSSQLADLYKSCETDALINFFAKSAFKAVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNC 848
Cdd:COG5028   593 TSSSIREVYASADQLAIACILAKKASTKALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLPLLMLA 672
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  849 LLKNcVLLSRPEISTDERAYQRQLVMTMGVADSQLFFYPQLLPIHTL-------DVKSTMLPAAVRCSESRLSEEGIFLL 921
Cdd:COG5028   673 LLKS-SAFRSGSTPSDIRISALNRLTSLPLKQLMRNIYPTLYALHDMpieaglpDEGLLVLPSPINATSSLLESGGLYLI 751
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  922 ANGLHMFLWLGVSSPPELIQGIFNVPSFAHINTDMTLLPEVGNPYSQQLRMIMGIIQQKRPYS-MKLTIVKQREQP--EM 998
Cdd:COG5028   752 DTGQKIFLWFGKDAVPSLLQDLFGVDSLSDIPSGKFTLPPTGNEFNERVRNIIGELRSVNDDStLPLVLVRGGGDPslRL 831
                         890       900
                  ....*....|....*....|....*....
gi 190343021  999 VFRQFLVEDKgLYGGSSYVDFLCCVHKEI 1027
Cdd:COG5028   832 WFFSTLVEDK-TLNIPSYLDYLQILHEKI 859
Sec24-like cd01479
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the ...
437-696 7.39e-119

Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.


Pssm-ID: 238756 [Multi-domain]  Cd Length: 244  Bit Score: 364.29  E-value: 7.39e-119
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  437 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEEqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 516
Cdd:cd01479     1 PQPAVYVFLIDVSYNAIKSGLLATACEALLSNLDNLPGDD----PRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDLDD 76
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  517 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKaaDCPGKLFIFHSSLPTAEApGKLKNR 596
Cdd:cd01479    77 PFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLK--ETGGKIIVFQSSLPTLGA-GKLKSR 153
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  597 DDKKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQmhldrqqflND 676
Cdd:cd01479   154 EDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYPSFN---------FS 224
                         250       260
                  ....*....|....*....|
gi 190343021  677 LRNDIEKKIGFDAIMRVRTS 696
Cdd:cd01479   225 APNDVEKLVNELARYLTRKI 244
Sec23_trunk pfam04811
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
437-681 6.32e-111

Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.


Pssm-ID: 398467 [Multi-domain]  Cd Length: 241  Bit Score: 343.46  E-value: 6.32e-111
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   437 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEeqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 516
Cdd:pfam04811    1 PQPPVFLFVIDVSYNAIKSGLLAALKESLLQSLDLLPGD-----PRARVGFITFDSTVHFFNLGSSLRQPQMLVVSDLQD 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   517 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSLPTAEAPGKLKNR 596
Cdd:pfam04811   76 MFLPLPDRFLVPLSECRFVLEDLLEQLPPMFPVTKRPERCLGPALQAAFLLLKAAFTGGKIMVFQGGLPTVGPGGKLKSR 155
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   597 DDKKLVNTDKEKILFQPQTN-VYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQMHLDRQQFLN 675
Cdd:pfam04811  156 LDESHHGTDKEKAKLVKKADkFYKSLAKECVKQGHSVDLFAFSLDYVDVATLGQLSRLTGGQVYLYPSFQADVDGSKFKQ 235

                   ....*.
gi 190343021   676 DLRNDI 681
Cdd:pfam04811  236 DLQRYF 241
trunk_domain cd01468
trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi ...
437-677 7.13e-98

trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. Some members of this family possess a partial MIDAS motif that is a characteristic feature of most vWA domain proteins.


Pssm-ID: 238745 [Multi-domain]  Cd Length: 239  Bit Score: 308.79  E-value: 7.13e-98
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  437 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIPKEeqeetSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 516
Cdd:cd01468     1 PQPPVFVFVIDVSYEAIKEGLLQALKESLLASLDLLPGD-----PRARVGLITYDSTVHFYNLSSDLAQPKMYVVSDLKD 75
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  517 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFAD--SNENETVFAPVIQAGMEALKAADCPGKLFIFHSSLPTAEaPGKLK 594
Cdd:cd01468    76 VFLPLPDRFLVPLSECKKVIHDLLEQLPPMFWPvpTHRPERCLGPALQAAFLLLKGTFAGGRIIVFQGGLPTVG-PGKLK 154
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  595 NRDDKKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQMHLDRQQFL 674
Cdd:cd01468   155 SREDKEPIRSHDEAQLLKPATKFYKSLAKECVKSGICVDLFAFSLDYVDVATLKQLAKSTGGQVYLYDSFQAPNDGSKFK 234

                  ...
gi 190343021  675 NDL 677
Cdd:cd01468   235 QDL 237
PTZ00395 PTZ00395
Sec24-related protein; Provisional
439-1025 1.45e-50

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 195.29  E-value: 1.45e-50
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  439 PPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKIpkeeqeETSAIRVGFITYNKVLHFFNVKSNLAQP------------ 506
Cdd:PTZ00395  952 PPYFVFVVECSYNAIYNNITYTILEGIRYAVQNV------KCPQTKIAIITFNSSIYFYHCKGGKGVSgeegdggggsgn 1025
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  507 -QMMVVTDVGEVFVPL-LDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSL 584
Cdd:PTZ00395 1026 hQVIVMSDVDDPFLPLpLEDLFFGCVEEIDKINTLIDTIKSVSTTMQSYGSCGNSALKIAMDMLKERNGLGSICMFYTTT 1105
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  585 PTAeAPGKLKnrddkKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVA--SLGLVPQLTGGTLYKYN 662
Cdd:PTZ00395 1106 PNC-GIGAIK-----ELKKDLQENFLEVKQKIFYDSLLLDLYAFNISVDIFIISSNNVRVCvpSLQYVAQNTGGKILFVE 1179
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  663 NFQMHLDRQQ-FLNDLRNDIEKKIGFDAIMRVRTSTG------FRATDFFGGILMNNTtdVEMAAIDCDKAVTVEFKHDD 735
Cdd:PTZ00395 1180 NFLWQKDYKEiYMNIMDTLTSEDIAYCCELKLRYSHHmsvkklFCCNNNFNSIISVDT--IKIPKIRHDQTFAFLLNYSD 1257
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  736 KLSEDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFKAVLHQplKVIREILVNQTAH 815
Cdd:PTZ00395 1258 ISESKKQIYFQCACIYTNLWGDRFVRLHTTHMNLTSSLSTVFRYTDAEALMNILIKQLCTNILHN--DNYSKIIIDNLAA 1335
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  816 MLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNCVllSRPEISTDERAYQRQLVMTMGVADSQLFFYPQLLPIH-- 893
Cdd:PTZ00395 1336 ILFSYRINCASSAHSGQLILPDTLKLLPLFTSSLLKHNV--TKKEILHDLKVYSLIKLLSMPIISSLLYVYPVMYVIHik 1413
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  894 -------TLDVKSTM-LPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIF-NVPSFAHINTdmtlLPEVGN 964
Cdd:PTZ00395 1414 gktneidSMDVDDDLfIPKTIPSSAEKIYSNGIYLLDACTHFYLYFGFHSDANFAKEIVgDIPTEKNAHE----LNLTDT 1489
                         570       580       590       600       610       620
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 190343021  965 PYSQQLRMIMGIIQQKRPYS--MKLTIVKQREQPEMVFRQFLVEDKGlYGGSSYVDFLCCVHK 1025
Cdd:PTZ00395 1490 PNAQKVQRIIKNLSRIHHFNkyVPLVMVAPKSNEEEHLISLCVEDKA-DKEYSYVNFLCFIHK 1551
Sec23_helical pfam04815
Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic ...
783-883 1.57e-32

Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices.


Pssm-ID: 461441 [Multi-domain]  Cd Length: 103  Bit Score: 121.84  E-value: 1.57e-32
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   783 DALINFFAKSAFKAVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNCVLLSRPEIS 862
Cdd:pfam04815    3 EAIAVLLAKKAVEKALSSSLSDAREALDNKLVDILAAYRKYCASSSSPGQLILPESLKLLPLYMLALLKSPALRGGNSSP 82
                           90       100
                   ....*....|....*....|.
gi 190343021   863 TDERAYQRQLVMTMGVADSQL 883
Cdd:pfam04815   83 SDERAYARHLLLSLPVEELLL 103
Sec23_BS pfam08033
Sec23/Sec24 beta-sandwich domain;
686-770 4.52e-29

Sec23/Sec24 beta-sandwich domain;


Pssm-ID: 429794 [Multi-domain]  Cd Length: 86  Bit Score: 111.09  E-value: 4.52e-29
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   686 GFDAIMRVRTSTGFRATDFFGGILMNNTTD-VEMAAIDCDKAVTVEFKHDDKLSEDSGALIQCAVLYTTISGQRRLRIHN 764
Cdd:pfam08033    1 GFNAVLRVRTSKGLKVSGFIGNFVSRSSGDtWKLPSLDPDTSYAFEFDIDEPLPNGSNAYIQFALLYTHSSGERRIRVTT 80

                   ....*.
gi 190343021   765 LGLNCS 770
Cdd:pfam08033   81 VALPVT 86
SEC23 COG5047
Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion];
312-900 1.71e-21

Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion];


Pssm-ID: 227380 [Multi-domain]  Cd Length: 755  Bit Score: 100.73  E-value: 1.71e-21
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  312 IRCTTYCFPCTSDMAKQAQIPLAAVIKPFatipsNESPLYLVNHGEsgPVRCNR-CKAYMCPFMQFIEGGRRYQCGFCNC 390
Cdd:COG5047    12 IRLTWNVFPATRGDATRTVIPIACLYTPL-----HEDDALTVNYYE--PVKCTApCKAVLNPYCHIDERNQSWICPFCNQ 84
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  391 VNDVPPFYfqhldhigRRLDHYEKP-ELSLGS--YEYVAtldycrkSKPPN-PPAFIFMIDVSYSNIKNGLVKlicEELK 466
Cdd:COG5047    85 RNTLPPQY--------RDISNANLPlELLPQSstIEYTL-------SKPVIlPPVFFFVVDACCDEEELTALK---DSLI 146
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  467 TMLEKIPKEeqeetsAIrVGFITYNKVLHFFNV------KSNLAQP----QMMVVTDVGEVFVPLLDG------------ 524
Cdd:COG5047   147 VSLSLLPPE------AL-VGLITYGTSIQVHELnaenhrRSYVFSGnkeyTKENLQELLALSKPTKSGgfeskisgigqf 219
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  525 ----FLVNYQESQSVIHNLLDQI-PDMFADSNENE----TVFAPVIQAGMEALKAADCPGKLFIFHSSlPTAEAPGKLKN 595
Cdd:COG5047   220 assrFLLPTQQCEFKLLNILEQLqPDPWPVPAGKRplrcTGSALNIASSLLEQCFPNAGCHIVLFAGG-PCTVGPGTVVS 298
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  596 RDDKK------LVNTDKEKiLFQPQTNVYDSLAKDCVAHGCSVTLFLfpSQYVDVASLGLVP--QLTGGTLYKYNNFQMH 667
Cdd:COG5047   299 TELKEpmrshhDIESDSAQ-HSKKATKFYKGLAERVANQGHALDIFA--GCLDQIGIMEMEPltTSTGGALVLSDSFTTS 375
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  668 LDRQQFLN--DLRNDIEKKIGFDAIMRVRTSTGFRATDFFG---------------GILMNNTTDVEMAAIDCDKAVTVE 730
Cdd:COG5047   376 IFKQSFQRifNRDSEGYLKMGFNANMEVKTSKNLKIKGLIGhavsvkkkannisdsEIGIGATNSWKMASLSPKSNYALY 455
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  731 FK-----HDDKLSEDSGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADL-YKSCETDALINFFAK-SAFKAVLHQPLK 803
Cdd:COG5047   456 FEialgaASGSAQRPAEAYIQFITTYQHSSGTYRIRVTTVARMFTDGGLPKiNRSFDQEAAAVFMARiAAFKAETEDIID 535
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  804 VIREI---LVNQTAHmLACYRKNcaspsAASQLILPDSMKVLPVYMnCLLKNCVLLSRPEISTDERAYQRQLVMTMGVAD 880
Cdd:COG5047   536 VFRWIdrnLIRLCQK-FADYRKD-----DPSSFRLDPNFTLYPQFM-YHLRRSPFLSVFNNSPDETAFYRHMLNNADVND 608
                         650       660
                  ....*....|....*....|....*...
gi 190343021  881 SQLFFYPQLLPIH--------TLDVKST 900
Cdd:COG5047   609 SLIMIQPTLQSYSfekggvpvLLDSVSV 636
zf-Sec23_Sec24 pfam04810
Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
360-396 1.06e-15

Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain.


Pssm-ID: 461437 [Multi-domain]  Cd Length: 38  Bit Score: 71.71  E-value: 1.06e-15
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 190343021   360 PVRCNRCKAYMCPFMQFIEGGRRYQCGFCNCVNDVPP 396
Cdd:pfam04810    1 PVRCRRCRAYLNPFCQFDFGGKKWTCNFCGTRNPVPP 37
PLN00162 PLN00162
transport protein sec23; Provisional
312-656 6.13e-11

transport protein sec23; Provisional


Pssm-ID: 215083 [Multi-domain]  Cd Length: 761  Bit Score: 66.50  E-value: 6.13e-11
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  312 IRCTTYCFPCTSDMAKQAQIPLAAVIKPFAtiPSNESPL--YlvnhgesGPVRCNRCKAYMCPFMQFIEGGRRYQCGFCN 389
Cdd:PLN00162   12 VRMSWNVWPSSKIEASKCVIPLAALYTPLK--PLPELPVlpY-------DPLRCRTCRAVLNPYCRVDFQAKIWICPFCF 82
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  390 CVNDVPPFYFQhldhIGrrlDHYEKPELslgsYEYVATLDY---CRKSKPPNPPAFIFMIDVSYSNIKNGLVKlicEELK 466
Cdd:PLN00162   83 QRNHFPPHYSS----IS---ETNLPAEL----FPQYTTVEYtlpPGSGGAPSPPVFVFVVDTCMIEEELGALK---SALL 148
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  467 TMLEKIPkeeqeETSaiRVGFITY----------------------------NKVLHFFNVKSNLAQPQMMVVTDVGEVF 518
Cdd:PLN00162  149 QAIALLP-----ENA--LVGLITFgthvhvhelgfsecsksyvfrgnkevskDQILEQLGLGGKKRRPAGGGIAGARDGL 221
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  519 VPL-LDGFLVNYQESQSVIHNLLDQI-PDMFADSNENE----TVFAPVIQAGMEALKAADCPGKLFIFHSSlPTAEAPGK 592
Cdd:PLN00162  222 SSSgVNRFLLPASECEFTLNSALEELqKDPWPVPPGHRparcTGAALSVAAGLLGACVPGTGARIMAFVGG-PCTEGPGA 300
                         330       340       350       360       370       380
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 190343021  593 LKNRDDKKLVNTDKEKI-----LFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGG 656
Cdd:PLN00162  301 IVSKDLSEPIRSHKDLDkdaapYYKKAVKFYEGLAKQLVAQGHVLDVFACSLDQVGVAEMKVAVERTGG 369
PHA03378 PHA03378
EBNA-3B; Provisional
96-269 2.47e-10

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 64.70  E-value: 2.47e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   96 SHAPyQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPsQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPP-P 174
Cdd:PHA03378  613 SHIP-ETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPP-QVEITPYKPTWTQIGHIPYQPSPTGANTMLPIQwA 690
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  175 PTTLNGPGASPLPL--------PMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQANSG----PQMAGAQLSYPGGF 242
Cdd:PHA03378  691 PGTMQPPPRAPTPMrppaappgRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGrarpPAAAPGRARPPAAA 770
                         170       180       190
                  ....*....|....*....|....*....|
gi 190343021  243 PGGPAQM---AGPPQPQKKldPDSIPSPIQ 269
Cdd:PHA03378  771 PGAPTPQpppQAPPAPQQR--PRGAPTPQP 798
PHA03247 PHA03247
large tegument protein UL36; Provisional
8-293 6.09e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 63.80  E-value: 6.09e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    8 ATPPYSQPQPGIGLSPPHYGhygDPSHTASPTGMMKPAGPlGATATRGMLPPGPPPPGPhqfgqnGAHATGHPPQRFPGP 87
Cdd:PHA03247 2718 ATPLPPGPAAARQASPALPA---APAPPAVPAGPATPGGP-ARPARPPTTAGPPAPAPP------AAPAAGPPRRLTRPA 2787
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   88 PPVNNVASSHAPyQPSAQSSYPGPISTSSVTQLGSQLSAmqinsygSGMAPPSQGPPgplsaTSLQTPPRPPQPSiLQPG 167
Cdd:PHA03247 2788 VASLSESRESLP-SPWDPADPPAAVLAPAAALPPAASPA-------GPLPPPTSAQP-----TAPPPPPGPPPPS-LPLG 2853
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  168 SQVLPPPPTTLNGPGASPLPLP----------MYRPdGLSGPPPPNAQYQPPPLPGQTlgagyPPQQANSGPQMAGAQLS 237
Cdd:PHA03247 2854 GSVAPGGDVRRRPPSRSPAAKPaaparppvrrLARP-AVSRSTESFALPPDQPERPPQ-----PQAPPPPQPQPQPPPPP 2927
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 190343021  238 YPGGFPGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDRASRGGQVYAtnTRGQIPP 293
Cdd:PHA03247 2928 QPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAV--PRFRVPQ 2981
Gelsolin pfam00626
Gelsolin repeat;
899-974 1.06e-09

Gelsolin repeat;


Pssm-ID: 395501 [Multi-domain]  Cd Length: 76  Bit Score: 55.78  E-value: 1.06e-09
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 190343021   899 STMLPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSppELIQGIFNVPSFAHINTDM-TLLPEVGN-PYSQQLRMIM 974
Cdd:pfam00626    1 KFVLPPPVPLSQESLNSGDCYLLDNGFTIFLWVGKGS--SLLEKLFAALLAAQLDDDErFPLPEVIRvPQGKEPARFL 76
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
96-255 1.50e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 62.09  E-value: 1.50e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    96 SHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSygsgmAPPSQGPPGPLSATSLQTP-PRPPQPSILQPGSQVLPPP 174
Cdd:pfam03154  143 STSPSIPSPQDNESDSDSSAQQQILQTQPPVLQAQS-----GAASPPSPPPPGTTQAATAgPTPSAPSVPPQGSPATSQP 217
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   175 PTTLNGPgASPLPL----PMYRPDGLSGPPPPNAQYQPPPLPGQTlgagypPQQANSGPQMAGAQLSYPGGFPGGPAQM- 249
Cdd:pfam03154  218 PNQTQST-AAPHTLiqqtPTLHPQRLPSPHPPLQPMTQPPPPSQV------SPQPLPQPSLHGQMPPMPHSLQTGPSHMq 290

                   ....*..
gi 190343021   250 -AGPPQP 255
Cdd:pfam03154  291 hPVPPQP 297
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
93-273 1.83e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 58.63  E-value: 1.83e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    93 VASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPS---QGP-------PGPLSATSLQTPPRPPQPS 162
Cdd:pfam03154  182 SPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTliqQTPtlhpqrlPSPHPPLQPMTQPPPPSQV 261
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   163 ILQPGSQV-----LPPPPTTLN-GPGASPLPLPmyrPDGLsGPPPPNAQYQPPPLPGQTLGAGYPPQQANSGPQMAGAQL 236
Cdd:pfam03154  262 SPQPLPQPslhgqMPPMPHSLQtGPSHMQHPVP---PQPF-PLTPQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQ 337
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|..
gi 190343021   237 SYPGGFPGGPAQMAGP-------------PQPQKKLDPD--SIPSPIQVIEN 273
Cdd:pfam03154  338 QPPREQPLPPAPLSMPhikpppttpipqlPNPQSHKHPPhlSGPSPFQMNSN 389
PHA03247 PHA03247
large tegument protein UL36; Provisional
7-288 5.42e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 57.26  E-value: 5.42e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    7 VATPPYSQPQPGIGLSPPHYGHYGDPSHTASPTGMMKPAGPLGATATRgmlppgppppgphqfgqngahatghppqrfpg 86
Cdd:PHA03247 2776 AAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAAS-------------------------------- 2823
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   87 pppvnnvASSHAPYQPSAQSSYPGPISTSSVTQLgsqlsamqinSYGSGMAP----PSQGPPGPlSATSLQTPPRPPQPS 162
Cdd:PHA03247 2824 -------PAGPLPPPTSAQPTAPPPPPGPPPPSL----------PLGGSVAPggdvRRRPPSRS-PAAKPAAPARPPVRR 2885
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  163 ILQPgsqvlPPPPTTlngpgaSPLPLPmyrPDGLSGPPPPNAQYQP---PPLPGQTLGAGYPPQQANSGPQMAGAQLSYP 239
Cdd:PHA03247 2886 LARP-----AVSRST------ESFALP---PDQPERPPQPQAPPPPqpqPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAG 2951
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|
gi 190343021  240 GGFPGGpaqmaGPPQPQ-KKLDPDSIPSPIQVIENDRASRGGQVYATNTR 288
Cdd:PHA03247 2952 AGEPSG-----AVPQPWlGALVPGRVAVPRFRVPQPAPSREAPASSTPPL 2996
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
99-289 6.90e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 56.70  E-value: 6.90e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    99 PYQPSAQS-SYPGPISTSSVTQLGSQLS---AMQINSYGSGMAPPSQGPPgPLS--ATSLQTPPRPPQPSILQPgSQVLP 172
Cdd:pfam03154  364 PQLPNPQShKHPPHLSGPSPFQMNSNLPpppALKPLSSLSTHHPPSAHPP-PLQlmPQSQQLPPPPAQPPVLTQ-SQSLP 441
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   173 P-----PPTTLNGPGASPLPLPMYrPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAN---SGPQMAGAQLSYPggfpg 244
Cdd:pfam03154  442 PpaashPPTSGLHQVPSQSPFPQH-PFVPGGPPPITPPSGPPTSTSSAMPGIQPPSSASvssSGPVPAAVSCPLP----- 515
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|..
gi 190343021   245 gPAQMAGPP-----QPQKKLDPDSIPSPIQVIEN--DRASRGGQVYATNTRG 289
Cdd:pfam03154  516 -PVQIKEEAldeaeEPESPPPPPRSPSPEPTVVNtpSHASQSARFYKHLDRG 566
PHA03247 PHA03247
large tegument protein UL36; Provisional
10-261 8.34e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 56.87  E-value: 8.34e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   10 PPYSQPQPGIGLSPPHYGHYGDPSHTASPTGMMKPAGPLGATATRGMLPPGPPPPGPHQFGQNGAHATGHPPQRFPGPPP 89
Cdd:PHA03247 2741 PPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPP 2820
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   90 VNNVASSHAP---YQPSAQSSYPGPISTSSVTQlGSQLSAMQINSYGSGMAPPSQ-------------GPPGPLSATSLQ 153
Cdd:PHA03247 2821 AASPAGPLPPptsAQPTAPPPPPGPPPPSLPLG-GSVAPGGDVRRRPPSRSPAAKpaaparppvrrlaRPAVSRSTESFA 2899
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  154 TPPRPPQPsilqPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPP-PPNAQYQPPPLPGqtlGAGYPPQQANSGP-QM 231
Cdd:PHA03247 2900 LPPDQPER----PPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPlAPTTDPAGAGEPS---GAVPQPWLGALVPgRV 2972
                         250       260       270
                  ....*....|....*....|....*....|
gi 190343021  232 AGAQLSYPGGFPGGPAQMAGPPQPQKKLDP 261
Cdd:PHA03247 2973 AVPRFRVPQPAPSREAPASSTPPLTGHSLS 3002
dnaA PRK14086
chromosomal replication initiator protein DnaA;
98-261 2.52e-07

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 54.83  E-value: 2.52e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   98 APYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRP----PQPSILQPGSQVLPP 173
Cdd:PRK14086   95 PAPPPPHARRTSEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPTARPAYPAYQQRPEPgawpRAADDYGWQQQRLGF 174
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  174 PPTTLNGPGASPLPLPMY--------RPDGLSGPPP---PNAQYQPP-----PLPGQTLGAGYPPQQANSGPQMAGAQL- 236
Cdd:PRK14086  175 PPRAPYASPASYAPEQERdrepydagRPEYDQRRRDydhPRPDWDRPrrdrtDRPEPPPGAGHVHRGGPGPPERDDAPVv 254
                         170       180
                  ....*....|....*....|....*....
gi 190343021  237 ----SYPGGFPGGPAQMAGPPQPQKKLDP 261
Cdd:PRK14086  255 pirpSAPGPLAAQPAPAPGPGEPTARLNP 283
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
103-248 3.84e-07

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 51.19  E-value: 3.84e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   103 SAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRP--PQPSILQPGS---QVLPPPPTT 177
Cdd:pfam15240   15 SAQSSSEDVSQEDSPSLISEEEGQSQQGGQGPQGPPPGGFPPQPPASDDPPGPPPPggPQQPPPQGGKqkpQGPPPQGGP 94
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 190343021   178 LNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYP--PQQANSGPQMAGAQLSYPGGFPGGPAQ 248
Cdd:pfam15240   95 RPPPGKPQGPPPQGGNQQQGPPPPGKPQGPPPQGGGPPPQGGNQqgPPPPPPGNPQGPPQRPPQPGNPQGPPQ 167
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
94-258 9.55e-07

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 53.12  E-value: 9.55e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    94 ASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPP 173
Cdd:pfam09770  205 AQAKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSPQPDPAQPSIQPQ 284
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   174 PPTTLNGPGASPlPLPMY------RPDGLSGPPPPNAQYQPPPLPGQtlgagYPPQQANSGPQMAGAQLsypggfpgGPA 247
Cdd:pfam09770  285 AQQFHQQPPPVP-VQPTQilqnpnRLSAARVGYPQNPQPGVQPAPAH-----QAHRQQGSFGRQAPIIT--------HPQ 350
                          170
                   ....*....|.
gi 190343021   248 QMAGPPQPQKK 258
Cdd:pfam09770  351 QLAQLSEEEKA 361
PHA03247 PHA03247
large tegument protein UL36; Provisional
138-349 1.21e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.02  E-value: 1.21e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  138 PPSQGP---PGPLSATS-LQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPP---PPNAQYQPPP 210
Cdd:PHA03247 2701 PPPPPPtpePAPHALVSaTPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPapaPPAAPAAGPP 2780
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  211 -----LPGQTLGAGYPPQQANSGPQMAGAQLSYPGG---FPGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDRAS--RGG 280
Cdd:PHA03247 2781 rrltrPAVASLSESRESLPSPWDPADPPAAVLAPAAalpPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSvaPGG 2860
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  281 QVYATNTRGQIPPLVTTDCMIQDQGNASPRFIRcTTYCFPCTSD-MAKQAQIPLAAVIKPFATIPSNESP 349
Cdd:PHA03247 2861 DVRRRPPSRSPAAKPAAPARPPVRRLARPAVSR-STESFALPPDqPERPPQPQAPPPPQPQPQPPPPPQP 2929
SOBP pfam15279
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ...
90-253 1.02e-05

Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.


Pssm-ID: 464609 [Multi-domain]  Cd Length: 325  Bit Score: 48.66  E-value: 1.02e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    90 VNNVASSHAPYQPSAQSSYPGPISTSSVT----------QLGSQLSAMQINSYGSGMAPPSQGPPGPLSAT----SLQTP 155
Cdd:pfam15279  119 VASSSKLLAPKPHEPPSLPPPPLPPKKGRrhrpglhpplGRPPGSPPMSMTPRGLLGKPQQHPPPSPLPAFmepsSMPPP 198
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   156 PRPPQPSILQPGSQVLPP------PPTTLNGPGASPlPLPMYRPD-GLSGPPPPNAQYQPPPLPGQTLGAGYPPQQANSG 228
Cdd:pfam15279  199 FLRPPPSIPQPNSPLSNPmlpgigPPPKPPRNLGPP-SNPMHRPPfSPHHPPPPPTPPGPPPGLPPPPPRGFTPPFGPPF 277
                          170       180
                   ....*....|....*....|....*
gi 190343021   229 PQMAGAQLSYPGGFPGGPAQMAGPP 253
Cdd:pfam15279  278 PPVNMMPNPPEMNFGLPSLAPLVPP 302
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
115-269 1.02e-05

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 46.95  E-value: 1.02e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   115 SSVTQLGSQLSAMQINSYGSGMAPPSQGPpgplsATSLQTPPRPPQPSilQPGSQVLPPPPTTLNGPGASPLPLPMYRPD 194
Cdd:pfam15240   14 SSAQSSSEDVSQEDSPSLISEEEGQSQQG-----GQGPQGPPPGGFPP--QPPASDDPPGPPPPGGPQQPPPQGGKQKPQ 86
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   195 GLSGP----PPPNAQYQPPPLPGQTLGAGYPPQQANSGPQMAGAQLSYPGGFPGGPAQMAGPPQ--PQKKLDPDSIPSPI 268
Cdd:pfam15240   87 GPPPQggprPPPGKPQGPPPQGGNQQQGPPPPGKPQGPPPQGGGPPPQGGNQQGPPPPPPGNPQgpPQRPPQPGNPQGPP 166

                   .
gi 190343021   269 Q 269
Cdd:pfam15240  167 Q 167
MISS pfam15822
MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic ...
139-255 1.74e-05

MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic MAPK-interacting and spindle-stabilising protein-like proteins. MISS is rich in prolines and has four potential MAPK-phosphorylation sites, a MAPK-docking site, a PEST sequence (PEST motif) and a bipartite nuclear localization signal. The endogenous protein accumulates during mouse meiotic maturation and is found as discrete dots on the MII spindle. MISS is the first example of a physiological MAPK-substrate that is stabilized in MII that specifically regulates MII spindle integrity during the CSF arrest.


Pssm-ID: 318115 [Multi-domain]  Cd Length: 238  Bit Score: 47.29  E-value: 1.74e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   139 PSQGPPGPLSATSLQTPPRPPQ--PSILQPGSQVLPPPPT----TLNGPGASPLPLPMYRPDGLSGPPPpNAQYQPPPLP 212
Cdd:pfam15822   26 PPQGWPGSNPWNNPSAPPAVPSglPPSTAPSTVPFGPAPTgmypSIPLTGPSPGPPAPFPPSGPSCPPP-GGPYPAPTVP 104
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|...
gi 190343021   213 GQTLGAGYPPqqansgPQMAGAQLSYPGGFPGGPAQmAGPPQP 255
Cdd:pfam15822  105 GPGPIGPYPT------PNMPFPELPRPYGAPTDPAA-AAPSGP 140
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
30-284 1.94e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.83  E-value: 1.94e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   30 GDPSHTASPTGMMKPAGPLGATATRGMLPPGPPPPGPHQFGQNGAHAtghppqrfpgpppvnnvasshapyqPSAQSSYP 109
Cdd:PRK07764  589 GPAPGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAA-------------------------PAEASAAP 643
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  110 GPISTSSVTQLGSqlSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLP 189
Cdd:PRK07764  644 APGVAAPEHHPKH--VAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQP 721
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  190 MYRPDGLSGPPPPNAQYQPPPlpgqtLGAGYPPQQANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDsIPSPiq 269
Cdd:PRK07764  722 PQAAQGASAPSPAADDPVPLP-----PEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEEEEMAEDD-APSM-- 793
                         250
                  ....*....|....*
gi 190343021  270 vieNDRASRGGQVYA 284
Cdd:PRK07764  794 ---DDEDRRDAEEVA 805
PHA03247 PHA03247
large tegument protein UL36; Provisional
86-267 2.24e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.78  E-value: 2.24e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   86 GPPPVNNVASSHAPYQPSA--------QSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPP----GPLSATSLQ 153
Cdd:PHA03247 2598 PRAPVDDRGDPRGPAPPSPlppdthapDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPrrarRLGRAAQAS 2677
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  154 TPPRPPQPSILQPGSQVL-------PPPPTTLNGPGA--SPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQ 224
Cdd:PHA03247 2678 SPPQRPRRRAARPTVGSLtsladppPPPPTPEPAPHAlvSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARP 2757
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*
gi 190343021  225 AnSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDP--DSIPSP 267
Cdd:PHA03247 2758 A-RPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSEsrESLPSP 2801
PRK13729 PRK13729
conjugal transfer pilus assembly protein TraB; Provisional
117-214 2.35e-05

conjugal transfer pilus assembly protein TraB; Provisional


Pssm-ID: 184281 [Multi-domain]  Cd Length: 475  Bit Score: 48.28  E-value: 2.35e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  117 VTQLGSQLSAMQINSYGSGMAPP-SQGPPGPLSATSLQTPPRPPQPSilqPGSQVLPPppttlNGPGASPLPLPMYrpDG 195
Cdd:PRK13729  106 IEKLGQDNAALAEQVKALGANPVtATGEPVPQMPASPPGPEGEPQPG---NTPVSFPP-----QGSVAVPPPTAFY--PG 175
                          90
                  ....*....|....*....
gi 190343021  196 LSGPPPPNAQYQPPPLPGQ 214
Cdd:PRK13729  176 NGVTPPPQVTYQSVPVPNR 194
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
137-280 2.56e-05

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 48.63  E-value: 2.56e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  137 APPSQGPPGPLSATSLQTPPRPPQPsilqpgsqVLPPPPTtlnGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQTL 216
Cdd:PHA03307   99 SPAREGSPTPPGPSSPDPPPPTPPP--------ASPPPSP---APDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAA 167
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 190343021  217 GAGYPPQQANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDRASRGG 280
Cdd:PHA03307  168 SSRQAALPLSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADD 231
FAP pfam07174
Fibronectin-attachment protein (FAP); This family contains bacterial fibronectin-attachment ...
137-230 2.99e-05

Fibronectin-attachment protein (FAP); This family contains bacterial fibronectin-attachment proteins (FAP). Family members are rich in alanine and proline, are approximately 300 long, and seem to be restricted to mycobacteria. These proteins contain a fibronectin-binding motif that allows mycobacteria to bind to fibronectin in the extracellular matrix.


Pssm-ID: 429334  Cd Length: 301  Bit Score: 47.23  E-value: 2.99e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   137 APPSQGPPGPLSATSLQTPPRPPQPsilqPGSQVLPPPPTTLNGPGASPlplpmyrpdglsGPPPPNAQYQPPPLPgqtl 216
Cdd:pfam07174   39 ADPEPAPPPPSTATAPPAPPPPPPA----PAAPAPPPPPAAPNAPNAPP------------PPADPNAPPPPPADP---- 98
                           90
                   ....*....|....
gi 190343021   217 GAGYPPQQANSGPQ 230
Cdd:pfam07174   99 NAPPPPAVDPNAPE 112
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
134-293 3.27e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 47.95  E-value: 3.27e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  134 SGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGA---SPLPLPMYRPDGLSGPPPPNAqyqPPP 210
Cdd:PRK12323  376 TAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARrspAPEALAAARQASARGPGGAPA---PAP 452
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  211 LPGQTLGAGYPPQQANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSPiQVIENDRASRGGQVYATNTRGQ 290
Cdd:PRK12323  453 APAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEELPPEFASP-APAQPDAAPAGWVAESIPDPAT 531

                  ...
gi 190343021  291 IPP 293
Cdd:PRK12323  532 ADP 534
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
123-256 3.98e-05

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 47.70  E-value: 3.98e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   123 QLSAMQINSYGSGMAPPSQGPPGPLSAtsLQTPP----RPPQPSILQPGSQvlPPPPTTLNGPG-ASPLPLPMYRPD--- 194
Cdd:pfam09606   59 QQQQPQGGQGNGGMGGGQQGMPDPINA--LQNLAgqgtRPQMMGPMGPGPG--GPMGQQMGGPGtASNLLASLGRPQmpm 134
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 190343021   195 -GLSGPPPPNA--QYQP----PPLPGQTLGAGYPPQQANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQ 256
Cdd:pfam09606  135 gGAGFPSQMSRvgRMQPggqaGGMMQPSSGQPGSGTPNQMGPNGGPGQGQAGGMNGGQQGPMGGQMPPQ 203
dnaA PRK14086
chromosomal replication initiator protein DnaA;
139-280 4.35e-05

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 47.51  E-value: 4.35e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  139 PSQGPPGPLSatslQTPPRPPQPSILQPGSQvlppPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQtlga 218
Cdd:PRK14086   90 PSAGEPAPPP----PHARRTSEPELPRPGRR----PYEGYGGPRADDRPPGLPRQDQLPTARPAYPAYQQRPEPGA---- 157
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 190343021  219 gYPPQQANSGPQMAgaqlsyPGGFPgGPAQMAGPPQPQKKLDPDSIPSPIQVIENDRASRGG 280
Cdd:PRK14086  158 -WPRAADDYGWQQQ------RLGFP-PRAPYASPASYAPEQERDREPYDAGRPEYDQRRRDY 211
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
148-295 6.09e-05

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 46.93  E-value: 6.09e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   148 SATSLQTPPRPPQPSILQPGSQVLPPPPTTL-NGPGASPLPlPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQQAN 226
Cdd:pfam09606   56 KAAQQQQPQGGQGNGGMGGGQQGMPDPINALqNLAGQGTRP-QMMGPMGPGPGGPMGQQMGGPGTASNLLASLGRPQMPM 134
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 190343021   227 SGPQMAGAQLSYPGGFPGGpaQMAGPPQPQKKLDPDSIPspiQVIENDRASRGGQVYATNTRGQIPPLV 295
Cdd:pfam09606  135 GGAGFPSQMSRVGRMQPGG--QAGGMMQPSSGQPGSGTP---NQMGPNGGPGQGQAGGMNGGQQGPMGG 198
BimA_second NF040983
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia ...
143-246 6.58e-05

trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia intracellular motility A), WP_004266405.1-like proteins in Burkholderia mallei or B. pseudomallei. The term BimA has also been used for WP_011205626.1-like homologs that have a very different N-terminal half.


Pssm-ID: 468913 [Multi-domain]  Cd Length: 382  Bit Score: 46.43  E-value: 6.58e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  143 PPGPLSATSLQTPPRPPQPSilqpgsqvlPPPPTTLNGPGASPLPLPmyrPDGLSGPPPPNAQYQPPPLPGQTLGAGYPP 222
Cdd:NF040983   79 PVGDRTLPNKVPPPPPPPPP---------PPPPPPTPPPPPPPPPPP---PPPSPPPPPPPSPPPSPPPPTTTPPTRTTP 146
                          90       100
                  ....*....|....*....|....
gi 190343021  223 QQANSGPQMAGAQlsyPGGFPGGP 246
Cdd:NF040983  147 STTTPTPSMHPIQ---PTQLPSIP 167
DUF3729 pfam12526
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins ...
136-221 7.11e-05

Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.


Pssm-ID: 372164 [Multi-domain]  Cd Length: 115  Bit Score: 43.14  E-value: 7.11e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   136 MAPPSQGPPGPLSATSLQTPPRPPQPSilqPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPppnaqyQPPPLPGQT 215
Cdd:pfam12526   39 PPPPVGDPRPPVVDTPPPVSAVWVLPP---PSEPAAPEPDLVPPVTGPAGPPSPLAPPAPAQKPP------LPPPRPQRR 109

                   ....*.
gi 190343021   216 LGAGYP 221
Cdd:pfam12526  110 LLHTYP 115
PRK13729 PRK13729
conjugal transfer pilus assembly protein TraB; Provisional
181-256 1.10e-04

conjugal transfer pilus assembly protein TraB; Provisional


Pssm-ID: 184281 [Multi-domain]  Cd Length: 475  Bit Score: 45.97  E-value: 1.10e-04
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 190343021  181 PGASP-LPLPMYRPDGLSGPPPPNAQYQPPPLPgqtlgAGYPPQQANSGPQMAGAqlsYPGGFPGGPAQMAGPPQPQ 256
Cdd:PRK13729  123 LGANPvTATGEPVPQMPASPPGPEGEPQPGNTP-----VSFPPQGSVAVPPPTAF---YPGNGVTPPPQVTYQSVPV 191
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
94-267 1.21e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 46.02  E-value: 1.21e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   94 ASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPP 173
Cdd:PRK12323  400 AAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAA 479
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  174 PPttLNGPGASPLPLPMYRPDGLSGPPPPnaqyqPPPLPGQTlGAGYPPQQANSGPQMAGAQLSYPggFPGGPAQMAGPP 253
Cdd:PRK12323  480 PA--RAAPAAAPAPADDDPPPWEELPPEF-----ASPAPAQP-DAAPAGWVAESIPDPATADPDDA--FETLAPAPAAAP 549
                         170
                  ....*....|....
gi 190343021  254 QPQKKLDPDSIPSP 267
Cdd:PRK12323  550 APRAAAATEPVVAP 563
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
137-241 1.21e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 46.13  E-value: 1.21e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  137 APPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTtlnGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQTL 216
Cdd:PRK07764  404 AAPAAAPAPAAAAPAAAAAPAPAAAPQPAPAPAPAPAPPS---PAGNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAP 480
                          90       100
                  ....*....|....*....|....*
gi 190343021  217 GAGYPPQQANSGPQMAGAQLSYPGG 241
Cdd:PRK07764  481 APAPPAAPAPAAAPAAPAAPAAPAG 505
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
102-231 1.22e-04

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 45.95  E-value: 1.22e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   102 PSAQSSYPGPISTS-SVTQLGSQLSAMQINSY-----GSGMAPPSQGppgplsatslqtPPRPPQPSILQPGSQVLPPPP 175
Cdd:TIGR01628  381 RMRQLPMGSPMGGAmGQPPYYGQGPQQQFNGQplgwpRMSMMPTPMG------------PGGPLRPNGLAPMNAVRAPSR 448
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 190343021   176 TTLNGPGASPLPLPMYRPDGLSGPPPPNAQyQPPPLPGQTLGAGYPPQQ-ANSGPQM 231
Cdd:TIGR01628  449 NAQNAAQKPPMQPVMYPPNYQSLPLSQDLP-QPQSTASQGGQNKKLAQVlASATPQM 504
PHA03379 PHA03379
EBNA-3A; Provisional
98-332 1.24e-04

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 46.20  E-value: 1.24e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   98 APYQPSAQSSYPGPISTSSVTQLGSQlsamqinsygSGMAP--PSQGPPGPLSATSL--QTP-----PRP-PQPSILQPG 167
Cdd:PHA03379  434 ATSHGSAQVPEPPPVHDLEPGPLHDQ----------HSMAPcpVAQLPPGPLQDLEPgdQLPgvvqdGRPaCAPVPAPAG 503
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  168 SQVLPPPPTTLNGPGASPLPlpmYRPDGLSGP--PPPNAQYQPPPLPGQTLGAGYPPQQANSGPQmagAQLSYpggfpGG 245
Cdd:PHA03379  504 PIVRPWEASLSQVPGVAFAP---VMPQPMPVEpvPVPTVALERPVCPAPPLIAMQGPGETSGIVR---VRERW-----RP 572
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  246 PAQMAGPPQPqkkldpdsiPSPIQVieNDRASRG---GQVYATNTRGQIPPLVTTDCMIQDQGNASPRFIRCTTYCFPCT 322
Cdd:PHA03379  573 APWTPNPPRS---------PSQMSV--RDRLARLraeAQPYQASVEVQPPQLTQVSPQQPMEYPLEPEQQMFPGSPFSQV 641
                         250
                  ....*....|
gi 190343021  323 SDMAKQAQIP 332
Cdd:PHA03379  642 ADVMRAGGVP 651
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
99-275 1.95e-04

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 45.44  E-value: 1.95e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   99 PYQPSAQSSyPGPISTSSvtqlgSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQ------------------ 160
Cdd:COG5180   278 PGLPVLEAG-SEPQSDAP-----EAETARPIDVKGVASAPPATRPVRPPGGARDPGTPRPGQpterpagvpeaasdagqp 351
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  161 PSILQPGSQVLP--------PPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAqyqPPPLPGQTLGAGYPPQQANSGPQMA 232
Cdd:COG5180   352 PSAYPPAEEAVPgkpleqgaPRPGSSGGDGAPFQPPNGAPQPGLGRRGAPGP---PMGAGDLVQAALDGGGRETASLGGA 428
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|...
gi 190343021  233 GAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSPIQVIENDR 275
Cdd:COG5180   429 AGGAGQGPKADFVPGDAESVSGPAGLADQAGAAASTAMADFVA 471
COG3416 COG3416
Uncharacterized conserved protein, DUF2076 domain [Function unknown];
117-250 2.01e-04

Uncharacterized conserved protein, DUF2076 domain [Function unknown];


Pssm-ID: 442642 [Multi-domain]  Cd Length: 237  Bit Score: 44.24  E-value: 2.01e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  117 VTQLGSQLSAMQinsygsgmAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPttlngpgasplplpmyrpdgl 196
Cdd:COG3416    64 IQELEAQLAQLQ--------QQQPQSSGGFLSGLFGGGQRPPPAPQPSQPGPQQQPAPP--------------------- 114
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|....
gi 190343021  197 sgPPPPNAQYQPPPlpgqtlGAGYPPQQAnsgPQMAGAQlsyPGGFPGGPAQMA 250
Cdd:COG3416   115 --SGPWGQAAPQQP------GYGQPQYGQ---PAAGPSG---GGGFLGGALQTA 154
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
132-264 2.05e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 45.36  E-value: 2.05e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  132 YGSGMAPPSQGPPGPLSATslqTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPlpmyRPDGLSGPPPPNAQYQPPPL 211
Cdd:PRK07764  387 VAGGAGAPAAAAPSAAAAA---PAAAPAPAAAAPAAAAAPAPAAAPQPAPAPAPAP----APPSPAGNAPAGGAPSPPPA 459
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|...
gi 190343021  212 PGQTLGAGYPPQQANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSI 264
Cdd:PRK07764  460 AAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAPAGADDAATL 512
Med25_SD1 pfam11235
Mediator complex subunit 25 synapsin 1; The overall function of the full-length Med25 is ...
102-248 2.20e-04

Mediator complex subunit 25 synapsin 1; The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA, domain, this SD1 - synapsin 1 - domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This The function of the SD domains is unclear.


Pssm-ID: 463244 [Multi-domain]  Cd Length: 157  Bit Score: 42.85  E-value: 2.20e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   102 PSAQSSYPGPISTSSVTQL------GSQLSAMQINSYGSgmAPPSQGPPGPLSATSL--QTPPRPPQPSILQPGSQVLPP 173
Cdd:pfam11235    2 PVGGGSAPGPLQSKQPVPLppaapsGATLSAAPQQPLPP--VPPQYQVPGNLSAAQVaaQNAVEAAKNQKAGLGPRFSPI 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   174 PPTTLNGPGASPlplPMYRPDGLSGPP-PPNAQYQPPPLPGQTL------GAGYPPQQANSGPQMAGAqlSYPGGFPG-G 245
Cdd:pfam11235   80 TPLQQAAPGVGP---PFSQAPAPQLPPgPPGAPKPVPPASQPSLvstvapGSGLAPTAQPGAPSMAGT--VAPGGVSGpS 154

                   ...
gi 190343021   246 PAQ 248
Cdd:pfam11235  155 PAQ 157
Drf_FH1 pfam06346
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs) ...
143-212 2.41e-04

Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.


Pssm-ID: 461881 [Multi-domain]  Cd Length: 157  Bit Score: 42.55  E-value: 2.41e-04
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   143 PPGPLSATSLQTPPRPPQPsilqpGSQVLPPPPTTLngPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLP 212
Cdd:pfam06346   83 PPPPLPGGAGIPPPPPPLP-----GGAGVPPPPPPL--PGGPGIPPPPPFPGGPGIPPPPPGMGMPPPPP 145
BimA_first NF040984
trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular ...
94-211 2.45e-04

trimeric autotransporter actin-nucleating factor BimA; BimA (B. pseudomallei intracellular motility protein A) is a trimeric autotransporter, homologous in its C-terminal half to a number of trimeric autotransporter adhesins. It is a virulence factor that nucleates actin, so that actin polymerization can drive escape by B. pseudomallei out of one cell and into a neighboring cell. HMM NF040983 describes a homolog with similar activity but substantial difference in sequence architecture in the N-terminal region.


Pssm-ID: 468914 [Multi-domain]  Cd Length: 517  Bit Score: 44.86  E-value: 2.45e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   94 ASSHAPYQPSaqssyPGPISTSSVTQLGSQLSAMQINSYGSgmappsqgPPGPLSATSLQTPPrpPQPSilqPGSQVLPP 173
Cdd:NF040984    6 SSSHAPDAPK-----PSSIATTLCRALASLSLGLSMDAEAN--------PPEPPGGTNIPVPP--PMPG---GGANIPVP 67
                          90       100       110
                  ....*....|....*....|....*....|....*...
gi 190343021  174 PPTTLNGPGASPLPLPmyrPDGLSGPPPpnaqyQPPPL 211
Cdd:NF040984   68 PPMPGGGANIPPPPPP---PGGIGGATP-----SPPPL 97
DUF3824 pfam12868
Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It ...
172-255 3.25e-04

Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It is proline-rich, and the function is not known.


Pssm-ID: 372351 [Multi-domain]  Cd Length: 145  Bit Score: 42.04  E-value: 3.25e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   172 PPPPTtlnGPGASPLPlpmYRPDGLSGPPPPNAqYQPPPLPGQTLGAGYPPQQANSGPQMAGAQLSYPGGFPGGPAQMAG 251
Cdd:pfam12868   62 PPSPA---GPYASQGQ---YYPETNYFPPPPGS-TPQPPVDPQPNAPPPPYNPADYPPPPGAAPPPQPYQYPPPPGPDPY 134

                   ....
gi 190343021   252 PPQP 255
Cdd:pfam12868  135 APRP 138
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
137-293 3.84e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 44.59  E-value: 3.84e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  137 APPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTtlngPGASPLPLPMYRPDG---LSGPPPPNAQYQPPPLPG 213
Cdd:PRK07764  614 RPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHH----PKHVAVPDASDGGDGwpaKAGGAAPAAPPPAPAPAA 689
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  214 QTLGAGYPPQQANSGP--QMAGAQLSYPGGFPGGPAQMAG----------PPQPQKKLDPDSIPSPIQVIENDRASRGGQ 281
Cdd:PRK07764  690 PAAPAGAAPAQPAPAPaaTPPAGQADDPAAQPPQAAQGASapspaaddpvPLPPEPDDPPDPAGAPAQPPPPPAPAPAAA 769
                         170
                  ....*....|..
gi 190343021  282 VYATNTRGQIPP 293
Cdd:PRK07764  770 PAAAPPPSPPSE 781
PRK14971 PRK14971
DNA polymerase III subunit gamma/tau;
121-228 5.93e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237874 [Multi-domain]  Cd Length: 614  Bit Score: 43.61  E-value: 5.93e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  121 GSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLP-PPPTTLNGPGASPLPLPMYRPDglSGP 199
Cdd:PRK14971  371 GGRGPKQHIKPVFTQPAAAPQPSAAAAASPSPSQSSAAAQPSAPQSATQPAGtPPTVSVDPPAAVPVNPPSTAPQ--AVR 448
                          90       100       110
                  ....*....|....*....|....*....|....*
gi 190343021  200 PPPNAQYQPPPLPGQ------TLGAGYPPQQANSG 228
Cdd:PRK14971  449 PAQFKEEKKIPVSKVsslgpsTLRPIQEKAEQATG 483
DUF3729 pfam12526
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins ...
139-214 7.12e-04

Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.


Pssm-ID: 372164 [Multi-domain]  Cd Length: 115  Bit Score: 40.45  E-value: 7.12e-04
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 190343021   139 PSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQ 214
Cdd:pfam12526   29 FSPPESAHPDPPPPVGDPRPPVVDTPPPVSAVWVLPPPSEPAAPEPDLVPPVTGPAGPPSPLAPPAPAQKPPLPPP 104
ARS2 pfam04959
Arsenite-resistance protein 2; Arsenite is a carcinogenic compound which can act as a ...
152-246 7.17e-04

Arsenite-resistance protein 2; Arsenite is a carcinogenic compound which can act as a co-mutagen by inhibiting DNA repair. Arsenite-resistance protein 2 is thought to play a role in arsenite resistance.


Pssm-ID: 461498  Cd Length: 195  Bit Score: 41.79  E-value: 7.17e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   152 LQTPPRPPQPSiLQPgsqvlPPPPTTLNGPGASPLPLPMYRPDGLSGppppnAQYQPPPLPGQTLGAGYPPQQANSGPQM 231
Cdd:pfam04959  109 LADAKRPATPE-LKP-----KPPPRPANRRERPGRAFPSQRPQGQMS-----DGHPRPPMDGPGGGPPFPPNQYGGGRGN 177
                           90
                   ....*....|....*.
gi 190343021   232 AGA-QLSYPGGFPGGP 246
Cdd:pfam04959  178 YDNfRGQGGGGYPPKP 193
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
94-293 7.88e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 43.49  E-value: 7.88e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    94 ASSHAPYQPsAQSSY-----PGPIS----TSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGP--LS------ATSLQTPP 156
Cdd:pfam09770  131 QQSQQPSKP-VRTGYekykePEPIPdlqvDASLWGVAPKKAAAPAPAPQPAAQPASLPAPSRkmMSleeveaAMRAQAKK 209
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   157 RPPQPSILQPGSQVLPPPPttlngpgasplplpmyrpdglSGPPPPNAQYQPPPLPGQTLGAGYPPQQANSGPQMagAQL 236
Cdd:pfam09770  210 PAQQPAPAPAQPPAAPPAQ---------------------QAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPV--TIL 266
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   237 SYPGGFPGGPAQMAGPPQPQ--KKLDPDSIPSPIQVIEN-DRASRGGQVYATNTRGQIPP 293
Cdd:pfam09770  267 QRPQSPQPDPAQPSIQPQAQqfHQQPPPVPVQPTQILQNpNRLSAARVGYPQNPQPGVQP 326
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
93-217 1.06e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 42.78  E-value: 1.06e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   93 VASSHAPYQPSAQSSYPGPIStssVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLSATSLQTPPRPPQPsilqpGSQVLP 172
Cdd:PRK14951  375 PAEKKTPARPEAAAPAAAPVA---QAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAAAPAAAPAA-----APAAVA 446
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*...
gi 190343021  173 PPPTTLNGPGASPLPLPMY---RPDGLSGPPPPNAQYQPPPLPGQTLG 217
Cdd:PRK14951  447 LAPAPPAQAAPETVAIPVRvapEPAVASAAPAPAAAPAAARLTPTEEG 494
Prog_receptor pfam02161
Progesterone receptor;
99-200 1.74e-03

Progesterone receptor;


Pssm-ID: 460470  Cd Length: 564  Bit Score: 42.22  E-value: 1.74e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    99 PYQPSAQSSYPGPIS------TSSVTQLGSQLSAMQINSYGSGMAPPSQGPPGPLsatslqtPPRPPQPSILQPGSQVLP 172
Cdd:pfam02161  424 PLPPRAPSSRPGEAAvaaapaSASVSSASSSGSTLECILYKAEGAPPQQGPFAPP-------PCKPPGAGACLLPRDGLP 496
                           90       100
                   ....*....|....*....|....*...
gi 190343021   173 PPPTTLNGPGASPlplPMYRPDGLSGPP 200
Cdd:pfam02161  497 STSASAAAAGAAP---ALYPPLGLNGLP 521
PHA03264 PHA03264
envelope glycoprotein D; Provisional
172-265 1.89e-03

envelope glycoprotein D; Provisional


Pssm-ID: 223029 [Multi-domain]  Cd Length: 416  Bit Score: 41.91  E-value: 1.89e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  172 PPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQ--YQPPPLPGQTLGAGYPPQQANSGPQMAGAQLSYPGGFPGGPAQM 249
Cdd:PHA03264  266 PPPAPSGGSPAPPGDDRPEAKPEPGPVEDGAPGRetGGEGEGPEPAGRDGAAGGEPKPGPPRPAPDADRPEGWPSLEAIT 345
                          90
                  ....*....|....*.
gi 190343021  250 AGPPQPQKKLDPDSIP 265
Cdd:PHA03264  346 FPPPTPATPAVPRARP 361
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
137-236 2.47e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 41.90  E-value: 2.47e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  137 APPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPM----YRPDGLSGPPPPNAQYQPPPLP 212
Cdd:PRK07764  416 APAAAAAPAPAAAPQPAPAPAPAPAPPSPAGNAPAGGAPSPPPAAAPSAQPAPApaaaPEPTAAPAPAPPAAPAPAAAPA 495
                          90       100
                  ....*....|....*....|....
gi 190343021  213 GQTLGAGYPPQQANSGPQMAGAQL 236
Cdd:PRK07764  496 APAAPAAPAGADDAATLRERWPEI 519
Gag_spuma pfam03276
Spumavirus gag protein;
134-271 3.22e-03

Spumavirus gag protein;


Pssm-ID: 460872 [Multi-domain]  Cd Length: 614  Bit Score: 41.27  E-value: 3.22e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   134 SGMAPPSQGPPGPLSATSlqtppRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPG 213
Cdd:pfam03276  176 AEISPGAQGGIPPGASFS-----GLPSLPAIGGIHLPAIPGIHARAPPGNIARSLGDDIMPSLGDAGMPQPRFAFHPGNP 250
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 190343021   214 QTLGAGYPPQQAnSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSPIQVI 271
Cdd:pfam03276  251 FAEAEGHPFAEA-EGERPRDIPRAPRIDAPSAPAIPAIQPIAPPMIPPIGAPIPIPHG 307
PHA03247 PHA03247
large tegument protein UL36; Provisional
137-210 4.07e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 41.46  E-value: 4.07e-03
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 190343021  137 APPSQGPPGPLSATSLQTPPRPPQPSIlqpgsqvlPPPPTTLNGPGASPL----PLPMYRPDGLSGPPPPNAQYQPPP 210
Cdd:PHA03247  388 ARHAATPFARGPGGDDQTRPAAPVPAS--------VPTPAPTPVPASAPPppatPLPSAEPGSDDGPAPPPERQPPAP 457
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
3-240 4.19e-03

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 41.15  E-value: 4.19e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021     3 QQGYVATPPYSQPQPGIGLSPPHYGHYGDPShtASPTGMmkPAGPLGATATR-----------GMLPPGPPPPGPHQFGQ 71
Cdd:pfam09606  243 MQQQQPQQQGQQSQLGMGINQMQQMPQGVGG--GAGQGG--PGQPMGPPGQQpgampnvmsigDQNNYQQQQTRQQQQQQ 318
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    72 NGAHATGHPPQRFPGPPPVNNVASshAPYQPSAQSSYPGPISTSSVTQLGSQLSAMqinsygsgMAPPSQGPPGPL-SAT 150
Cdd:pfam09606  319 GGNHPAAHQQQMNQSVGQGGQVVA--LGGLNHLETWNPGNFGGLGANPMQRGQPGM--------MSSPSPVPGQQVrQVT 388
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   151 SLQTPPRPPQPSILQPGsqvlppppttlnGPGASPLPLPMYRPdglsgPPPPNAQYQPPPLPGQTlgagyPPQQANSGPQ 230
Cdd:pfam09606  389 PNQFMRQSPQPSVPSPQ------------GPGSQPPQSHPGGM-----IPSPALIPSPSPQMSQQ-----PAQQRTIGQD 446
                          250
                   ....*....|
gi 190343021   231 MAGAQLSYPG 240
Cdd:pfam09606  447 SPGGSLNTPG 456
SP6_N cd22544
N-terminal domain of transcription factor Specificity Protein (SP) 6; Specificity Proteins ...
114-253 4.44e-03

N-terminal domain of transcription factor Specificity Protein (SP) 6; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP6, also known as epiprofin, shows specific expression pattern in hair follicles and the apical ectodermal ridge (AER) of the developing limbs. SP6 null mice are nude and show defects in skin, teeth, limbs (syndactyly and oligodactyly), and lung alveoli. SP6 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP6.


Pssm-ID: 411693 [Multi-domain]  Cd Length: 245  Bit Score: 39.90  E-value: 4.44e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  114 TSSVTQLGSQLSAMQINSYGSGMAPPSQ-----------GPPGPLSATSLQTPPRPPQPSILQPGSQVlPPPPTTLNGPG 182
Cdd:cd22544     3 TAVCGSLGNQHSETPRASPPTLDLQPLQpyqihsspeagDYPSPLQPTELQSLPLGPGVDFSARESYE-PHSSRRTCLDL 81
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  183 ASPLPLPMYRPdgLSGPPPPNAQ-YQP---PPLPGQTLGAG-----------------YPPQQANSGPQMAGAQLSYPGG 241
Cdd:cd22544    82 ESDLPLGPFPK--LLHPPPDMAHpYESwfrPPHPGGSGEEGgvpswwdlhagsswmdlQHGQGGLQSPGPPGGLQPPLGG 159
                         170
                  ....*....|..
gi 190343021  242 FpGGPAQMAGPP 253
Cdd:cd22544   160 Y-GSEHQLCGPP 170
SAV_2336_NTERM NF041121
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ...
138-205 5.36e-03

SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.


Pssm-ID: 469044 [Multi-domain]  Cd Length: 473  Bit Score: 40.37  E-value: 5.36e-03
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 190343021  138 PPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDglsGPPPPNAQ 205
Cdd:NF041121   39 PPPAAPPSPPGDPPEPPAPEPAPLPAPYPGSLAPPPPPPPGPAGAAPGAALPVRVPA---PPALPNPL 103
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
132-267 6.44e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 40.54  E-value: 6.44e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  132 YGS-GMAPPSQGP---PGPLSATSLQTPPRPPQ-----PSILQPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPP 202
Cdd:PHA03307   37 SGSqGQLVSDSAElaaVTVVAGAAACDRFEPPTgpppgPGTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDP 116
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 190343021  203 NAQYQPPPLPGQTLGAGYPPQQANSGPQMAGAQLSYPGGFPGGPAQMAGPPQPQKKLDPDSIPSP 267
Cdd:PHA03307  117 PPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASSRQAALPLSSPEE 181
SSDP pfam04503
Single-stranded DNA binding protein, SSDP; This is a family of eukaryotic single-stranded DNA ...
87-255 7.09e-03

Single-stranded DNA binding protein, SSDP; This is a family of eukaryotic single-stranded DNA binding proteins with specificity to a pyrimidine-rich element found in the promoter region of the alpha2(I) collagen gene.


Pssm-ID: 461334 [Multi-domain]  Cd Length: 293  Bit Score: 39.55  E-value: 7.09e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    87 PPPVNNVASSHAPYQPSAQSSYP-GPISTSSVTQLGSQLSamqinsygsgmAPPSQGPPGPLSAtslqTPPRPPQPSILQ 165
Cdd:pfam04503   61 PPPHNPATMMGPHSQPFMGPRYPgGPRPSVRMPQQGNDFN-----------GPPGQQPMMPNSM----DPTRPGGHPNMG 125
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   166 PGSQVLPPPpttlNGPGASPLPLPMYRPdGLSGpPPPNAQYQPPPLPGQTLGAGYPPQQANSGpqmAGAQLSYPGGFPG- 244
Cdd:pfam04503  126 GPMQRMNPP----RGPGMGPMGPQSYGP-GMRG-PPPNSTDGPGGMPPMNMGPGGRRPWPQPN---ASNPLPYSSSSPGs 196
                          170
                   ....*....|...
gi 190343021   245 --GPAQMAGPPQP 255
Cdd:pfam04503  197 ygGPPGGGGPPGP 209
SAV_2336_NTERM NF041121
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ...
137-214 7.43e-03

SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.


Pssm-ID: 469044 [Multi-domain]  Cd Length: 473  Bit Score: 39.99  E-value: 7.43e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  137 APPSQGPPGPLSATSLQTPPRPPQPSiLQPGSQVLPPPPTTLNGPGASP--LPLPMYRPDGLSGPPPPNAQ----YQPPP 210
Cdd:NF041121   20 APPSPEGPAPTAASQPATPPPPAAPP-SPPGDPPEPPAPEPAPLPAPYPgsLAPPPPPPPGPAGAAPGAALpvrvPAPPA 98

                  ....
gi 190343021  211 LPGQ 214
Cdd:NF041121   99 LPNP 102
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
64-226 7.68e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 40.40  E-value: 7.68e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021    64 PGPHQfGQNGAHATGHPPQRFPgpppvnnvASSHAPYQPSAQSSYPGPISTSSVTQLGSQLSAMQinsygSGMAPPSQGP 143
Cdd:pfam09770  215 APAPA-QPPAAPPAQQAQQQQQ--------FPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQ-----SPQPDPAQPS 280
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   144 PGPLSATSLQTPPRPPQpsilQPgSQVLPPPpttlNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPLPGQTLGAGYPPQ 223
Cdd:pfam09770  281 IQPQAQQFHQQPPPVPV----QP-TQILQNP----NRLSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHPQQ 351

                   ...
gi 190343021   224 QAN 226
Cdd:pfam09770  352 LAQ 354
PHA03201 PHA03201
uracil DNA glycosylase; Provisional
156-225 7.91e-03

uracil DNA glycosylase; Provisional


Pssm-ID: 165468  Cd Length: 318  Bit Score: 39.49  E-value: 7.91e-03
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 190343021  156 PRPPQPSilqPGSQVLPPPPTTLNGPGASPLPLPMYRPDGLSGPPPPNAQYQPPPL-------PGQTLGAGYPPQQA 225
Cdd:PHA03201    4 ARSRSPS---PPRRPSPPRPTPPRSPDASPEETPPSPPGPGAEPPPGRAAGPAAPRrrprgcpAGVTFSSSAPPRPP 77
PHA03247 PHA03247
large tegument protein UL36; Provisional
11-267 8.41e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 40.31  E-value: 8.41e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   11 PYSQPQPGiGLSPPHYGHYGDPSHTA------SPTGmmKPAGPLGATATRGM--------------LPPGPPPPGPHQfg 70
Cdd:PHA03247 2492 AGAAPDPG-GGGPPDPDAPPAPSRLApailpdEPVG--EPVHPRMLTWIRGLeelasddagdppppLPPAAPPAAPDR-- 2566
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   71 qngahatghppqrfpgpppvnnvasSHAPYQPSAQSsyPGPISTSSVTQLGSQLSAmqinsyGSGMAP--PSQGPPGPLS 148
Cdd:PHA03247 2567 -------------------------SVPPPRPAPRP--SEPAVTSRARRPDAPPQS------ARPRAPvdDRGDPRGPAP 2613
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  149 ATSL---QTPPRPPQPS-------ILQPGSQVLPPPPTTLNGPGASPLPLP-----MYRPDGLSG--------------- 198
Cdd:PHA03247 2614 PSPLppdTHAPDPPPPSpspaanePDPHPPPTVPPPERPRDDPAPGRVSRPrrarrLGRAAQASSppqrprrraarptvg 2693
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  199 -------PPPPNAQYQPPPLP----------GQTLGAGYPPQQANSGPQMAGAQLSYPGG--FPGGPAQMAGPPQPQKKL 259
Cdd:PHA03247 2694 sltsladPPPPPPTPEPAPHAlvsatplppgPAAARQASPALPAAPAPPAVPAGPATPGGpaRPARPPTTAGPPAPAPPA 2773

                  ....*...
gi 190343021  260 DPDSIPSP 267
Cdd:PHA03247 2774 APAAGPPR 2781
PHA03379 PHA03379
EBNA-3A; Provisional
94-278 8.66e-03

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 40.04  E-value: 8.66e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   94 ASSHAPYQ----PSAQSSYPGPISTSS------VTQL----------GSQLSAmqINSYGSGMAPPSQGPPGPL----SA 149
Cdd:PHA03379  434 ATSHGSAQvpepPPVHDLEPGPLHDQHsmapcpVAQLppgplqdlepGDQLPG--VVQDGRPACAPVPAPAGPIvrpwEA 511
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021  150 TSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLP--LPMYRPDGLSG---------PPP--PNAQYQPPPLPGQT- 215
Cdd:PHA03379  512 SLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPplIAMQGPGETSGivrvrerwrPAPwtPNPPRSPSQMSVRDr 591
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 190343021  216 LGAGYPPQQANSGPqmAGAQlsyPGGFPGGPAQ--MAGPPQPQKKLDPDSIPSpiQVIENDRASR 278
Cdd:PHA03379  592 LARLRAEAQPYQAS--VEVQ---PPQLTQVSPQqpMEYPLEPEQQMFPGSPFS--QVADVMRAGG 649
Drf_FH1 pfam06346
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs) ...
143-267 8.98e-03

Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.


Pssm-ID: 461881 [Multi-domain]  Cd Length: 157  Bit Score: 37.93  E-value: 8.98e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 190343021   143 PPGPLSATSLQTPPRPPQPSIlqpgsqvlpPPPTTLNGPGASPLPLPMYRPDGLSGPPP-PNAQY--QPPPLPGqtlGAG 219
Cdd:pfam06346    1 PPPPPLPGDSSTIPLPPGACI---------PTPPPLPGGGGPPPPPPLPGSAAIPPPPPlPGGTSipPPPPLPG---AAS 68
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....
gi 190343021   220 YPPQQANSGPQMAGAQLSYPG--GFPGGPAQMAG----PPQPQKKLDPDSIPSP 267
Cdd:pfam06346   69 IPPPPPLPGSTGIPPPPPLPGgaGIPPPPPPLPGgagvPPPPPPLPGGPGIPPP 122
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH