NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|353906527|gb|EHE81931|]
View 

endo-beta-N-acetylglucosaminidase D [Streptococcus pneumoniae GA13338]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
COG4724 super family cl44115
Endo-beta-N-acetylglucosaminidase D [Carbohydrate transport and metabolism];
172-786 2.93e-174

Endo-beta-N-acetylglucosaminidase D [Carbohydrate transport and metabolism];


The actual alignment was detected with superfamily member COG4724:

Pssm-ID: 443759 [Multi-domain]  Cd Length: 662  Bit Score: 540.05  E-value: 2.93e-174
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  172 EELLKWEPGAREDDAINRGSVVLASRRTGHL--VNEKASKEAKVQALSNTNSKAKDHASVGGEEFKAYAFDYWQYLDSMV 249
Cdd:COG4724    43 EDLLNWSPETDPDARYNRSRVPLAPRFTGSAtqINPTLSPDAKVMSLAIDNPNTSGNPSQGGSDFNVYTFTYWQYIDYLV 122
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  250 FW-----EGLV--PTPDVIDAGHRNGVPVYGTLFFNWSNSIADQERFAEALKQDADGSFPIARKLVDMAKYYGYDGYFIN 322
Cdd:COG4724   123 YWggsagEGIIvpPSPDVIDAAHKNGVKVLGTVFFPPGAYGGKIEWVDAFLEKDEDGSFPVADKLIEIAQYYGFDGWFIN 202
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  323 QETTGdLVKPLGEKMRQFMLYSKEyaaKVNHPIKYSWYDAMTYNYGRYHQDGLGEYNYQFMQpEGDKVPADNFFANFNWD 402
Cdd:COG4724   203 QETNG-TDPELAKKMKEFLEYLKE---KSPENMEIMWYDSMLENGSVSWQNALNEKNDAFLQ-DGNKKVSDSMFLNFWWT 277
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  403 KAKN-DYTIATANWIGRNPYDVFAGLELQQGGsYKTKVKWNDILDENGKLRLSLGLFAPDTITSLGKTGEDYHKNEDIFF 481
Cdd:COG4724   278 GGSLlEKSRDTAKSLGRSPYDLYAGIDVQQNG-YNTRINWDALLDDNKKPPTSLGLYCPNWTFNSSKNPDDFYDNEQKFW 356
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  482 TGYQGDPTgQKPGDKDWYGIANLVADRTPAVGNTFTTSFNTGHGKKWFVDGKVSKDSEWNYRSVSGVLPTWRWWQTSTGE 561
Cdd:COG4724   357 VGPDGDPA-NTTDSNGWKGISTYVVEKSPVTSLPFVTNFNTGHGYKFYINGQQVSDGEWNNRSLQDVLPTWQWIVDSEGN 435
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  562 KLRAEYDFTDAYNGGNSLKFSGDVAGKTDQDVRLYSTKLEVTEKTKLRVAHKGGKGSKVYMAFSTTPDYK-FDDADAwkE 640
Cdd:COG4724   436 SLTPSFDYTDAYNGGSSLKLEGKLKAGGETTIKLYKTDLPITDDTKLSVVYKTDAKVKLSLGLTFKDGPTeFITFDL--G 513
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  641 LTLSDNWTNEEFDLSSLAGKTIYAVKLFFEHEGAVKDYQFNLGQLTISDNHQEPQSPTSFSVV--KQSLKNAQEAEAVVQ 718
Cdd:COG4724   514 TTSNNGWTTVTVDLSAYAGKTIAAISLKFSSTTDVDNYKINLGQLAIFNGTTPPSAPPNNTTVsgQTLVDASASAFRLNW 593
                         570       580       590       600       610       620
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 353906527  719 FKGNKDADFYEVYEKDGDSWKLLTGSSSTTIYLPKVSRSASAQGTTQELKVVAVGKNGVRSEAATTTF 786
Cdd:COG4724   594 WSDASAYGEYHVLQVTNPNAKWLGTNTNNAATVAKTSDRTVNAILFTITPIGAESISTPNKTATTTTI 661
Big_3 pfam07523
Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like ...
1144-1212 1.29e-12

Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins.


:

Pssm-ID: 400072 [Multi-domain]  Cd Length: 67  Bit Score: 64.24  E-value: 1.29e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  1144 QEPKKDYLVGDSLDLSEGRFAVAYSNDTMEehsFTDEGVEISG-YDAQKTGRQTLTLRYQGHEVNFDVLV 1212
Cdd:pfam07523    1 QPHKTTYYVGDSWDAEDNFVSATYKDGDAE---VPFDDVEVSGtVDSTKAGEYTVTYTYKGVSATFTVTV 67
YabE super family cl34636
Uncharacterized conserved protein YabE, contains G5 and tandem DUF348 domains [Function ...
1512-1607 1.81e-12

Uncharacterized conserved protein YabE, contains G5 and tandem DUF348 domains [Function unknown];


The actual alignment was detected with superfamily member COG3583:

Pssm-ID: 442802 [Multi-domain]  Cd Length: 335  Bit Score: 70.67  E-value: 1.81e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527 1512 KLEVQEEKVAFHRQEHENTEMLVGEQRVIIQGRDGLLRHVFEVD-ENGQ---RRLRSTEVIQEAIPEIVEIGTKVKTVPA 1587
Cdd:COG3583   152 KTVTEEEPIPFETVRKEDPSLPKGETKVVQEGVPGVKEVTYRVTyENGKevsREVVSEKVTKEPVDEVVAVGTKPRPAPA 231
                          90       100
                  ....*....|....*....|
gi 353906527 1588 VVATQEKPAQNTAVKSEEAS 1607
Cdd:COG3583   232 PVPAGSGSGGGGSSTGSGGY 251
Big_3 pfam07523
Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like ...
1055-1126 7.56e-12

Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins.


:

Pssm-ID: 400072 [Multi-domain]  Cd Length: 67  Bit Score: 61.93  E-value: 7.56e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 353906527  1055 GPKKTSYAEGEDLDLRGGVLRVQYEGGTEdeliRLTHAGVSVSG-FDTHHKGEQNLTLQYLGqpVNANLSVTV 1126
Cdd:pfam07523    1 QPHKTTYYVGDSWDAEDNFVSATYKDGDA----EVPFDDVEVSGtVDSTKAGEYTVTYTYKG--VSATFTVTV 67
YSIRK_signal TIGR01168
Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, ...
2-40 2.23e-10

Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.


:

Pssm-ID: 273479 [Multi-domain]  Cd Length: 39  Bit Score: 57.11  E-value: 2.23e-10
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 353906527     2 KNPFFERRCRYSIRKLSVGACSLMIGAVLFAGPALAEET 40
Cdd:TIGR01168    1 AKKFNEKQQKYSIRKLSVGVASVLVASLFFGGGVAAAES 39
endo_SpGH101 super family cl45865
SpGH101 family endo-alpha-N-acetylgalactosaminidase; Members of this family are streptococcal ...
1-162 2.17e-09

SpGH101 family endo-alpha-N-acetylgalactosaminidase; Members of this family are streptococcal surface proteins with a complex (and somewhat variable) architecture that includes a crosswall-targeting N-terminal YSIRK domain, a C-terminal cell wall-anchoring LPXTG domain, and a central endo-alpha-N-acetylgalactosaminidase that removes an O-linked disaccharide from host glycoproteins.


The actual alignment was detected with superfamily member NF040533:

Pssm-ID: 439743 [Multi-domain]  Cd Length: 1694  Bit Score: 62.67  E-value: 2.17e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527    1 MKNPFFERRCRYSIRKLSVGACSLMIGAVLFA-GPALAEETAVPENSGANTELVSGESEHSTNEA-DKQNEGEHTRENKL 78
Cdd:NF040533    1 MDKGLFEKRCKYSIRKFSLGVASVMIGASFFGtSPVLADTAQVGSTANLPADLADALAKAKDDNGrDFEAPKAGENQGSP 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   79 EKAEGvatasetaeaasaAKPEEKASEVVAETPSAEaKPKSDKETEAKPEatnqgdESKPAAEANKT--EKEVQPDVPKN 156
Cdd:NF040533   81 EVTDG-------------PKTEEELLALEKEKSATE-KPKENKPAEAKPE------TAKTVTPEWQTvaRKEQQGTVEIR 140

                  ....*.
gi 353906527  157 TEKTLK 162
Cdd:NF040533  141 EENGVR 146
FIVAR pfam07554
FIVAR domain; This domain is found in a wide variety of contexts, but mostly occurring in cell ...
1223-1287 1.52e-06

FIVAR domain; This domain is found in a wide variety of contexts, but mostly occurring in cell wall associated proteins. A lack of conserved catalytic residues suggests that it is a binding domain. From context, possible substrates are hyaluronate or fibronectin (personal obs: C Yeats). This is further evidenced by. Possibly the exact substrate is N-acetyl glucosamine. Finding it in the same protein as pfam05089 further supports this proposal. It is found in the C-terminal part of Swiss:O82833, which is removed during maturation. Some of the proteins it is found in are involved in methicillin resistance. The name FIVAR derives from Found In Various Architectures.


:

Pssm-ID: 400096 [Multi-domain]  Cd Length: 69  Bit Score: 47.31  E-value: 1.52e-06
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 353906527  1223 LKQKLAEVEAAKNKVVYNFASPEVKEAFLKAIEAAEQVLK--DHETSTQDQVNDRLNKLTEAHKALN 1287
Cdd:pfam07554    3 LKTSINDKNATKTSSNYINADNDKKAAYNNAITAAKAILNktNNPNATQEEVNQALTKLNTAINALN 69
Gram_pos_anchor pfam00746
LPXTG cell wall anchor motif;
1606-1645 8.55e-05

LPXTG cell wall anchor motif;


:

Pssm-ID: 366278 [Multi-domain]  Cd Length: 43  Bit Score: 41.37  E-value: 8.55e-05
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|
gi 353906527  1606 ASKQLPNTGTADANEALIAGLASLgLASLALTLRRKREDK 1645
Cdd:pfam00746    5 KKKTLPKTGENSNIFLTAAGLLAL-LGGLLLLVKRRKKEK 43
F5_F8_type_C pfam00754
F5/8 type C domain; This domain is also known as the discoidin (DS) domain family.
820-935 1.26e-03

F5/8 type C domain; This domain is also known as the discoidin (DS) domain family.


:

Pssm-ID: 459925 [Multi-domain]  Cd Length: 127  Bit Score: 40.51  E-value: 1.26e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   820 EGGEGIEGMLNGTITSlsdKWSSAQLSGSVDIR--LTKPRTVVRWVMDHAGAGGEsvndglMNTKDFDLYYKDADGEWK- 896
Cdd:pfam00754    9 SGEGPAAAALDGDPNT---AWSAWSGDDPQWIQvdLGKPKKITGVVTQGRQDGSN------GYVTSYKIEYSLDGENWTt 79
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|..
gi 353906527   897 -LAKEVRGNKA--HVTDITLDKPITAQDWRLNVVTSDNGTPW 935
Cdd:pfam00754   80 vKDEKIPGNNDnnTPVTNTFDPPIKARYVRIVPTSWNGGNGI 121
 
Name Accession Description Interval E-value
COG4724 COG4724
Endo-beta-N-acetylglucosaminidase D [Carbohydrate transport and metabolism];
172-786 2.93e-174

Endo-beta-N-acetylglucosaminidase D [Carbohydrate transport and metabolism];


Pssm-ID: 443759 [Multi-domain]  Cd Length: 662  Bit Score: 540.05  E-value: 2.93e-174
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  172 EELLKWEPGAREDDAINRGSVVLASRRTGHL--VNEKASKEAKVQALSNTNSKAKDHASVGGEEFKAYAFDYWQYLDSMV 249
Cdd:COG4724    43 EDLLNWSPETDPDARYNRSRVPLAPRFTGSAtqINPTLSPDAKVMSLAIDNPNTSGNPSQGGSDFNVYTFTYWQYIDYLV 122
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  250 FW-----EGLV--PTPDVIDAGHRNGVPVYGTLFFNWSNSIADQERFAEALKQDADGSFPIARKLVDMAKYYGYDGYFIN 322
Cdd:COG4724   123 YWggsagEGIIvpPSPDVIDAAHKNGVKVLGTVFFPPGAYGGKIEWVDAFLEKDEDGSFPVADKLIEIAQYYGFDGWFIN 202
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  323 QETTGdLVKPLGEKMRQFMLYSKEyaaKVNHPIKYSWYDAMTYNYGRYHQDGLGEYNYQFMQpEGDKVPADNFFANFNWD 402
Cdd:COG4724   203 QETNG-TDPELAKKMKEFLEYLKE---KSPENMEIMWYDSMLENGSVSWQNALNEKNDAFLQ-DGNKKVSDSMFLNFWWT 277
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  403 KAKN-DYTIATANWIGRNPYDVFAGLELQQGGsYKTKVKWNDILDENGKLRLSLGLFAPDTITSLGKTGEDYHKNEDIFF 481
Cdd:COG4724   278 GGSLlEKSRDTAKSLGRSPYDLYAGIDVQQNG-YNTRINWDALLDDNKKPPTSLGLYCPNWTFNSSKNPDDFYDNEQKFW 356
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  482 TGYQGDPTgQKPGDKDWYGIANLVADRTPAVGNTFTTSFNTGHGKKWFVDGKVSKDSEWNYRSVSGVLPTWRWWQTSTGE 561
Cdd:COG4724   357 VGPDGDPA-NTTDSNGWKGISTYVVEKSPVTSLPFVTNFNTGHGYKFYINGQQVSDGEWNNRSLQDVLPTWQWIVDSEGN 435
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  562 KLRAEYDFTDAYNGGNSLKFSGDVAGKTDQDVRLYSTKLEVTEKTKLRVAHKGGKGSKVYMAFSTTPDYK-FDDADAwkE 640
Cdd:COG4724   436 SLTPSFDYTDAYNGGSSLKLEGKLKAGGETTIKLYKTDLPITDDTKLSVVYKTDAKVKLSLGLTFKDGPTeFITFDL--G 513
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  641 LTLSDNWTNEEFDLSSLAGKTIYAVKLFFEHEGAVKDYQFNLGQLTISDNHQEPQSPTSFSVV--KQSLKNAQEAEAVVQ 718
Cdd:COG4724   514 TTSNNGWTTVTVDLSAYAGKTIAAISLKFSSTTDVDNYKINLGQLAIFNGTTPPSAPPNNTTVsgQTLVDASASAFRLNW 593
                         570       580       590       600       610       620
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 353906527  719 FKGNKDADFYEVYEKDGDSWKLLTGSSSTTIYLPKVSRSASAQGTTQELKVVAVGKNGVRSEAATTTF 786
Cdd:COG4724   594 WSDASAYGEYHVLQVTNPNAKWLGTNTNNAATVAKTSDRTVNAILFTITPIGAESISTPNKTATTTTI 661
Glyco_hydro_85 pfam03644
Glycosyl hydrolase family 85; Family of endo-beta-N-acetylglucosaminidases. These enzymes work ...
230-525 5.39e-107

Glycosyl hydrolase family 85; Family of endo-beta-N-acetylglucosaminidases. These enzymes work on a broad spectrum of substrates.


Pssm-ID: 461002  Cd Length: 292  Bit Score: 342.73  E-value: 5.39e-107
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   230 GGEEFKAYAFDYWQYLDSMVFWEG---LVPTPDVIDAGHRNGVPVYGTLFFNWSNSIADQERFaeaLKQDADGSFPIARK 306
Cdd:pfam03644    1 GGNDFDAYTFYYWQYVDTFVYFSHsrvTIPPPGWINAAHRNGVPVLGTFIFEWDEGGEWLEEL---LEKDEDGAFPVADK 77
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   307 LVDMAKYYGYDGYFINQETTGDLVKPLGEKMRQFMLYSKEYAAKVNHPIKYSWYDAMTYNYGRYHQDGLGEYNYQFMQpe 386
Cdd:pfam03644   78 LVEIAKYYGFDGWLINIETAFLLDPELAENLKEFLRYLREELHERVPGSEVIWYDSVTTDGKLSWQNELNEKNAPFFQ-- 155
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   387 gdkvPADNFFANFNWDKAKNDYTIATANWIGRNPYDVFAGLEL-----QQGGSYKTKVKWNDIldenGKLRLSLGLFAPD 461
Cdd:pfam03644  156 ----AADSIFLNYWWTESNLESSAELAGSLGRRPYDVYVGIDVfgrgtVGGGGFNTNVALDLI----AKAGLSAALFAPG 227
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 353906527   462 TI--TSLGKTGEDYHKNEDIFFTGYQGDPTgQKPGDKDWYGIANLVADRTPAVGNTFTTSFNTGHG 525
Cdd:pfam03644  228 WTyeTFQSGSTPDFLERERRFWVGPKGDPD-PDSSDNSWKGIANYVAERSAISSLPFYTNFNTGSG 292
GH85_ENGase cd06547
Endo-beta-N-acetylglucosaminidase (ENGase) hydrolyzes the N-N'-diacetylchitobiosyl core of ...
225-560 3.57e-93

Endo-beta-N-acetylglucosaminidase (ENGase) hydrolyzes the N-N'-diacetylchitobiosyl core of N-glycosylproteins. The beta-1,4-glycosyl bond located between two N-acetylglucosamine residues is hydrolyzed such that N-acetylglucosamine 1 remains with the protein and N-acetylglucosamine 2 forms the reducing end of the released glycan. ENGase is a key enzyme in the processing of free oligosaccharides in the cytosol of eukaryotes. Oligosaccharides formed in the lumen of the endoplasmic reticulum are transported into the cytosol where they are catabolized by cytosolic ENGases and other enzymes, possibly to maximize the reutilization of the component sugars. ENGases have an eight-stranded alpha/beta barrel topology and are classified as a family 85 glycosyl hydrolase (GH85) domain. The GH85 ENGases are sequence-similar to the family 18 glycosyl hydrolases, also known as GH18 chitinases. An ENGase-like protein is also found in bacteria and is included in this alignment model.


Pssm-ID: 119364  Cd Length: 339  Bit Score: 305.38  E-value: 3.57e-93
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  225 DHASVGGEEFKAYAFDYWQYLDSMVFWEG---LVPTPDVIDAGHRNGVPVYGTLFFNWSNSIADQERFaeaLKQDADGSF 301
Cdd:cd06547    13 DRPSQGSNSFNAYTFSYWQYVDTFVYFSHsavTIPPADWINAAHRNGVPVLGTFIFEWTGQVEWLEDF---LKKDEDGSF 89
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  302 PIARKLVDMAKYYGYDGYFINQETTGdLVKPLGEKMRQFMLYSKEYAAKVNHPIKYSWYDAMTYNyGRYH-QDGLGEYNY 380
Cdd:cd06547    90 PVADKLVEVAKYYGFDGWLINIETEL-GDAEKAKRLIAFLRYLKAKLHENVPGSLVIWYDSMTED-GKLSwQNELNSKNK 167
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  381 QFMqpegdKVpADNFFANFNWDKAKNDYTIATANWIGRNPYDVFAGLELQQGGsYKTKVKWND--ILDENGKLRLSLGLF 458
Cdd:cd06547   168 PFF-----DV-CDGIFLNYWWTEESLERSVQLAEGLGRSPYDVYVGVDVWGRG-TKGGGGWNSdkALDEIKKAGLSVALF 240
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  459 APD-TITSLGKTGEDYHKNEDIFFTGYQGDPTGqkpgDKDWYGIANLVADRTPAVGNTFTTSFNTGHGKKWFVDGKVSKD 537
Cdd:cd06547   241 APGwTYESFEEPDFFVKNESRFGESGDPFLTND----DKFWSGLATYVPEKSPITSLPFVTNFNTGSGYAFYVNGKKVSD 316
                         330       340
                  ....*....|....*....|...
gi 353906527  538 SEWNYRSVSGVLPTWRWWQTSTG 560
Cdd:cd06547   317 SPWNNLSLQDILPTYRWIVSSNG 339
Big_3 pfam07523
Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like ...
1144-1212 1.29e-12

Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins.


Pssm-ID: 400072 [Multi-domain]  Cd Length: 67  Bit Score: 64.24  E-value: 1.29e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  1144 QEPKKDYLVGDSLDLSEGRFAVAYSNDTMEehsFTDEGVEISG-YDAQKTGRQTLTLRYQGHEVNFDVLV 1212
Cdd:pfam07523    1 QPHKTTYYVGDSWDAEDNFVSATYKDGDAE---VPFDDVEVSGtVDSTKAGEYTVTYTYKGVSATFTVTV 67
YabE COG3583
Uncharacterized conserved protein YabE, contains G5 and tandem DUF348 domains [Function ...
1512-1607 1.81e-12

Uncharacterized conserved protein YabE, contains G5 and tandem DUF348 domains [Function unknown];


Pssm-ID: 442802 [Multi-domain]  Cd Length: 335  Bit Score: 70.67  E-value: 1.81e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527 1512 KLEVQEEKVAFHRQEHENTEMLVGEQRVIIQGRDGLLRHVFEVD-ENGQ---RRLRSTEVIQEAIPEIVEIGTKVKTVPA 1587
Cdd:COG3583   152 KTVTEEEPIPFETVRKEDPSLPKGETKVVQEGVPGVKEVTYRVTyENGKevsREVVSEKVTKEPVDEVVAVGTKPRPAPA 231
                          90       100
                  ....*....|....*....|
gi 353906527 1588 VVATQEKPAQNTAVKSEEAS 1607
Cdd:COG3583   232 PVPAGSGSGGGGSSTGSGGY 251
Big_3 pfam07523
Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like ...
1055-1126 7.56e-12

Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins.


Pssm-ID: 400072 [Multi-domain]  Cd Length: 67  Bit Score: 61.93  E-value: 7.56e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 353906527  1055 GPKKTSYAEGEDLDLRGGVLRVQYEGGTEdeliRLTHAGVSVSG-FDTHHKGEQNLTLQYLGqpVNANLSVTV 1126
Cdd:pfam07523    1 QPHKTTYYVGDSWDAEDNFVSATYKDGDA----EVPFDDVEVSGtVDSTKAGEYTVTYTYKG--VSATFTVTV 67
YSIRK_signal TIGR01168
Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, ...
2-40 2.23e-10

Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.


Pssm-ID: 273479 [Multi-domain]  Cd Length: 39  Bit Score: 57.11  E-value: 2.23e-10
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 353906527     2 KNPFFERRCRYSIRKLSVGACSLMIGAVLFAGPALAEET 40
Cdd:TIGR01168    1 AKKFNEKQQKYSIRKLSVGVASVLVASLFFGGGVAAAES 39
G5 pfam07501
G5 domain; This domain is found in a wide range of extracellular proteins. It is found ...
1512-1581 3.03e-10

G5 domain; This domain is found in a wide range of extracellular proteins. It is found tandemly repeated in up to 8 copies. It is found in the N-terminus of peptidases belonging to the M26 family which cleave human IgA. The domain is also found in proteins involved in metabolism of bacterial cell walls suggesting this domain may have an adhesive function.


Pssm-ID: 462185 [Multi-domain]  Cd Length: 75  Bit Score: 57.95  E-value: 3.03e-10
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 353906527  1512 KLEVQEEKVAFHRQEHENTEMLVGEQRVIIQGRDGLLRHVFEVD-ENGQ---RRLRSTEVIQEAIPEIVEIGTK 1581
Cdd:pfam07501    2 KTVTEEEEIPFETVTKEDPSLPKGEEKVVQEGKPGEKEVTYKVTyVNGKevsREVVSEEVTKEPVDEVVAVGTK 75
endo_SpGH101 NF040533
SpGH101 family endo-alpha-N-acetylgalactosaminidase; Members of this family are streptococcal ...
1-162 2.17e-09

SpGH101 family endo-alpha-N-acetylgalactosaminidase; Members of this family are streptococcal surface proteins with a complex (and somewhat variable) architecture that includes a crosswall-targeting N-terminal YSIRK domain, a C-terminal cell wall-anchoring LPXTG domain, and a central endo-alpha-N-acetylgalactosaminidase that removes an O-linked disaccharide from host glycoproteins.


Pssm-ID: 439743 [Multi-domain]  Cd Length: 1694  Bit Score: 62.67  E-value: 2.17e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527    1 MKNPFFERRCRYSIRKLSVGACSLMIGAVLFA-GPALAEETAVPENSGANTELVSGESEHSTNEA-DKQNEGEHTRENKL 78
Cdd:NF040533    1 MDKGLFEKRCKYSIRKFSLGVASVMIGASFFGtSPVLADTAQVGSTANLPADLADALAKAKDDNGrDFEAPKAGENQGSP 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   79 EKAEGvatasetaeaasaAKPEEKASEVVAETPSAEaKPKSDKETEAKPEatnqgdESKPAAEANKT--EKEVQPDVPKN 156
Cdd:NF040533   81 EVTDG-------------PKTEEELLALEKEKSATE-KPKENKPAEAKPE------TAKTVTPEWQTvaRKEQQGTVEIR 140

                  ....*.
gi 353906527  157 TEKTLK 162
Cdd:NF040533  141 EENGVR 146
YSIRK_signal pfam04650
YSIRK type signal peptide; Many surface proteins found in Streptococcus, Staphylococcus, and ...
7-31 6.12e-08

YSIRK type signal peptide; Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.


Pssm-ID: 428049 [Multi-domain]  Cd Length: 26  Bit Score: 49.69  E-value: 6.12e-08
                           10        20
                   ....*....|....*....|....*
gi 353906527     7 ERRCRYSIRKLSVGACSLMIGAVLF 31
Cdd:pfam04650    1 EKKQRYSIRKLSVGVASVLIGTLLF 25
FIVAR pfam07554
FIVAR domain; This domain is found in a wide variety of contexts, but mostly occurring in cell ...
1223-1287 1.52e-06

FIVAR domain; This domain is found in a wide variety of contexts, but mostly occurring in cell wall associated proteins. A lack of conserved catalytic residues suggests that it is a binding domain. From context, possible substrates are hyaluronate or fibronectin (personal obs: C Yeats). This is further evidenced by. Possibly the exact substrate is N-acetyl glucosamine. Finding it in the same protein as pfam05089 further supports this proposal. It is found in the C-terminal part of Swiss:O82833, which is removed during maturation. Some of the proteins it is found in are involved in methicillin resistance. The name FIVAR derives from Found In Various Architectures.


Pssm-ID: 400096 [Multi-domain]  Cd Length: 69  Bit Score: 47.31  E-value: 1.52e-06
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 353906527  1223 LKQKLAEVEAAKNKVVYNFASPEVKEAFLKAIEAAEQVLK--DHETSTQDQVNDRLNKLTEAHKALN 1287
Cdd:pfam07554    3 LKTSINDKNATKTSSNYINADNDKKAAYNNAITAAKAILNktNNPNATQEEVNQALTKLNTAINALN 69
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
7-215 4.14e-05

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 48.47  E-value: 4.14e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527    7 ERRCRYSIRKLSVGACSLMIGAVLFAGPALAEetavpENSGANTELVSGESEHSTNEADKQNEG---------------- 70
Cdd:NF033838    7 ERKVHYSIRKFSIGVASVVVASLFLGGVVHAE-----EVRGGNNPTVTSSGNESQKEHAKEVEShlekilseiqksldkr 81
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   71 EHTRENKL-EKAEGVATASETAEAASAAKPE-EKASEVVAETPSAEAKPKSD--KETEAKPEATNQGDESKPAAEANKTE 146
Cdd:NF033838   82 KHTQNVALnKKLSDIKTEYLYELNVLKEKSEaELTSKTKKELDAAFEQFKKDtlEPGKKVAEATKKVEEAEKKAKDQKEE 161
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  147 KevQPDVPKNTEKTLK----PKEIKFNSWE-ELLKWEP-GAREDDAINRGSVVLASRRTGHLV-------NEKASKEAKV 213
Cdd:NF033838  162 D--RRNYPTNTYKTLEleiaESDVEVKKAElELVKEEAkEPRDEEKIKQAKAKVESKKAEATRlekiktdREKAEEEAKR 239

                  ..
gi 353906527  214 QA 215
Cdd:NF033838  240 RA 241
Gram_pos_anchor pfam00746
LPXTG cell wall anchor motif;
1606-1645 8.55e-05

LPXTG cell wall anchor motif;


Pssm-ID: 366278 [Multi-domain]  Cd Length: 43  Bit Score: 41.37  E-value: 8.55e-05
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|
gi 353906527  1606 ASKQLPNTGTADANEALIAGLASLgLASLALTLRRKREDK 1645
Cdd:pfam00746    5 KKKTLPKTGENSNIFLTAAGLLAL-LGGLLLLVKRRKKEK 43
PTZ00121 PTZ00121
MAEBL; Provisional
56-238 1.64e-04

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 46.67  E-value: 1.64e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   56 ESEHSTNEADKQNEGEHTRENKLEKAEgvataSETAEAASAAKPEEKASEVVAETPSAEAKPKSDK-----ETEAKPEAT 130
Cdd:PTZ00121 1441 EEAKKADEAKKKAEEAKKAEEAKKKAE-----EAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEakkaaEAKKKADEA 1515
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  131 NQGDESKPAAEANKTEKEVQPDVPKNTEKTLKPKEIK----FNSWEELLKWEPGAREDDAINrgsvvLASRRTGHLVNEK 206
Cdd:PTZ00121 1516 KKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKkaeeLKKAEEKKKAEEAKKAEEDKN-----MALRKAEEAKKAE 1590
                         170       180       190
                  ....*....|....*....|....*....|..
gi 353906527  207 ASKEAKVQALSNTNSKAKDHASVGGEEFKAYA 238
Cdd:PTZ00121 1591 EARIEEVMKLYEEEKKMKAEEAKKAEEAKIKA 1622
LPXTG_anchor TIGR01167
LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region ...
1609-1642 2.99e-04

LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region found at the C-terminus of many surface proteins of Streptococcus and Streptomyces species. Cleavage between the Thr and Gly by sortase or a related enzyme leads to covalent anchoring at the new C-terminal Thr to the cell wall. Hits that do not lie at the C-terminus or are not found in Gram-positive bacteria are probably false-positive. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region. [Cell envelope, Other]


Pssm-ID: 273478 [Multi-domain]  Cd Length: 34  Bit Score: 39.38  E-value: 2.99e-04
                           10        20        30
                   ....*....|....*....|....*....|....
gi 353906527  1609 QLPNTGTADANEALIAGLASLGLASLALTLRRKR 1642
Cdd:TIGR01167    1 KLPKTGESGNSLLLLLGLLLLGLGGLLLRKRKKK 34
F5_F8_type_C pfam00754
F5/8 type C domain; This domain is also known as the discoidin (DS) domain family.
820-935 1.26e-03

F5/8 type C domain; This domain is also known as the discoidin (DS) domain family.


Pssm-ID: 459925 [Multi-domain]  Cd Length: 127  Bit Score: 40.51  E-value: 1.26e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   820 EGGEGIEGMLNGTITSlsdKWSSAQLSGSVDIR--LTKPRTVVRWVMDHAGAGGEsvndglMNTKDFDLYYKDADGEWK- 896
Cdd:pfam00754    9 SGEGPAAAALDGDPNT---AWSAWSGDDPQWIQvdLGKPKKITGVVTQGRQDGSN------GYVTSYKIEYSLDGENWTt 79
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|..
gi 353906527   897 -LAKEVRGNKA--HVTDITLDKPITAQDWRLNVVTSDNGTPW 935
Cdd:pfam00754   80 vKDEKIPGNNDnnTPVTNTFDPPIKARYVRIVPTSWNGGNGI 121
 
Name Accession Description Interval E-value
COG4724 COG4724
Endo-beta-N-acetylglucosaminidase D [Carbohydrate transport and metabolism];
172-786 2.93e-174

Endo-beta-N-acetylglucosaminidase D [Carbohydrate transport and metabolism];


Pssm-ID: 443759 [Multi-domain]  Cd Length: 662  Bit Score: 540.05  E-value: 2.93e-174
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  172 EELLKWEPGAREDDAINRGSVVLASRRTGHL--VNEKASKEAKVQALSNTNSKAKDHASVGGEEFKAYAFDYWQYLDSMV 249
Cdd:COG4724    43 EDLLNWSPETDPDARYNRSRVPLAPRFTGSAtqINPTLSPDAKVMSLAIDNPNTSGNPSQGGSDFNVYTFTYWQYIDYLV 122
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  250 FW-----EGLV--PTPDVIDAGHRNGVPVYGTLFFNWSNSIADQERFAEALKQDADGSFPIARKLVDMAKYYGYDGYFIN 322
Cdd:COG4724   123 YWggsagEGIIvpPSPDVIDAAHKNGVKVLGTVFFPPGAYGGKIEWVDAFLEKDEDGSFPVADKLIEIAQYYGFDGWFIN 202
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  323 QETTGdLVKPLGEKMRQFMLYSKEyaaKVNHPIKYSWYDAMTYNYGRYHQDGLGEYNYQFMQpEGDKVPADNFFANFNWD 402
Cdd:COG4724   203 QETNG-TDPELAKKMKEFLEYLKE---KSPENMEIMWYDSMLENGSVSWQNALNEKNDAFLQ-DGNKKVSDSMFLNFWWT 277
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  403 KAKN-DYTIATANWIGRNPYDVFAGLELQQGGsYKTKVKWNDILDENGKLRLSLGLFAPDTITSLGKTGEDYHKNEDIFF 481
Cdd:COG4724   278 GGSLlEKSRDTAKSLGRSPYDLYAGIDVQQNG-YNTRINWDALLDDNKKPPTSLGLYCPNWTFNSSKNPDDFYDNEQKFW 356
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  482 TGYQGDPTgQKPGDKDWYGIANLVADRTPAVGNTFTTSFNTGHGKKWFVDGKVSKDSEWNYRSVSGVLPTWRWWQTSTGE 561
Cdd:COG4724   357 VGPDGDPA-NTTDSNGWKGISTYVVEKSPVTSLPFVTNFNTGHGYKFYINGQQVSDGEWNNRSLQDVLPTWQWIVDSEGN 435
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  562 KLRAEYDFTDAYNGGNSLKFSGDVAGKTDQDVRLYSTKLEVTEKTKLRVAHKGGKGSKVYMAFSTTPDYK-FDDADAwkE 640
Cdd:COG4724   436 SLTPSFDYTDAYNGGSSLKLEGKLKAGGETTIKLYKTDLPITDDTKLSVVYKTDAKVKLSLGLTFKDGPTeFITFDL--G 513
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  641 LTLSDNWTNEEFDLSSLAGKTIYAVKLFFEHEGAVKDYQFNLGQLTISDNHQEPQSPTSFSVV--KQSLKNAQEAEAVVQ 718
Cdd:COG4724   514 TTSNNGWTTVTVDLSAYAGKTIAAISLKFSSTTDVDNYKINLGQLAIFNGTTPPSAPPNNTTVsgQTLVDASASAFRLNW 593
                         570       580       590       600       610       620
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 353906527  719 FKGNKDADFYEVYEKDGDSWKLLTGSSSTTIYLPKVSRSASAQGTTQELKVVAVGKNGVRSEAATTTF 786
Cdd:COG4724   594 WSDASAYGEYHVLQVTNPNAKWLGTNTNNAATVAKTSDRTVNAILFTITPIGAESISTPNKTATTTTI 661
Glyco_hydro_85 pfam03644
Glycosyl hydrolase family 85; Family of endo-beta-N-acetylglucosaminidases. These enzymes work ...
230-525 5.39e-107

Glycosyl hydrolase family 85; Family of endo-beta-N-acetylglucosaminidases. These enzymes work on a broad spectrum of substrates.


Pssm-ID: 461002  Cd Length: 292  Bit Score: 342.73  E-value: 5.39e-107
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   230 GGEEFKAYAFDYWQYLDSMVFWEG---LVPTPDVIDAGHRNGVPVYGTLFFNWSNSIADQERFaeaLKQDADGSFPIARK 306
Cdd:pfam03644    1 GGNDFDAYTFYYWQYVDTFVYFSHsrvTIPPPGWINAAHRNGVPVLGTFIFEWDEGGEWLEEL---LEKDEDGAFPVADK 77
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   307 LVDMAKYYGYDGYFINQETTGDLVKPLGEKMRQFMLYSKEYAAKVNHPIKYSWYDAMTYNYGRYHQDGLGEYNYQFMQpe 386
Cdd:pfam03644   78 LVEIAKYYGFDGWLINIETAFLLDPELAENLKEFLRYLREELHERVPGSEVIWYDSVTTDGKLSWQNELNEKNAPFFQ-- 155
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   387 gdkvPADNFFANFNWDKAKNDYTIATANWIGRNPYDVFAGLEL-----QQGGSYKTKVKWNDIldenGKLRLSLGLFAPD 461
Cdd:pfam03644  156 ----AADSIFLNYWWTESNLESSAELAGSLGRRPYDVYVGIDVfgrgtVGGGGFNTNVALDLI----AKAGLSAALFAPG 227
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 353906527   462 TI--TSLGKTGEDYHKNEDIFFTGYQGDPTgQKPGDKDWYGIANLVADRTPAVGNTFTTSFNTGHG 525
Cdd:pfam03644  228 WTyeTFQSGSTPDFLERERRFWVGPKGDPD-PDSSDNSWKGIANYVAERSAISSLPFYTNFNTGSG 292
GH85_ENGase cd06547
Endo-beta-N-acetylglucosaminidase (ENGase) hydrolyzes the N-N'-diacetylchitobiosyl core of ...
225-560 3.57e-93

Endo-beta-N-acetylglucosaminidase (ENGase) hydrolyzes the N-N'-diacetylchitobiosyl core of N-glycosylproteins. The beta-1,4-glycosyl bond located between two N-acetylglucosamine residues is hydrolyzed such that N-acetylglucosamine 1 remains with the protein and N-acetylglucosamine 2 forms the reducing end of the released glycan. ENGase is a key enzyme in the processing of free oligosaccharides in the cytosol of eukaryotes. Oligosaccharides formed in the lumen of the endoplasmic reticulum are transported into the cytosol where they are catabolized by cytosolic ENGases and other enzymes, possibly to maximize the reutilization of the component sugars. ENGases have an eight-stranded alpha/beta barrel topology and are classified as a family 85 glycosyl hydrolase (GH85) domain. The GH85 ENGases are sequence-similar to the family 18 glycosyl hydrolases, also known as GH18 chitinases. An ENGase-like protein is also found in bacteria and is included in this alignment model.


Pssm-ID: 119364  Cd Length: 339  Bit Score: 305.38  E-value: 3.57e-93
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  225 DHASVGGEEFKAYAFDYWQYLDSMVFWEG---LVPTPDVIDAGHRNGVPVYGTLFFNWSNSIADQERFaeaLKQDADGSF 301
Cdd:cd06547    13 DRPSQGSNSFNAYTFSYWQYVDTFVYFSHsavTIPPADWINAAHRNGVPVLGTFIFEWTGQVEWLEDF---LKKDEDGSF 89
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  302 PIARKLVDMAKYYGYDGYFINQETTGdLVKPLGEKMRQFMLYSKEYAAKVNHPIKYSWYDAMTYNyGRYH-QDGLGEYNY 380
Cdd:cd06547    90 PVADKLVEVAKYYGFDGWLINIETEL-GDAEKAKRLIAFLRYLKAKLHENVPGSLVIWYDSMTED-GKLSwQNELNSKNK 167
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  381 QFMqpegdKVpADNFFANFNWDKAKNDYTIATANWIGRNPYDVFAGLELQQGGsYKTKVKWND--ILDENGKLRLSLGLF 458
Cdd:cd06547   168 PFF-----DV-CDGIFLNYWWTEESLERSVQLAEGLGRSPYDVYVGVDVWGRG-TKGGGGWNSdkALDEIKKAGLSVALF 240
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  459 APD-TITSLGKTGEDYHKNEDIFFTGYQGDPTGqkpgDKDWYGIANLVADRTPAVGNTFTTSFNTGHGKKWFVDGKVSKD 537
Cdd:cd06547   241 APGwTYESFEEPDFFVKNESRFGESGDPFLTND----DKFWSGLATYVPEKSPITSLPFVTNFNTGSGYAFYVNGKKVSD 316
                         330       340
                  ....*....|....*....|...
gi 353906527  538 SEWNYRSVSGVLPTWRWWQTSTG 560
Cdd:cd06547   317 SPWNNLSLQDILPTYRWIVSSNG 339
Big_3 pfam07523
Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like ...
1144-1212 1.29e-12

Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins.


Pssm-ID: 400072 [Multi-domain]  Cd Length: 67  Bit Score: 64.24  E-value: 1.29e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  1144 QEPKKDYLVGDSLDLSEGRFAVAYSNDTMEehsFTDEGVEISG-YDAQKTGRQTLTLRYQGHEVNFDVLV 1212
Cdd:pfam07523    1 QPHKTTYYVGDSWDAEDNFVSATYKDGDAE---VPFDDVEVSGtVDSTKAGEYTVTYTYKGVSATFTVTV 67
YabE COG3583
Uncharacterized conserved protein YabE, contains G5 and tandem DUF348 domains [Function ...
1512-1607 1.81e-12

Uncharacterized conserved protein YabE, contains G5 and tandem DUF348 domains [Function unknown];


Pssm-ID: 442802 [Multi-domain]  Cd Length: 335  Bit Score: 70.67  E-value: 1.81e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527 1512 KLEVQEEKVAFHRQEHENTEMLVGEQRVIIQGRDGLLRHVFEVD-ENGQ---RRLRSTEVIQEAIPEIVEIGTKVKTVPA 1587
Cdd:COG3583   152 KTVTEEEPIPFETVRKEDPSLPKGETKVVQEGVPGVKEVTYRVTyENGKevsREVVSEKVTKEPVDEVVAVGTKPRPAPA 231
                          90       100
                  ....*....|....*....|
gi 353906527 1588 VVATQEKPAQNTAVKSEEAS 1607
Cdd:COG3583   232 PVPAGSGSGGGGSSTGSGGY 251
Big_3 pfam07523
Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like ...
1055-1126 7.56e-12

Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins.


Pssm-ID: 400072 [Multi-domain]  Cd Length: 67  Bit Score: 61.93  E-value: 7.56e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 353906527  1055 GPKKTSYAEGEDLDLRGGVLRVQYEGGTEdeliRLTHAGVSVSG-FDTHHKGEQNLTLQYLGqpVNANLSVTV 1126
Cdd:pfam07523    1 QPHKTTYYVGDSWDAEDNFVSATYKDGDA----EVPFDDVEVSGtVDSTKAGEYTVTYTYKG--VSATFTVTV 67
YSIRK_signal TIGR01168
Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, ...
2-40 2.23e-10

Gram-positive signal peptide, YSIRK family; Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.


Pssm-ID: 273479 [Multi-domain]  Cd Length: 39  Bit Score: 57.11  E-value: 2.23e-10
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 353906527     2 KNPFFERRCRYSIRKLSVGACSLMIGAVLFAGPALAEET 40
Cdd:TIGR01168    1 AKKFNEKQQKYSIRKLSVGVASVLVASLFFGGGVAAAES 39
G5 pfam07501
G5 domain; This domain is found in a wide range of extracellular proteins. It is found ...
1512-1581 3.03e-10

G5 domain; This domain is found in a wide range of extracellular proteins. It is found tandemly repeated in up to 8 copies. It is found in the N-terminus of peptidases belonging to the M26 family which cleave human IgA. The domain is also found in proteins involved in metabolism of bacterial cell walls suggesting this domain may have an adhesive function.


Pssm-ID: 462185 [Multi-domain]  Cd Length: 75  Bit Score: 57.95  E-value: 3.03e-10
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 353906527  1512 KLEVQEEKVAFHRQEHENTEMLVGEQRVIIQGRDGLLRHVFEVD-ENGQ---RRLRSTEVIQEAIPEIVEIGTK 1581
Cdd:pfam07501    2 KTVTEEEEIPFETVTKEDPSLPKGEEKVVQEGKPGEKEVTYKVTyVNGKevsREVVSEEVTKEPVDEVVAVGTK 75
endo_SpGH101 NF040533
SpGH101 family endo-alpha-N-acetylgalactosaminidase; Members of this family are streptococcal ...
1-162 2.17e-09

SpGH101 family endo-alpha-N-acetylgalactosaminidase; Members of this family are streptococcal surface proteins with a complex (and somewhat variable) architecture that includes a crosswall-targeting N-terminal YSIRK domain, a C-terminal cell wall-anchoring LPXTG domain, and a central endo-alpha-N-acetylgalactosaminidase that removes an O-linked disaccharide from host glycoproteins.


Pssm-ID: 439743 [Multi-domain]  Cd Length: 1694  Bit Score: 62.67  E-value: 2.17e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527    1 MKNPFFERRCRYSIRKLSVGACSLMIGAVLFA-GPALAEETAVPENSGANTELVSGESEHSTNEA-DKQNEGEHTRENKL 78
Cdd:NF040533    1 MDKGLFEKRCKYSIRKFSLGVASVMIGASFFGtSPVLADTAQVGSTANLPADLADALAKAKDDNGrDFEAPKAGENQGSP 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   79 EKAEGvatasetaeaasaAKPEEKASEVVAETPSAEaKPKSDKETEAKPEatnqgdESKPAAEANKT--EKEVQPDVPKN 156
Cdd:NF040533   81 EVTDG-------------PKTEEELLALEKEKSATE-KPKENKPAEAKPE------TAKTVTPEWQTvaRKEQQGTVEIR 140

                  ....*.
gi 353906527  157 TEKTLK 162
Cdd:NF040533  141 EENGVR 146
YSIRK_signal pfam04650
YSIRK type signal peptide; Many surface proteins found in Streptococcus, Staphylococcus, and ...
7-31 6.12e-08

YSIRK type signal peptide; Many surface proteins found in Streptococcus, Staphylococcus, and related lineages share apparently homologous signal sequences. A motif resembling [YF]SIRKxxxGxxS[VIA] appears at the start of the transmembrane domain. The GxxS motif appears perfectly conserved, suggesting a specific function and not just homology. There is a strong correlation between proteins carrying this region at the N-terminus and those carrying the Gram-positive anchor domain with the LPXTG sortase processing site at the C-terminus.


Pssm-ID: 428049 [Multi-domain]  Cd Length: 26  Bit Score: 49.69  E-value: 6.12e-08
                           10        20
                   ....*....|....*....|....*
gi 353906527     7 ERRCRYSIRKLSVGACSLMIGAVLF 31
Cdd:pfam04650    1 EKKQRYSIRKLSVGVASVLIGTLLF 25
FIVAR pfam07554
FIVAR domain; This domain is found in a wide variety of contexts, but mostly occurring in cell ...
1223-1287 1.52e-06

FIVAR domain; This domain is found in a wide variety of contexts, but mostly occurring in cell wall associated proteins. A lack of conserved catalytic residues suggests that it is a binding domain. From context, possible substrates are hyaluronate or fibronectin (personal obs: C Yeats). This is further evidenced by. Possibly the exact substrate is N-acetyl glucosamine. Finding it in the same protein as pfam05089 further supports this proposal. It is found in the C-terminal part of Swiss:O82833, which is removed during maturation. Some of the proteins it is found in are involved in methicillin resistance. The name FIVAR derives from Found In Various Architectures.


Pssm-ID: 400096 [Multi-domain]  Cd Length: 69  Bit Score: 47.31  E-value: 1.52e-06
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 353906527  1223 LKQKLAEVEAAKNKVVYNFASPEVKEAFLKAIEAAEQVLK--DHETSTQDQVNDRLNKLTEAHKALN 1287
Cdd:pfam07554    3 LKTSINDKNATKTSSNYINADNDKKAAYNNAITAAKAILNktNNPNATQEEVNQALTKLNTAINALN 69
PspC_subgroup_1 NF033838
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ...
7-215 4.14e-05

pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.


Pssm-ID: 468201 [Multi-domain]  Cd Length: 684  Bit Score: 48.47  E-value: 4.14e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527    7 ERRCRYSIRKLSVGACSLMIGAVLFAGPALAEetavpENSGANTELVSGESEHSTNEADKQNEG---------------- 70
Cdd:NF033838    7 ERKVHYSIRKFSIGVASVVVASLFLGGVVHAE-----EVRGGNNPTVTSSGNESQKEHAKEVEShlekilseiqksldkr 81
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   71 EHTRENKL-EKAEGVATASETAEAASAAKPE-EKASEVVAETPSAEAKPKSD--KETEAKPEATNQGDESKPAAEANKTE 146
Cdd:NF033838   82 KHTQNVALnKKLSDIKTEYLYELNVLKEKSEaELTSKTKKELDAAFEQFKKDtlEPGKKVAEATKKVEEAEKKAKDQKEE 161
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  147 KevQPDVPKNTEKTLK----PKEIKFNSWE-ELLKWEP-GAREDDAINRGSVVLASRRTGHLV-------NEKASKEAKV 213
Cdd:NF033838  162 D--RRNYPTNTYKTLEleiaESDVEVKKAElELVKEEAkEPRDEEKIKQAKAKVESKKAEATRlekiktdREKAEEEAKR 239

                  ..
gi 353906527  214 QA 215
Cdd:NF033838  240 RA 241
Gram_pos_anchor pfam00746
LPXTG cell wall anchor motif;
1606-1645 8.55e-05

LPXTG cell wall anchor motif;


Pssm-ID: 366278 [Multi-domain]  Cd Length: 43  Bit Score: 41.37  E-value: 8.55e-05
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|
gi 353906527  1606 ASKQLPNTGTADANEALIAGLASLgLASLALTLRRKREDK 1645
Cdd:pfam00746    5 KKKTLPKTGENSNIFLTAAGLLAL-LGGLLLLVKRRKKEK 43
PTZ00121 PTZ00121
MAEBL; Provisional
56-238 1.64e-04

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 46.67  E-value: 1.64e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   56 ESEHSTNEADKQNEGEHTRENKLEKAEgvataSETAEAASAAKPEEKASEVVAETPSAEAKPKSDK-----ETEAKPEAT 130
Cdd:PTZ00121 1441 EEAKKADEAKKKAEEAKKAEEAKKKAE-----EAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEakkaaEAKKKADEA 1515
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527  131 NQGDESKPAAEANKTEKEVQPDVPKNTEKTLKPKEIK----FNSWEELLKWEPGAREDDAINrgsvvLASRRTGHLVNEK 206
Cdd:PTZ00121 1516 KKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKkaeeLKKAEEKKKAEEAKKAEEDKN-----MALRKAEEAKKAE 1590
                         170       180       190
                  ....*....|....*....|....*....|..
gi 353906527  207 ASKEAKVQALSNTNSKAKDHASVGGEEFKAYA 238
Cdd:PTZ00121 1591 EARIEEVMKLYEEEKKMKAEEAKKAEEAKIKA 1622
LPXTG_anchor TIGR01167
LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region ...
1609-1642 2.99e-04

LPXTG-motif cell wall anchor domain; This model describes the LPXTG motif-containing region found at the C-terminus of many surface proteins of Streptococcus and Streptomyces species. Cleavage between the Thr and Gly by sortase or a related enzyme leads to covalent anchoring at the new C-terminal Thr to the cell wall. Hits that do not lie at the C-terminus or are not found in Gram-positive bacteria are probably false-positive. A common feature of this proteins containing this domain appears to be a high proportion of charged and zwitterionic residues immediatedly upstream of the LPXTG motif. This model differs from other descriptions of the LPXTG region by including a portion of that upstream charged region. [Cell envelope, Other]


Pssm-ID: 273478 [Multi-domain]  Cd Length: 34  Bit Score: 39.38  E-value: 2.99e-04
                           10        20        30
                   ....*....|....*....|....*....|....
gi 353906527  1609 QLPNTGTADANEALIAGLASLGLASLALTLRRKR 1642
Cdd:TIGR01167    1 KLPKTGESGNSLLLLLGLLLLGLGGLLLRKRKKK 34
F5_F8_type_C pfam00754
F5/8 type C domain; This domain is also known as the discoidin (DS) domain family.
820-935 1.26e-03

F5/8 type C domain; This domain is also known as the discoidin (DS) domain family.


Pssm-ID: 459925 [Multi-domain]  Cd Length: 127  Bit Score: 40.51  E-value: 1.26e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   820 EGGEGIEGMLNGTITSlsdKWSSAQLSGSVDIR--LTKPRTVVRWVMDHAGAGGEsvndglMNTKDFDLYYKDADGEWK- 896
Cdd:pfam00754    9 SGEGPAAAALDGDPNT---AWSAWSGDDPQWIQvdLGKPKKITGVVTQGRQDGSN------GYVTSYKIEYSLDGENWTt 79
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|..
gi 353906527   897 -LAKEVRGNKA--HVTDITLDKPITAQDWRLNVVTSDNGTPW 935
Cdd:pfam00754   80 vKDEKIPGNNDnnTPVTNTFDPPIKARYVRIVPTSWNGGNGI 121
PTZ00121 PTZ00121
MAEBL; Provisional
37-167 1.34e-03

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 43.59  E-value: 1.34e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 353906527   37 AEETAVPENSGANTELVSGESEHSTNEADKQNEG---EHTRENKLEKAEGVATASETAEAASAAKPEEKASEVVAETPSA 113
Cdd:PTZ00121 1239 AEEAKKAEEERNNEEIRKFEEARMAHFARRQAAIkaeEARKADELKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKAD 1318
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|....
gi 353906527  114 EAKPKSDKETEAKPEATNQGDESKPAAEANKTEKEVQPDVPKNTEKTLKPKEIK 167
Cdd:PTZ00121 1319 EAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKK 1372
COG4412 COG4412
Bacillopeptidase F, M6 metalloprotease family [Posttranslational modification, protein ...
614-663 9.80e-03

Bacillopeptidase F, M6 metalloprotease family [Posttranslational modification, protein turnover, chaperones];


Pssm-ID: 443532 [Multi-domain]  Cd Length: 524  Bit Score: 40.44  E-value: 9.80e-03
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|
gi 353906527  614 GGKGSKVYMAFSTTPDYKFDDADAWKELTLSDNWTNEEFDLSSLAGKTIY 663
Cdd:COG4412   301 GGATWTPLPGNVTGTPDPNGSGLGPGITGTSNGWVDLSFDLSAYAGQTVQ 350
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH
HHS Vulnerability Disclosure