NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1928115501|ref|NP_001375382|]
View 

glutamine-rich protein 2 isoform 2 [Homo sapiens]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
DUF4795 pfam16043
Domain of unknown function (DUF4795); This family of proteins is functionally uncharacterized. ...
1419-1598 1.73e-77

Domain of unknown function (DUF4795); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 285 and 978 amino acids in length.


:

Pssm-ID: 464990 [Multi-domain]  Cd Length: 181  Bit Score: 254.15  E-value: 1.73e-77
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1419 QRQDEELLGRVQSAILQVQGDCEKLNITTSNLIEDHRQKQKDIAMLYQGLEKLEKEKANREHLEMEIDVKADKSALATKV 1498
Cdd:pfam16043    2 KVEDAELLDQLQALILDLQEELEKLSETTSELSERLQQRQKHLEALYQQIEKLEKVKADKEVVEEELDEKADKEALASKV 81
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1499 SRVQFDATTEQLNHMMQELVAKMSGQEQDWQKMLDRLLTEMDNKLDRLELDPVKQLLEDRWKSLRQQLRERPPLYQADEA 1578
Cdd:pfam16043   82 SRDQFDETLEELNQMLQELLDKLEGQEDAWKKALETLSEELDTKLDRLELDPLKELLERRIKALQKLLQEGSEELDEAEA 161
                          170       180
                   ....*....|....*....|
gi 1928115501 1579 AAMRRQLLAHFHCLSCDRPL 1598
Cdd:pfam16043  162 AGFRKKLLERFHCISCDRPV 181
Glutenin_hmw super family cl26620
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
448-980 2.91e-16

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


The actual alignment was detected with superfamily member pfam03157:

Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 85.00  E-value: 2.91e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  448 PGRDQQGLELPSTDQHGLVSVSAYQHGMTFPGTDQRSMEPLGMDQrgcviSGMGQQGLVP-----PGIDQQGLTLPVVDQ 522
Cdd:pfam03157  194 SGQGQPGYYPTSSQQPGQLQQTGQGQQGQQPERGQQGQQPGQGQQ-----PGQGQQGQQPgqpqqLGQGQQGYYPISPQQ 268
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  523 HGLVlpftDQHGLVSPGLMPISADQQGFVQPSLEATGFIQPGTEQHD--LIQSGRFQRALVQRGAYQPGLVQPGADQRGL 600
Cdd:pfam03157  269 PRQW----QQSGQGQQGYYPTSLQQPGQGQSGYYPTSQQQAGQLQQEqqLGQEQQDQQPGQGRQGQQPGQGQQGQQPAQG 344
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  601 VRPGMDQSGLAQPGADQRGLVWPGMDQSGLAQPGRDQhgliQPGTGQHDLvQSGTGQGVLVQPGVDQPGMVQPGRFQRAL 680
Cdd:pfam03157  345 QQPGQGQPGYYPTSPQQPGQGQPGYYPTSQQQPQQGQ----QPEQGQQGQ-QQGQGQQGQQPGQGQQPGQGQPGYYPTSP 419
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  681 VQPGAYQPGLVQPGADQIDVVQPGADQHGLVQSGADQSDlaQPGAVQHGlVQPGVDQRGlaQPRADHQRGLVPPGADQRG 760
Cdd:pfam03157  420 QQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQEQPGQGQ--QPGQGQQG-QQPGQPEQG--QQPGQGQPGYYPTSPQQSG 494
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  761 -------LVQPGADQHGLVQPGVDQHGLAQPGEVQRSLVQPGIVQRGLVQPGAVQRGLVQPGAVQRGLVQPGVDQRGlVQ 833
Cdd:pfam03157  495 qgqqlgqWQQQGQGQPGYYPTSPLQPGQGQPGYYPTSPQQPGQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQG-QQ 573
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  834 PGAVQRGLVQPGAVQHGLVQPGADQRGLVQPGVDQrglvQPGVDQrglvQPGMDQRGLIQPGADQPGLVQPGAGQLGMVQ 913
Cdd:pfam03157  574 PGQGQQGQQPGQGQQPGQGQPGYYPTSPQQSGQGQ----QPGQWQ----QPGQGQPGYYPTSSLQLGQGQQGYYPTSPQQ 645
                          490       500       510       520       530       540
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1928115501  914 PGIGQQgmvQPQADPHGLVQPGAYPLGLVQPGAylHDLSQSGTYPRGLVQPGMDQYGLRQPGAYQPG 980
Cdd:pfam03157  646 PGQGQQ---PGQWQQSGQGQQGYYPTSPQQSGQ--AQQPGQGQQPGQWLQPGQGQQGYYPTSPQQPG 707
SMC_prok_B super family cl37069
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
1176-1544 4.42e-06

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


The actual alignment was detected with superfamily member TIGR02168:

Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 51.98  E-value: 4.42e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1176 LSERRNSLRRMSSSFPTAVETF-HLMGELSSLYVGLKESMKDLDEEQ-AGQTDLEKIQFLLAQMVKRTIppELQEQLKTV 1253
Cdd:TIGR02168  728 ISALRKDLARLEAEVEQLEERIaQLSKELTELEAEIEELEERLEEAEeELAEAEAEIEELEAQIEQLKE--ELKALREAL 805
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1254 KTLAKEVWQEKAKVERLQRILEGEGNQEAGKELKAGELRLQLGVLRVTVADIEKELAELRESQDRGKAAME------NSV 1327
Cdd:TIGR02168  806 DELRAELTLLNEEAANLRERLESLERRIAATERRLEDLEEQIEELSEDIESLAAEIEELEELIEELESELEallnerASL 885
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1328 SEASLYLQDQLDKLRMIIEsmltssstllsmsmaphkahtlapgqidpeatcpacslDVSHQVSTLVRRYEQLQDMVNSL 1407
Cdd:TIGR02168  886 EEALALLRSELEELSEELR--------------------------------------ELESKRSELRRELEELREKLAQL 927
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1408 AVSRPSKKAKLQRQDEELLGRVQSAiLQVQGDCEKLNITTSNLIEDH-RQKQKDIAMLyqG------LEKLEKEKANREH 1480
Cdd:TIGR02168  928 ELRLEGLEVRIDNLQERLSEEYSLT-LEEAEALENKIEDDEEEARRRlKRLENKIKEL--GpvnlaaIEEYEELKERYDF 1004
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1928115501 1481 LEMEID--VKADKSALAT-----KVSRVQFDATTEQLNHMMQELVAKMSGQEQDWQKmldrlLTEMDNKLD 1544
Cdd:TIGR02168 1005 LTAQKEdlTEAKETLEEAieeidREARERFKDTFDQVNENFQRVFPKLFGGGEAELR-----LTDPEDLLE 1070
 
Name Accession Description Interval E-value
DUF4795 pfam16043
Domain of unknown function (DUF4795); This family of proteins is functionally uncharacterized. ...
1419-1598 1.73e-77

Domain of unknown function (DUF4795); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 285 and 978 amino acids in length.


Pssm-ID: 464990 [Multi-domain]  Cd Length: 181  Bit Score: 254.15  E-value: 1.73e-77
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1419 QRQDEELLGRVQSAILQVQGDCEKLNITTSNLIEDHRQKQKDIAMLYQGLEKLEKEKANREHLEMEIDVKADKSALATKV 1498
Cdd:pfam16043    2 KVEDAELLDQLQALILDLQEELEKLSETTSELSERLQQRQKHLEALYQQIEKLEKVKADKEVVEEELDEKADKEALASKV 81
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1499 SRVQFDATTEQLNHMMQELVAKMSGQEQDWQKMLDRLLTEMDNKLDRLELDPVKQLLEDRWKSLRQQLRERPPLYQADEA 1578
Cdd:pfam16043   82 SRDQFDETLEELNQMLQELLDKLEGQEDAWKKALETLSEELDTKLDRLELDPLKELLERRIKALQKLLQEGSEELDEAEA 161
                          170       180
                   ....*....|....*....|
gi 1928115501 1579 AAMRRQLLAHFHCLSCDRPL 1598
Cdd:pfam16043  162 AGFRKKLLERFHCISCDRPV 181
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
448-980 2.91e-16

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 85.00  E-value: 2.91e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  448 PGRDQQGLELPSTDQHGLVSVSAYQHGMTFPGTDQRSMEPLGMDQrgcviSGMGQQGLVP-----PGIDQQGLTLPVVDQ 522
Cdd:pfam03157  194 SGQGQPGYYPTSSQQPGQLQQTGQGQQGQQPERGQQGQQPGQGQQ-----PGQGQQGQQPgqpqqLGQGQQGYYPISPQQ 268
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  523 HGLVlpftDQHGLVSPGLMPISADQQGFVQPSLEATGFIQPGTEQHD--LIQSGRFQRALVQRGAYQPGLVQPGADQRGL 600
Cdd:pfam03157  269 PRQW----QQSGQGQQGYYPTSLQQPGQGQSGYYPTSQQQAGQLQQEqqLGQEQQDQQPGQGRQGQQPGQGQQGQQPAQG 344
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  601 VRPGMDQSGLAQPGADQRGLVWPGMDQSGLAQPGRDQhgliQPGTGQHDLvQSGTGQGVLVQPGVDQPGMVQPGRFQRAL 680
Cdd:pfam03157  345 QQPGQGQPGYYPTSPQQPGQGQPGYYPTSQQQPQQGQ----QPEQGQQGQ-QQGQGQQGQQPGQGQQPGQGQPGYYPTSP 419
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  681 VQPGAYQPGLVQPGADQIDVVQPGADQHGLVQSGADQSDlaQPGAVQHGlVQPGVDQRGlaQPRADHQRGLVPPGADQRG 760
Cdd:pfam03157  420 QQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQEQPGQGQ--QPGQGQQG-QQPGQPEQG--QQPGQGQPGYYPTSPQQSG 494
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  761 -------LVQPGADQHGLVQPGVDQHGLAQPGEVQRSLVQPGIVQRGLVQPGAVQRGLVQPGAVQRGLVQPGVDQRGlVQ 833
Cdd:pfam03157  495 qgqqlgqWQQQGQGQPGYYPTSPLQPGQGQPGYYPTSPQQPGQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQG-QQ 573
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  834 PGAVQRGLVQPGAVQHGLVQPGADQRGLVQPGVDQrglvQPGVDQrglvQPGMDQRGLIQPGADQPGLVQPGAGQLGMVQ 913
Cdd:pfam03157  574 PGQGQQGQQPGQGQQPGQGQPGYYPTSPQQSGQGQ----QPGQWQ----QPGQGQPGYYPTSSLQLGQGQQGYYPTSPQQ 645
                          490       500       510       520       530       540
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1928115501  914 PGIGQQgmvQPQADPHGLVQPGAYPLGLVQPGAylHDLSQSGTYPRGLVQPGMDQYGLRQPGAYQPG 980
Cdd:pfam03157  646 PGQGQQ---PGQWQQSGQGQQGYYPTSPQQSGQ--AQQPGQGQQPGQWLQPGQGQQGYYPTSPQQPG 707
SMC_prok_B TIGR02168
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
1176-1544 4.42e-06

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 51.98  E-value: 4.42e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1176 LSERRNSLRRMSSSFPTAVETF-HLMGELSSLYVGLKESMKDLDEEQ-AGQTDLEKIQFLLAQMVKRTIppELQEQLKTV 1253
Cdd:TIGR02168  728 ISALRKDLARLEAEVEQLEERIaQLSKELTELEAEIEELEERLEEAEeELAEAEAEIEELEAQIEQLKE--ELKALREAL 805
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1254 KTLAKEVWQEKAKVERLQRILEGEGNQEAGKELKAGELRLQLGVLRVTVADIEKELAELRESQDRGKAAME------NSV 1327
Cdd:TIGR02168  806 DELRAELTLLNEEAANLRERLESLERRIAATERRLEDLEEQIEELSEDIESLAAEIEELEELIEELESELEallnerASL 885
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1328 SEASLYLQDQLDKLRMIIEsmltssstllsmsmaphkahtlapgqidpeatcpacslDVSHQVSTLVRRYEQLQDMVNSL 1407
Cdd:TIGR02168  886 EEALALLRSELEELSEELR--------------------------------------ELESKRSELRRELEELREKLAQL 927
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1408 AVSRPSKKAKLQRQDEELLGRVQSAiLQVQGDCEKLNITTSNLIEDH-RQKQKDIAMLyqG------LEKLEKEKANREH 1480
Cdd:TIGR02168  928 ELRLEGLEVRIDNLQERLSEEYSLT-LEEAEALENKIEDDEEEARRRlKRLENKIKEL--GpvnlaaIEEYEELKERYDF 1004
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1928115501 1481 LEMEID--VKADKSALAT-----KVSRVQFDATTEQLNHMMQELVAKMSGQEQDWQKmldrlLTEMDNKLD 1544
Cdd:TIGR02168 1005 LTAQKEdlTEAKETLEEAieeidREARERFKDTFDQVNENFQRVFPKLFGGGEAELR-----LTDPEDLLE 1070
SMC_prok_B TIGR02168
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
1284-1569 7.20e-06

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 51.21  E-value: 7.20e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1284 KELKagELRLQLGVLRVTVADIEKELAELRESQDRgkaamensvseaslyLQDQLDKLRMIIESMLTSSSTLLSMSMAPH 1363
Cdd:TIGR02168  677 REIE--ELEEKIEELEEKIAELEKALAELRKELEE---------------LEEELEQLRKELEELSRQISALRKDLARLE 739
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1364 KAHTLAPGQIDpeatcpacslDVSHQVSTLVRRYEQLQDMVNSL---AVSRPSKKAKLQRQDEELLGRVQ---SAILQVQ 1437
Cdd:TIGR02168  740 AEVEQLEERIA----------QLSKELTELEAEIEELEERLEEAeeeLAEAEAEIEELEAQIEQLKEELKalrEALDELR 809
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1438 GDCEKLNITTSNLIEDHRQKQKDIAMLYQGLEKLEKEK-----------ANREHLEMEIDVKADKSALATKVSRVQFDAt 1506
Cdd:TIGR02168  810 AELTLLNEEAANLRERLESLERRIAATERRLEDLEEQIeelsedieslaAEIEELEELIEELESELEALLNERASLEEA- 888
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1928115501 1507 TEQLNHMMQELVAKMSGQEQDWQKmLDRLLTEMDNKLDRLELDpvKQLLEDRWKSLRQQLRER 1569
Cdd:TIGR02168  889 LALLRSELEELSEELRELESKRSE-LRRELEELREKLAQLELR--LEGLEVRIDNLQERLSEE 948
YhaN COG4717
Uncharacterized conserved protein YhaN, contains AAA domain [Function unknown];
1167-1586 7.90e-05

Uncharacterized conserved protein YhaN, contains AAA domain [Function unknown];


Pssm-ID: 443752 [Multi-domain]  Cd Length: 641  Bit Score: 47.45  E-value: 7.90e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1167 EGSEVSSEVLSERRNSLRRMSSSFPTAVETFHLMGELSSL---YVGLKESMKDL----DEEQAGQTDLEKIQFLLAQMVK 1239
Cdd:COG4717    105 EELEAELEELREELEKLEKLLQLLPLYQELEALEAELAELperLEELEERLEELreleEELEELEAELAELQEELEELLE 184
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1240 RT---IPPELQEQLKTVKTLAKEVWQEKAKVERLQRILEgegnqEAGKELKAGELRLQlgvlrvtVADIEKELAELRESq 1316
Cdd:COG4717    185 QLslaTEEELQDLAEELEELQQRLAELEEELEEAQEELE-----ELEEELEQLENELE-------AAALEERLKEARLL- 251
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1317 drgkAAMENSVSEASLYLQDQLDKLRMIIESMLTSSSTLLSMSMAPHKAHTLAPGQIDPEATCPACSldvshqvstlVRR 1396
Cdd:COG4717    252 ----LLIAAALLALLGLGGSLLSLILTIAGVLFLVLGLLALLFLLLAREKASLGKEAEELQALPALE----------ELE 317
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1397 YEQLQDMVNSLAVSRPSKKAKLqRQDEELLGRVQSAILQVQGDCEKLNIttsNLIEDHRQ------KQKDIAMLYQGLEK 1470
Cdd:COG4717    318 EEELEELLAALGLPPDLSPEEL-LELLDRIEELQELLREAEELEEELQL---EELEQEIAallaeaGVEDEEELRAALEQ 393
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1471 LEKEKANREHLEmeiDVKADKSALATKVSRVQFDATTEQLNHMMQELVAKMsgqeQDWQKMLDRLLTEM---DNKLDRLE 1547
Cdd:COG4717    394 AEEYQELKEELE---ELEEQLEELLGELEELLEALDEEELEEELEELEEEL----EELEEELEELREELaelEAELEQLE 466
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|.
gi 1928115501 1548 LDPVKQLLEDRWKSLRQQLRErpplyQADEAAAMR--RQLL 1586
Cdd:COG4717    467 EDGELAELLQELEELKAELRE-----LAEEWAALKlaLELL 502
PRK02224 PRK02224
DNA double-strand break repair Rad50 ATPase;
1245-1576 2.32e-03

DNA double-strand break repair Rad50 ATPase;


Pssm-ID: 179385 [Multi-domain]  Cd Length: 880  Bit Score: 43.11  E-value: 2.32e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1245 ELQEQLKTVKTLAKEVWQEkakVERL-QRILEGEGNQEAGKElKAGELRLQLGVLRVTVADIEKELAELRESQDRGKAAM 1323
Cdd:PRK02224   325 ELRDRLEECRVAAQAHNEE---AESLrEDADDLEERAEELRE-EAAELESELEEAREAVEDRREEIEELEEEIEELRERF 400
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1324 ENS---VSEASLYLQDQLDKLRMIIESMLTSSSTLLSMSMAPHKAHTLApgqidPEATCPACSLDV---SH--------- 1388
Cdd:PRK02224   401 GDApvdLGNAEDFLEELREERDELREREAELEATLRTARERVEEAEALL-----EAGKCPECGQPVegsPHvetieedre 475
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1389 QVSTLVRRYEQLQDMVNSLA--VSRPSKKAKLQRQDEELLGRVQsailqvqgdceklniTTSNLIEDHR----QKQKDIA 1462
Cdd:PRK02224   476 RVEELEAELEDLEEEVEEVEerLERAEDLVEAEDRIERLEERRE---------------DLEELIAERRetieEKRERAE 540
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1463 MLYQGLEKLEKEKANREHLEMEIDVKADKSALATKV---SRVQFDATTEQLNHmMQELVAKMSGQEQDWQKMLDRL--LT 1537
Cdd:PRK02224   541 ELRERAAELEAEAEEKREAAAEAEEEAEEAREEVAElnsKLAELKERIESLER-IRTLLAAIADAEDEIERLREKReaLA 619
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....*.
gi 1928115501 1538 EM-DNKLDRLE--LDPVKQLL----EDRWKSLRQQlRERPPLYQAD 1576
Cdd:PRK02224   620 ELnDERRERLAekRERKRELEaefdEARIEEARED-KERAEEYLEQ 664
DR0291 COG1579
Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General ...
1413-1569 6.12e-03

Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General function prediction only];


Pssm-ID: 441187 [Multi-domain]  Cd Length: 236  Bit Score: 40.29  E-value: 6.12e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1413 SKKAKLQRQDEEL---LGRVQSAILQVQGDCEKLNITTSNLIEDHRQKQKDIAMLYQGLEKLEKE----KANRE--HLEM 1483
Cdd:COG1579     17 SELDRLEHRLKELpaeLAELEDELAALEARLEAAKTELEDLEKEIKRLELEIEEVEARIKKYEEQlgnvRNNKEyeALQK 96
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1484 EID-VKADKSALATKVSRV--QFDATTEQLNHmMQELVAKMSGQEQDWQKMLDRLLTEMDNKLDRL--ELDPVKQLLEDR 1558
Cdd:COG1579     97 EIEsLKRRISDLEDEILELmeRIEELEEELAE-LEAELAELEAELEEKKAELDEELAELEAELEELeaEREELAAKIPPE 175
                          170
                   ....*....|.
gi 1928115501 1559 WKSLRQQLRER 1569
Cdd:COG1579    176 LLALYERIRKR 186
 
Name Accession Description Interval E-value
DUF4795 pfam16043
Domain of unknown function (DUF4795); This family of proteins is functionally uncharacterized. ...
1419-1598 1.73e-77

Domain of unknown function (DUF4795); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 285 and 978 amino acids in length.


Pssm-ID: 464990 [Multi-domain]  Cd Length: 181  Bit Score: 254.15  E-value: 1.73e-77
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1419 QRQDEELLGRVQSAILQVQGDCEKLNITTSNLIEDHRQKQKDIAMLYQGLEKLEKEKANREHLEMEIDVKADKSALATKV 1498
Cdd:pfam16043    2 KVEDAELLDQLQALILDLQEELEKLSETTSELSERLQQRQKHLEALYQQIEKLEKVKADKEVVEEELDEKADKEALASKV 81
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1499 SRVQFDATTEQLNHMMQELVAKMSGQEQDWQKMLDRLLTEMDNKLDRLELDPVKQLLEDRWKSLRQQLRERPPLYQADEA 1578
Cdd:pfam16043   82 SRDQFDETLEELNQMLQELLDKLEGQEDAWKKALETLSEELDTKLDRLELDPLKELLERRIKALQKLLQEGSEELDEAEA 161
                          170       180
                   ....*....|....*....|
gi 1928115501 1579 AAMRRQLLAHFHCLSCDRPL 1598
Cdd:pfam16043  162 AGFRKKLLERFHCISCDRPV 181
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
448-980 2.91e-16

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 85.00  E-value: 2.91e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  448 PGRDQQGLELPSTDQHGLVSVSAYQHGMTFPGTDQRSMEPLGMDQrgcviSGMGQQGLVP-----PGIDQQGLTLPVVDQ 522
Cdd:pfam03157  194 SGQGQPGYYPTSSQQPGQLQQTGQGQQGQQPERGQQGQQPGQGQQ-----PGQGQQGQQPgqpqqLGQGQQGYYPISPQQ 268
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  523 HGLVlpftDQHGLVSPGLMPISADQQGFVQPSLEATGFIQPGTEQHD--LIQSGRFQRALVQRGAYQPGLVQPGADQRGL 600
Cdd:pfam03157  269 PRQW----QQSGQGQQGYYPTSLQQPGQGQSGYYPTSQQQAGQLQQEqqLGQEQQDQQPGQGRQGQQPGQGQQGQQPAQG 344
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  601 VRPGMDQSGLAQPGADQRGLVWPGMDQSGLAQPGRDQhgliQPGTGQHDLvQSGTGQGVLVQPGVDQPGMVQPGRFQRAL 680
Cdd:pfam03157  345 QQPGQGQPGYYPTSPQQPGQGQPGYYPTSQQQPQQGQ----QPEQGQQGQ-QQGQGQQGQQPGQGQQPGQGQPGYYPTSP 419
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  681 VQPGAYQPGLVQPGADQIDVVQPGADQHGLVQSGADQSDlaQPGAVQHGlVQPGVDQRGlaQPRADHQRGLVPPGADQRG 760
Cdd:pfam03157  420 QQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQEQPGQGQ--QPGQGQQG-QQPGQPEQG--QQPGQGQPGYYPTSPQQSG 494
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  761 -------LVQPGADQHGLVQPGVDQHGLAQPGEVQRSLVQPGIVQRGLVQPGAVQRGLVQPGAVQRGLVQPGVDQRGlVQ 833
Cdd:pfam03157  495 qgqqlgqWQQQGQGQPGYYPTSPLQPGQGQPGYYPTSPQQPGQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQG-QQ 573
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  834 PGAVQRGLVQPGAVQHGLVQPGADQRGLVQPGVDQrglvQPGVDQrglvQPGMDQRGLIQPGADQPGLVQPGAGQLGMVQ 913
Cdd:pfam03157  574 PGQGQQGQQPGQGQQPGQGQPGYYPTSPQQSGQGQ----QPGQWQ----QPGQGQPGYYPTSSLQLGQGQQGYYPTSPQQ 645
                          490       500       510       520       530       540
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1928115501  914 PGIGQQgmvQPQADPHGLVQPGAYPLGLVQPGAylHDLSQSGTYPRGLVQPGMDQYGLRQPGAYQPG 980
Cdd:pfam03157  646 PGQGQQ---PGQWQQSGQGQQGYYPTSPQQSGQ--AQQPGQGQQPGQWLQPGQGQQGYYPTSPQQPG 707
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
642-1150 1.36e-13

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 76.52  E-value: 1.36e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  642 QPGTGQhdlvQSGTGQGVLVQPGVDQPGMVQpgrfQRALVQPGAYQPGLVQPGADQidvvQPGADQHGLVQSGADQSDLA 721
Cdd:pfam03157  130 RPGQGQ----QPGQGQQWYYPTSPQQPGQWQ----QPGQGQQGYYPTSPQQSGQRQ----QPGQGQQLRQGQQGQQSGQG 197
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  722 QPGAVQHGLVQPGVDQRgLAQPRADHQRGLVPPGADQRGLVQPGADQHGLVQPGVDQHGLAQPGEVQRSLVQPGIVQR-G 800
Cdd:pfam03157  198 QPGYYPTSSQQPGQLQQ-TGQGQQGQQPERGQQGQQPGQGQQPGQGQQGQQPGQPQQLGQGQQGYYPISPQQPRQWQQsG 276
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  801 LVQPGAVQRGLVQPGAVQRGLVQPGVDQRG-LVQPGAVQRGLV--QPGAVQHGLvQPGADQRGLVQPGVDQRGLVQPGVD 877
Cdd:pfam03157  277 QGQQGYYPTSLQQPGQGQSGYYPTSQQQAGqLQQEQQLGQEQQdqQPGQGRQGQ-QPGQGQQGQQPAQGQQPGQGQPGYY 355
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  878 QRGLVQPGMDQRGLIQPGADQPGLVQPGAGQLGMVQPGIGQQGMVQPQADPHGLVQPGAYPLGLVQPGAylhdlSQSGTY 957
Cdd:pfam03157  356 PTSPQQPGQGQPGYYPTSQQQPQQGQQPEQGQQGQQQGQGQQGQQPGQGQQPGQGQPGYYPTSPQQSGQ-----GQPGYY 430
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501  958 PRGLVQPGMDQyglrQPG-AYQPGLIAPGTKLRGSSTFQADSTGfisvRPYQHGMVPPGREQYGQVSPLLASQGLASPGI 1036
Cdd:pfam03157  431 PTSPQQSGQGQ----QPGqGQQPGQEQPGQGQQPGQGQQGQQPG----QPEQGQQPGQGQPGYYPTSPQQSGQGQQLGQW 502
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1037 DRRSLVPPETYQQGLMHPGTDQHSPIPLSTGLGSTHPDQQHVASPGPGEHDQVYPDAAQHGHAFSLFDSHDSMYPGY-RG 1115
Cdd:pfam03157  503 QQQGQGQPGYYPTSPLQPGQGQPGYYPTSPQQPGQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQPGQGQQgQQ 582
                          490       500       510
                   ....*....|....*....|....*....|....*.
gi 1928115501 1116 PGYLSADQHGQEGLDPNRTRASDRHGIPAQ-KAPGQ 1150
Cdd:pfam03157  583 PGQGQQPGQGQPGYYPTSPQQSGQGQQPGQwQQPGQ 618
SMC_prok_B TIGR02168
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
1176-1544 4.42e-06

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 51.98  E-value: 4.42e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1176 LSERRNSLRRMSSSFPTAVETF-HLMGELSSLYVGLKESMKDLDEEQ-AGQTDLEKIQFLLAQMVKRTIppELQEQLKTV 1253
Cdd:TIGR02168  728 ISALRKDLARLEAEVEQLEERIaQLSKELTELEAEIEELEERLEEAEeELAEAEAEIEELEAQIEQLKE--ELKALREAL 805
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1254 KTLAKEVWQEKAKVERLQRILEGEGNQEAGKELKAGELRLQLGVLRVTVADIEKELAELRESQDRGKAAME------NSV 1327
Cdd:TIGR02168  806 DELRAELTLLNEEAANLRERLESLERRIAATERRLEDLEEQIEELSEDIESLAAEIEELEELIEELESELEallnerASL 885
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1328 SEASLYLQDQLDKLRMIIEsmltssstllsmsmaphkahtlapgqidpeatcpacslDVSHQVSTLVRRYEQLQDMVNSL 1407
Cdd:TIGR02168  886 EEALALLRSELEELSEELR--------------------------------------ELESKRSELRRELEELREKLAQL 927
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1408 AVSRPSKKAKLQRQDEELLGRVQSAiLQVQGDCEKLNITTSNLIEDH-RQKQKDIAMLyqG------LEKLEKEKANREH 1480
Cdd:TIGR02168  928 ELRLEGLEVRIDNLQERLSEEYSLT-LEEAEALENKIEDDEEEARRRlKRLENKIKEL--GpvnlaaIEEYEELKERYDF 1004
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1928115501 1481 LEMEID--VKADKSALAT-----KVSRVQFDATTEQLNHMMQELVAKMSGQEQDWQKmldrlLTEMDNKLD 1544
Cdd:TIGR02168 1005 LTAQKEdlTEAKETLEEAieeidREARERFKDTFDQVNENFQRVFPKLFGGGEAELR-----LTDPEDLLE 1070
SMC_prok_B TIGR02168
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
1284-1569 7.20e-06

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 51.21  E-value: 7.20e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1284 KELKagELRLQLGVLRVTVADIEKELAELRESQDRgkaamensvseaslyLQDQLDKLRMIIESMLTSSSTLLSMSMAPH 1363
Cdd:TIGR02168  677 REIE--ELEEKIEELEEKIAELEKALAELRKELEE---------------LEEELEQLRKELEELSRQISALRKDLARLE 739
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1364 KAHTLAPGQIDpeatcpacslDVSHQVSTLVRRYEQLQDMVNSL---AVSRPSKKAKLQRQDEELLGRVQ---SAILQVQ 1437
Cdd:TIGR02168  740 AEVEQLEERIA----------QLSKELTELEAEIEELEERLEEAeeeLAEAEAEIEELEAQIEQLKEELKalrEALDELR 809
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1438 GDCEKLNITTSNLIEDHRQKQKDIAMLYQGLEKLEKEK-----------ANREHLEMEIDVKADKSALATKVSRVQFDAt 1506
Cdd:TIGR02168  810 AELTLLNEEAANLRERLESLERRIAATERRLEDLEEQIeelsedieslaAEIEELEELIEELESELEALLNERASLEEA- 888
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1928115501 1507 TEQLNHMMQELVAKMSGQEQDWQKmLDRLLTEMDNKLDRLELDpvKQLLEDRWKSLRQQLRER 1569
Cdd:TIGR02168  889 LALLRSELEELSEELRELESKRSE-LRRELEELREKLAQLELR--LEGLEVRIDNLQERLSEE 948
Mplasa_alph_rch TIGR04523
helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of ...
1202-1568 1.23e-05

helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown.


Pssm-ID: 275316 [Multi-domain]  Cd Length: 745  Bit Score: 50.40  E-value: 1.23e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1202 ELSSLYVGLKESMKDL-DEEQAGQTDLEKIQfllaqmvkrTIPPELQEQLKTVKTLakevwQEKAKVERLQRILEGEGNQ 1280
Cdd:TIGR04523  215 SLESQISELKKQNNQLkDNIEKKQQEINEKT---------TEISNTQTQLNQLKDE-----QNKIKKQLSEKQKELEQNN 280
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1281 EAGKELKAG--ELRLQLGVLR-VTVADIEKELAELRESQDRGKAAMENSVSEAslylQDQLDKLRMIIEsmltssstlls 1357
Cdd:TIGR04523  281 KKIKELEKQlnQLKSEISDLNnQKEQDWNKELKSELKNQEKKLEEIQNQISQN----NKIISQLNEQIS----------- 345
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1358 msmaphkahtlapgQIDPEatcpacsldVSHQVSTLVRRYEQLQDMVNSLAvsrpskkaKLQRQDEELLGRVQSAILQVQ 1437
Cdd:TIGR04523  346 --------------QLKKE---------LTNSESENSEKQRELEEKQNEIE--------KLKKENQSYKQEIKNLESQIN 394
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1438 gdceKLNITTSNLIEDHRQKQKDIAMLYQGLEKLEKE----KANREHLEMEI-DVKADKSALATKVSrvQFDATTEQLNH 1512
Cdd:TIGR04523  395 ----DLESKIQNQEKLNQQKDEQIKKLQQEKELLEKEierlKETIIKNNSEIkDLTNQDSVKELIIK--NLDNTRESLET 468
                          330       340       350       360       370
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1928115501 1513 MMQELVAKMSGQEQDWQKmldrLLTEMDNKLDRL-ELDPVKQLLEDRWKSLRQQLRE 1568
Cdd:TIGR04523  469 QLKVLSRSINKIKQNLEQ----KQKELKSKEKELkKLNEEKKELEEKVKDLTKKISS 521
YhaN COG4717
Uncharacterized conserved protein YhaN, contains AAA domain [Function unknown];
1167-1586 7.90e-05

Uncharacterized conserved protein YhaN, contains AAA domain [Function unknown];


Pssm-ID: 443752 [Multi-domain]  Cd Length: 641  Bit Score: 47.45  E-value: 7.90e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1167 EGSEVSSEVLSERRNSLRRMSSSFPTAVETFHLMGELSSL---YVGLKESMKDL----DEEQAGQTDLEKIQFLLAQMVK 1239
Cdd:COG4717    105 EELEAELEELREELEKLEKLLQLLPLYQELEALEAELAELperLEELEERLEELreleEELEELEAELAELQEELEELLE 184
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1240 RT---IPPELQEQLKTVKTLAKEVWQEKAKVERLQRILEgegnqEAGKELKAGELRLQlgvlrvtVADIEKELAELRESq 1316
Cdd:COG4717    185 QLslaTEEELQDLAEELEELQQRLAELEEELEEAQEELE-----ELEEELEQLENELE-------AAALEERLKEARLL- 251
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1317 drgkAAMENSVSEASLYLQDQLDKLRMIIESMLTSSSTLLSMSMAPHKAHTLAPGQIDPEATCPACSldvshqvstlVRR 1396
Cdd:COG4717    252 ----LLIAAALLALLGLGGSLLSLILTIAGVLFLVLGLLALLFLLLAREKASLGKEAEELQALPALE----------ELE 317
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1397 YEQLQDMVNSLAVSRPSKKAKLqRQDEELLGRVQSAILQVQGDCEKLNIttsNLIEDHRQ------KQKDIAMLYQGLEK 1470
Cdd:COG4717    318 EEELEELLAALGLPPDLSPEEL-LELLDRIEELQELLREAEELEEELQL---EELEQEIAallaeaGVEDEEELRAALEQ 393
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1471 LEKEKANREHLEmeiDVKADKSALATKVSRVQFDATTEQLNHMMQELVAKMsgqeQDWQKMLDRLLTEM---DNKLDRLE 1547
Cdd:COG4717    394 AEEYQELKEELE---ELEEQLEELLGELEELLEALDEEELEEELEELEEEL----EELEEELEELREELaelEAELEQLE 466
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|.
gi 1928115501 1548 LDPVKQLLEDRWKSLRQQLRErpplyQADEAAAMR--RQLL 1586
Cdd:COG4717    467 EDGELAELLQELEELKAELRE-----LAEEWAALKlaLELL 502
DUF4515 pfam14988
Domain of unknown function (DUF4515); This family of proteins is found in bacteria and ...
1391-1588 4.50e-04

Domain of unknown function (DUF4515); This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 198 and 469 amino acids in length. There are two completely conserved L residues that may be functionally important.


Pssm-ID: 405647 [Multi-domain]  Cd Length: 206  Bit Score: 43.60  E-value: 4.50e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1391 STLVRRYEQLQDMVNSLAVSRPSKKAKLQRQDEELLGRVQSAILQVQGDCEklnittsnliedhrQKQKDIAMLYQGLEK 1470
Cdd:pfam14988    7 EYLAKKTEEKQKKIEKLWNQYVQECEEIERRRQELASRYTQQTAELQTQLL--------------QKEKEQASLKKELQA 72
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1471 LEKEKANREHLEMEID-----VKADKSALATKVSRVQFDATTEQlNHMMQELvakmsgQEQDWQKMLDRLLTEMDNKLDR 1545
Cdd:pfam14988   73 LRPFAKLKESQEREIQdleeeKEKVRAETAEKDREAHLQFLKEK-ALLEKQL------QELRILELGERATRELKRKAQA 145
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 1928115501 1546 LELdPVKQLLEDRWKSLR---QQLRERpPLYQADEAAA---------MRRQLLAH 1588
Cdd:pfam14988  146 LKL-AAKQALSEFCRSIKrenRQLQKE-LLQLIQETQAleaikskleNRKQRLKE 198
COG1340 COG1340
Uncharacterized coiled-coil protein, contains DUF342 domain [Function unknown];
1202-1342 1.36e-03

Uncharacterized coiled-coil protein, contains DUF342 domain [Function unknown];


Pssm-ID: 440951 [Multi-domain]  Cd Length: 297  Bit Score: 42.98  E-value: 1.36e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1202 ELSSLYVGLKESMKDLDEEQAGQTDLEKIQFLLAQMVKR----TIPPELQEQL-KTVKTLAKEVwqEKAKVERlqrileg 1276
Cdd:COG1340     86 KLNELREELDELRKELAELNKAGGSIDKLRKEIERLEWRqqteVLSPEEEKELvEKIKELEKEL--EKAKKAL------- 156
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1928115501 1277 egnqEAGKELKagELRLQLGVLRVTVADIEKELAELRESQDRGKAAMensvseASLYlqDQLDKLR 1342
Cdd:COG1340    157 ----EKNEKLK--ELRAELKELRKEAEEIHKKIKELAEEAQELHEEM------IELY--KEADELR 208
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
1153-1431 2.09e-03

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 43.13  E-value: 2.09e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1153 TLFRSPDSVDRVLSEGS---EVSSEVLSERRNSLRRMSSSFPTAVETF-HLMGELSSLYVGL---KESMKDLDEE-QAGQ 1224
Cdd:TIGR02169  692 SLQSELRRIENRLDELSqelSDASRKIGEIEKEIEQLEQEEEKLKERLeELEEDLSSLEQEIenvKSELKELEARiEELE 771
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1225 TDLEKIQFLLA---QMVKRTIPPELQEQLKTVKT-------------------LAKEVWQEKAKVERLQRILEGEGNQEA 1282
Cdd:TIGR02169  772 EDLHKLEEALNdleARLSHSRIPEIQAELSKLEEevsriearlreieqklnrlTLEKEYLEKEIQELQEQRIDLKEQIKS 851
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1283 -GKELKAGELRL-----QLGVLRVTVADIEKELAELRESQDRGKA---AMENSVSEASLylqdQLDKLRMIIESMLTSSS 1353
Cdd:TIGR02169  852 iEKEIENLNGKKeeleeELEELEAALRDLESRLGDLKKERDELEAqlrELERKIEELEA----QIEKKRKRLSELKAKLE 927
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1354 TLLSMSmaphKAHTLAPGQIDPEATCPACSLDVSHQVSTLVRRYEQLQDmVNSLAV-----------SRPSKKAKLQRQD 1422
Cdd:TIGR02169  928 ALEEEL----SEIEDPKGEDEEIPEEELSLEDVQAELQRVEEEIRALEP-VNMLAIqeyeevlkrldELKEKRAKLEEER 1002

                   ....*....
gi 1928115501 1423 EELLGRVQS 1431
Cdd:TIGR02169 1003 KAILERIEE 1011
PRK02224 PRK02224
DNA double-strand break repair Rad50 ATPase;
1245-1576 2.32e-03

DNA double-strand break repair Rad50 ATPase;


Pssm-ID: 179385 [Multi-domain]  Cd Length: 880  Bit Score: 43.11  E-value: 2.32e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1245 ELQEQLKTVKTLAKEVWQEkakVERL-QRILEGEGNQEAGKElKAGELRLQLGVLRVTVADIEKELAELRESQDRGKAAM 1323
Cdd:PRK02224   325 ELRDRLEECRVAAQAHNEE---AESLrEDADDLEERAEELRE-EAAELESELEEAREAVEDRREEIEELEEEIEELRERF 400
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1324 ENS---VSEASLYLQDQLDKLRMIIESMLTSSSTLLSMSMAPHKAHTLApgqidPEATCPACSLDV---SH--------- 1388
Cdd:PRK02224   401 GDApvdLGNAEDFLEELREERDELREREAELEATLRTARERVEEAEALL-----EAGKCPECGQPVegsPHvetieedre 475
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1389 QVSTLVRRYEQLQDMVNSLA--VSRPSKKAKLQRQDEELLGRVQsailqvqgdceklniTTSNLIEDHR----QKQKDIA 1462
Cdd:PRK02224   476 RVEELEAELEDLEEEVEEVEerLERAEDLVEAEDRIERLEERRE---------------DLEELIAERRetieEKRERAE 540
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1463 MLYQGLEKLEKEKANREHLEMEIDVKADKSALATKV---SRVQFDATTEQLNHmMQELVAKMSGQEQDWQKMLDRL--LT 1537
Cdd:PRK02224   541 ELRERAAELEAEAEEKREAAAEAEEEAEEAREEVAElnsKLAELKERIESLER-IRTLLAAIADAEDEIERLREKReaLA 619
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....*.
gi 1928115501 1538 EM-DNKLDRLE--LDPVKQLL----EDRWKSLRQQlRERPPLYQAD 1576
Cdd:PRK02224   620 ELnDERRERLAekRERKRELEaefdEARIEEARED-KERAEEYLEQ 664
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
1164-1498 2.47e-03

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 43.13  E-value: 2.47e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1164 VLSEGSEVSSEVLSERRNSLRRMSSSFPTA------------VETFHLMGELSSLYVGLKESMKDLDEEQAgqtDLEKIQ 1231
Cdd:TIGR02169  181 EVEENIERLDLIIDEKRQQLERLRREREKAeryqallkekreYEGYELLKEKEALERQKEAIERQLASLEE---ELEKLT 257
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1232 FLLAQMVKRT-------------IPPELQEQLKTVKTLAKEVwqeKAKVERLQRILEgEGNQEAGK-ELKAGELRLQLGV 1297
Cdd:TIGR02169  258 EEISELEKRLeeieqlleelnkkIKDLGEEEQLRVKEKIGEL---EAEIASLERSIA-EKERELEDaEERLAKLEAEIDK 333
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1298 LRVTVADIEKELAELResqdRGKAAMENSVSEaslyLQDQLDKLRMIIEsmltssstllsmsmaphkahtlapgqidpea 1377
Cdd:TIGR02169  334 LLAEIEELEREIEEER----KRRDKLTEEYAE----LKEELEDLRAELE------------------------------- 374
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1378 tcpacSLDVSHQ-----VSTLVRRYEQLQDMVNSLAVSRPSKKAKLQRQDEEL------LGRVQSAILQVQGDCE----- 1441
Cdd:TIGR02169  375 -----EVDKEFAetrdeLKDYREKLEKLKREINELKRELDRLQEELQRLSEELadlnaaIAGIEAKINELEEEKEdkale 449
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1442 --KLNITTSNLIEDHRQKQKDIAMLYQGLEKLEKEkanREHLEMEID-VKADKSALATKV 1498
Cdd:TIGR02169  450 ikKQEWKLEQLAADLSKYEQELYDLKEEYDRVEKE---LSKLQRELAeAEAQARASEERV 506
DR0291 COG1579
Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General ...
1413-1569 6.12e-03

Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General function prediction only];


Pssm-ID: 441187 [Multi-domain]  Cd Length: 236  Bit Score: 40.29  E-value: 6.12e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1413 SKKAKLQRQDEEL---LGRVQSAILQVQGDCEKLNITTSNLIEDHRQKQKDIAMLYQGLEKLEKE----KANRE--HLEM 1483
Cdd:COG1579     17 SELDRLEHRLKELpaeLAELEDELAALEARLEAAKTELEDLEKEIKRLELEIEEVEARIKKYEEQlgnvRNNKEyeALQK 96
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1484 EID-VKADKSALATKVSRV--QFDATTEQLNHmMQELVAKMSGQEQDWQKMLDRLLTEMDNKLDRL--ELDPVKQLLEDR 1558
Cdd:COG1579     97 EIEsLKRRISDLEDEILELmeRIEELEEELAE-LEAELAELEAELEEKKAELDEELAELEAELEELeaEREELAAKIPPE 175
                          170
                   ....*....|.
gi 1928115501 1559 WKSLRQQLRER 1569
Cdd:COG1579    176 LLALYERIRKR 186
SMC_prok_B TIGR02168
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
1173-1342 7.16e-03

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 41.58  E-value: 7.16e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1173 SEVLSERRNSLRRMSSSFPTAVETFHLMGELSSLYVGLKESMKDLDEE------------QAGQTDLEKIQFLLAQMV-K 1239
Cdd:TIGR02168  757 TELEAEIEELEERLEEAEEELAEAEAEIEELEAQIEQLKEELKALREAldelraeltllnEEAANLRERLESLERRIAaT 836
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1928115501 1240 RTIPPELQEQLK----TVKTLAKEVWQEKAKVERLQRILEGEGNQEAGKELKAGELRLQLGVLRVTVADIEKELAELR-- 1313
Cdd:TIGR02168  837 ERRLEDLEEQIEelseDIESLAAEIEELEELIEELESELEALLNERASLEEALALLRSELEELSEELRELESKRSELRre 916
                          170       180       190
                   ....*....|....*....|....*....|.
gi 1928115501 1314 --ESQDRgKAAMENSVSEASLYLQDQLDKLR 1342
Cdd:TIGR02168  917 leELREK-LAQLELRLEGLEVRIDNLQERLS 946
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH