NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Madame Curie Bioscience Database [Internet]. Austin (TX): Landes Bioscience; 2000-2013.
In this chapter we review recent studies of repeat proteins, a class of proteins consisting of tandem arrays of small structural motifs that stack approximately linearly to produce elongated structures. We discuss the observation that, despite lacking the long-range tertiary interactions that are thought to be the hallmark of globular protein stability, repeat proteins can be as stable and as coorperatively folded as their globular counterparts. The symmetry inherent in the structures of repeat arrays, however, means there can be many partly folded species (whether it be intermediates or transition states) that have similar stabilities. Consequently they do have distinct folding properties compared with globular proteins and these are manifest in their behaviour both at equilibrium and under kinetic conditions. Thus, when studying repeat proteins one appears to be probing a moving target: a relatively small perturbation, by mutation for example, can result in a shift to a different intermediate or transition state. The growing literature on these proteins illustrates how their modular architecture can be adapted to a remarkable array of biological and physical roles, both in vivo and in vitro. Further, their simple architecture makes them uniquely amenable to redesign—of their stability, folding and function—promising exciting possibilities for future research.
Introduction
Repeat proteins, such as ankyrin repeats, leucine rich repeats (LRR) and tetratricopeptide repeats (TPR), consist of tandem arrays of a small structural motif of ∼20-40 amino acids that stack in a roughly linear fashion, creating elongated and superhelical architectures. They are mostly found in arrays of four, up to tens of repeats and they can comprise an entire protein or a domain within a larger protein. Repeat protein structures have simple topologies composed entirely of short-range interactions between residues within a repeat or between adjacent repeats, in striking contrast to the more complex topologies of globular proteins that are stabilized by contacts between residues distant in sequence. They are ubiquitous and highly versatile, mediating molecular recognition in many different biological processes. A picture is emerging, from work carried out in the last five years, of the simplicity of this modular architecture which has enabled the following: (1) proteins containing consensus repeat motifs have been produced with enhanced thermodynamic stabilities compared with their natural counterparts; (2) designed consensus repeat proteins have been used successfully as scaffolds for engineering novel binding specificities; (3) repeat proteins have folding pathways that are amenable to design. None of these features have been found with any consistency for globular proteins. By far the most work has been carried out on ankyrin repeat proteins, followed by TPR proteins and these two will therefore be the focus of this review.
Repeat Protein Structures
As with globular proteins, the repeat protein family can be broken down further into sub-families based on their secondary structure content. A range of different repeat proteins are shown in Figure 1. The most common forms of α-helical repeat proteins are ankyrin repeats and TPR repeats. Ankyrin repeats (named after the protein Ankyrin in which these repeats were first identified1) have an L-shaped profile, with a long loop/β-hairpin sitting at right angles to a pair of antiparallel α-helices (Fig. 1a and b). The helices of one repeat pack against the helices of an adjacent repeat, with the interface between repeats consisting of mainly hydrophobic interactions stabilising the helices and hydrogen bonding stabilising the long loop region. The two helices of the protein are of different lengths causing curvature in the structure with each repeat rotated 2-3° counterclockwise relative to the previous one. The tetratricopeptide repeat (TPR) consists of a 34 residue motif 2,3 that adopts an antiparallel helix-turn-helix arrangement without any long connecting loops unlike ankyrin repeats. The packing of a pattern of small and large hydrophobic residues results in a rather splayed arrangement of the α-helices. The crystal structure of protein phosphatase 5 contains three TPR repeats and forms a right-handed superhelical structure with a continuous helical groove4 (Fig. 1c). The groove is the proposed site of protein-protein interactions. The curvature that results from arrays of large numbers of repeats is particularly well-illustrated by the structure of the procine ribonuclease inhibitor5(Fig. 1g).
Interestingly, not all repeat proteins form linear elongated structures. The prokaryotic TPR protein N1p1 forms a globular structure in which repeats from the N-terminus forming a central hydrophobic core and the C-terminal repeats forming the outer solvent exposed surface, with the entire protein resembling a spiral.6 Further, WD40 repeats form propeller-like structures with each repeat folding into a β-sheet resembling a blade of the propeller7 (Fig. 1h).
Repeat Proteins as Mediators in Molecular Recognition
The large and regular surfaces of repeat proteins lend them ideally to mediating molecular recognition events. Repeat proteins provide a central scaffold with a conserved sequence motif and typically have highly variable sequences in loops presented on the surface. In this way, the loops can provide varied interaction surfaces in terms of geometry as well as functional group content. Repeat proteins are frequently found in proteins containing other domains and they may serve to recruit substrates in these cases. An important class of such proteins are the F-box proteins.8 SCF complexes are an important class of E3 ubiquitin ligases that target key cell cycle regulators and transcription factors to the proteasome for degradation. SCF complexes consist of two invariable, core subunits and a third, variable, subunit, the F-box protein that recruits substrates for ubiquitination by an associated E2 enzyme. F-box proteins have an F-box domain that binds to the core SCF subunits and a substrate recognition domain, in many cases composed of LRRs or WD40 repeats.
Ankyrin repeats have been extensively studied in terms of their protein-protein interactions. For example, the INK4 family of ankyrin repeat proteins specifically inhibit the cyclin-dependent kinases (CDK) of the cell cycle (Fig. 1a). There are four members of the INK4 family of CDK inhibitors: p15INK4B, p16INK4A, p18INK4C, p19INK4D. All four proteins are biochemically indistinguishable with respect to inhibition of CDK4/6. Understanding the mechanism of molecular recognition of these proteins is made more important by the fact that a common event in tumorigenesis is mutation of the regulatory proteins involved in the G1-S transition. The INK4 family of proteins specifically inhibit the cyclin-dependent kinases CDK4 and CDK6 responsible for initiating this transition. Insertions in the ankyrin repeat motif are common and usually found in the loop regions. In the case of the INK4 proteins, the first helix in the second repeat is shorter than the consensus one.
The ankyrin repeats of the protein IκBα are an interesting example of the phenomenon of protein folding induced by binding, as is seen in an increasing number of protein-protein interactions. IκBα is an inhibitor of the transcription factor NF-κB and the NF-κB-binding domain of IκBα has six ankyrin repeats (Fig. 1b). The structure of IκBα is only known in complex with NF-κB but extensive biophysical analysis revealed that although the protein is compactly folded and has all of its secondary structural elements it nevertheless has some highly dynamic, molten-globule-like regions in both free and complexed forms.9 Subsequent analysis using hydrogen/deuterium exchange monitored by mass spectrometry showed that the β-hairpins of ankyrin repeats 5 and 6 only fold upon binding to NF-κB.10,11 The authors propose that this feature of IκBα may be important for several aspects of its function that are critical for the tight regulation of NF-κB activity. In particular, the flexibility of the hairpins may help it to bind to different NF-κB dimers resulting in the transcription of different genes and also facilitate the dissociation of NF-κB from DNA.
Another example of the molecular recognition capabilities of repeat proteins is found in nuclear transport (reviewed in 12,13). The import of cargos into the nucleus of cells is dependent on Armadillo repeat- and HEAT repeat-containing proteins known as the importins. The Armadillo repeat was first identified in the Drosophila melanogaster gene product Armadillo.14 These proteins consist of tandem repeats of an approximately 40-residue motif which folds into a three-helix bundle. The approximately 40-residue HEAT motif consists of only two helices. The HEAT motif was initially found in a diverse range of proteins, including the four from which its name derives: huntingtin, elongation factor 3, the PR65/A subunit of protein phosphatase 2A and the lipid kinase Tor (target of rapamycin).15 In both armadillo- and HEAT-repeat proteins, the motifs stack together to form an elongated super-helix. Cargoes destined for the nucleus and displaying the nuclear localisation signal (NLS),16-19 a short predominantly positively charged peptide, bind importin-α, a protein containing 10 armadillo repeats in which the super-helix creates a narrow internal groove that envelops the NLS and displays its own positively charged peptide20 (Fig. 1e). This importin-α peptide interacts with a 19-HEAT repeat protein, importin-β/β-karyopherin21 (Fig. 1d). The resulting large complex enables the cargo to translocate through the nuclear pore. Not only are these interactions extremely specific for the NLS peptide, but the outer surface of these proteins also allow the complex to dissolve into the central channel of the pore itself in a highly selective fashion. A few cargo proteins bind directly to importin-β rather than via importin-α. Once in the nucleus, RanGTP binds to importin-β thereby dissociating the importin-αβ complex and leading to release of the cargo. Another 19-HEAT repeat protein, CAS, then exports importin-α out of the nucleus so that it can be used again. Crystal structures reveal that the HEAT repeats of both importin-β and CAS function as tightly wound springs with the curvature created by the packing of the individual α-helices and thereby the twist of the super-helix, changing dramatically depending on whether they are in the free or complexed form. It is thought that this flexibility enables them to bind to a range of different partners by wrapping around them with varying helicoidal pitch.
The regular presentation of surface residues provided by the repeat protein architecture lends itself to roles other than just mediating interactions between proteins. For example, the antifreeze proteins of some invertebrates form fascinating structures that are able to directly bind to ice crystals. These antifreeze proteins, found in invertebrates such as the Tenebrio molitor beetle, are composed of tandem 12-residue repeats (TCTxSxxCxxAx) which form linear beta-helical structures which are either square or triangular in cross section22 (Fig. 1f). Remarkably, these beta-helical structures present a regular array of optimally orientated binding pockets which bind water molecules at the edges of ice crystals, thereby preventing further ice crystal growth and therefore damage to the organism.
Designing Repeat Proteins
A number of groups have taken advantage of the symmetry in linearly repeating structures to successfully design novel proteins that have been found to have increased stability relative to the natural counterparts. To date, they have all applied, at least in part, the concept of consensus design23—engineering a protein to have the most common residue as each position as determined by multiple sequence alignment. The rationale is that the residues that are most highly conserved among proteins of a particular fold will be those that are important for the stability of that fold, although there are a number of reasons why this rationale may be imperfect.24 Two groups have designed consensus ankyrin repeat proteins, making use of the relatively high sequence homology within ankyrin repeats. Pluckthun's group used libraries assembled from N- and C-terminal capping repeats and an internal repeat consensus sequence with 27 fixed positions and the remaining 6 randomized. Six of the library members were chosen at random, containing between four and six repeats.25-27 Crystallography of one of the proteins showed that it adopted the correct fold.28,29 The proteins were more stable (as measured by equilibrium chemical denaturation experiments) than natural ankyrin repeat proteins of similar sizes and five of the six were monomeric. The most stable protein had six repeats and a free energy of unfolding that was greater than 20 kcal.mol−1.25 A comparison of the hydrophobic core packing of the designed ankyrin repeat structure with natural ankyrin repeats did not reveal any significant differences. The authors suggest that elimination of insertions, commonly found in the loop regions and used for molecular recognition, may contribute to the enhanced stability of these designed proteins relative to their natural counterparts; the loop regions form a greatly improved, regular network of hydrogen bonds. Increased stability is also likely to come from the conserved TPLH motif in the first helix of the ankyrin repeat which again results in a network of hydrogen bonds extending throughout the molecule, as well as from the tight packing of hydrophobic residues between the α-helices. A similar approach was used to design leucine-rich repeat protein libraries. As for the consensus ankyrin repeat proteins, the leucine-rich repeat proteins were found to be expressed at very high levels in E. coli, soluble, monomeric and stable.30 Peng's group designed a consensus ankyrin repeat sequence that differed in four positions from Pluckthun's31. The other major difference was that they did not use capping repeats that were different from the internal repeats. They built proteins consisting of 1-4 ankyrin repeats and X-ray crystal structures of the two larger proteins showed that they adopted the correct fold.31 The designed proteins were also found to have high thermal stability but they were only soluble in acidic conditions. The solubility at neutral pH was improved by making substitutions of surface leucine residues for arginines.
Pluckthun's group have since gone on to show that the idea of using consensus ankyrin repeats as scaffolds on which to graft novel binding specificities can have great success. The residues to be used for molecular recognition are located on the β-turn and first helix of each repeat, creating a large and modular surface. Noteworthy are designed 4-ankyrin repeat proteins (DARPins) selected against human epidermal growth factor receptor 2 with high specificity and low nanomolar affinities.32 The interaction was subsequently improved to a 90 nM affinity using error-prone PCR and stringent off-rate selection.33
Regan's group have made novel TPR proteins by consensus design. They constructed proteins containing 1.5-3.5 TPR motifs34 and additionally incorporated an N-capping, helix-stabilising GNS sequence at the N-terminus and an extra solvating helix at the C-terminus. The rationale for the latter addition was that similar helices were present in all of the structures of TPR-containing proteins and were found to increase solubility. The 2.5- and 3.5-TPR repeat proteins folded into the correct structures and had high thermal stabilities.35 Whereas the enhanced stability of the consensus-designed ankyrin repeat proteins appears to come from the regularised hydrogen bonding networks, the stabilising interactions in the consensus TPR structures are predominantly the large hydrophobic residues that force the α-helices apart into the characteristic elongated architecture. This group then took the consensus TPR repeats and redesigned them to bind to the C-terminal peptide of the molecular chaperone Hsp90 (a natural TPR binding partner).36 The designed protein had greater specificity than the natural TPR proteins.
Several studies have recently shown that the stabilities of naturally occurring ankyrin repeat proteins can be enhanced by using the consensus sequence.37-39 For example, for the 7-ankyrin repeat domain of Notch, replacement of the two C-terminal repeats by consensus repeats increased the stability of the protein by almost 6 kcal.mol−1.38 Substitution by consensus residues of two residues in the N-terminal repeat of the 4-ankyrin repeat protein myotrophin increased the stability by over 2 kcal.mol−1.39 The 6-ankyrin repeat region of IkBa has only marginal stability; engineering of three residues, one in each of three of the repeats, to conform to the extended motif GXTPLHLA of the consensus (that creates a hydrogen bonding network as described above) resulted in an increase in stability of 4.5 kcal mol−1.37
Biophysical Properties of Repeat Proteins
The distinctive architecture of repeat proteins compared with globular proteins makes them a particular interesting target for biophysical analysis. The simple, modular nature of the structures and the lack of long-range interactions (which are thought to be contribute to the cooperative folding of globular proteins), pose a number of questions. First, are repeat proteins more or less stable than globular proteins of similar size? Second, are repeat structures less cooperatively folded given the lack of long-range contacts? Third, are the differences in the structures of repeat proteins compared with globular proteins reflected in their folding mechanisms? In particular, it has been shown that there is a correlation between folding rate and proportion of short-range contacts in the native structure because there is a smaller entropic cost of closing a short loop to form an interaction between residues close in sequence than closing a long loop between sequence-distant residues;40 so, do repeat proteins fold faster than globular ones because of the predominance of short-range contacts? And, are there multiple folding pathways accessible due to the linear symmetry of their structures?
Repeat Protein Stability
Perhaps surprisingly given the lack of long-range interactions, the natural ankyrin repeat proteins and designed TPR proteins studied to-date have stabilities within the range found for globular proteins. Several studies have shown that both the free energy of unfolding and the m-value (a measure of the size of the structural unit that is unfolding in a cooperative step) increase with increasing number of repeats, indicating that there is cooperativity of folding, although cooperativity does break down after a certain number of repeats is reached (see below).25,35 Truncations of p1641 and the Notch42 ankyrin domain indicate that a minimum of two ankyrin repeats is required for a cooperative unit of structure, whereas Regan and colleagues were able to design a cooperatively folded polypeptide consisting of only one and a half TPR motifs. These findings confirm the observation from contact maps of ankyrin repeat and TPR proteins indicating that the TPR module is a more independently folded unit of structure that makes more contacts within the module than between modules, whereas the ankyrin module makes many contacts with adjacent modules.43 In nature, ankyrin repeat and TPR-containing domains rarely contain fewer than three tandem motifs.
The biophysical properties of proteins consisting of 4-7 ankyrin repeats have been studied the most extensively. The data show that, despite their modular architecture, these ankyrin repeat proteins unfold in as cooperative a manner as do globular proteins. Studies of deletion variants of the Notch ankyrin domain suggest that cooperativity arises because the interaction between neighbouring repeats is highly stabilising whereas the interactions made by residues within a repeat are destabilising.44 The observed cooperativity is particular impressive when one considers that the 7-ankyrin repeat proteins Notch and gankyrin each comprise over 200 residues and yet they appear to unfold in a two-state manner. From the globular proteins studied to-date, two-state unfolding appears to break down well before this size is reached. The less well-studied repeat proteins containing β-strand structure have also been found to unfold cooperativity,45 although a series of designed LRR proteins of increasing length showed deviation from two-state behaviour30 and unfolding of the very large, β-helical protein pertactin also occurred via a clearly observed intermediate.46
One feature of repeat proteins that does distinguish them from globular proteins and that arises from the modularity of their structures, is the capacity to make them bigger or smaller by simply inserting or deleting modules. Whereas the insertion of large structural elements into globular proteins would be expected to be high destabilising, this is not the case for repeat proteins. These types of changes have been made most comprehensively to the Notch ankyrin domain.38,42,44,47 The variants in which an internal repeat was duplicated were somewhat more stable than the wild type, but a breakdown in cooperativity of unfolding was observed when more than one duplicated internal repeat was inserted. The inter-repeat interfaces were not optimised in these variants which could explain their reduced cooperativity. However, other results suggest an alternative explanation. First, the consensus ankyrin repeat proteins designed by Pluckthun and Peng did not explicitly include a consideration of inter-repeat packing, but these proteins did show two-state unfolding. It is likely, therefore, that the consensus sequence is optimised for inter-repeat stability, as is borne out by the crystal structures of the consensus repeat proteins which reveal extended H-bonding networks. Further, recent studies of Notch have shown that two-state behaviour is maintained when two of its repeats are replaced by consensus repeats, although when the two consensus repeats are added to the protein then cooperativity breaks down.38
These various studies indicate that cooperative unfolding of ankyrin repeat proteins (and probably other repeat proteins also) is governed predominantly by the number of repeats in the array. In addition to this primary factor, there will be more subtle, sequence-dependent effects. Thus, both the 7-ankyrin repeat proteins Notch47,48 and gankyrin (R. Hutton and LSI, manuscript in preparation) show a break-down in cooperativity on mutation at certain sites but not at others. Both proteins unfold in a non-two-state manner when mutations are made in the two C-terminal repeats, but not in the five N-terminal repeats. The results indicate that stability is distributed unevenly between the repeats of a protein (as expected given the non-identity of repeat sequences in a natural protein) and suggest that an even energy distribution is required for cooperativity.47
Ising Models and Repeat Protein Folding
The inter-repeat coupling in the context of cooperativity has also been explored using one dimensional Ising models.49 In these models, each repeat is represented as having one of two states or spins (si = ±1) which correspond to that repeat being either folded or unfolded. Further, each repeat has an intrinsic equilibrium constant for folding and there is an additional equilibrium constant which describes the interactions or coupling between adjacent repeats. Mello and Barrick applied this method to the equilibrium unfolding of Notch ankyrin domain, utilising values for the equilibrium constants derived from a series of variants having deletions of the internal repeats. Their analysis indicated that fully folded and fully unfolded conformations comprised the majority of all species across the entire denaturant range used in the experiment. Even at the midpoint of the unfolding transition, a maximum of only 2% of all conformations were partly folded molecules with less than 6 of the 7 repeats folded.44 Again, these results provide compelling evidence for the high level of coupling between adjacent repeats in these natural structures.
In contrast, Kajander et al utilised the one dimensional Ising model to study the cooperativity of folding in designed TPR repeat proteins. Here, several repeat proteins with varying numbers of consensus repeat domains were synthesised. By simultaneously fitting the data for all of the proteins with varying numbers of TPR repeats, Kajander et al were able to extract the equilibrium constants for folding and also coupling. For TPR proteins containing multiple copies of the consensus sequence, it appears that the probability of partially unfolded conformations at the unfolding transition midpoint is high; up to 25% of the population was found to be in a partially unfolded state for a construct containing three repeats.50 These results indicate that there may be some optimisation of the coupling for cooperative folding in naturally occurring proteins which is absent in proteins containing multiple repeats of an identical sequence.
Characterisation of Equilibrium (Un)Folding Intermediates
The studies of Notch, involving a series of variants having deletions of the internal repeats, indicate that the protein folds at equilibrium through an intermediate comprising folded central repeats and the terminal repeats unfolded.44 Although comprising only five ankyrin repeats, the protein p19 unfolds at equilibrium via an intermediate which was recently characterised by NMR-monitored hydrogen/deuterium exchange.51 This intermediate was found to have the three C-terminal repeats folded and the two N-terminal repeats unfolded.
Computer simulations of ankyrin repeats have made a number of predictions about their properties, including that the cooperativity breaks down when the protein contains more than six or seven repeats.52 The sensitivity of the two-state unfolding of 7-ankyrin repeat proteins Notch and gankyrin to single amino acid substitutions bears out this prediction. However, recent studies of the much larger, 12-ankyrin repeat protein, D34, suggest that the potential degree of cooperativity in ankyrin repeat proteins could be much greater than was first envisaged. Wild-type D34 unfoldsin two parts via an intermediate in which the N-terminal approximately six repeats are unfolded and the C-terminal six repeats are folded, a result that concurs with the idea that a cooperatively folded ankyrin ‘domain’ consists of six or seven repeats.53 However, mutants can be made in which as many as eleven ankyrin repeats unfold in a single step. Mutations in the C-terminal repeats destabilise the wild-type intermediate, causing different intermediates to be populated. The closer to the C-terminus the mutation, the fewer repeats are structured in the intermediate; thus, structure in the intermediate frays from the site of the mutation, a domino-like effect.
Mechanisms of Repeat Protein Folding
As described above for the equilibrium behaviour, the basic kinetic folding characteristics of ankyrin repeat proteins appear also to mirror those of globular proteins. Thus, the refolding kinetics of myotrophin,39,54 p16,55,56 Notch ankyrin domain,5757gankyrin (R. Hutton and LSI, manuscript in preparation) and p19,51,58 comprising between 118 and 238 residues, are multi-phasic when monitored by CD or fluorescence and proceed via partly folded intermediates. Only for the designed 3-ankyrin repeat protein E1_5 were the refolding and unfolding kinetics monophasic.59 The kinetic folding or unfolding mechanisms of five ankyrin repeat proteins have been studied in detail using a protein engineering approach. These are p16,56 Notch ankyrin domain,60 myotrophin, 39 D34 (N. Werbeck and LSI, submitted) and gankyrin (R. Hutton and LSI, manuscript in preparation). In the first of these studies, on p16, eighteen mutations spread throughout the four repeats showed a strikingly clear pattern of unfolding and refolding behaviour which indicated a transition state for unfolding with structure polarised in the C-terminal repeats.56 Subsequent studies of myotrophin and gankyrin also indicated a structurally polarised folding mechanism.39 However, a different picture emerged for the Notch ankyrin domain. Analysis of seven mutants, one in each repeat, showed that folding is initiated at three internal repeats.60 On the one hand, the terminal repeats might be expected to fold first since the entropic cost of fixing the end of the polypeptide chain would be less than for fixing the internal repeats. On the other hand the internal repeats might be expected to fold first since these have the potential to make stabilising interactions with more neighbouring repeats than the terminal repeats. However, it is likely that the fine details of each protein sequence also play a critical role. For globular proteins, there is a balance between the entropic cost of closing a loop in order to bring distant residues together and the enthalpic gain of the resulting contacts that are made. The region of a globular protein that folds first is therefore the one that uses the best contacts for which the entropy penalty is minimal. The linear symmetry of repeat proteins, however, means that equivalent positions in each repeat of a protein pay the same entropic cost of bringing other residues into contact with it. It is only the enthalpic gain that varies between repeats, being determined by the individual sequences of the different repeats. Therefore, the repeat(s) that folds first will be the ones that are the most stable, assuming that the lowest energy folding route (i.e., transition state) corresponds to the lowest energy folded repeat(s). This simple picture of folding the most stable repeats first is borne out by experiments, described below, that show the folding pathway can be changed in a predictable way simply by manipulating the stabilities of individual repeats. Finally, for each of p16, myotrophin, Notch and gankyrin, the structure of the rate-determining transition states comprises more than a single ankyrin motif. This result is as expected, since the ankyrin motif in isolation has been shown to be intrinsically unstable whereas the interaction between motifs is highly favourable.
Multiple Folding Pathways of Ankyrin Repeat Proteins
The most striking feature of both myotrophin and gankyrin (and also D34, see below) is that there are alternative pathways accessible to the native state in which folding is initiated at one or other end of the structure.39 The authors were able to redesign the folding pathway of myotrophin very simply. By taking advantage of the modular structure and manipulating the stabilities of the individual repeats, they could switch the folding initiation site from one end of the structure to the other.39 Molecular dynamics simulations carried out on a number of ankyrin repeat proteins, before the experimental data were available, also predicted pathway heterogeneity.61 The concept of energy landscapes suggests that a protein can follow multiple folding pathways. However, there has been very little experimental evidence to support this view for globular proteins. Experimental studies indicate that folding transition states of small globular proteins are represented by relatively homogenous ensembles of structures (excluding parallel folding reactions that result from heterogeneity in the denatured state, due to proline isomerisation for example). Moreover, the folding mechanisms of globular proteins are generally robust and therefore only a drastic change in the energetic balance, by circular permutation for example, can in some cases shift the folding nucleus from one part of the structure to another. By contrast, the potential to initiate folding at more than one site may be a general feature of repeat proteins that arises from the symmetry inherent in their structures.
Allomative pathways were, however, not observed for p16 or Notch (although a Notch variant containing two consensus-stabilised repeats showed evidence of a shift to an alternative folding pathway38). Presumably the different behaviour reflects the distribution of stability across the repeats. Thus, when the repeats within a protein have similar stabilities then multiple pathways may be accessible if the energy barriers to their folding are also of similar energy; by contrast, when some repeats have significantly greater stabilities than others then there is a unique folding pathway. Interestingly, preliminary studies on the 12-ankyrin repeat D34 indicate that the two halves of the protein each display unfolding mechanisms resembling one or other of these two scenarios (N. Werbeck and LSI, manuscript submitted). The C-terminal half, which folds and unfolds in the absence of a folded N-terminal half, can follow alternative pathways; in contrast, the N-terminal half folds and unfolds in the presence of a folded C-terminal half which therefore acts as a “seed” and consequently directs folding along a unique pathway.
Mechanical Properties of Ankyrin Repeats
In addition to the solution folding properties of repeat proteins, there has been recent interest in their folding and unfolding under mechanical perturbation. Whereas the curvature created in smaller repeat proteins provides a concave face that is ideal as an interaction surface, structures composed of a greater number of repeats resemble solenoids, forming a continuous spiral or spring-like topology. Large ankyrin repeat proteins consisting of many tens of repeats have been postulated to have mechano-signal transduction roles in processes such as hearing. Stacks of ankyrin repeats found in the cytoplasmic domains of the transient receptor potential (TRP) ion channels have been proposed as springs which provide a resting tension.62-64 These ion channels are mechanically gated by deflections of the stereocillia in response to sound. Recent work using single molecule atomic force microscopy (AFM) has demonstrated that ankyrin repeat proteins work as “nanosprings” within this role; their modular nature enables them to behave as a reversible spring that is capable of generating force upon refolding.65 Further, the repeats unfold sequentially under mechanical perturbation.66 These observations are consistent with those from the solution studies of folding and demonstrate the remarkable range of cellular roles of repeat proteins.
Self-Assembly and Higher-Order Structures in Repeat Proteins
One of the most interesting new classes of repeat proteins to emerge in recent years is the reflectin family of proteins, isolated from the Hawaiian bobtail squid. Although no structural information is currently available, the proteins are known to contain five repeats of a 18-20-residue sequence motif [M/FD(X)5MD(X)5MD(X)3/4] which is unusually rich in methionine.67 These proteins have the remarkable capability of self-assembling into regular higher-order structures that act as biological diffraction gratings, reflecting light in the visible wavelengths.68 Large platelets of these proteins form photonic structures which can be used to modulate macroscale changes such as overall body colouration in the squid. These proteins are providing a unique challenge to biophysicists to understand their folding and assembly as well as their physical properties.
Summary
Repeat proteins have turned out to be a fascinating class of structures, with particular appeal to those interested in protein folding, engineering and design. They are modular and therefore much more amenable to redesign than are globular proteins and yet this feature does not appear to compromise cooperativity, a hallmark of globular proteins that is thought to be important in making them sufficiently robust to maintain them in their native structures over their functional lifetimes. The recent studies reviewed here reveal design-ability in every aspect of repeat proteins, including their stability, folding and function. It will be interesting to see whether these striking properties, observed to date for the relatively small ankyrin and TPR repeats, are recapitulated in the behaviour of larger repeat motifs such as Armadillo and HEAT. We also look forward with anticipation to seeing how these properties will be exploited and manipulated in the future.
References
- 1.
- Bennett V, Stenbuck PJ. Identification and partial purification of ankyrin, the high affinity membrane attachment site for human erythrocyte spectrin. J Biol Chem. 1979;254(7):2533–41. [PubMed: 372182]
- 2.
- Hirano T, Kinoshita N, Morikawa K. et al. Snap helix with knob and hole: essential repeats in S. pombe nuclear protein nuc2+ Cell. 1990;60(2):319–28. [PubMed: 2297790]
- 3.
- Sikorski RS, Boguski MS, Goebl M. et al. A repeating amino acid motif in CDC23 defines a family of proteins and a new relationship among genes required for mitosis and RNA synthesis. Cell. 1990;60(2):307–17. [PubMed: 2404612]
- 4.
- Das AK, Cohen PW, Barford D. The structure of the tetratricopeptide repeats of protein phosphatase 5: implications for TPR-mediated protein-protein interactions. EMBO J. 1998;17(5):1192–9. [PMC free article: PMC1170467] [PubMed: 9482716]
- 5.
- Kobe B, Deisenhofer J. Crystal structure of porcine ribonuclease inhibitor, a protein with leucine-rich repeats. Nature. 1993;366(6457):751–6. [PubMed: 8264799]
- 6.
- Wilson CG, Kajander T, Regan L. The crystal structure of NlpI. A prokaryotic tetratricopeptide repeat protein with a globular fold. Febs J. 2005;272(1):166–79. [PubMed: 15634341]
- 7.
- Jawad Z, Paoli M. Novel sequences propel familiar folds. Structure. 2002;10(4):447–454. [PubMed: 11937049]
- 8.
- Willems AR, Goh T, Taylor L. et al. SCF ubiquitin protein ligases and phosphorylation-dependent proteolysis. Philos Trans R Soc Lond B Biol Sci. 1999;354(1389):1533–50. [PMC free article: PMC1692661] [PubMed: 10582239]
- 9.
- Croy CH, Bergqvist S, Huxford T. et al. Biophysical characterization of the free IkappaBalpha ankyrin repeat domain in solution. Protein Sci. 2004;13(7):1767–1777. [PMC free article: PMC2279933] [PubMed: 15215520]
- 10.
- Bergqvist S, Croy CH, Kjaergaard M. et al. Thermodynamics reveal that helix four in the NLS of NF-kappaB p65 anchors IkappaBalpha, forming a very stable complex. J Mol Biol. 2006;360(2):421–34. [PMC free article: PMC2680085] [PubMed: 16756995]
- 11.
- Truhlar SM, Torpey JW, Komives EA. Regions of IkappaBalpha that are critical for its inhibition of NF-kappaB. DNA interaction fold upon binding to NF-kappaB. Proc Natl Acad Sci USA. 2006;103(50):18951–6. [PMC free article: PMC1748158] [PubMed: 17148610]
- 12.
- Conti E, Muller CW, Stewart M. Karyopherin flexibility in nucleocytoplasmic transport. Curr Opin Struct Biol. 2006;16(2):237–44. [PubMed: 16567089]
- 13.
- Stewart M. Molecular mechanism of the nuclear protein import cycle. Nat Rev Mol Cell Biol. 2007;8(3):195–208. [PubMed: 17287812]
- 14.
- Riggleman B, Wieschaus E, Schedl P. Molecular analysis of the armadillo locus: uniformly distributed transcripts and a protein with novel internal repeats are associated with a Drosophila segment polarity gene. Genes Dev. 1989;3(1):96–113. [PubMed: 2707602]
- 15.
- Andrade MA, Bork P. HEAT repeats in the Huntington's disease protein. Nat Genet. 1995;11(2):115–6. [PubMed: 7550332]
- 16.
- Chi NC, Adam EJH, Adam SA. Sequence and characterization of cytoplasmic nuclear-protein import factor P97. J Cell Biol. 1995;130(2):265–274. [PMC free article: PMC2199936] [PubMed: 7615630]
- 17.
- Gorlich D, Kostka S, Kraft R. et al. 2 Different subunits of importin cooperate to recognize nuclear-localization signals and bind them to the nuclear-envelope. Curr Biol. 1995;5(4):383–392. [PubMed: 7627554]
- 18.
- Imamoto N, Shimamoto T, Kose S. et al. The nuclear pore-targeting complex binds to nuclear-pores after association with a karyophile. FEBS Lett. 1995;368(3):415–419. [PubMed: 7635189]
- 19.
- Radu A, Blobel G, Moore MS. Identification of a protein complex that is required for nuclear-protein import and mediates docking of import substrate to distinct nucleoporins. Proc Natl Acad Sci U S A. 1995;92(5):1769–1773. [PMC free article: PMC42601] [PubMed: 7878057]
- 20.
- Andrade MA, Petosa C, O'Donoghue SI. et al. Comparison of ARM and HEAT protein repeats. J Mol Biol. 2001;309(1):1–18. [PubMed: 11491282]
- 21.
- Cingolani G, Petosa C, Weis K. et al. Structure of importin-beta bound to tbe IBB domain of importin-alpha. Nature. 1999;399(6733):221–229. [PubMed: 10353244]
- 22.
- Liou YC, Tocilj A, Davies PL. et al. Mimicry of ice structure by surface hydroxyls and water of a beta-helix antifreeze protein. Nature. 2000;406(6793):322–324. [PubMed: 10917536]
- 23.
- Desjarlais JR, Berg JM. Use of a zinc-finger consensus sequence framework and specificity rules to design specific DNA binding proteins. Proc Natl Acad Sci USA. 1993;90(6):2256–60. [PMC free article: PMC46065] [PubMed: 8460130]
- 24.
- Magliery TJ, Regan L. Beyond consensus: statistical free energies reveal hidden interactions in the design of a TPR motif. J Mol Biol. 2004;343(3):731–45. [PubMed: 15465058]
- 25.
- Binz HK, Stumpp MT, Forrer P. et al. Designing repeat proteins: well-expressed, soluble and stable proteins from combinatorial libraries of consensus ankyrin repeat proteins. J Mol Biol. 2003;332(2):489–503. [PubMed: 12948497]
- 26.
- Forrer P, Stumpp MT, Binz HK. et al. A novel strategy to design binding molecules harnessing the modular nature of repeat proteins. FEBS Lett. 2003;539(1-3):2–6. [PubMed: 12650916]
- 27.
- Kohl A, Binz HK, Forrer P. et al. Designed to be stable: crystal structure of a consensus ankyrin repeat protein. Proc Natl Acad Sci USA. 2003;100(4):1700–5. [PMC free article: PMC149896] [PubMed: 12566564]
- 28.
- Binz HK, Amstutz P, Kohl A. et al. High-affinity binders selected from designed ankyrin repeat protein libraries. Nat Biotechnol. 2004;22(5):575–82. [PubMed: 15097997]
- 29.
- Binz HK, Kohl A, Pluckthun A. et al. Crystal structure of a consensus-designed ankyrin repeat protein: implications for stability. Proteins. 2006;65(2):280–4. [PubMed: 16493627]
- 30.
- Stumpp MT, Forrer P, Binz HK. et al. Designing repeat proteins: modular leucine-rich repeat protein libraries based on the mammalian ribonuclease inhibitor family. J Mol Biol. 2003;332(2):471–87. [PubMed: 12948496]
- 31.
- Mosavi LK, Minor DL Jr, Peng ZY. Consensus-derived structural determinants of the ankyrin repeat motif. Proc Natl Acad Sci USA. 2002;99(25):16029–34. [PMC free article: PMC138559] [PubMed: 12461176]
- 32.
- Zahnd C, Pecorari F, Straumann N. et al. Selection and characterization of Her2 binding-designed ankyrin repeat proteins. J Biol Chem. 2006;281(46):35167–75. [PubMed: 16963452]
- 33.
- Zahnd C, Wyler E, Schwenk JM. et al. A designed ankyrin repeat protein evolved to picomolar affinity to Her2. J Mol Biol. 2007;369(4):1015–28. [PubMed: 17466328]
- 34.
- Main ER, Xiong Y, Cocco MJ. et al. Design of stable alpha-helical arrays from an idealized TPR motif. Structure. 2003;11(5):497–508. [PubMed: 12737816]
- 35.
- Main ER, Stott K, Jackson SE. et al. Local and long-range stability in tandemly arrayed tetratricopeptide repeats. Proc Natl Acad Sci USA. 2005;102(16):5721–6. [PMC free article: PMC556279] [PubMed: 15824314]
- 36.
- Cortajarena AL, Kajander T, Pan W. et al. Protein design to understand peptide ligand recognition by tetratricopeptide repeat proteins. Protein Eng Des Sel. 2004;17(4):399–409. [PubMed: 15166314]
- 37.
- Ferreiro DU, Cervantes CF, Truhlar SM. et al. Stabilizing IkappaBalpha by “consensus” design. J Mol Biol. 2007;365(4):1201–16. [PMC free article: PMC1866275] [PubMed: 17174335]
- 38.
- Tripp KW, Barrick D. Enhancing the stability and folding rate of a repeat protein through the addition of consensus repeats. J Mol Biol. 2007;365:1187–200. [PMC free article: PMC1851695] [PubMed: 17067634]
- 39.
- Lowe AR, Itzhaki LS. Rational redesign of the folding pathway of a modular protein. Proc Natl Acad Sci USA. 2007;104(8):2679–84. [PMC free article: PMC1815241] [PubMed: 17299057]
- 40.
- Plaxco KW, Simons KT, Baker D. Contact order, transition state placement and the refolding rates of single domain proteins. J Mol Biol. 1998;277(4):985–94. [PubMed: 9545386]
- 41.
- Zhang B, Peng Z. A minimum folding unit in the ankyrin repeat protein p16(INK4). J Mol Biol. 2000;299(4):1121–32. [PubMed: 10843863]
- 42.
- Tripp KW, Barrick D. The tolerance of a modular protein to duplication and deletion of internal repeats. J Mol Biol. 2004;344(1):169–78. [PubMed: 15504409]
- 43.
- Main ER, Jackson SE, Regan L. The folding and design of repeat proteins: reaching a consensus. Curr Opin Struct Biol. 2003;13(4):482–9. [PubMed: 12948778]
- 44.
- Mello CC, Barrick D. An experimentally determined protein folding energy landscape. Proc Natl Acad Sci USA. 2004;101(39):14102–7. [PMC free article: PMC521126] [PubMed: 15377792]
- 45.
- Freiberg A, Machner MP, Pfeil W. et al. Folding and stability of the leucine-rich repeat domain of internalin B from Listeri monocytogenes. J Mol Biol. 2004;337(2):453–61. [PubMed: 15003459]
- 46.
- Junker M, Schuster CC, McDonnell AV. et al. Pertactin beta-helix folding mechanism suggests common themes for the secretion and folding of autotransporter proteins. Proc Natl Acad Sci USA. 2006;103(13):4918–23. [PMC free article: PMC1458770] [PubMed: 16549796]
- 47.
- Street TO, Bradley CM, Barrick D. Predicting coupling limits from an experimentally determined energy landscape. Proc Natl Acad Sci USA. 2007;104(12):4907–12. [PMC free article: PMC1829238] [PubMed: 17360387]
- 48.
- Bradley CM, Barrick D. Limits of cooperativity in a structurally modular protein: response of the Notch ankyrin domain to analogous alanine substitutions in each repeat. J Mol Biol. 2002;324(2):373–86. [PubMed: 12441114]
- 49.
- Zimm BH, Bragg JK. Theory of the phase transition between helix and random coil in polypeptide chains. J Chem Phys. 1959;31(2):526–535.
- 50.
- Kajander T, Cortajarena AL, Main ERG. et al. A new folding paradigm for repeat proteins. J Am Chem Soc. 2005;127(29):10188–10190. [PubMed: 16028928]
- 51.
- Low C, Weininger U, Zeeb M. et al. Folding mechanism of an ankyrin repeat protein: scaffold and active site formation of human CDK inhibitor p19(INK4d). J Mol Biol. 2007;373(1):219–31. [PubMed: 17804013]
- 52.
- Ferreiro DU, Cho SS, Komives EA. et al. The energy landscape of modular repeat proteins: Topology determines folding mechanism in the ankyrin family. J Mol Biol. 2005;354(3):679–692. [PubMed: 16257414]
- 53.
- Werbeck ND, Itzhaki LS. Probing a moving target with a plastic unfolding intermediate of an ankyrin-repeat protein. Proc Natl Acad Sci USA. 2007;104(19):7863–8. [PMC free article: PMC1876538] [PubMed: 17483458]
- 54.
- Lowe AR, Itzhaki LS. Biophysical characterisation of the small ankyrin repeat protein myotrophin. J Mol Biol. 2007;365(4):1245–55. [PubMed: 17113103]
- 55.
- Tang KS, Guralnick BJ, Wang WK. et al. Stability and folding of the tumour suppressor protein p16. J Mol Biol. 1999;285(4):1869–1886. [PubMed: 9917418]
- 56.
- Tang KS, Fersht AR, Itzhaki LS. Sequential unfolding of ankyrin repeats in tumor suppressor p16. Structure. 2003;11(1):67–73. [PubMed: 12517341]
- 57.
- Mello CC, Bradley CM, Tripp KW. et al. Experimental characterization of the folding kinetics of the notch ankyrin domain. J Mol Biol. 2005;352(2):266–81. [PubMed: 16095609]
- 58.
- Zeeb M, Rosner H, Zeslawski W. et al. Protein folding and stability of human CDK inhibitor p19(INK4d). J Mol Biol. 2002;315(3):447–457. [PubMed: 11786024]
- 59.
- Devi VS, Binz HK, Stumpp MT. et al. Folding of a designed simple ankyrin repeat protein. Protein Sci. 2004;13(11):2864–70. [PMC free article: PMC2286595] [PubMed: 15498935]
- 60.
- Bradley CM, Barrick D. The notch ankyrin domain folds via a discrete, centralized pathway. Structure. 2006;14(8):1303–12. [PubMed: 16905104]
- 61.
- Cho SS, Levy Y, Wolynes PG. P versus Q: Structural reaction coordinates capture protein folding on smooth landscapes. Proc Natl Acad Sci U S A. 2006;103(3):586–591. [PMC free article: PMC1334664] [PubMed: 16407126]
- 62.
- Howard J, Bechstedt S. Hypothesis: A helix of ankyrin repeats of the NOMPIC-TRP ion channel is the gating spring of mechanoreceptors. Curr Biol. 2004;14(6):R224–R226. [PubMed: 15043829]
- 63.
- Gillespie PG, Dumont RA, Kachar B. Have we found the tip link, transduction channel and gating spring of the hair cell? Curr Opin Neurobiol. 2005;15(4):389–396. [PubMed: 16009547]
- 64.
- Sotomayor M, Corey DP, Schulten K. In search of the hair-cell gating spring: Elastic properties of ankyrin and cadherin repeats. Structure. 2005;13(4):669–682. [PubMed: 15837205]
- 65.
- Lee G, Abdi K, Jiang J. et al. Nanospring behaviour of ankyrin repeats. Nature. 2006;440(7081):246–249. [PubMed: 16415852]
- 66.
- Li LW, Wetzel S, Pluckthun A. et al. Stepwise unfolding of ankyrin repeats in a single protein revealed by atomic force microscopy. Biophys J. 2006;90(4):L30–L32. [PMC free article: PMC1367297] [PubMed: 16387766]
- 67.
- Crookes WJ, Ding LL, Huang QL. et al. Reflectins: The unusual proteins of squid reflective tissues. Science. 2004;303(5655):235–238. [PubMed: 14716016]
- 68.
- Kramer RM, Crookes-Goodson WJ, Naik RR. The self-organizing properties of squid reflectin protein. Nat Mater. 2007;6(7):533–8. [PubMed: 17546036]
- From Artificial Antibodies to Nanosprings:The Biophysical Properties of Repeat P...From Artificial Antibodies to Nanosprings:The Biophysical Properties of Repeat Proteins - Madame Curie Bioscience Database
- Cell Cycle and Chromosome Segregation Defects in Alzheimer's Disease - Madame Cu...Cell Cycle and Chromosome Segregation Defects in Alzheimer's Disease - Madame Curie Bioscience Database
- Genetic Susceptibility to Infectious Diseases Linked to NRAMP1 Gene in Farm Anim...Genetic Susceptibility to Infectious Diseases Linked to NRAMP1 Gene in Farm Animals - Madame Curie Bioscience Database
- Conformational Dynamics within the Ribosome - Madame Curie Bioscience DatabaseConformational Dynamics within the Ribosome - Madame Curie Bioscience Database
- Neuroprotective Strategies in Animal and in Vitro Models of Neuronal Damage: Isc...Neuroprotective Strategies in Animal and in Vitro Models of Neuronal Damage: Ischemia and Stroke - Madame Curie Bioscience Database
Your browsing activity is empty.
Activity recording is turned off.
See more...