New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Distinct contribution of electrostatics, initial conformational ensemble, and macromolecular stability in RNA folding
-
Edited by Donald M. Crothers, Yale University, New Haven, CT, and approved March 5, 2007 (received for review October 3, 2006)
Abstract
We distinguish the contribution of the electrostatic environment, initial conformational ensemble, and macromolecular stability on the folding mechanism of a large RNA using a combination of time-resolved “Fast Fenton” hydroxyl radical footprinting and exhaustive kinetic modeling. This integrated approach allows us to define the folding landscape of the L-21 Tetrahymena thermophila group I intron structurally and kinetically from its earliest steps with unprecedented accuracy. Distinct parallel pathways leading the RNA to its native form upon its Mg2+-induced folding are observed. The structures of the intermediates populating the pathways are not affected by variation of the concentration and type of background monovalent ions (electrostatic environment) but are altered by a mutation that destabilizes one domain of the ribozyme. Experiments starting from different conformational ensembles but folding under identical conditions show that whereas the electrostatic environment modulates molecular flux through different pathways, the initial conformational ensemble determines the partitioning of the flux. This study showcases a robust approach for the development of kinetic models from collections of local structural probes.
Most of the studied RNA folding pathways are populated with multiple kinetic intermediates (1–9) whose prevalence suggests their important role in the regulation of RNA function (10–14). The structural properties of folding intermediates and the factors that influence their population and lifetime are incompletely understood. Among the known factors are intermediate stability, electrostatic environment, native-state topology, and the initial conformational ensemble (15). RNA's negative charge results in counterion concentration strongly influencing the folding landscape and the relative stability of kinetic intermediates (16–20). The native-state topology of structured RNA consisting of Watson–Crick duplexes connected by semiflexible junctions and specific long-range tertiary contacts constrains the biopolymer's conformational degrees of freedom (21–23). The ensemble of macromolecular conformations present at the initial state directs the RNA commitment to a particular folding pathway (16): a compact ensemble with significant tertiary structure may be more favorable to conformational search but can also yield barriers that impede resolution of misfolded molecules. The distinct roles of each of these influences in the determination of the structure, population, and lifetime of RNA folding intermediates have yet to be established.
In this study we investigate the interplay among electrostatics, initial conformation, and macromolecular stability in RNA folding. We analyze the solution condition dependence of the intermediate structures and molecular flux through them on the Mg2+-mediated folding of the L-21 Tetrahymena thermophila group I intron and a mutant of reduced stability by varying the background monovalent cation type and concentration. Although changes in folding environment and severely destabilizing mutations are not likely to occur in vivo, their use as a tool to perturb the RNA folding landscape in vitro yields significant insight into the folding mechanism.
We have combined recent technological advances in time-resolved hydroxyl radical (·OH) footprinting (24) and analysis (25) with automated kinetic and structural modeling (26) to construct folding landscapes for the T. thermophila ribozyme under a variety of different conditions. Quantitative comparisons of these landscapes show that macromolecular stability, order of tertiary contact formation, and the RNA native structure affect the structures of RNA folding intermediates whereas electrostatic environment govern the relative flux through the different pathways. The initial partitioning of the reaction flux depends on the initial conformational ensemble whereas the reaction rates depend on the final solution conditions.
Results
Fig. 1 summarizes the Mg2+-mediated folding reactions that were analyzed in this study. Alteration of the concentration and type of the monovalent cations affects both the electrostatic environment during folding and the initial conformational ensemble as evidenced by changes in compaction and extent of tertiary structure formation of the RNA; the final structure formed by the addition of Mg2+ is independent of the analyzed reaction conditions [supporting information (SI) Figs. 5 and 6]. To separate initial conformational ensemble contribution from that of the electrostatic environment during folding we compare the results of folding experiments in which only the Mg2+ is jumped to a final concentration of 10 mM and where the Na+ or K+ concentration is also jumped in concert with the Mg2+ (Fig. 1; compare reaction B with reaction C). Although folding occurs under identical conditions, it is initiated from conformational ensembles whose global compaction and tertiary contacts are different. We analyzed the role of the RNA stability by folding a ribozyme bearing the UUCG mutation of the L5b tetraloop, a molecule with native-state topology comparable if not identical to the wild type but with diminished stability of the P4–P6 domain and the periphery (SI Figs. 5–7) (27, 28).
Time–Progress Clusters.
Nucleic acid ·OH footprinting separately reports on the solvent accessibility of the phosphodiester backbone of each nucleotide. Most reactivity changes are protection in which formation of tertiary structure buries backbone in the interior of the molecule during folding. Sometimes the reverse occurs in which a region is more accessible in the final compared with the initial state (“hypersensitivity”) (SI Fig. 8, nucleotide 122). Thus, each folding experiment yields a collection of local measures of solvent-accessibility change that are distributed throughout the ribozymes (Fig. 2 A Insets and C Insets). Time–progress curves are obtained in the quench flow mixer for each local site by sampling the mixtures of unfolded and folded molecules with a short pulse of ·OH at each time point along a transition (24). Each curve reflects the change in solvent accessibility induced by addition of Mg2+. As noted above, the final states of all of the reactions are the same. However, the initial state depends on the monovalent cation concentration that is present (29). The initial ensemble present in 600 mM M+ has a greater probability of possessing tertiary contacts compared with 200 mM M+ (Fig. 1; compare the cartoons). Thus, high salt curves transcend a smaller range of ·OH reactivity change than if the initial reference state were fully unstructured RNA.
For the reactions shown in Fig. 1, time–progress curves were acquired for 25 discrete sites commencing 4 msec after the initiation of folding (SI Fig. 8) (24, 30). The k-means clustering implemented in KinFold (26) yields a model-free assessment of the similarity and differences among the time–progress curves (26). This grouping identifies regions of the molecule with comparable time-dependent behavior. Based on a Gap Statistic analysis (31), three clusters characterized by fast, medium, and slow initial rates are necessary and sufficient to describe the collections of local measures obtained for the wild-type and mutated ribozyme folding reactions analyzed (Figs. 1 and 2 A and C). In cases where progress curves exhibit multiple transitions, the fast, medium, and slow transitions are assigned based on the first transition.
The clustering of the progress curves reveals a striking consistency among the different folding reactions (illustrated as red, green, or blue boxes in Fig. 2 B). In general, the assignment of each site of ·OH protection to a cluster is independent of the folding conditions: the color of the columns is uniform for most sites of the wild-type and mutant ribozymes. Only nucleotides 109–112 and 224–225 demonstrate monovalent cation type specificity in their cluster assignment for the wild type of the ribozyme. Although not a prerequisite, the clusters mainly correspond to defined structural domains of the ribozyme, the P4–P6 domain, the periphery, and the catalytic core (Fig. 2 A Insets). The structural characteristics of the clusters are in turn used to define the intermediate species of the folding reaction when the data are modeled kinetically. The assignment of sites to particular clusters is affected by destabilization of the P4–P6 domain in the L5b mutant reflecting an altered folding hierarchy (Fig. 2 A Insets and C Insets). The different cluster assignments in turn require alternative folding intermediate structures as presented below.
Kinetic Modeling Identifies Common Folding Configurations.
KinFold was independently applied to each data set to exhaustively search kinetic model configuration space and determine the optimum mapping of intermediates to the time–progress curve clusters (26). [The term “kinetic model configuration” is used in this article rather than “kinetic model topology” (29) to avoid confusion with “native-state topology”.] The algorithm identified the best fitting model without the imposition of constraints. Three intermediates were required to satisfy the required root mean square error criteria (26) for both the wild-type and mutant ribozymes (SI Tables 1 and 2) (A statistical analysis of the root mean square error is presented in SI Text supporting the three intermediate models for all the folding reactions.) One kinetic model configuration best describes folding of the wild-type ribozyme at all six experimental conditions (Fig. 3 A). For this model, intermediate I1 has the P4–P6 domain structured, I2 has the peripheral contacts formed, and I3 has both P4–P6 and periphery native tertiary organization. The I1 and I3 intermediates are structurally identical to those observed previously when the ribozyme was Mg2+-folded from very low salt concentrations (26). Slight monovalent cation-type dependency of the intermediate structures is evidenced by the different affiliation of the sites that cluster differently to either the I1 or I2 intermediate. The resolved rates for wild-type ribozyme folding in Na+ containing solutions are schematically summarized by arrow type and thickness in Fig. 3 A. The complete set of resolved rates is presented in SI Table 3 and SI Fig. 15. The monovalent cation type dependency of the rate constants is greatest for the U→I3 transition but is also apparent for I2→I3 and I2→F.
One kinetic model also describes the folding of the mutant ribozyme at the four experimental conditions studied. Although this model also contains three intermediates, their structures are different from those of the wild-type model: I1 has a formed catalytic core, I2 has P5abc and the peripheral contact P13 formed, and I3 has P5abc, P13, and a second peripheral contact, P14, formed in the mutant ribozyme folding model (Fig. 3 B). Whereas most of the rate constants for the mutant ribozyme model depend on the monovalent ion concentration, only the rate constants describing intermediate interconversion demonstrate clear monovalent cation-type dependence (SI Table 3 and SI Fig. 16).
The structural models shown in Fig. 3 represent one RNA conformation that satisfies the observed ·OH protection profile. Although multiple conformations satisfy the experimental constraints, the single structure shown embraces the tertiary contacts that have a high probability of being present in that reaction intermediate.
Time Evolution of the Reaction Intermediates and Folded RNA.
The 4-msec time resolution of our progress curves revealed transitions that were previously hidden (32), allowing robust determination of the fastest rate constants (SI Fig. 8). For example, the bootstrapping error estimates for the U→Ii and U→F transitions are ≈15% (SI Table 3). Fig. 4 plots the time evolution of the intermediates and final state predicted by the best-fit rate constants of the model. The widths of the curves reflect the propagation of errors in the individual rate constants by bootstrap analysis. The more rapid accumulation of I1 and I2 at 200 mM Na+ in reaction condition A (Fig. 1) reflects the correspondingly 2-fold greater value of the rate constants measured for the U→I1 and U→I2 transitions. The time-evolution of I3 is distinctly different from I1 and I2. Only a small fraction of the RNA molecules pass through it late in the reaction, most slowly for reaction A.
For the wild-type ribozyme, the intermediate interconversion rates depend on the electrostatic environment. No appreciable interconversion is observed when the ribozyme folds in the presence of 200 mM K+, revealing one of the major differences between folding in the presence of either K+ or Na+. In contrast, interconversion is comparable to Ii→F rates when the ribozyme folds at 600 mM of monovalent cations (SI Table 3 and SI Fig. 15). Very low to no intermediate interconversion is observed for the mutant ribozyme at all of the reaction conditions studied (SI Table 3 and SI Fig. 16).
The time evolution of the folded ribozymes (F) is a complex process with multiple kinetic phases that reflect the differences in the conversion rates and relative stability of the intermediate species. The lifetime of all three intermediates is shorter when folding occurs at 600 compared with 200 mM salt (Fig. 4, compare the cyan and magenta curves), resulting in faster accumulation of folded molecules at the higher monovalent concentration. Formation of F continues unimpeded in reaction C (Fig. 4 Lower Right, black curve) because of the high flux from I1 and I2 to F and the short lifetimes of the intermediates. These characteristics result in the fastest completion of the folding for reaction C. A rationale for these results is that a higher fraction of molecules present in the compact initial ensemble at 600 mM salt fold directly to F in contrast to the dominance of the indirect pathways (U→I1 and U→I2) when folding initiates from a lower monovalent salt concentration. These general trends are independent of monovalent cation type (see SI Fig. 17).
The Initial Conformational Ensemble Modulates Pathway Partitioning.
The availability of precise data at short folding times allows us to meaningfully estimate the partitioning of the reaction flux during the earliest stages of the folding reaction from the ratio of the initial rate constants. The relative initial reaction flux of the wild-type ribozyme through the four initial pathways (U→Ii and U→F, respectively) for the reactions carried out in Na+-containing solutions are summarized in the histograms in Fig. 4 Insets. A majority of the RNA molecules commence folding by traversing either the U→I1 or U→I2 pathways under all of the solution conditions. Only a small fraction of the molecules traverse U→I3. The U→I1 and U→I2 initial fluxes are the same within error for reactions A and C but lower for B. Therefore, the initial conditions, and not the folding conditions, dictate the initial partitioning of molecules into the folding pathways. Conversely, the rate of appearance of I1 and I2 as well as their lifetimes depend on the folding, not the initial, condition. The initial flux analysis of the K+ experiments reveals similar behavior, although the differences in flux are smaller than the corresponding values observed in the presence of Na+ (compare SI Fig. 17 with Fig. 4 Insets).
The initial conformational ensemble significantly influences the initial flux to F as well (Fig. 4 Lower Right Inset); a remarkably high percentage (≈36%) of the RNA folds through the direct pathway (U→F) at 600 mM Na+ (reaction B). This behavior agrees with the higher compaction and more native-like protections of the initial ensemble at 600 mM Na+. When folding is initiated from and carried out in 200 mM salt, a smaller fraction of the molecules (19%) folds directly to F. The fraction of fast folding is even smaller (10%) when folding is initiated from 200 mM salt but carried out in 600 mM salt (reaction C).
Discussion
The ability of RNA to fold into unique three-dimensional structures is central to its biological function (6, 14, 17). The few RNA folding mechanisms that have been studied are characterized by an abundance of long-lived intermediate species, kinetic traps, and parallel pathways (3, 5, 6, 9, 33, 34). RNA folding landscapes have been referred to as “rugged” (35) although exceptions have been observed (7, 36). The results of our study are consistent with competition between the energetically defined bias toward the folded state and trapping because of ruggedness in the folding landscape proposed for protein folding (37–39) and are suggested to also apply to RNA folding (15). Our study also demonstrates the presence and importance of discrete pathways through which the molecular flux moves.
Our global analysis of the collection of discrete measures of local solvent-accessibility changes provided by time-resolved ·OH footprinting additionally provides the structural nature of the reaction intermediates. The result is a unique “structural–kinetic” description of the RNA folding landscape from its earliest discernible steps. The identification of the intermediate structures and the flux partitioning among them allow the relative contributions of the electrostatic environment and initial conformational ensemble on the reaction to be quantitatively distinguished. Although this study focuses on a single RNA, confirmation of the general nature of our conclusions will come through the application of the methods showcased here to other RNAs.
Our exploration of the contributions of the electrostatic environment and initial conformation on the folding landscape of the Tetrahymena ribozyme was stimulated by reports of the importance of native-state topology (21) and initial conditions (16) to RNA folding. For both the wild-type and mutant ribozymes, the model-independent clustering of the individual tertiary contacts into one of three groups during folding under all of the analyzed solution conditions (Fig. 2). These commonalities suggest that the structural nature of the intermediates is primarily dependent on the native-state topology and macromolecular stability and is minimally dependent on the folding conditions. This conclusion is supported by the consistency of the kinetic models resolved for the wild-type and mutant ribozymes for all of the solution conditions analyzed (Fig. 3). More extensive studies of reaction space and tertiary contact mutations that do and do not alter the native-state topology will test the generality of this hypothesis.
The kinetic models also show that the electrostatic environment that is set by the solution conditions modulates the flux through the alternative pathways of the folding landscape. Both clustering and kinetic modeling demonstrate that the intermediate structures are independent of the solution conditions. Their relative abundance and interconversion depend on the electrostatic environment set by the folding conditions. The experiments in which monovalent and divalent cations are concurrently adjusted (Fig. 1, reaction C) demonstrate that the initial partitioning of the reaction flux depends on the ensemble of RNA conformations present at the initial condition whereas the subsequent steps are defined by the folding conditions (16). Our separation of the roles of electrostatics and initial conformational ensemble in RNA folding does not imply that they are completely independent. The initial conformational ensemble is defined by the electrostatic environment, but each factor distinctly influences RNA folding.
Although the topology of the folding landscape is different for each experimental condition, the presence and number of intermediates are constant features. The three pathways are populated with the I1, I2, and I3 intermediates, whereas some molecules traverse directly to F. The degree to which these pathways are channels, i.e., where the intermediate interconversion rates are significantly less than the Ii→F rates, is highly dependent on the folding conditions. The isolation of the intermediates from each other is uniformly higher for the less stable mutant ribozyme.
The observation that early partitioning of the reaction flux depends on the prefolding conditions suggests a presence of structurally heterogeneous mix of “unfolded” RNA molecules (40). This hypothesis is supported by single-molecule measurements (16) suggesting that the interconversion rates among these populations are much slower than the initial commitment rates to the folding pathways (50–100 s−1) (Fig. 4). Higher monovalent ion concentrations increase the relative abundance of molecules that are native-like (toward the right of the funnel), thus increasing the observed flux through the direct folding channel. Because an ensemble method such as ·OH footprinting cannot directly observe the structurally heterogeneous mix of RNA molecules in the prefolding conditions, this hypothesis is being tested by incorporating single-molecule measurements into our novel footprinting/computation/perturbation approach to understanding folding.
Our analysis of the classic model system of RNA folding, the Tetrahymena ribozyme, establishes a general approach for quantitatively defining the folding landscape of large RNA molecules and the structural nature of the populated intermediate species. Fast Fenton footprinting (24) allows rapid collection of the structural–kinetic data required to illuminate the intermediate structures. The computational tools SAFA for radiogram analysis (25) and KinFold for clustering and kinetic modeling (26) allow the footprinting data to be readily and robustly distilled into structural–kinetic models such as those shown in Fig. 3. The computational tools can also incorporate alternative views of reactions, such as global conformation by time-resolved small angle x-ray scattering (23, 41) and single-molecule methods, into a common kinetic framework. We are only beginning to exploit the high-throughput aspect of this approach, which will allow rigorous characterization of the energetics of the individual steps of complex folding reactions. Structural–kinetic–energetic models provide unique insight into the physical forces that drive RNA folding and the assembly of RNA complexes. This integrated approach is generally applicable to questions of protein, DNA and RNA folding and binding, assembly, and enzymological reactions involving these macromolecules.
Materials and Methods
Time-Resolved Footprinting.
The L-21 ribozyme from T. thermophila was prepared by T7 transcription (29) and was either 5′- or 3′-labeled (42–44) with 32P. The RNA was gel-purified, precipitated, and resuspended in CE buffer (10 mM Na+ or K+ cacodylate, pH 7.3/0.1 mM EDTA).
Before initiating an experiment 32P-labeled RNA was heated in solution containing the indicated concentration of monovalent salt at 95°C for 3 min, cooled slowly, and incubated at 42°C. Folding at 42°C was initiated in the rapid mixer by mixing 15–20 μl of sample with an equal volume of CE buffer, monovalent salt, and 20 mM MgCl2. After aging, the samples were either exposed to the x-ray beam (44, 45) or mixed with Fe(II)-EDTA (24) solution for several milliseconds, expelled from the mixer, gel-separated, and imaged (44, 45). The ·OH reactivity of the nucleotides comprising tertiary contacts that changed during folding were quantitated by either “block” (46) or SAFA individual peak fitting analysis (25). The data for each protection were individually scaled to fractional saturation (44).
KinFold Analysis of Time–Progress Curves.
The collection of time–progress curves from each experimental condition was binned and clustered by k-means with a Manhattan distance metric (26). The Gap Statistic was used to estimate the number of clusters, k (31). Exhaustive enumeration of all possible kinetic model configurations was carried out on a distributed computing grid (Stanford BioXCluster). For each experimental condition, the 28 (three clusters, two intermediate) and 84 (three clusters, three intermediates) model configurations were tested. The best fitting models identified based on an analysis of the RMSE were subjected to numerical flux analysis simulations (26) simulating 104 folding pathways. The initial fluxes were calculated as the ratio of the initial rate constants and validated by comparison to the numerical flux analysis. The resulting sets of ordinary differential equations defining the best fitting kinetic model were integrated by using an explicit Runge-Kutta pair (47) to determine the time evolution of the different species in solution. Coarse-grained structural cartoons were generated by using NAST (Nucleic Acid Simulation Tool, https://simtk.org/home/nast) and rendered by using VMD (48). Molecules were constrained to correspond to the accessible surface area profile that matched the footprinting data.
Acknowledgments
We thank Dan Herschlag, Rick Russell, and Vijay Pande for stimulating discussions and critical readings of the manuscript. This work was funded by National Institutes of Health Grants P01-GM66275, U54-GM072970, and P41-EB0001979. A.L. is supported by a Damon Runyan Cancer Research Foundation postdoctoral fellowship.
Footnotes
- §To whom correspondence may be addressed. E-mail: russ.altman{at}stanford.edu or brenowit{at}aecom.yu.edu
-
Author contributions: A.L. and I.S. contributed equally to this work; A.L., I.S., R.B.A., and M.B. designed research; A.L., I.S., and M.A.J. performed research; A.L., I.S., and M.A.J. analyzed data; and A.L., I.S., R.B.A., and M.B. wrote the paper.
-
The authors declare no conflict of interest.
-
This article is a PNAS direct submission.
-
This article contains supporting information online at www.pnas.org/cgi/content/full/0608765104/DC1.
- © 2007 by The National Academy of Sciences of the USA
References
- ↵
-
↵
- Su LJ ,
- Waldsich C ,
- Pyle AM
- ↵
- ↵
-
↵
- Onoa B ,
- Dumont S ,
- Liphardt J ,
- Smith SB ,
- Tinoco I, Jr ,
- Bustamante C
- ↵
-
↵
- Fang XW ,
- Thiyagarajan P ,
- Sosnick TR ,
- Pan T
- ↵
-
↵
- Treiber DK ,
- Rook MS ,
- Zarrinkar PP ,
- Williamson JR
- ↵
- ↵
- ↵
-
↵
- Koduvayur SP ,
- Woodson SA
- ↵
- ↵
-
↵
- Russell R ,
- Zhuang X ,
- Babcock HP ,
- Millett IS ,
- Doniach S ,
- Chu S ,
- Herschlag D
- ↵
-
↵
- Rook MS ,
- Treiber DK ,
- Williamson JR
- ↵
-
↵
- Zarrinkar PP ,
- Williamson JR
- ↵
- ↵
-
↵
- Russell R ,
- Millett IS ,
- Tate MW ,
- Kwok LW ,
- Nakatani B ,
- Gruner SM ,
- Mochrie SG ,
- Pande V ,
- Doniach S ,
- Herschlag D ,
- Pollack L
-
↵
- Shcherbakova I ,
- Mitra S ,
- Beer RH ,
- Brenowitz M
-
↵
- Das R ,
- Laederach A ,
- Pearlman SM ,
- Herschlag D ,
- Altman RB
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
-
↵
- Camacho CJ ,
- Thirumalai D
-
↵
- Brooks CL, III ,
- Gruebele M ,
- Onuchic JN ,
- Wolynes PG
- ↵
- ↵
- ↵
- ↵
-
↵
- Huang Z ,
- Szostak JW
- ↵
- ↵
- ↵
- ↵
-
↵
- Humphrey W ,
- Dalke A ,
- Schulten K
- ↵
Citation Manager Formats
More Articles of This Classification
Biological Sciences
Related Content
- No related articles found.
Cited by...
- Multiple conformations are a conserved and regulatory feature of the RB1 5' UTR
- The Azoarcus Group I Intron Ribozyme Misfolds and Is Accelerated for Refolding by ATP-dependent RNA Chaperone Proteins
- RNA molecules with conserved catalytic cores but variable peripheries fold along unique energetically optimized pathways
- Sharing and archiving nucleic acid structure mapping data
- Evaluation of the information content of RNA structure mapping data for secondary structure prediction