Concept: Secondary structure
Decoding post-transcriptional regulatory programs in RNA is a critical step towards the larger goal of developing predictive dynamical models of cellular behaviour. Despite recent efforts, the vast landscape of RNA regulatory elements remains largely uncharacterized. A long-standing obstacle is the contribution of local RNA secondary structure to the definition of interaction partners in a variety of regulatory contexts, including–but not limited to–transcript stability, alternative splicing and localization. There are many documented instances where the presence of a structural regulatory element dictates alternative splicing patterns (for example, human cardiac troponin T) or affects other aspects of RNA biology. Thus, a full characterization of post-transcriptional regulatory programs requires capturing information provided by both local secondary structures and the underlying sequence. Here we present a computational framework based on context-free grammars and mutual information that systematically explores the immense space of small structural elements and reveals motifs that are significantly informative of genome-wide measurements of RNA behaviour. By applying this framework to genome-wide human mRNA stability data, we reveal eight highly significant elements with substantial structural information, for the strongest of which we show a major role in global mRNA regulation. Through biochemistry, mass spectrometry and in vivo binding studies, we identified human HNRPA2B1 (heterogeneous nuclear ribonucleoprotein A2/B1, also known as HNRNPA2B1) as the key regulator that binds this element and stabilizes a large number of its target genes. We created a global post-transcriptional regulatory map based on the identity of the discovered linear and structural cis-regulatory elements, their regulatory interactions and their target pathways. This approach could also be used to reveal the structural elements that modulate other aspects of RNA behaviour.
Designing RNAs that form specific secondary structures is enabling better understanding and control of living systems through RNA-guided silencing, genome editing and protein organization. Little is known, however, about which RNA secondary structures might be tractable for downstream sequence design, increasing the time and expense of design efforts due to inefficient secondary structure choices. Here, we present insights into specific structural features that increase the difficulty of finding sequences that fold into a target RNA secondary structure, summarizing the design efforts of tens of thousands of human participants and three automated algorithms (RNAInverse, INFO-RNA and RNA-SSD) in the Eterna massive open laboratory. Subsequent tests through three independent RNA design algorithms (NUPACK, DSS-Opt and MODENA) confirmed the hypothesized importance of several features in determining design difficulty, including sequence length, mean stem length, symmetry and specific difficult-to-design motifs such as zigzags. Based on these results, we have compiled an Eterna100 benchmark of 100 secondary structure design challenges that span a large range in design difficulty to help test future efforts. Our in silico results suggest new routes for improving computational RNA design methods and for extending these insights to assess “designability” of single RNA structures, as well as of switches for in vitro and in vivo applications.
- Proceedings of the National Academy of Sciences of the United States of America
- Published almost 4 years ago
Self-assembling RNA molecules present compelling substrates for the rational interrogation and control of living systems. However, imperfect in silico models-even at the secondary structure level-hinder the design of new RNAs that function properly when synthesized. Here, we present a unique and potentially general approach to such empirical problems: the Massive Open Laboratory. The EteRNA project connects 37,000 enthusiasts to RNA design puzzles through an online interface. Uniquely, EteRNA participants not only manipulate simulated molecules but also control a remote experimental pipeline for high-throughput RNA synthesis and structure mapping. We show herein that the EteRNA community leveraged dozens of cycles of continuous wet laboratory feedback to learn strategies for solving in vitro RNA design problems on which automated methods fail. The top strategies-including several previously unrecognized negative design rules-were distilled by machine learning into an algorithm, EteRNABot. Over a rigorous 1-y testing phase, both the EteRNA community and EteRNABot significantly outperformed prior algorithms in a dozen RNA secondary structure design tests, including the creation of dendrimer-like structures and scaffolds for small molecule sensors. These results show that an online community can carry out large-scale experiments, hypothesis generation, and algorithm design to create practical advances in empirical science.
The cephalochordate Amphioxus naturally co-expresses fluorescent proteins (FPs) with different brightness, which thus offers the rare opportunity to identify FP molecular feature/s that are associated with greater/lower intensity of fluorescence. Here, we describe the spectral and structural characteristics of green FP (bfloGFPa1) with perfect (100%) quantum efficiency yielding to unprecedentedly-high brightness, and compare them to those of co-expressed bfloGFPc1 showing extremely-dim brightness due to low (0.1%) quantum efficiency. This direct comparison of structure-function relationship indicated that in the bright bfloGFPa1, a Tyrosine (Tyr159) promotes a ring flipping of a Tryptophan (Trp157) that in turn allows a cis-trans transformation of a Proline (Pro55). Consequently, the FP chromophore is pushed up, which comes with a slight tilt and increased stability. FPs are continuously engineered for improved biochemical and/or photonic properties, and this study provides new insight to the challenge of establishing a clear mechanistic understanding between chromophore structural environment and brightness level.
Unlike random heteropolymers, natural proteins fold into unique ordered structures. Understanding how these are encoded in amino-acid sequences is complicated by energetically unfavourable non-ideal features–for example kinked α-helices, bulged β-strands, strained loops and buried polar groups–that arise in proteins from evolutionary selection for biological function or from neutral drift. Here we describe an approach to designing ideal protein structures stabilized by completely consistent local and non-local interactions. The approach is based on a set of rules relating secondary structure patterns to protein tertiary motifs, which make possible the design of funnel-shaped protein folding energy landscapes leading into the target folded state. Guided by these rules, we designed sequences predicted to fold into ideal protein structures consisting of α-helices, β-strands and minimal loops. Designs for five different topologies were found to be monomeric and very stable and to adopt structures in solution nearly identical to the computational models. These results illuminate how the folding funnels of natural proteins arise and provide the foundation for engineering a new generation of functional proteins free from natural evolution.
The response of living systems to nanoparticles is thought to depend on the protein corona, which forms shortly after exposure to physiological fluids and which is linked to a wide array of pathophysiologies. A mechanistic understanding of the dynamic interaction between proteins and nanoparticles and thus the biological fate of nanoparticles and associated proteins is, however, often missing mainly due to the inadequacies in current ensemble experimental approaches. Through the application of a variety of single molecule and single particle spectroscopic techniques in combination with ensemble level characterization tools, we have been able to identify different interaction pathways between gold nanorods and bovine serum albumin depending on the protein concentration. Overall, we found that local changes in protein concentration influence everything from cancer cell uptake to nanoparticle stability and even protein secondary structure. We envision that our findings and methods will lead to strategies to control the associated pathophysiology of nanoparticle exposure in vivo.
TIM15/Zim17 in yeast and its mammalian ortholog Hep are Zn(2+) finger (Cys4) proteins that assist mtHsp70 in protein import into the mitochondrial matrix.
Attenuated total reflectance Fourier transform infrared spectroscopy (ATR-FTIR) was used to study the conformation of aggregated proteins both in vivo and in vitro. Several different protein aggregates including amyloid fibrils from several peptides and polypeptides, inclusion bodies, folding aggregates, soluble oligomers, and protein extracts from stressed cells were examined in this study. All protein aggregates demonstrate a characteristic new ß-structure with lower frequency band positions. All protein aggregates acquire this new ß-band following the aggregation process involving inter-molecular interactions. The ß-sheets, in some proteins, arise from regions of polypeptide that are helical or non-beta in the native conformation. For a given protein, all types of aggregates (e. g. inclusion bodies, folding aggregates, thermal aggregates) showed similar spectra, indicating they arose from a common partially-folded species. All the aggregates have some native-like secondary structure as well as non-periodic structure, along with the specific new ß-structure. The new ß could most likely be attributed to stronger hydrogen bonds in the intermolecular ß-sheet structure present in protein aggregates.
The α subunit of β-conglycinin is a major allergen in soybean. The objective of this study was to predict and identify the linear immunoglobulin (Ig)E epitopes of the soybean α subunit of β-conglycinin. Three immunoinformatics tools were used to predict the potential epitopes and were confirmed by dot-blot inhibition using sera from soybean allergic subjects. As a result, 15 peptides were predicted and assembled by solid-phase synthesis. Eleven epitopes were identified by the dot-blot inhibition test. Moreover, peptide 3 had IgE binding capability with all sera(5/5) tested, while peptide 1, 4, 6, 8 and12 could bind to 4/5 of the sera samples. Secondary structure prediction of peptide 3 and circular dichroism test validated that the structure of peptide 3 was a random coil.
Dot plots were originally introduced in bioinformatics as dot-containing images used to compare biological sequences and identify regions of close similarity between them. In addition to similarity, dot plots were extended to possibly represent interactions between building blocks of biological sequences, where the dots can vary in size or color according to desired features. In this survey, we first review their use in representing an RNA secondary structure, which has mostly been applied for displaying the output secondary structures as a result of running RNA folding prediction algorithms. Such a result may often contain suboptimal solutions in addition to the optimal one, which can be easily incorporated in the dot plot. We then proceed from their passive use of providing RNA secondary structure snapshots to their active use of illustrating RNA secondary structure manipulations in beneficial ways. While comparison between RNA secondary structures can mostly be done efficiently using a string representation, there are notable advantages in using dot plots for analyzing the suboptimal solutions that convey important information about the structure of the RNA molecule. In addition, structure-based alignment of dot plots has been advanced considerably and the filtering of dot plots that considers chemical and enzymatic data from structure determination experiments has been suggested. We discuss these procedures and how they can be enhanced in the future by using an image representation to analyze RNA secondary structures and examine their manipulations. WIREs RNA 2013. doi: 10.1002/wrna.1154 For further resources related to this article, please visit the WIREs website.