Binding specificity of Cas9-guide RNA complexes to DNA is important for genome-engineering applications; however, how mismatches influence target recognition/rejection kinetics is not well understood. Here we used single-molecule FRET to probe real-time interactions between Cas9-RNA and DNA targets. The bimolecular association rate is only weakly dependent on sequence; however, the dissociation rate greatly increases from <0.006 s(-1) to >2 s(-1) upon introduction of mismatches proximal to protospacer-adjacent motif (PAM), demonstrating that mismatches encountered early during heteroduplex formation induce rapid rejection of off-target DNA. In contrast, PAM-distal mismatches up to 11 base pairs in length, which prevent DNA cleavage, still allow formation of a stable complex (dissociation rate <0.006 s(-1)), suggesting that extremely slow rejection could sequester Cas9-RNA, increasing the Cas9 expression level necessary for genome-editing, thereby aggravating off-target effects. We also observed at least two different bound FRET states that may represent distinct steps in target search and proofreading.
The molecular recognition and discrimination of very similar ligand moieties by proteins are important subjects in protein-ligand interaction studies. Specificity in the recognition of molecules is determined by the arrangement of protein and ligand atoms in space. The three pyrimidine bases, viz. cytosine, thymine, and uracil, are structurally similar, but the proteins that bind to them are able to discriminate them and form interactions. Since nonbonded interactions are responsible for molecular recognition processes in biological systems, our work attempts to understand some of the underlying principles of such recognition of pyrimidine molecular structures by proteins. The preferences of the amino acid residues to contact the pyrimidine bases in terms of nonbonded interactions; amino acid residue-ligand atom preferences; main chain and side chain atom contributions of amino acid residues; and solvent-accessible surface area of ligand atoms when forming complexes are analyzed. Our analysis shows that the amino acid residues, tyrosine and phenyl alanine, are highly involved in the pyrimidine interactions. Arginine prefers contacts with the cytosine base. The similarities and differences that exist between the interactions of the amino acid residues with each of the three pyrimidine base atoms in our analysis provide insights that can be exploited in designing specific inhibitors competitive to the ligands.
N6-Methyladenosine (m6A) is the most prevalent internal RNA modification in eukaryotes. ALKBH5 belongs to the AlkB family of dioxygenases and has been shown to specifically demethylate m6A in single stranded RNA. Here we report crystal structures of ALKBH5 in the presence of either its cofactors or the ALKBH5 inhibitor citrate. Catalytic assays demonstrate that the ALKBH5 catalytic domain can demethylate both ssRNA and ssDNA. We identify the tricarboxylic acid (TCA) cycle intermediate citrate as a modest inhibitor of ALKHB5 (IC50: ~488 μM). The structural analysis reveals that a loop region of ALKBH5 is immobilized by a disulfide bond which apparently excludes the binding of dsDNA to ALKBH5. We identify the m6A binding pocket of ALKBH5 and the key residues involved in m6A recognition using mutagenesis and ITC binding experiments.
N6-methyladenosine (m6A) modification is hypothesized to control processes such as RNA degradation, localization and splicing. However, the molecular mechanisms by which this occurs are unclear. Here we measured structures of an RNA duplex containing m6A in the GGACU consensus, along with an unmodified RNA control, by 2D-NMR. The data show that m6A-U pairing in the double-stranded context is accompanied by the methylamino group rotating from its energetically preferred syn geometry on the Watson-Crick face to the higher-energy anti conformation, positioning the methyl group in the major groove. Thermodynamic measurements of m6A in duplexes reveal that it is destabilizing by 0.5-1.7 kcal•mol-1. In contrast, we show that m6A in unpaired positions base stacks considerably more strongly than the unmodified base, adding substantial stabilization in single-stranded locations. Transcriptome-wide nuclease mapping of methylated RNA secondary structure from human cells reveals a structural transition at methylated adenosines, with a tendency to single-stranded structure adjacent to the modified base.
Corn, a 28-nucleotide RNA, increases yellow fluorescence of its cognate ligand 3,5-difluoro-4-hydroxybenzylidene-imidazolinone-2-oxime (DFHO) by >400-fold. Corn was selected in vitro to overcome limitations of other fluorogenic RNAs, particularly rapid photobleaching. We now report the Corn-DFHO co-crystal structure, discovering that the functional species is a quasisymmetric homodimer. Unusually, the dimer interface, in which six unpaired adenosines break overall two-fold symmetry, lacks any intermolecular base pairs. The homodimer encapsulates one DFHO at its interprotomer interface, sandwiching it with a G-quadruplex from each protomer. Corn and the green-fluorescent Spinach RNA are structurally unrelated. Their convergent use of G-quadruplexes underscores the usefulness of this motif for RNA-induced small-molecule fluorescence. The asymmetric dimer interface of Corn could provide a basis for the development of mutants that only fluoresce as heterodimers. Such variants would be analogous to Split GFP, and may be useful for analyzing RNA co-expression or association, or for designing self-assembling RNA nanostructures.
We recently developed base editing, the programmable conversion of target C:G base pairs to T:A without inducing double-stranded DNA breaks (DSBs) or requiring homology-directed repair using engineered fusions of Cas9 variants and cytidine deaminases. Over the past year, the third-generation base editor (BE3) and related technologies have been successfully used by many researchers in a wide range of organisms. The product distribution of base editing-the frequency with which the target C:G is converted to mixtures of undesired by-products, along with the desired T:A product-varies in a target site-dependent manner. We characterize determinants of base editing outcomes in human cells and establish that the formation of undesired products is dependent on uracil N-glycosylase (UNG) and is more likely to occur at target sites containing only a single C within the base editing activity window. We engineered CDA1-BE3 and AID-BE3, which use cytidine deaminase homologs that increase base editing efficiency for some sequences. On the basis of these observations, we engineered fourth-generation base editors (BE4 and SaBE4) that increase the efficiency of C:G to T:A base editing by approximately 50%, while halving the frequency of undesired by-products compared to BE3. Fusing BE3, BE4, SaBE3, or SaBE4 to Gam, a bacteriophage Mu protein that binds DSBs greatly reduces indel formation during base editing, in most cases to below 1.5%, and further improves product purity. BE4, SaBE4, BE4-Gam, and SaBE4-Gam represent the state of the art in C:G-to-T:A base editing, and we recommend their use in future efforts.
The B-DNA double helix can dynamically accommodate G-C and A-T base pairs in either Watson-Crick or Hoogsteen configurations. Here, we show that G-C(+) (in which + indicates protonation) and A-U Hoogsteen base pairs are strongly disfavored in A-RNA. As a result,N(1)-methyladenosine and N(1)-methylguanosine, which occur in DNA as a form of alkylation damage and in RNA as post-transcriptional modifications, have dramatically different consequences. Whereas they create G-C(+) and A-T Hoogsteen base pairs in duplex DNA, thereby maintaining the structural integrity of the double helix, they block base-pairing and induce local duplex melting in RNA. These observations provide a mechanism for disrupting RNA structure through post-transcriptional modifications. The different propensities to form Hoogsteen base pairs in B-DNA and A-RNA may help cells meet the opposing requirements of maintaining genome stability, on the one hand, and of dynamically modulating the structure of the epitranscriptome, on the other.
N(6)-methyladenine is the most widespread mRNA modification. A subset of human box C/D snoRNA species have target GAC sequences that lead to formation of N(6)-methyladenine at a key trans Hoogsteen-sugar A·G base pair, of which half are methylated in vivo The GAC target is conserved only in those that are methylated. Methylation prevents binding of the 15.5-kDa protein and the induced folding of the RNA Thus, the assembly of the box C/D snoRNP could in principle be regulated by RNA methylation at its critical first stage. Crystallography reveals that N(6)-methylation of adenine prevents the formation of trans Hoogsteen-sugar A·G base pairs, explaining why the box C/D RNA cannot adopt its kinked conformation. More generally, our data indicate that sheared A·G base pairs (but not Watson-Crick base pairs) are more susceptible to disruption by N(6)mA methylation and are therefore possible regulatory sites. The human signal recognition particle RNA and many related Alu retrotransposon RNA species are also methylated at N6 of an adenine that forms a sheared base pair with guanine and mediates a key tertiary interaction.
The oligonucleotide d(TX)9 , which consists of an octadecamer sequence with alternating non-canonical 7-deazaadenine (X) and canonical thymine (T) as the nucleobases, was synthesized and shown to hybridize into double-stranded DNA through the formation of hydrogen-bonded Watson-Crick base pairs. dsDNA with metal-mediated base pairs was then obtained by selectively replacing W-C hydrogen bonds by coordination bonds to central silver(I) ions. The oligonucleotide I adopts a duplex structure in the absence of Ag(+) ions, and its stability is significantly enhanced in the presence of Ag(+) ions while its double-helix structure is retained. Temperature-dependent UV spectroscopy, circular dichroism spectroscopy, and ESI mass spectrometry were used to confirm the selective formation of the silver(I)-mediated base pairs. This strategy could become useful for preparing stable metallo-DNA-based nanostructures.
Accurate thermodynamic parameters improve RNA structure predictions and thus accelerate understanding of RNA function and the identification of RNA drug binding sites. Many viral RNA structures, such as internal ribosome entry sites, have internal loops and bulges that are potential drug target sites. Current models used to predict internal loops are biased towards small, symmetric purine loops, and thus poorly predict asymmetric, pyrimidine-rich loops with more than 6 nucleotides that occur frequently in viral RNA. This paper presents new thermodynamic data for 40 pyrimidine loops, many of which can form UU or protonated CC base pairs. Protonated cytosine and uracil base pairs stabilize asymmetric internal loops. Accurate prediction rules are presented that account for all thermodynamic measurements of RNA asymmetric internal loops. New loop initiation terms for loops with more than 6 nucleotides are presented that do not follow previous assumptions that increasing asymmetry destabilizes loops. Since the last 2004 update, 126 new loops with asymmetry or sizes greater than 2x2 have been measured (Mathews 2004). These new measurements significantly deepen and diversify the thermodynamic database for RNA. These results will help better predict internal loops that are larger, pyrimidine-rich, and occur within viral structures such as internal ribosome entry sites.