Chemically modified proteins are invaluable tools for studying the molecular details of biological processes, and they also hold great potential as new therapeutic agents. Several methods have been developed for the site-specific modification of proteins, one of the most widely used being expressed protein ligation (EPL) in which a recombinant α-thioester is ligated to an N-terminal Cys-containing peptide. Despite the widespread use of EPL, the generation and isolation of the required recombinant protein α-thioesters remain challenging. We describe here a new method for the preparation and purification of recombinant protein α-thioesters using engineered versions of naturally split DnaE inteins. This family of autoprocessing enzymes is closely related to the inteins currently used for protein α-thioester generation, but they feature faster kinetics and are split into two inactive polypeptides that need to associate to become active. Taking advantage of the strong affinity between the two split intein fragments, we devised a streamlined procedure for the purification and generation of protein α-thioesters from cell lysates and applied this strategy for the semisynthesis of a variety of proteins including an acetylated histone and a site-specifically modified monoclonal antibody.
The Chlamydomonas reinhardtii chloroplast-localized poly(A)-binding protein RB47 is predicted to contain a non-conserved linker (NCL) sequence flanked by highly conserved N- and C-terminal sequences, based on the corresponding cDNA. RB47 was purified from chloroplasts in association with an endoribonuclease activity, however, protein sequencing failed to detect the NCL. Furthermore, while recombinant RB47 including the NCL did not display endoribonuclease activity in vitro, versions lacking the NCL displayed strong activity. Both full-length and shorter forms of RB47 could be detected in chloroplasts, with conversion to the shorter form occurring in chloroplasts isolated from cells grown in the light. This conversion could be replicated in vitro in chloroplast extracts in a light-dependent manner, where epitope tags and protein sequencing showed that the NCL was excised from a full-length recombinant substrate, together with splicing of the flanking sequences. The requirement for endogenous factors and light differentiates this protein splicing from autocatalytic inteins, and may allow the chloroplast to regulate the activation of RB47 endoribonuclease activity. We speculate that this protein splicing activity arose to post-translationally repair proteins that had been inactivated by deleterious insertions or extensions.
An intein from Halobacterium salinarum can be isolated as an unspliced precursor protein with exogenous exteins after Escherichia coli over-expression. The intein promotes protein splicing and uncoupled N-terminal cleavage in vitro, conditional on incubation with NaCl or KCl at concentrations greater than 1.5 M. The protein splicing reaction also is conditional on reduction of a disulfide bond between two active site cysteines. Conditional protein splicing under these relatively mild conditions may lead to advances in intein-based biotechnology applications and hints at the possibility that this H. salinarum intein could serve as a switch to control extein activity under physiologically relevant conditions.
Inteins, also called protein introns, are self-splicing mobile elements found in all domains of life. A bioinformatic survey of genomic data highlights a biased distribution of inteins among functional categories of proteins in both bacteria and archaea, with a strong preference for a single network of functions containing replisome proteins. Many non-orthologous, functionally equivalent replicative proteins in bacteria and archaea carry inteins, suggesting a selective retention of inteins in proteins of particular functions across domains of life. Inteins cluster not only in proteins with related roles, but also in specific functional units of those proteins, like ATPase domains. This peculiar bias does not fully fit the models describing inteins exclusively as parasitic elements. In such models, evolutionary dynamics of inteins is viewed primarily through their mobility with the intein homing endonuclease (HEN) as the major factor of intein acquisition and loss. Although the HEN is essential for intein invasion and spread in populations, HEN dynamics does not explain the observed biased distribution of inteins among proteins in specific functional categories. We propose that the protein splicing domain of the intein can act as an environmental sensor that adapts to a particular niche and could potentially increase the chance of the intein becoming fixed in a population. We argue that selective retention of some inteins might be beneficial under certain environmental stresses, to act as panic buttons that reversibly inhibit specific networks, consistent with the observed intein distribution.
Inteins are intervening proteins that undergo an autocatalytic splicing reaction that ligates flanking host protein sequences termed exteins. Some intein-containing proteins have evolved to couple splicing to environmental signals; this represents a new form of posttranslational regulation. Of particular interest is RadA from the archaeon Pyrococcus horikoshii, for which long-range intein-extein interactions block splicing, requiring temperature and single-stranded DNA (ssDNA) substrate to splice rapidly and accurately. Here, we report that splicing of the intein-containing RadA from another archaeon, Thermococcus sibericus, is activated by significantly lower temperatures than is P. horikoshii RadA, consistent with differences in their growth environments. Investigation into variations between T. sibericus and P. horikoshii RadA inteins led to the discovery that a nonconserved region (NCR) of the intein, a flexible loop where a homing endonuclease previously resided, is critical to splicing. Deletion of the NCR leads to a substantial loss in the rate and accuracy of P. horikoshii RadA splicing only within native exteins. The influence of the NCR deletion can be largely overcome by ssDNA, demonstrating that the splicing-competent conformation can be achieved. We present a model whereby the NCR is a flexible hinge which acts as a switch by controlling distant intein-extein interactions that inhibit active site assembly. These results speak to the repurposing of the vestigial endonuclease loop to control an intein-extein partnership, which ultimately allows exquisite adaptation of protein splicing upon changes in the environment.IMPORTANCE Inteins are mobile genetic elements that interrupt coding sequences (exteins) and are removed by protein splicing. They are abundant elements in microbes, and recent work has demonstrated that protein splicing can be controlled by environmental cues, including the substrate of the intein-containing protein. Here, we describe an intein-extein collaboration that controls temperature-induced splicing of RadA from two archaea and how variation in this intein-extein partnership results in fine-tuning of splicing to closely match the environment. Specifically, we found that a small sequence difference between the two inteins, a flexible loop that likely once housed a homing endonuclease used for intein mobility, acts as a switch to control intein-extein interactions that block splicing. Our results argue strongly that some inteins have evolved away from a purely parasitic lifestyle to control the activity of host proteins, representing a new form of posttranslational regulation that is potentially widespread in the microbial world.
Split inteins play an important role in modern protein semisynthesis techniques. These naturally occurring protein splicing domains can be used for in vitro and in vivo protein modification, peptide and protein cyclization, segmental isotopic labeling, and the construction of biosensors. The most well-characterized family of split inteins, the cyanobacterial DnaE inteins, show particular promise as many of these can splice proteins in under one minute. Despite this fact, the activity of these inteins is context-dependent: certain peptide sequences surrounding their ligation junction (called local N- and C-exteins) are strongly preferred, while other sequences cause a dramatic reduction in splicing kinetics and yields. These sequence constraints limit the utility of inteins, and thus a more detailed understanding of their participation in protein splicing is needed. Here, we present a thorough kinetic analysis of the relationship between C-extein composition and split intein activity. The results of these experiments were used to guide structural and molecular dynamics studies, which revealed that the motions of catalytic residues are constrained by the second C-extein residue, likely forcing them into an active conformation that promotes rapid protein splicing. Together, our structural and functional studies also highlight a key region of the intein structure that can be re-engineered to increase intein promiscuity.
Inteins are naturally occurring intervening sequences that catalyze a protein splicing reaction resulting in intein excision and concatenation of the flanking polypeptides (exteins) with a native peptide bond. Inteins display a diversity of catalytic mechanisms within a highly conserved fold, which is shared with hedgehog autoprocessing proteins. The unusual chemistry of inteins has afforded powerful biotechnology tools for controlling enzyme function upon splicing and allowing peptides of different origins to be coupled in a specific, time-defined manner. The extein sequences immediately flanking the intein affect splicing and can be defined as the intein’s substrate. Due to the enormous potential complexity of all possible flanking sequences, studying intein substrate specificity has been difficult. Therefore we developed a genetic selection for splicing-dependent kanamycin resistance with no significant bias when six amino acids that immediately flank the intein insertion site were randomized. We applied this selection to examine the sequence space of residues flanking the Nostoc punctiforme Npu DnaE intein and found that this intein efficiently splices a much wider range of sequences than previously thought, with little N-extein specificity and only two important C-extein positions. The novel selected extein sequences were sufficient to promote splicing in three unrelated proteins, confirming the generalizable nature of the specificity data and defining new potential insertion sites for any target. Kinetic analysis showed splicing rates with the selected exteins that were as fast or faster than the native extein, refuting past assumptions that the naturally selected flanking extein sequences are optimal for splicing.
Biologics, such as antibody-drug conjugates, are becoming mainstream therapeutics. Consequently, methods to functionalize biologics without disrupting their native properties are essential for identifying, characterizing, and translating candidate biologics from the bench to clinical practice. Here, we present a method for site-specific, carboxy-terminal modification of single-chain antibody fragments (scFvs). ScFvs displayed on the surface of yeast were isolated and functionalized by combining intein-mediated expressed protein ligation (EPL) with inverse electron-demand Diels-Alder (IEDDA) cycloaddition using a styrene-tetrazine pair. The high thiol concentration required to trigger EPL can hinder the subsequent chemoselective ligation reactions; therefore, the EPL reaction was used to append styrene to the scFv, limiting tetrazine exposure to damaging thiols. Subsequently, the styrene-functionalized scFv was reacted with tetrazine-conjugated compounds in an IEDDA cycloaddition to generate functionalized scFvs that retain their native binding activity. Rapid functionalization of yeast surface-derived scFv in a site-directed manner could find utility in many downstream laboratory and pre-clinical applications.
- Proceedings of the National Academy of Sciences of the United States of America
- Published over 2 years ago
The facile rearrangement of “S-acyl isopeptides” to native peptide bonds viaS,N-acyl shift is central to the success of native chemical ligation, the widely used approach for protein total synthesis. Proximity-driven amide bond formation via acyl transfer reactions in other contexts has proven generally less effective. Here, we show that under neutral aqueous conditions, “O-acyl isopeptides” derived from hydroxy-asparagine [aspartic acid-β-hydroxamic acid; Asp(β-HA)] rearrange to form native peptide bonds via anO,N-acyl shift. This process constitutes a rare example of anO,N-acyl shift that proceeds rapidly across a medium-size ring (t1/2∼ 15 min), and takes place in water with minimal interference from hydrolysis. In contrast to serine/threonine or tyrosine, which formO-acyl isopeptides only by the use of highly activated acyl donors and appropriate protecting groups in organic solvent, Asp(β-HA) is sufficiently reactive to formO-acyl isopeptides by treatment with an unprotected peptide-αthioester, at low mM concentration, in water. These findings were applied to an acyl transfer-based chemical ligation strategy, in which an unprotectedN-terminal Asp(β-HA)-peptide and peptide-αthioester react under aqueous conditions to give a ligation product ultimately linked by a native peptide bond.
Harnessing and controlling self-assembly is an important step in developing proteins as novel biomaterials. With this goal, here we report the design of a general genetically programmed system that covalently concatenates multiple distinct protein domains into specific assembled arrays. It is driven by iterative intein-mediated Native Chemical Ligation (NCL) under mild native conditions. The system uses a series of initially inert recombinant protein fusions that sandwich the protein modules to be ligated between one of a number of different affinity tags and an intein protein domain. Orthogonal activation at opposite termini of compatible protein fusions, via protease and intein cleavage, coupled with sequential mixing directs an irreversible and traceless stepwise assembly process. This gives total control over the composition and arrangement of component proteins within the final product, enabled the limits of the system - reaction efficiency and yield - to be investigated and led to the production of “functional” assemblies.