Concept: Base pair
Modern whole-organism genome analysis, in combination with biomass estimates, allows us to estimate a lower bound on the total information content in the biosphere: 5.3 × 1031 (±3.6 × 1031) megabases (Mb) of DNA. Given conservative estimates regarding DNA transcription rates, this information content suggests biosphere processing speeds exceeding yottaNOPS values (1024 Nucleotide Operations Per Second). Although prokaryotes evolved at least 3 billion years before plants and animals, we find that the information content of prokaryotes is similar to plants and animals at the present day. This information-based approach offers a new way to quantify anthropogenic and natural processes in the biosphere and its information diversity over time.
In order to explore the diversity and selective signatures of duplication and deletion human copy number variants (CNVs), we sequenced 236 individuals from 125 distinct human populations. We observed that duplications exhibit fundamentally different population genetic and selective signatures than deletions and are more likely to be stratified between human populations. Through reconstruction of the ancestral human genome, we identify megabases of DNA lost in different human lineages and pinpoint large duplications that introgressed from the extinct Denisova lineage now found at high frequency exclusively in Oceanic populations. We find that the proportion of CNV base pairs to single nucleotide variant base pairs is greater among non-Africans than it is among African populations, but we conclude that this difference is likely due to unique aspects of non-African population history as opposed to differences in CNV load.
Genomes are composed of long strings of nucleotide monomers (A, C, G and T) that are either scavenged from the organism’s environment or built from metabolic precursors. The biosynthesis of each nucleotide differs in atomic requirements with different nucleotides requiring different quantities of nitrogen atoms. However, the impact of the relative availability of dietary nitrogen on genome composition and codon bias is poorly understood.
Single-molecule techniques facilitate analysis of mechanical transitions within nucleic acids and proteins. Here, we describe an integrated fluorescence and magnetic tweezers instrument that permits detection of nanometer-scale DNA structural rearrangements together with the application of a wide range of stretching forces to individual DNA molecules. We have analyzed the force-dependent equilibrium and rate constants for telomere DNA G-quadruplex (GQ) folding and unfolding, and have determined the location of the transition state barrier along the well-defined DNA-stretching reaction coordinate. Our results reveal the mechanical unfolding pathway of the telomere DNA GQ is characterized by a short distance (<1 nm) to the transition state for the unfolding reaction. This mechanical unfolding response reflects a critical contribution of long-range interactions to the global stability of the GQ fold, and suggests that telomere-associated proteins need only disrupt a few base pairs to destabilize GQ structures. Comparison of the GQ unfolded state with a single-stranded polyT DNA revealed the unfolded GQ exhibits a compacted non-native conformation reminiscent of the protein molten globule. We expect the capacity to interrogate macromolecular structural transitions with high spatial resolution under conditions of low forces will have broad application in analyses of nucleic acid and protein folding.
Sensitive and specific methodologies for detection of pathogenic gene at the point-of-care are still urgent demands in rapid diagnosis of infectious diseases. This work develops a simple and pragmatic electrochemical biosensing strategy for ultrasensitive and specific detection of pathogenic nucleic acids directly by integrating homogeneous target-initiated transcription amplification (HTITA) with interfacial sensing process in single analysis system. The homogeneous recognition and specific binding of target DNA with the designed hairpin probe triggered circular primer extension reaction to form DNA double-strands which contained T7 RNA polymerase promoter and served as templates for in vitro transcription amplification. The HTITA protocol resulted in numerous single-stranded RNA products which could synchronously hybridized with the detection probes and immobilized capture probes for enzyme-amplified electrochemical detection on the biosensor surface. The proposed electrochemical biosensing strategy showed very high sensitivity and selectivity for target DNA with a dynamic response range from 1 fM to 100 pM. Using salmonella as a model, the established strategy was successfully applied to directly detect invA gene from genomic DNA extract. This proposed strategy presented a simple, pragmatic platform toward ultrasensitive nucleic acids detection and would become a versatile and powerful tool for point-of-care pathogen identification.
Organisms are defined by the information encoded in their genomes, and since the origin of life this information has been encoded using a two-base-pair genetic alphabet (A-T and G-C). In vitro, the alphabet has been expanded to include several unnatural base pairs (UBPs). We have developed a class of UBPs formed between nucleotides bearing hydrophobic nucleobases, exemplified by the pair formed between d5SICS and dNaM (d5SICS-dNaM), which is efficiently PCR-amplified and transcribed in vitro, and whose unique mechanism of replication has been characterized. However, expansion of an organism’s genetic alphabet presents new and unprecedented challenges: the unnatural nucleoside triphosphates must be available inside the cell; endogenous polymerases must be able to use the unnatural triphosphates to faithfully replicate DNA containing the UBP within the complex cellular milieu; and finally, the UBP must be stable in the presence of pathways that maintain the integrity of DNA. Here we show that an exogenously expressed algal nucleotide triphosphate transporter efficiently imports the triphosphates of both d5SICS and dNaM (d5SICSTP and dNaMTP) into Escherichia coli, and that the endogenous replication machinery uses them to accurately replicate a plasmid containing d5SICS-dNaM. Neither the presence of the unnatural triphosphates nor the replication of the UBP introduces a notable growth burden. Lastly, we find that the UBP is not efficiently excised by DNA repair pathways. Thus, the resulting bacterium is the first organism to propagate stably an expanded genetic alphabet.
We report the sequencing and assembly of a reference genome for the human GM12878 Utah/Ceph cell line using the MinION (Oxford Nanopore Technologies) nanopore sequencer. 91.2 Gb of sequence data, representing ∼30× theoretical coverage, were produced. Reference-based alignment enabled detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contiguous assembly (NG50 ∼3 Mb). We developed a protocol to generate ultra-long reads (N50 > 100 kb, read lengths up to 882 kb). Incorporating an additional 5× coverage of these ultra-long reads more than doubled the assembly contiguity (NG50 ∼6.4 Mb). The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled assembly and phasing of the 4-Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length, and closure of gaps in the reference human genome assembly GRCh38.
- Proceedings of the National Academy of Sciences of the United States of America
- Published 12 months ago
Benzo[a]pyrene (BaP), a polycyclic aromatic hydrocarbon, is the major cause of lung cancer. BaP forms covalent DNA adducts after metabolic activation and induces mutations. We have developed a method for capturing oligonucleotides carrying bulky base adducts, including UV-induced cyclobutane pyrimidine dimers (CPDs) and BaP diol epoxide-deoxyguanosine (BPDE-dG), which are removed from the genome by nucleotide excision repair. The isolated oligonucleotides are ligated to adaptors, and after damage-specific immunoprecipitation, the adaptor-ligated oligonucleotides are converted to dsDNA with an appropriate translesion DNA synthesis (TLS) polymerase, followed by PCR amplification and next-generation sequencing (NGS) to generate genome-wide repair maps. We have termed this method translesion excision repair-sequencing (tXR-seq). In contrast to our previously described XR-seq method, tXR-seq does not depend on repair/removal of the damage in the excised oligonucleotides, and thus it is applicable to essentially all DNA damages processed by nucleotide excision repair. Here we present the excision repair maps for CPDs and BPDE-dG adducts generated by tXR-Seq for the human genome. In addition, we report the sequence specificity of BPDE-dG excision repair using tXR-seq.
A framework for open discourse on the use of CRISPR-Cas9 technology to manipulate the human genome is urgently needed.
The spontaneous deamination of cytosine is a major source of transitions from C•G to T•A base pairs, which account for half of known pathogenic point mutations in humans. The ability to efficiently convert targeted A•T base pairs to G•C could therefore advance the study and treatment of genetic diseases. The deamination of adenine yields inosine, which is treated as guanine by polymerases, but no enzymes are known to deaminate adenine in DNA. Here we describe adenine base editors (ABEs) that mediate the conversion of A•T to G•C in genomic DNA. We evolved a transfer RNA adenosine deaminase to operate on DNA when fused to a catalytically impaired CRISPR-Cas9 mutant. Extensive directed evolution and protein engineering resulted in seventh-generation ABEs that convert targeted A•T base pairs efficiently to G•C (approximately 50% efficiency in human cells) with high product purity (typically at least 99.9%) and low rates of indels (typically no more than 0.1%). ABEs introduce point mutations more efficiently and cleanly, and with less off-target genome modification, than a current Cas9 nuclease-based method, and can install disease-correcting or disease-suppressing mutations in human cells. Together with previous base editors, ABEs enable the direct, programmable introduction of all four transition mutations without double-stranded DNA cleavage.