Journal: American journal of human genetics
Human genes governing innate immunity provide a valuable tool for the study of the selective pressure imposed by microorganisms on host genomes. A comprehensive, genome-wide study of how selective constraints and adaptations have driven the evolution of innate immunity genes is missing. Using full-genome sequence variation from the 1000 Genomes Project, we first show that innate immunity genes have globally evolved under stronger purifying selection than the remainder of protein-coding genes. We identify a gene set under the strongest selective constraints, mutations in which are likely to predispose individuals to life-threatening disease, as illustrated by STAT1 and TRAF3. We then evaluate the occurrence of local adaptation and detect 57 high-scoring signals of positive selection at innate immunity genes, variation in which has been associated with susceptibility to common infectious or autoimmune diseases. Furthermore, we show that most adaptations targeting coding variation have occurred in the last 6,000-13,000 years, the period at which populations shifted from hunting and gathering to farming. Finally, we show that innate immunity genes present higher Neandertal introgression than the remainder of the coding genome. Notably, among the genes presenting the highest Neandertal ancestry, we find the TLR6-TLR1-TLR10 cluster, which also contains functional adaptive variation in Europeans. This study identifies highly constrained genes that fulfill essential, non-redundant functions in host survival and reveals others that are more permissive to change-containing variation acquired from archaic hominins or adaptive variants in specific populations-improving our understanding of the relative biological importance of innate immunity pathways in natural conditions.
Over the past 500 years, North America has been the site of ongoing mixing of Native Americans, European settlers, and Africans (brought largely by the trans-Atlantic slave trade), shaping the early history of what became the United States. We studied the genetic ancestry of 5,269 self-described African Americans, 8,663 Latinos, and 148,789 European Americans who are 23andMe customers and show that the legacy of these historical interactions is visible in the genetic ancestry of present-day Americans. We document pervasive mixed ancestry and asymmetrical male and female ancestry contributions in all groups studied. We show that regional ancestry differences reflect historical events, such as early Spanish colonization, waves of immigration from many regions of Europe, and forced relocation of Native Americans within the US. This study sheds light on the fine-scale differences in ancestry within and across the United States and informs our understanding of the relationship between racial and ethnic identities and genetic ancestry.
The past five years have seen many scientific and biological discoveries made through the experimental design of genome-wide association studies (GWASs). These studies were aimed at detecting variants at genomic loci that are associated with complex traits in the population and, in particular, at detecting associations between common single-nucleotide polymorphisms (SNPs) and common diseases such as heart disease, diabetes, auto-immune diseases, and psychiatric disorders. We start by giving a number of quotes from scientists and journalists about perceived problems with GWASs. We will then briefly give the history of GWASs and focus on the discoveries made through this experimental design, what those discoveries tell us and do not tell us about the genetics and biology of complex traits, and what immediate utility has come out of these studies. Rather than giving an exhaustive review of all reported findings for all diseases and other complex traits, we focus on the results for auto-immune diseases and metabolic diseases. We return to the perceived failure or disappointment about GWASs in the concluding section.
The magnitude of the human antibody response to viral antigens is highly variable. To explore the human genetic contribution to this variability, we performed genome-wide association studies of the immunoglobulin G response to 14 pathogenic viruses in 2,363 immunocompetent adults. Significant associations were observed in the major histocompatibility complex region on chromosome 6 for influenza A virus, Epstein-Barr virus, JC polyomavirus, and Merkel cell polyomavirus. Using local imputation and fine mapping, we identified specific amino acid residues in human leucocyte antigen (HLA) class II proteins as the most probable causal variants underlying these association signals. Common HLA-DRβ1 haplotypes showed virus-specific patterns of humoral-response regulation. We observed an overlap between variants affecting the humoral response to influenza A and EBV and variants previously associated with autoimmune diseases related to these viruses. The results of this study emphasize the central and pathogen-specific role of HLA class II variation in the modulation of humoral immune response to viral antigens in humans.
During neurotransmission, synaptic vesicles undergo multiple rounds of exo-endocytosis, involving recycling and/or degradation of synaptic proteins. While ubiquitin signaling at synapses is essential for neural function, it has been assumed that synaptic proteostasis requires the ubiquitin-proteasome system (UPS). We demonstrate here that turnover of synaptic membrane proteins via the endolysosomal pathway is essential for synaptic function. In both human and mouse, hypomorphic mutations in the ubiquitin adaptor protein PLAA cause an infantile-lethal neurodysfunction syndrome with seizures. Resulting from perturbed endolysosomal degradation, Plaa mutant neurons accumulate K63-polyubiquitylated proteins and synaptic membrane proteins, disrupting synaptic vesicle recycling and neurotransmission. Through characterization of this neurological intracellular trafficking disorder, we establish the importance of ubiquitin-mediated endolysosomal trafficking at the synapse.
Uncombable hair syndrome (UHS), also known as “spun glass hair syndrome,” “pili trianguli et canaliculi,” or “cheveux incoiffables” is a rare anomaly of the hair shaft that occurs in children and improves with age. UHS is characterized by dry, frizzy, spangly, and often fair hair that is resistant to being combed flat. Until now, both simplex and familial UHS-affected case subjects with autosomal-dominant as well as -recessive inheritance have been reported. However, none of these case subjects were linked to a molecular genetic cause. Here, we report the identification of UHS-causative mutations located in the three genes PADI3 (peptidylarginine deiminase 3), TGM3 (transglutaminase 3), and TCHH (trichohyalin) in a total of 11 children. All of these individuals carry homozygous or compound heterozygous mutations in one of these three genes, indicating an autosomal-recessive inheritance pattern in the majority of UHS case subjects. The two enzymes PADI3 and TGM3, responsible for posttranslational protein modifications, and their target structural protein TCHH are all involved in hair shaft formation. Elucidation of the molecular outcomes of the disease-causing mutations by cell culture experiments and tridimensional protein models demonstrated clear differences in the structural organization and activity of mutant and wild-type proteins. Scanning electron microscopy observations revealed morphological alterations in hair coat of Padi3 knockout mice. All together, these findings elucidate the molecular genetic causes of UHS and shed light on its pathophysiology and hair physiology in general.
With recent rapid advances in genomic technologies, precise delineation of structural chromosome rearrangements at the nucleotide level is becoming increasingly feasible. In this era of “next-generation cytogenetics” (i.e., an integration of traditional cytogenetic techniques and next-generation sequencing), a consensus nomenclature is essential for accurate communication and data sharing. Currently, nomenclature for describing the sequencing data of these aberrations is lacking. Herein, we present a system called Next-Gen Cytogenetic Nomenclature, which is concordant with the International System for Human Cytogenetic Nomenclature (2013). This system starts with the alignment of rearrangement sequences by BLAT or BLAST (alignment tools) and arrives at a concise and detailed description of chromosomal changes. To facilitate usage and implementation of this nomenclature, we are developing a program designated BLA(S)T Output Sequence Tool of Nomenclature (BOSToN), a demonstrative version of which is accessible online. A standardized characterization of structural chromosomal rearrangements is essential both for research analyses and for application in the clinical setting.
Nemaline myopathy (NEM) is a common congenital myopathy. At the very severe end of the NEM clinical spectrum are genetically unresolved cases of autosomal-recessive fetal akinesia sequence. We studied a multinational cohort of 143 severe-NEM-affected families lacking genetic diagnosis. We performed whole-exome sequencing of six families and targeted gene sequencing of additional families. We identified 19 mutations in KLHL40 (kelch-like family member 40) in 28 apparently unrelated NEM kindreds of various ethnicities. Accounting for up to 28% of the tested individuals in the Japanese cohort, KLHL40 mutations were found to be the most common cause of this severe form of NEM. Clinical features of affected individuals were severe and distinctive and included fetal akinesia or hypokinesia and contractures, fractures, respiratory failure, and swallowing difficulties at birth. Molecular modeling suggested that the missense substitutions would destabilize the protein. Protein studies showed that KLHL40 is a striated-muscle-specific protein that is absent in KLHL40-associated NEM skeletal muscle. In zebrafish, klhl40a and klhl40b expression is largely confined to the myotome and skeletal muscle, and knockdown of these isoforms results in disruption of muscle structure and loss of movement. We identified KLHL40 mutations as a frequent cause of severe autosomal-recessive NEM and showed that it plays a key role in muscle development and function. Screening of KLHL40 should be a priority in individuals who are affected by autosomal-recessive NEM and who present with prenatal symptoms and/or contractures and in all Japanese individuals with severe NEM.
Hemophilia B, or the “royal disease,” arises from mutations in coagulation factor IX (F9). Mutations within the F9 promoter are associated with a remarkable hemophilia B subtype, termed hemophilia B Leyden, in which symptoms ameliorate after puberty. Mutations at the -5/-6 site (nucleotides -5 and -6 relative to the transcription start site, designated +1) account for the majority of Leyden cases and have been postulated to disrupt the binding of a transcriptional activator, the identity of which has remained elusive for more than 20 years. Here, we show that ONECUT transcription factors (ONECUT1 and ONECUT2) bind to the -5/-6 site. The various hemophilia B Leyden mutations that have been reported in this site inhibit ONECUT binding to varying degrees, which correlate well with their associated clinical severities. In addition, expression of F9 is crucially dependent on ONECUT factors in vivo, and as such, mice deficient in ONECUT1, ONECUT2, or both exhibit depleted levels of F9. Taken together, our findings establish ONECUT transcription factors as the missing hemophilia B Leyden regulators that operate through the -5/-6 site.
Whole-genome sequencing across multiple samples in a population provides an unprecedented opportunity for comprehensively characterizing the polymorphic variants in the population. Although the 1000 Genomes Project (1KGP) has offered brief insights into the value of population-level sequencing, the low coverage has compromised the ability to confidently detect rare and low-frequency variants. In addition, the composition of populations in the 1KGP is not complete, despite the fact that the study design has been extended to more than 2,500 samples from more than 20 population groups. The Malays are one of the Austronesian groups predominantly present in Southeast Asia and Oceania, and the Singapore Sequencing Malay Project (SSMP) aims to perform deep whole-genome sequencing of 100 healthy Malays. By sequencing at a minimum of 30× coverage, we have illustrated the higher sensitivity at detecting low-frequency and rare variants and the ability to investigate the presence of hotspots of functional mutations. Compared to the low-pass sequencing in the 1KGP, the deeper coverage allows more functional variants to be identified for each person. A comparison of the fidelity of genotype imputation of Malays indicated that a population-specific reference panel, such as the SSMP, outperforms a cosmopolitan panel with larger number of individuals for common SNPs. For lower-frequency (<5%) markers, a larger number of individuals might have to be whole-genome sequenced so that the accuracy currently afforded by the 1KGP can be achieved. The SSMP data are expected to be the benchmark for evaluating the value of deep population-level sequencing versus low-pass sequencing, especially in populations that are poorly represented in population-genetics studies.