Journal: American journal of human genetics
Pathogens and the diseases they cause have been among the most important selective forces experienced by humans during their evolutionary history. Although adaptive alleles generally arise by mutation, introgression can also be a valuable source of beneficial alleles. Archaic humans, who lived in Europe and Western Asia for more than 200,000 years, were probably well adapted to this environment and its local pathogens. It is therefore conceivable that modern humans entering Europe and Western Asia who admixed with them obtained a substantial immune advantage from the introgression of archaic alleles. Here we document a cluster of three Toll-like receptors (TLR6-TLR1-TLR10) in modern humans that carries three distinct archaic haplotypes, indicating repeated introgression from archaic humans. Two of these haplotypes are most similar to the Neandertal genome, and the third haplotype is most similar to the Denisovan genome. The Toll-like receptors are key components of innate immunity and provide an important first line of immune defense against bacteria, fungi, and parasites. The unusually high allele frequencies and unexpected levels of population differentiation indicate that there has been local positive selection on multiple haplotypes at this locus. We show that the introgressed alleles have clear functional effects in modern humans; archaic-like alleles underlie differences in the expression of the TLR genes and are associated with reduced microbial resistance and increased allergic disease in large cohorts. This provides strong evidence for recurrent adaptive introgression at the TLR6-TLR1-TLR10 locus, resulting in differences in disease phenotypes in modern humans.
Sequencing the genomes of extinct hominids has reshaped our understanding of modern human origins. Here, we analyze ∼120 kb of exome-captured Y-chromosome DNA from a Neandertal individual from El Sidrón, Spain. We investigate its divergence from orthologous chimpanzee and modern human sequences and find strong support for a model that places the Neandertal lineage as an outgroup to modern human Y chromosomes-including A00, the highly divergent basal haplogroup. We estimate that the time to the most recent common ancestor (TMRCA) of Neandertal and modern human Y chromosomes is ∼588 thousand years ago (kya) (95% confidence interval [CI]: 447-806 kya). This is ∼2.1 (95% CI: 1.7-2.9) times longer than the TMRCA of A00 and other extant modern human Y-chromosome lineages. This estimate suggests that the Y-chromosome divergence mirrors the population divergence of Neandertals and modern human ancestors, and it refutes alternative scenarios of a relatively recent or super-archaic origin of Neandertal Y chromosomes. The fact that the Neandertal Y we describe has never been observed in modern humans suggests that the lineage is most likely extinct. We identify protein-coding differences between Neandertal and modern human Y chromosomes, including potentially damaging changes to PCDH11Y, TMSB4Y, USP9Y, and KDM5D. Three of these changes are missense mutations in genes that produce male-specific minor histocompatibility (H-Y) antigens. Antigens derived from KDM5D, for example, are thought to elicit a maternal immune response during gestation. It is possible that incompatibilities at one or more of these genes played a role in the reproductive isolation of the two groups.
The Canaanites inhabited the Levant region during the Bronze Age and established a culture that became influential in the Near East and beyond. However, the Canaanites, unlike most other ancient Near Easterners of this period, left few surviving textual records and thus their origin and relationship to ancient and present-day populations remain unclear. In this study, we sequenced five whole genomes from ∼3,700-year-old individuals from the city of Sidon, a major Canaanite city-state on the Eastern Mediterranean coast. We also sequenced the genomes of 99 individuals from present-day Lebanon to catalog modern Levantine genetic diversity. We find that a Bronze Age Canaanite-related ancestry was widespread in the region, shared among urban populations inhabiting the coast (Sidon) and inland populations (Jordan) who likely lived in farming societies or were pastoral nomads. This Canaanite-related ancestry derived from mixture between local Neolithic populations and eastern migrants genetically related to Chalcolithic Iranians. We estimate, using linkage-disequilibrium decay patterns, that admixture occurred 6,600-3,550 years ago, coinciding with recorded massive population movements in Mesopotamia during the mid-Holocene. We show that present-day Lebanese derive most of their ancestry from a Canaanite-related population, which therefore implies substantial genetic continuity in the Levant since at least the Bronze Age. In addition, we find Eurasian ancestry in the Lebanese not present in Bronze Age or earlier Levantines. We estimate that this Eurasian ancestry arrived in the Levant around 3,750-2,170 years ago during a period of successive conquests by distant populations.
Men have a shorter life expectancy compared with women but the underlying factor(s) are not clear. Late-onset, sporadic Alzheimer disease (AD) is a common and lethal neurodegenerative disorder and many germline inherited variants have been found to influence the risk of developing AD. Our previous results show that a fundamentally different genetic variant, i.e., lifetime-acquired loss of chromosome Y (LOY) in blood cells, is associated with all-cause mortality and an increased risk of non-hematological tumors and that LOY could be induced by tobacco smoking. We tested here a hypothesis that men with LOY are more susceptible to AD and show that LOY is associated with AD in three independent studies of different types. In a case-control study, males with AD diagnosis had higher degree of LOY mosaicism (adjusted odds ratio = 2.80, p = 0.0184, AD events = 606). Furthermore, in two prospective studies, men with LOY at blood sampling had greater risk for incident AD diagnosis during follow-up time (hazard ratio [HR] = 6.80, 95% confidence interval [95% CI] = 2.16-21.43, AD events = 140, p = 0.0011). Thus, LOY in blood is associated with risks of both AD and cancer, suggesting a role of LOY in blood cells on disease processes in other tissues, possibly via defective immunosurveillance. As a male-specific risk factor, LOY might explain why males on average live shorter lives than females.
Erectile dysfunction (ED) is a common condition affecting more than 20% of men over 60 years, yet little is known about its genetic architecture. We performed a genome-wide association study of ED in 6,175 case subjects among 223,805 European men and identified one locus at 6q16.3 (lead variant rs57989773, OR 1.20 per C-allele; p = 5.71 × 10-14), located between MCHR2 and SIM1. In silico analysis suggests SIM1 to confer ED risk through hypothalamic dysregulation. Mendelian randomization provides evidence that genetic risk of type 2 diabetes mellitus is a cause of ED (OR 1.11 per 1-log unit higher risk of type 2 diabetes). These findings provide insights into the biological underpinnings and the causes of ED and may help prioritize the development of future therapies for this common disorder.
During the medieval period, hundreds of thousands of Europeans migrated to the Near East to take part in the Crusades, and many of them settled in the newly established Christian states along the Eastern Mediterranean coast. Here, we present a genetic snapshot of these events and their aftermath by sequencing the whole genomes of 13 individuals who lived in what is today known as Lebanon between the 3rd and 13th centuries CE. These include nine individuals from the “Crusaders' pit” in Sidon, a mass burial in South Lebanon identified from the archaeology as the grave of Crusaders killed during a battle in the 13th century CE. We show that all of the Crusaders' pit individuals were males; some were Western Europeans from diverse origins, some were locals (genetically indistinguishable from present-day Lebanese), and two individuals were a mixture of European and Near Eastern ancestries, providing direct evidence that the Crusaders admixed with the local population. However, these mixtures appear to have had limited genetic consequences since signals of admixture with Europeans are not significant in any Lebanese group today-in particular, Lebanese Christians are today genetically similar to local people who lived during the Roman period which preceded the Crusades by more than four centuries.
Phenome-wide association studies (PheWASs) have been a useful tool for testing associations between genetic variations and multiple complex traits or diagnoses. Linking PheWAS-based associations between phenotypes and a variant or a genomic region into a network provides a new way to investigate cross-phenotype associations, and it might broaden the understanding of genetic architecture that exists between diagnoses, genes, and pleiotropy. We created a network of associations from one of the largest PheWASs on electronic health record (EHR)-derived phenotypes across 38,682 unrelated samples from the Geisinger’s biobank; the samples were genotyped through the DiscovEHR project. We computed associations between 632,574 common variants and 541 diagnosis codes. Using these associations, we constructed a “disease-disease” network (DDN) wherein pairs of diseases were connected on the basis of shared associations with a given genetic variant. The DDN provides a landscape of intra-connections within the same disease classes, as well as inter-connections across disease classes. We identified clusters of diseases with known biological connections, such as autoimmune disorders (type 1 diabetes, rheumatoid arthritis, and multiple sclerosis) and cardiovascular disorders. Previously unreported relationships between multiple diseases were identified on the basis of genetic associations as well. The network approach applied in this study can be used to uncover interactions between diseases as a result of their shared, potentially pleiotropic SNPs. Additionally, this approach might advance clinical research and even clinical practice by accelerating our understanding of disease mechanisms on the basis of similar underlying genetic associations.
Genomic technologies such as next-generation sequencing (NGS) are revolutionizing molecular diagnostics and clinical medicine. However, these approaches have proven inefficient at identifying pathogenic repeat expansions. Here, we apply a collection of bioinformatics tools that can be utilized to identify either known or novel expanded repeat sequences in NGS data. We performed genetic studies of a cohort of 35 individuals from 22 families with a clinical diagnosis of cerebellar ataxia with neuropathy and bilateral vestibular areflexia syndrome (CANVAS). Analysis of whole-genome sequence (WGS) data with five independent algorithms identified a recessively inherited intronic repeat expansion [(AAGGG)exp] in the gene encoding Replication Factor C1 (RFC1). This motif, not reported in the reference sequence, localized to an Alu element and replaced the reference (AAAAG)11 short tandem repeat. Genetic analyses confirmed the pathogenic expansion in 18 of 22 CANVAS-affected families and identified a core ancestral haplotype, estimated to have arisen in Europe more than twenty-five thousand years ago. WGS of the four RFC1-negative CANVAS-affected families identified plausible variants in three, with genomic re-diagnosis of SCA3, spastic ataxia of the Charlevoix-Saguenay type, and SCA45. This study identified the genetic basis of CANVAS and demonstrated that these improved bioinformatics tools increase the diagnostic utility of WGS to determine the genetic basis of a heterogeneous group of clinically overlapping neurogenetic disorders.
The predominantly African origin of all modern human populations is well established, but the route taken out of Africa is still unclear. Two alternative routes, via Egypt and Sinai or across the Bab el Mandeb strait into Arabia, have traditionally been proposed as feasible gateways in light of geographic, paleoclimatic, archaeological, and genetic evidence. Distinguishing among these alternatives has been difficult. We generated 225 whole-genome sequences (225 at 8× depth, of which 8 were increased to 30×; Illumina HiSeq 2000) from six modern Northeast African populations (100 Egyptians and five Ethiopian populations each represented by 25 individuals). West Eurasian components were masked out, and the remaining African haplotypes were compared with a panel of sub-Saharan African and non-African genomes. We showed that masked Northeast African haplotypes overall were more similar to non-African haplotypes and more frequently present outside Africa than were any sets of haplotypes derived from a West African population. Furthermore, the masked Egyptian haplotypes showed these properties more markedly than the masked Ethiopian haplotypes, pointing to Egypt as the more likely gateway in the exodus to the rest of the world. Using five Ethiopian and three Egyptian high-coverage masked genomes and the multiple sequentially Markovian coalescent (MSMC) approach, we estimated the genetic split times of Egyptians and Ethiopians from non-African populations at 55,000 and 65,000 years ago, respectively, whereas that of West Africans was estimated to be 75,000 years ago. Both the haplotype and MSMC analyses thus suggest a predominant northern route out of Africa via Egypt.
Five classical designations of sickle haplotypes are made on the basis of the presence or absence of restriction sites and are named after the ethno-linguistic groups or geographic regions from which the individuals with sickle cell anemia originated. Each haplotype is thought to represent an independent occurrence of the sickle mutation rs334 (c.20A>T [p.Glu7Val] in HBB). We investigated the origins of the sickle mutation by using whole-genome-sequence data. We identified 156 carriers from the 1000 Genomes Project, the African Genome Variation Project, and Qatar. We classified haplotypes by using 27 polymorphisms in linkage disequilibrium with rs334. Network analysis revealed a common haplotype that differed from the ancestral haplotype only by the derived sickle mutation at rs334 and correlated collectively with the Central African Republic (CAR), Cameroon, and Arabian/Indian haplotypes. Other haplotypes were derived from this haplotype and fell into two clusters, one composed of Senegal haplotypes and the other composed of Benin and Senegal haplotypes. The near-exclusive presence of the original sickle haplotype in the CAR, Kenya, Uganda, and South Africa is consistent with this haplotype predating the Bantu expansions. Modeling of balancing selection indicated that the heterozygote advantage was 15.2%, an equilibrium frequency of 12.0% was reached after 87 generations, and the selective environment predated the mutation. The posterior distribution of the ancestral recombination graph yielded a sickle mutation age of 259 generations, corresponding to 7,300 years ago during the Holocene Wet Phase. These results clarify the origin of the sickle allele and improve and simplify the classification of sickle haplotypes.