Journal: American journal of human genetics
Sequencing the genomes of extinct hominids has reshaped our understanding of modern human origins. Here, we analyze ∼120 kb of exome-captured Y-chromosome DNA from a Neandertal individual from El Sidrón, Spain. We investigate its divergence from orthologous chimpanzee and modern human sequences and find strong support for a model that places the Neandertal lineage as an outgroup to modern human Y chromosomes-including A00, the highly divergent basal haplogroup. We estimate that the time to the most recent common ancestor (TMRCA) of Neandertal and modern human Y chromosomes is ∼588 thousand years ago (kya) (95% confidence interval [CI]: 447-806 kya). This is ∼2.1 (95% CI: 1.7-2.9) times longer than the TMRCA of A00 and other extant modern human Y-chromosome lineages. This estimate suggests that the Y-chromosome divergence mirrors the population divergence of Neandertals and modern human ancestors, and it refutes alternative scenarios of a relatively recent or super-archaic origin of Neandertal Y chromosomes. The fact that the Neandertal Y we describe has never been observed in modern humans suggests that the lineage is most likely extinct. We identify protein-coding differences between Neandertal and modern human Y chromosomes, including potentially damaging changes to PCDH11Y, TMSB4Y, USP9Y, and KDM5D. Three of these changes are missense mutations in genes that produce male-specific minor histocompatibility (H-Y) antigens. Antigens derived from KDM5D, for example, are thought to elicit a maternal immune response during gestation. It is possible that incompatibilities at one or more of these genes played a role in the reproductive isolation of the two groups.
Pathogens and the diseases they cause have been among the most important selective forces experienced by humans during their evolutionary history. Although adaptive alleles generally arise by mutation, introgression can also be a valuable source of beneficial alleles. Archaic humans, who lived in Europe and Western Asia for more than 200,000 years, were probably well adapted to this environment and its local pathogens. It is therefore conceivable that modern humans entering Europe and Western Asia who admixed with them obtained a substantial immune advantage from the introgression of archaic alleles. Here we document a cluster of three Toll-like receptors (TLR6-TLR1-TLR10) in modern humans that carries three distinct archaic haplotypes, indicating repeated introgression from archaic humans. Two of these haplotypes are most similar to the Neandertal genome, and the third haplotype is most similar to the Denisovan genome. The Toll-like receptors are key components of innate immunity and provide an important first line of immune defense against bacteria, fungi, and parasites. The unusually high allele frequencies and unexpected levels of population differentiation indicate that there has been local positive selection on multiple haplotypes at this locus. We show that the introgressed alleles have clear functional effects in modern humans; archaic-like alleles underlie differences in the expression of the TLR genes and are associated with reduced microbial resistance and increased allergic disease in large cohorts. This provides strong evidence for recurrent adaptive introgression at the TLR6-TLR1-TLR10 locus, resulting in differences in disease phenotypes in modern humans.
The Canaanites inhabited the Levant region during the Bronze Age and established a culture that became influential in the Near East and beyond. However, the Canaanites, unlike most other ancient Near Easterners of this period, left few surviving textual records and thus their origin and relationship to ancient and present-day populations remain unclear. In this study, we sequenced five whole genomes from ∼3,700-year-old individuals from the city of Sidon, a major Canaanite city-state on the Eastern Mediterranean coast. We also sequenced the genomes of 99 individuals from present-day Lebanon to catalog modern Levantine genetic diversity. We find that a Bronze Age Canaanite-related ancestry was widespread in the region, shared among urban populations inhabiting the coast (Sidon) and inland populations (Jordan) who likely lived in farming societies or were pastoral nomads. This Canaanite-related ancestry derived from mixture between local Neolithic populations and eastern migrants genetically related to Chalcolithic Iranians. We estimate, using linkage-disequilibrium decay patterns, that admixture occurred 6,600-3,550 years ago, coinciding with recorded massive population movements in Mesopotamia during the mid-Holocene. We show that present-day Lebanese derive most of their ancestry from a Canaanite-related population, which therefore implies substantial genetic continuity in the Levant since at least the Bronze Age. In addition, we find Eurasian ancestry in the Lebanese not present in Bronze Age or earlier Levantines. We estimate that this Eurasian ancestry arrived in the Levant around 3,750-2,170 years ago during a period of successive conquests by distant populations.
Men have a shorter life expectancy compared with women but the underlying factor(s) are not clear. Late-onset, sporadic Alzheimer disease (AD) is a common and lethal neurodegenerative disorder and many germline inherited variants have been found to influence the risk of developing AD. Our previous results show that a fundamentally different genetic variant, i.e., lifetime-acquired loss of chromosome Y (LOY) in blood cells, is associated with all-cause mortality and an increased risk of non-hematological tumors and that LOY could be induced by tobacco smoking. We tested here a hypothesis that men with LOY are more susceptible to AD and show that LOY is associated with AD in three independent studies of different types. In a case-control study, males with AD diagnosis had higher degree of LOY mosaicism (adjusted odds ratio = 2.80, p = 0.0184, AD events = 606). Furthermore, in two prospective studies, men with LOY at blood sampling had greater risk for incident AD diagnosis during follow-up time (hazard ratio [HR] = 6.80, 95% confidence interval [95% CI] = 2.16-21.43, AD events = 140, p = 0.0011). Thus, LOY in blood is associated with risks of both AD and cancer, suggesting a role of LOY in blood cells on disease processes in other tissues, possibly via defective immunosurveillance. As a male-specific risk factor, LOY might explain why males on average live shorter lives than females.
The predominantly African origin of all modern human populations is well established, but the route taken out of Africa is still unclear. Two alternative routes, via Egypt and Sinai or across the Bab el Mandeb strait into Arabia, have traditionally been proposed as feasible gateways in light of geographic, paleoclimatic, archaeological, and genetic evidence. Distinguishing among these alternatives has been difficult. We generated 225 whole-genome sequences (225 at 8× depth, of which 8 were increased to 30×; Illumina HiSeq 2000) from six modern Northeast African populations (100 Egyptians and five Ethiopian populations each represented by 25 individuals). West Eurasian components were masked out, and the remaining African haplotypes were compared with a panel of sub-Saharan African and non-African genomes. We showed that masked Northeast African haplotypes overall were more similar to non-African haplotypes and more frequently present outside Africa than were any sets of haplotypes derived from a West African population. Furthermore, the masked Egyptian haplotypes showed these properties more markedly than the masked Ethiopian haplotypes, pointing to Egypt as the more likely gateway in the exodus to the rest of the world. Using five Ethiopian and three Egyptian high-coverage masked genomes and the multiple sequentially Markovian coalescent (MSMC) approach, we estimated the genetic split times of Egyptians and Ethiopians from non-African populations at 55,000 and 65,000 years ago, respectively, whereas that of West Africans was estimated to be 75,000 years ago. Both the haplotype and MSMC analyses thus suggest a predominant northern route out of Africa via Egypt.
Five classical designations of sickle haplotypes are made on the basis of the presence or absence of restriction sites and are named after the ethno-linguistic groups or geographic regions from which the individuals with sickle cell anemia originated. Each haplotype is thought to represent an independent occurrence of the sickle mutation rs334 (c.20A>T [p.Glu7Val] in HBB). We investigated the origins of the sickle mutation by using whole-genome-sequence data. We identified 156 carriers from the 1000 Genomes Project, the African Genome Variation Project, and Qatar. We classified haplotypes by using 27 polymorphisms in linkage disequilibrium with rs334. Network analysis revealed a common haplotype that differed from the ancestral haplotype only by the derived sickle mutation at rs334 and correlated collectively with the Central African Republic (CAR), Cameroon, and Arabian/Indian haplotypes. Other haplotypes were derived from this haplotype and fell into two clusters, one composed of Senegal haplotypes and the other composed of Benin and Senegal haplotypes. The near-exclusive presence of the original sickle haplotype in the CAR, Kenya, Uganda, and South Africa is consistent with this haplotype predating the Bantu expansions. Modeling of balancing selection indicated that the heterozygote advantage was 15.2%, an equilibrium frequency of 12.0% was reached after 87 generations, and the selective environment predated the mutation. The posterior distribution of the ancestral recombination graph yielded a sickle mutation age of 259 generations, corresponding to 7,300 years ago during the Holocene Wet Phase. These results clarify the origin of the sickle allele and improve and simplify the classification of sickle haplotypes.
The genetic basis of earlobe attachment has been a matter of debate since the early 20th century, such that geneticists argue both for and against polygenic inheritance. Recent genetic studies have identified a few loci associated with the trait, but large-scale analyses are still lacking. Here, we performed a genome-wide association study of lobe attachment in a multiethnic sample of 74,660 individuals from four cohorts (three with the trait scored by an expert rater and one with the trait self-reported). Meta-analysis of the three expert-rater-scored cohorts revealed six associated loci harboring numerous candidate genes, including EDAR, SP5, MRPS22, ADGRG6 (GPR126), KIAA1217, and PAX9. The large self-reported 23andMe cohort recapitulated each of these six loci. Moreover, meta-analysis across all four cohorts revealed a total of 49 significant (p < 5 × 10-8) loci. Annotation and enrichment analyses of these 49 loci showed strong evidence of genes involved in ear development and syndromes with auricular phenotypes. RNA sequencing data from both human fetal ear and mouse second branchial arch tissue confirmed that genes located among associated loci showed evidence of expression. These results provide strong evidence for the polygenic nature of earlobe attachment and offer insights into the biological basis of normal and abnormal ear development.
Assessing the genetic contribution of Neanderthals to non-disease phenotypes in modern humans has been difficult because of the absence of large cohorts for which common phenotype information is available. Using baseline phenotypes collected for 112,000 individuals by the UK Biobank, we can now elaborate on previous findings that identified associations between signatures of positive selection on Neanderthal DNA and various modern human traits but not any specific phenotypic consequences. Here, we show that Neanderthal DNA affects skin tone and hair color, height, sleeping patterns, mood, and smoking status in present-day Europeans. Interestingly, multiple Neanderthal alleles at different loci contribute to skin and hair color in present-day Europeans, and these Neanderthal alleles contribute to both lighter and darker skin tones and hair color, suggesting that Neanderthals themselves were most likely variable in these traits.
We report the discovery of an African American Y chromosome that carries the ancestral state of all SNPs that defined the basal portion of the Y chromosome phylogenetic tree. We sequenced ∼240 kb of this chromosome to identify private, derived mutations on this lineage, which we named A00. We then estimated the time to the most recent common ancestor (TMRCA) for the Y tree as 338 thousand years ago (kya) (95% confidence interval = 237-581 kya). Remarkably, this exceeds current estimates of the mtDNA TMRCA, as well as those of the age of the oldest anatomically modern human fossils. The extremely ancient age combined with the rarity of the A00 lineage, which we also find at very low frequency in central Africa, point to the importance of considering more complex models for the origin of Y chromosome diversity. These models include ancient population structure and the possibility of archaic introgression of Y chromosomes into anatomically modern humans. The A00 lineage was discovered in a large database of consumer samples of African Americans and has not been identified in traditional hunter-gatherer populations from sub-Saharan Africa. This underscores how the stochastic nature of the genealogical process can affect inference from a single locus and warrants caution during the interpretation of the geographic location of divergent branches of the Y chromosome phylogenetic tree for the elucidation of human origins.
The human genetics community needs robust protocols that enable secure sharing of genomic data from participants in genetic research. Beacons are web servers that answer allele-presence queries-such as “Do you have a genome that has a specific nucleotide (e.g., A) at a specific genomic position (e.g., position 11,272 on chromosome 1)?”-with either “yes” or “no.” Here, we show that individuals in a beacon are susceptible to re-identification even if the only data shared include presence or absence information about alleles in a beacon. Specifically, we propose a likelihood-ratio test of whether a given individual is present in a given genetic beacon. Our test is not dependent on allele frequencies and is the most powerful test for a specified false-positive rate. Through simulations, we showed that in a beacon with 1,000 individuals, re-identification is possible with just 5,000 queries. Relatives can also be identified in the beacon. Re-identification is possible even in the presence of sequencing errors and variant-calling differences. In a beacon constructed with 65 European individuals from the 1000 Genomes Project, we demonstrated that it is possible to detect membership in the beacon with just 250 SNPs. With just 1,000 SNP queries, we were able to detect the presence of an individual genome from the Personal Genome Project in an existing beacon. Our results show that beacons can disclose membership and implied phenotypic information about participants and do not protect privacy a priori. We discuss risk mitigation through policies and standards such as not allowing anonymous pings of genetic beacons and requiring minimum beacon sizes.