Concept: Minor allele frequency
Autoimmune thyroid disease (AITD), including Graves' disease (GD) and Hashimoto’s thyroiditis (HT), is one of the most common of the immune-mediated diseases. To further investigate the genetic determinants of AITD, we conducted an association study using a custom-made single-nucleotide polymorphism (SNP) array, the ImmunoChip. The SNP array contains all known and genotype-able SNPs across 186 distinct susceptibility loci associated with one or more immune-mediated diseases. After stringent quality control, we analysed 103 875 common SNPs (minor allele frequency >0.05) in 2285 GD and 462 HT patients and 9364 controls. We found evidence for seven new AITD risk loci (P < 1.12 × 10(-6); a permutation test derived significance threshold), five at locations previously associated and two at locations awaiting confirmation, with other immune-mediated diseases.
A number of previous studies suggested the presence of deleterious amino acid altering nonsynonymous single-nucleotide polymorphisms (nSNPs) in human populations. However, the proportions of deleterious nSNPs among rare and common variants are not known. To estimate these, >77 000 SNPs from human protein-coding genes were analyzed. Based on two independent methods, this study reveals that up to 53% of rare nSNPs (minor allele frequency (MAF)<0.002) could be deleterious in nature. The fraction of deleterious nSNPs declines with the increase in their allele frequencies and only 12% of the common nSNPs (MAF>0.4) were found to be harmful. This shows that even at high frequencies significant fractions of deleterious polymorphisms are present in human populations. These results could be useful for genome-wide association studies in understanding the relative contributions of rare and common variants in causing human genetic diseases.
INTRODUCTION: The largest genetic risk to develop rheumatoid arthritis (RA) arises from a group of alleles of the HLA DRB1 locus (“shared epitope”, SE). Over 30 non-HLA single nucleotide polymorphisms (SNPs) predisposing to disease have been identified in Caucasians, but they have never been investigated in West/Central Africa. We previously reported a lower prevalence of the SE in RA patients in Cameroon compared to European patients and aimed in the present study to investigate the contribution of Caucasian non-HLA RA SNPs to disease susceptibility in Black Africans. METHODS: RA cases and controls from Cameroon were genotyped for Caucasian RA susceptibility SNPs using Sequenom MassArray technology. Genotype data was also available for 5024 UK cases and 4281 UK controls and for 119 Yoruba individuals in Ibadan, Nigeria (YRI, HapMap). A Caucasian aggregate genetic-risk score (GRS) was calculated as the sum of the weighted risk-allele counts. RESULTS: After genotyping quality control, data on 28 Caucasian non-HLA susceptibility SNPs was available in 43 Cameroonian RA cases and 44 controls. The minor allele frequencies (MAF) were tightly correlated between Cameroonian controls and YRI individuals (correlation coefficient 93.8%, p=1.7E-13), and they were pooled together. There was no correlation between MAF of UK and African controls; 13 markers differed by more than 20%. The MAF for markers at PTPN22, IL2RA, FCGR2A and IL2/IL21 was below 2% in Africans. The GRS showed a strong association with RA in the UK. However, the GRS did not predict RA in Africans (OR=0.71, 95% CI 0.29 - 1.74, p=0.456). Random sampling from the UK cohort showed that this difference in association is unlikely to be explained by small sample size or chance, but is statistically significant with p<0.001. CONCLUSIONS: The MAF of non-HLA Caucasian RA susceptibility SNPs are different between Caucasians and Africans and several polymorphisms are barely detectable in West/Central Africa. The genetic risk of developing RA conferred by a set of 28 Caucasian susceptibility SNPs is significantly different between the UK and Africa with p<0.001. Taken together, these observations strengthen the hypothesis that the genetic architecture of RA susceptibility is different in different ethnic backgrounds.
We develop a Bayesian mixed linear model that simultaneously estimates single-nucleotide polymorphism (SNP)-based heritability, polygenicity (proportion of SNPs with nonzero effects), and the relationship between SNP effect size and minor allele frequency for complex traits in conventionally unrelated individuals using genome-wide SNP data. We apply the method to 28 complex traits in the UK Biobank data (N = 126,752) and show that on average, 6% of SNPs have nonzero effects, which in total explain 22% of phenotypic variance. We detect significant (P < 0.05/28) signatures of natural selection in the genetic architecture of 23 traits, including reproductive, cardiovascular, and anthropometric traits, as well as educational attainment. The significant estimates of the relationship between effect size and minor allele frequency in complex traits are consistent with a model of negative (or purifying) selection, as confirmed by forward simulation. We conclude that negative selection acts pervasively on the genetic variants associated with human complex traits.
- The journals of gerontology. Series A, Biological sciences and medical sciences
- Published over 3 years ago
We used 197 Drosophila melanogaster Genetic Reference Panel (DGRP) lines to perform a genome-wide association analysis for virgin female lifespan, using ~2M common single nucleotide polymorphisms (SNPs). We found considerable genetic variation in lifespan in the DGRP, with a broad-sense heritability of 0.413. There was little power to detect signals at a genome-wide level in single-SNP and gene-based analyses. Polygenic score analysis revealed that a small proportion of the variation in lifespan (~4.7%) was explicable in terms of additive effects of common SNPs (≥2% minor allele frequency). However, several of the top associated genes are involved in the processes previously shown to impact ageing (eg, carbohydrate-related metabolism, regulation of cell death, proteolysis). Other top-ranked genes are of unknown function and provide promising candidates for experimental examination. Genes in the target of rapamycin pathway (TOR; Chrb, slif, mipp2, dredd, RpS9, dm) contributed to the significant enrichment of this pathway among the top-ranked 100 genes (p = 4.79×10(-06)). Gene Ontology analysis suggested that genes involved in carbohydrate metabolism are important for lifespan; including the InterPro term DUF227, which has been previously associated with lifespan determination. This analysis suggests that our understanding of the genetic basis of natural variation in lifespan from induced mutations is incomplete.
Multiple methods have been developed to estimate narrow-sense heritability, h2, using single nucleotide polymorphisms (SNPs) in unrelated individuals. However, a comprehensive evaluation of these methods has not yet been performed, leading to confusion and discrepancy in the literature. We present the most thorough and realistic comparison of these methods to date. We used thousands of real whole-genome sequences to simulate phenotypes under varying genetic architectures and confounding variables, and we used array, imputed, or whole genome sequence SNPs to obtain ‘SNP-heritability’ estimates. We show that SNP-heritability can be highly sensitive to assumptions about the frequencies, effect sizes, and levels of linkage disequilibrium of underlying causal variants, but that methods that bin SNPs according to minor allele frequency and linkage disequilibrium are less sensitive to these assumptions across a wide range of genetic architectures and possible confounding factors. These findings provide guidance for best practices and proper interpretation of published estimates.
Genomic evaluation is used to predict direct genomic values (DGV) for selection candidates in breeding programs, but also to estimate allele substitution effects (ASE) of single nucleotide polymorphisms (SNPs). Scaling of allele counts influences the estimated ASE, because scaling of allele counts results in less shrinkage towards the mean for low minor allele frequency (MAF) variants. Scaling may become relevant for estimating ASE as more low MAF variants will be used in genomic evaluations. We show the impact of scaling on estimates of ASE using real data and a theoretical framework, and in terms of power, model fit and predictive performance.
Variation in the oxytocin receptor (OXTR) gene may partly explain individual differences in oxytocin-related social behavior. Two single nucleotide polymorphisms (SNPs) have been suggested as promising candidates: rs53576 and rs2254298, although the results of studies were not consistent. We carried out meta-analyses for these two SNPs, covering five domains of outcomes: (a) biology, (b) personality, © social behavior, (d) psychopathology, and (e) autism, on the basis of 82 pertinent effect sizes, 48 for OXTR rs53576 (N=17 559) and 34 for OXTR rs2254298 (N=13 547). Combined effect sizes did not differ from zero in any of the domains, nor for all domains combined. Clinical status, age, and sex did not moderate the effect sizes. Minor allele frequency was related to ethnicity, with significantly lower minor allele frequencies in samples with predominantly Caucasian participants. The domain of biological functioning seemed most promising, but comprised few studies. We conclude that so far two of the most intensively studied OXTR SNPs (rs53576 and rs2254298) failed to explain a significant part of human social behavior.
Polymorphic loci exist throughout the genomes of a population and provide the raw genetic material needed for a species to adapt to changes in the environment. The minor allele frequencies of rare Single Nucleotide Polymorphisms (SNPs) within a population have been difficult to track with Next-Generation Sequencing (NGS), due to the high error rate of standard methods such as Illumina sequencing.
BACKGROUND AND OBJECTIVE: Survival of patients with pancreatic adenocarcinoma is limited and few prognostic factors are known. We conducted a two-stage genome-wide association study (GWAS) to identify germline variants associated with survival in patients with pancreatic adenocarcinoma. METHODS: We analysed overall survival in relation to single nucleotide polymorphisms (SNPs) among 1005 patients from two large GWAS datasets, PanScan I and ChinaPC. Cox proportional hazards regression was used in an additive genetic model with adjustment for age, sex, clinical stage and the top four principal components of population stratification. The first stage included 642 cases of European ancestry (PanScan), from which the top SNPs (p≤10(-5)) were advanced to a joint analysis with 363 additional patients from China (ChinaPC). RESULTS: In the first stage of cases of European descent, the top-ranked loci were at chromosomes 11p15.4, 18p11.21 and 1p36.13, tagged by rs12362504 (p=1.63×10(-7)), rs981621 (p=1.65×10(-7)) and rs16861827 (p=3.75×10(-7)), respectively. 131 SNPs with p≤10(-5) were advanced to a joint analysis with cases from the ChinaPC study. In the joint analysis, the top-ranked SNP was rs10500715 (minor allele frequency, 0.37; p=1.72×10(-7)) on chromosome 11p15.4, which is intronic to the SET binding factor 2 (SBF2) gene. The HR (95% CI) for death was 0.74 (0.66 to 0.84) in PanScan I, 0.79 (0.65 to 0.97) in ChinaPC and 0.76 (0.68 to 0.84) in the joint analysis. CONCLUSIONS: Germline genetic variation in the SBF2 locus was associated with overall survival in patients with pancreatic adenocarcinoma of European and Asian ancestry. This association should be investigated in additional large patient cohorts.