Discover the most talked about and latest scientific content & concepts.

Concept: Diversity index


Two recent studies have reanalyzed previously published data and found that when data sets were analyzed independently, there was limited support for the widely accepted hypothesis that changes in the microbiome are associated with obesity. This hypothesis was reconsidered by increasing the number of data sets and pooling the results across the individual data sets. The preferred reporting items for systematic reviews and meta-analyses guidelines were used to identify 10 studies for an updated and more synthetic analysis. Alpha diversity metrics and the relative risk of obesity based on those metrics were used to identify a limited number of significant associations with obesity; however, when the results of the studies were pooled by using a random-effect model, significant associations were observed among Shannon diversity, the number of observed operational taxonomic units, Shannon evenness, and obesity status. They were not observed for the ratio of Bacteroidetes and Firmicutes or their individual relative abundances. Although these tests yielded small P values, the difference between the Shannon diversity indices of nonobese and obese individuals was 2.07%. A power analysis demonstrated that only one of the studies had sufficient power to detect a 5% difference in diversity. When random forest machine learning models were trained on one data set and then tested by using the other nine data sets, the median accuracy varied between 33.01 and 64.77% (median, 56.68%). Although there was support for a relationship between the microbial communities found in human feces and obesity status, this association was relatively weak and its detection is confounded by large interpersonal variation and insufficient sample sizes.

Concepts: Bacteria, Gut flora, Taxonomy, Ratio, Diversity index, Shannon index, Measurement of biodiversity, Index numbers


The peanut (Arachis hypogaea) is an important oil crop. Breeding for high oil content is becoming increasingly important. Wild Arachis species have been reported to harbor genes for many valuable traits that may enable the improvement of cultivated Arachis hypogaea, such as resistance to pests and disease. However, only limited information is available on variation in oil content. In the present study, a collection of 72 wild Arachis accessions representing 19 species and 3 cultivated peanut accessions were genotyped using 136 genome-wide SSR markers and phenotyped for oil content over three growing seasons. The wild Arachis accessions showed abundant diversity across the 19 species. A. duranensis exhibited the highest diversity, with a Shannon-Weaver diversity index of 0.35. A total of 129 unique alleles were detected in the species studied. A. rigonii exhibited the largest number of unique alleles (75), indicating that this species is highly differentiated. AMOVA and genetic distance analyses confirmed the genetic differentiation between the wild Arachis species. The majority of SSR alleles were detected exclusively in the wild species and not in A. hypogaea, indicating that directional selection or the hitchhiking effect has played an important role in the domestication of the cultivated peanut. The 75 accessions were grouped into three clusters based on population structure and phylogenic analysis, consistent with their taxonomic sections, species and genome types. A. villosa and A. batizocoi were grouped with A. hypogaea, suggesting the close relationship between these two diploid wild species and the cultivated peanut. Considerable phenotypic variation in oil content was observed among different sections and species. Nine alleles were identified as associated with oil content based on association analysis, of these, three alleles were associated with higher oil content but were absent in the cultivated peanut. The results demonstrated that there is great potential to increase the oil content in A. hypogaea by using the wild Arachis germplasm.

Concepts: Gene, Evolution, Fabaceae, Peanut, Arachis, Diversity index, Shannon index, Measurement of biodiversity


How reliable are results on spatial distribution of biodiversity based on databases? Many studies have evidenced the uncertainty related to this kind of analysis due to sampling effort bias and the need for its quantification. Despite that a number of methods are available for that, little is known about their statistical limitations and discrimination capability, which could seriously constrain their use. We assess for the first time the discrimination capacity of two widely used methods and a proposed new one (FIDEGAM), all based on species accumulation curves, under different scenarios of sampling exhaustiveness using Receiver Operating Characteristic (ROC) analyses. Additionally, we examine to what extent the output of each method represents the sampling completeness in a simulated scenario where the true species richness is known. Finally, we apply FIDEGAM to a real situation and explore the spatial patterns of plant diversity in a National Park. FIDEGAM showed an excellent discrimination capability to distinguish between well and poorly sampled areas regardless of sampling exhaustiveness, whereas the other methods failed. Accordingly, FIDEGAM values were strongly correlated with the true percentage of species detected in a simulated scenario, whereas sampling completeness estimated with other methods showed no relationship due to null discrimination capability. Quantifying sampling effort is necessary to account for the uncertainty in biodiversity analyses, however, not all proposed methods are equally reliable. Our comparative analysis demonstrated that FIDEGAM was the most accurate discriminator method in all scenarios of sampling exhaustiveness, and therefore, it can be efficiently applied to most databases in order to enhance the reliability of biodiversity analyses.

Concepts: Scientific method, Critical thinking, Plant, Physics, Mathematical analysis, Logic, Receiver operating characteristic, Diversity index


Omega-3 fatty acids may influence human physiological parameters in part by affecting the gut microbiome. The aim of this study was to investigate the links between omega-3 fatty acids, gut microbiome diversity and composition and faecal metabolomic profiles in middle aged and elderly women. We analysed data from 876 twins with 16S microbiome data and DHA, total omega-3, and other circulating fatty acids. Estimated food intake of omega-3 fatty acids were obtained from food frequency questionnaires. Both total omega-3and DHA serum levels were significantly correlated with microbiome alpha diversity (Shannon index) after adjusting for confounders (DHA Beta(SE) = 0.13(0.04), P = 0.0006 total omega-3: 0.13(0.04), P = 0.001). These associations remained significant after adjusting for dietary fibre intake. We found even stronger associations between DHA and 38 operational taxonomic units (OTUs), the strongest ones being with OTUs from the Lachnospiraceae family (Beta(SE) = 0.13(0.03), P = 8 × 10(-7)). Some of the associations with gut bacterial OTUs appear to be mediated by the abundance of the faecal metabolite N-carbamylglutamate. Our data indicate a link between omega-3 circulating levels/intake and microbiome composition independent of dietary fibre intake, particularly with bacteria of the Lachnospiraceae family. These data suggest the potential use of omega-3 supplementation to improve the microbiome composition.

Concepts: Nutrition, Fatty acid, Triglyceride, Essential fatty acid, Omega-3 fatty acid, Eicosapentaenoic acid, Diversity index, Omega-6 fatty acid


Anecdotal accounts regarding reduced US cropping system diversity have raised concerns about negative impacts of increasingly homogeneous cropping systems. However, formal analyses to document such changes are lacking. Using US Agriculture Census data, which are collected every five years, we quantified crop species diversity from 1978 to 2012, for the contiguous US on a county level basis. We used Shannon diversity indices expressed as effective number of crop species (ENCS) to quantify crop diversity. We then evaluated changes in county-level crop diversity both nationally and for each of the eight Farm Resource Regions developed by the National Agriculture Statistics Service. During the 34 years we considered in our analyses, both national and regional ENCS changed. Nationally, crop diversity was lower in 2012 than in 1978. However, our analyses also revealed interesting trends between and within different Resource Regions. Overall, the Heartland Resource Region had the lowest crop diversity whereas the Fruitful Rim and Northern Crescent had the highest. In contrast to the other Resource Regions, the Mississippi Portal had significantly higher crop diversity in 2012 than in 1978. Also, within regions there were differences between counties in crop diversity. Spatial autocorrelation revealed clustering of low and high ENCS and this trend became stronger over time. These results show that, nationally counties have been clustering into areas of either low diversity or high diversity. Moreover, a significant trend of more counties shifting to lower rather than to higher crop diversity was detected. The clustering and shifting demonstrates a trend toward crop diversity loss and attendant homogenization of agricultural production systems, which could have far-reaching consequences for provision of ecosystem system services associated with agricultural systems as well as food system sustainability.

Concepts: Biodiversity, Conservation biology, Agriculture, Sustainability, Food security, Diversity index, Shannon index, Monoculture


The aim of this study was to develop novel anaerobic media using gellan gum for the isolation of previously uncultured rumen bacteria. Four anaerobic media, a basal liquid medium (BM) with agar (A-BM), a modified BM (MBM) with agar (A-MBM), an MBM with phytagel (P-MBM) and an MBM with gelrite (G-MBM) were used for the isolation of rumen bacteria and evaluated for the growth of previously uncultured rumen bacteria. Of the 214 isolates composed of 144 OTUs, 103 isolates (83 OTUs) were previously uncultured rumen bacteria. Most of the previously uncultured strains were obtained from A-MBM, G-MBM and P-MBM, but the predominant cultural members, isolated from each medium, differed. A-MBM and G-MBM showed significantly higher numbers of different OTUs derived from isolates than A-BM (P < 0·05). The Shannon index indicated that the isolates of A-MBM showed the highest diversity (H' = 3·89) compared with those of G-MBM, P-MBM and A-BM (H' = 3·59, 3·23 and 3·39, respectively). Although previously uncultured rumen bacteria were isolated from all media used, the ratio of previously uncultured bacteria to total isolates was increased in A-MBM, P-MBM and G-MBM.

Concepts: Ratio, Agar, E number, Diversity index, Shannon index, Gellan gum, Sphingomonas


Diversity indices might be used to assess the impact of treatments on the relative abundance patterns in species communities. When several treatments are to be compared, simultaneous confidence intervals for the differences of diversity indices between treatments may be used. The simultaneous confidence interval methods described until now are either constructed or validated under the assumption of the multinomial distribution for the abundance counts. Motivated by four example data sets with background in agricultural and marine ecology, we focus on the situation when available replications show that the count data exhibit extra-multinomial variability. Based on simulated overdispersed count data, we compare previously proposed methods assuming multinomial distribution, a method assuming normal distribution for the replicated observations of the diversity indices and three different bootstrap methods to construct simultaneous confidence intervals for multiple differences of Simpson and Shannon diversity indices. The focus of the simulation study is on comparisons to a control group. The severe failure of asymptotic multinomial methods in overdispersed settings is illustrated. Among the bootstrap methods, the widely known Westfall-Young method performs best for the Simpson index, while for the Shannon index, two methods based on stratified bootstrap and summed count data are preferable. The methods application is illustrated for an example.

Concepts: Statistics, Confidence interval, Normal distribution, Student's t-distribution, Diversity index, Shannon index, Measurement of biodiversity, Species richness


Estimations of microbial community diversity based on metagenomic data sets are affected, often to an unknown degree, by biases derived from insufficient coverage and reference database-dependent estimations of diversity. For instance, the completeness of reference databases cannot be generally estimated since it depends on the extant diversity sampled to date, which, with the exception of a few habitats such as the human gut, remains severely undersampled. Further, estimation of the degree of coverage of a microbial community by a metagenomic data set is prohibitively time-consuming for large data sets, and coverage values may not be directly comparable between data sets obtained with different sequencing technologies. Here, we extend Nonpareil, a database-independent tool for the estimation of coverage in metagenomic data sets, to a high-performance computing implementation that scales up to hundreds of cores and includes, in addition, a k-mer-based estimation as sensitive as the original alignment-based version but about three hundred times as fast. Further, we propose a metric of sequence diversity (N d ) derived directly from Nonpareil curves that correlates well with alpha diversity assessed by traditional metrics. We use this metric in different experiments demonstrating the correlation with the Shannon index estimated on 16S rRNA gene profiles and show that N d additionally reveals seasonal patterns in marine samples that are not captured by the Shannon index and more precise rankings of the magnitude of diversity of microbial communities in different habitats. Therefore, the new version of Nonpareil, called Nonpareil 3, advances the toolbox for metagenomic analyses of microbiomes. IMPORTANCE Estimation of the coverage provided by a metagenomic data set, i.e., what fraction of the microbial community was sampled by DNA sequencing, represents an essential first step of every culture-independent genomic study that aims to robustly assess the sequence diversity present in a sample. However, estimation of coverage remains elusive because of several technical limitations associated with high computational requirements and limiting statistical approaches to quantify diversity. Here we described Nonpareil 3, a new bioinformatics algorithm that circumvents several of these limitations and thus can facilitate culture-independent studies in clinical or environmental settings, independent of the sequencing platform employed. In addition, we present a new metric of sequence diversity based on rarefied coverage and demonstrate its use in communities from diverse ecosystems.

Concepts: Statistics, Microbiology, Ribosomal RNA, 16S ribosomal RNA, Estimation, Diversity index, Shannon index, Measurement of biodiversity


GenGIS is free and open source software designed to integrate biodiversity data with a digital map and information about geography and habitat. While originally developed with microbial community analyses and phylogeography in mind, GenGIS has been applied to a wide range of datasets. A key feature of GenGIS is the ability to test geographic axes that can correspond to routes of migration or gradients that influence community similarity. Here we introduce GenGIS version 2, which extends the linear gradient tests introduced in the first version to allow comprehensive testing of all possible linear geographic axes. GenGIS v2 also includes a new plugin framework that supports the development and use of graphically driven analysis packages: initial plugins include implementations of linear regression and the Mantel test, calculations of alpha-diversity (e.g., Shannon Index) for all samples, and geographic visualizations of dissimilarity matrices. We have also implemented a recently published method for biomonitoring reference condition analysis (RCA), which compares observed species richness and diversity to predicted values to determine whether a given site has been impacted. The newest version of GenGIS supports vector data in addition to raster files. We demonstrate the new features of GenGIS by performing a full gradient analysis of an Australian kangaroo apple data set, by using plugins and embedded statistical commands to analyze human microbiome sample data, and by applying RCA to a set of samples from Atlantic Canada. GenGIS release versions, tutorials and documentation are freely available at, and source code is available at

Concepts: Scientific method, Statistics, Gradient, Source code, Diversity index, Shannon index, Open source, Free software


How floral traits and community composition influence plant specialization is poorly understood and the existing evidence is restricted to regions where plant diversity is low. Here, we assessed whether plant specialization varied among four species-rich subalpine/alpine communities on the Yulong Mountain, SW China (elevation from 2725 to 3910 m). We analyzed two factors (floral traits and pollen vector community composition: richness and density) to determine the degree of plant specialization across 101 plant species in all four communities. Floral visitors were collected and pollen load analyses were conducted to identify and define pollen vectors. Plant specialization of each species was described by using both pollen vector diversity (Shannon’s diversity index) and plant selectiveness (d' index), which reflected how selective a given species was relative to available pollen vectors.

Concepts: Biodiversity, Life, Species, Plant, Flowering plant, Diversity index, Shannon index, Index numbers