Concept: Shannon index


Two recent studies have reanalyzed previously published data and found that when data sets were analyzed independently, there was limited support for the widely accepted hypothesis that changes in the microbiome are associated with obesity. This hypothesis was reconsidered by increasing the number of data sets and pooling the results across the individual data sets. The preferred reporting items for systematic reviews and meta-analyses guidelines were used to identify 10 studies for an updated and more synthetic analysis. Alpha diversity metrics and the relative risk of obesity based on those metrics were used to identify a limited number of significant associations with obesity; however, when the results of the studies were pooled by using a random-effect model, significant associations were observed among Shannon diversity, the number of observed operational taxonomic units, Shannon evenness, and obesity status. They were not observed for the ratio of Bacteroidetes and Firmicutes or their individual relative abundances. Although these tests yielded small P values, the difference between the Shannon diversity indices of nonobese and obese individuals was 2.07%. A power analysis demonstrated that only one of the studies had sufficient power to detect a 5% difference in diversity. When random forest machine learning models were trained on one data set and then tested by using the other nine data sets, the median accuracy varied between 33.01 and 64.77% (median, 56.68%). Although there was support for a relationship between the microbial communities found in human feces and obesity status, this association was relatively weak and its detection is confounded by large interpersonal variation and insufficient sample sizes.

Concepts: Bacteria, Gut flora, Taxonomy, Ratio, Diversity index, Shannon index, Measurement of biodiversity, Index numbers


The peanut (Arachis hypogaea) is an important oil crop. Breeding for high oil content is becoming increasingly important. Wild Arachis species have been reported to harbor genes for many valuable traits that may enable the improvement of cultivated Arachis hypogaea, such as resistance to pests and disease. However, only limited information is available on variation in oil content. In the present study, a collection of 72 wild Arachis accessions representing 19 species and 3 cultivated peanut accessions were genotyped using 136 genome-wide SSR markers and phenotyped for oil content over three growing seasons. The wild Arachis accessions showed abundant diversity across the 19 species. A. duranensis exhibited the highest diversity, with a Shannon-Weaver diversity index of 0.35. A total of 129 unique alleles were detected in the species studied. A. rigonii exhibited the largest number of unique alleles (75), indicating that this species is highly differentiated. AMOVA and genetic distance analyses confirmed the genetic differentiation between the wild Arachis species. The majority of SSR alleles were detected exclusively in the wild species and not in A. hypogaea, indicating that directional selection or the hitchhiking effect has played an important role in the domestication of the cultivated peanut. The 75 accessions were grouped into three clusters based on population structure and phylogenic analysis, consistent with their taxonomic sections, species and genome types. A. villosa and A. batizocoi were grouped with A. hypogaea, suggesting the close relationship between these two diploid wild species and the cultivated peanut. Considerable phenotypic variation in oil content was observed among different sections and species. Nine alleles were identified as associated with oil content based on association analysis, of these, three alleles were associated with higher oil content but were absent in the cultivated peanut. The results demonstrated that there is great potential to increase the oil content in A. hypogaea by using the wild Arachis germplasm.

Concepts: Gene, Evolution, Fabaceae, Peanut, Arachis, Diversity index, Shannon index, Measurement of biodiversity


The disease severity of Entamoeba histolytica infection ranges from asymptomatic to life-threatening. Recent human and animal data implicate the gut microbiome as a modifier of E. histolytica virulence. Here we have explored the association of the microbiome with susceptibility to amebiasis in infants and in the mouse model of amebic colitis. Dysbiosis occurred symptomatic E. histolytica infection in children, as evidenced by a lower Shannon diversity index of the gut microbiota. To test if dysbiosis was a cause of susceptibility, wild type C57BL/6 mice (which are innately resistant to E. histiolytica infection) were treated with antibiotics prior to cecal challenge with E. histolytica. Compared with untreated mice, antibiotic pre-treated mice had more severe colitis and delayed clearance of E. histolytica. Gut IL-25 and mucus protein Muc2, both shown to provide innate immunity in the mouse model of amebic colitis, were lower in antibiotic pre-treated mice. Moreover, dysbiotic mice had fewer cecal neutrophils and myeloperoxidase activity. Paradoxically, the neutrophil chemoattractant chemokines CXCL1 and CXCL2, as well as IL-1β, were higher in the colon of mice with antibiotic-induced dysbiosis. Neutrophils from antibiotic pre-treated mice had diminished surface expression of the chemokine receptor CXCR2, potentially explaining their inability to migrate to the site of infection. Blockade of CXCR2 increased susceptibility of control non-antibiotic treated mice to amebiasis. In conclusion, dysbiosis increased the severity of amebic colitis due to decreased neutrophil recruitment to the gut, which was due in part to decreased surface expression on neutrophils of CXCR2.

Concepts: Immune system, Gene, Bacteria, Gut flora, Innate immune system, Chemokine, Shannon index, Entamoeba histolytica


Anecdotal accounts regarding reduced US cropping system diversity have raised concerns about negative impacts of increasingly homogeneous cropping systems. However, formal analyses to document such changes are lacking. Using US Agriculture Census data, which are collected every five years, we quantified crop species diversity from 1978 to 2012, for the contiguous US on a county level basis. We used Shannon diversity indices expressed as effective number of crop species (ENCS) to quantify crop diversity. We then evaluated changes in county-level crop diversity both nationally and for each of the eight Farm Resource Regions developed by the National Agriculture Statistics Service. During the 34 years we considered in our analyses, both national and regional ENCS changed. Nationally, crop diversity was lower in 2012 than in 1978. However, our analyses also revealed interesting trends between and within different Resource Regions. Overall, the Heartland Resource Region had the lowest crop diversity whereas the Fruitful Rim and Northern Crescent had the highest. In contrast to the other Resource Regions, the Mississippi Portal had significantly higher crop diversity in 2012 than in 1978. Also, within regions there were differences between counties in crop diversity. Spatial autocorrelation revealed clustering of low and high ENCS and this trend became stronger over time. These results show that, nationally counties have been clustering into areas of either low diversity or high diversity. Moreover, a significant trend of more counties shifting to lower rather than to higher crop diversity was detected. The clustering and shifting demonstrates a trend toward crop diversity loss and attendant homogenization of agricultural production systems, which could have far-reaching consequences for provision of ecosystem system services associated with agricultural systems as well as food system sustainability.

Concepts: Biodiversity, Conservation biology, Agriculture, Sustainability, Food security, Diversity index, Shannon index, Monoculture


The aim of this study was to develop novel anaerobic media using gellan gum for the isolation of previously uncultured rumen bacteria. Four anaerobic media, a basal liquid medium (BM) with agar (A-BM), a modified BM (MBM) with agar (A-MBM), an MBM with phytagel (P-MBM) and an MBM with gelrite (G-MBM) were used for the isolation of rumen bacteria and evaluated for the growth of previously uncultured rumen bacteria. Of the 214 isolates composed of 144 OTUs, 103 isolates (83 OTUs) were previously uncultured rumen bacteria. Most of the previously uncultured strains were obtained from A-MBM, G-MBM and P-MBM, but the predominant cultural members, isolated from each medium, differed. A-MBM and G-MBM showed significantly higher numbers of different OTUs derived from isolates than A-BM (P < 0·05). The Shannon index indicated that the isolates of A-MBM showed the highest diversity (H' = 3·89) compared with those of G-MBM, P-MBM and A-BM (H' = 3·59, 3·23 and 3·39, respectively). Although previously uncultured rumen bacteria were isolated from all media used, the ratio of previously uncultured bacteria to total isolates was increased in A-MBM, P-MBM and G-MBM.

Concepts: Ratio, Agar, E number, Diversity index, Shannon index, Gellan gum, Sphingomonas


Diversity indices might be used to assess the impact of treatments on the relative abundance patterns in species communities. When several treatments are to be compared, simultaneous confidence intervals for the differences of diversity indices between treatments may be used. The simultaneous confidence interval methods described until now are either constructed or validated under the assumption of the multinomial distribution for the abundance counts. Motivated by four example data sets with background in agricultural and marine ecology, we focus on the situation when available replications show that the count data exhibit extra-multinomial variability. Based on simulated overdispersed count data, we compare previously proposed methods assuming multinomial distribution, a method assuming normal distribution for the replicated observations of the diversity indices and three different bootstrap methods to construct simultaneous confidence intervals for multiple differences of Simpson and Shannon diversity indices. The focus of the simulation study is on comparisons to a control group. The severe failure of asymptotic multinomial methods in overdispersed settings is illustrated. Among the bootstrap methods, the widely known Westfall-Young method performs best for the Simpson index, while for the Shannon index, two methods based on stratified bootstrap and summed count data are preferable. The methods application is illustrated for an example.

Concepts: Statistics, Confidence interval, Normal distribution, Student's t-distribution, Diversity index, Shannon index, Measurement of biodiversity, Species richness


There are many species of jasmines in different regions of Iran in natural or cultivated form, and there is no information about their genetic status. Therefore, inter-simple sequence repeat (ISSR) analysis was used to evaluate genetic variations of the 53 accessions representing eight species of Jasminum collected from different regions of Iran. A total of 21 ISSR primers were used which generated 981 bands of different sizes. Mean percentage of polymorphic bands was 90.64 %. Maximum resolving power, polymorphic information content average, and marker index values were 21.55, 0.35, and 14.42 for primers of 3, 4, and 3 respectively. The unweighted pair group method with arithmetic mean dendrogram based on Jaccard’s coefficients indicated that 53 accessions were divided into two major clusters. The first major cluster was divided into two subclusters; the subcluster A included Jasminum grandiflorum L., J. officinale L., and J. azoricum L. and the subcluster B consisted of three forms of J. sambac L. (single, semi-double, and double flowers). The second major cluster was divided into two subclusters; the first subcluster © included J. humile L., J. primulinum Hemsl., J. nudiflorum Lindl. and the second subcluster (D) consisted of J. fruticans L. At the species level, the highest percentage of polymorphism (34.05 %), numbers of effective alleles (1.16), Shannon index (0.151), and Nei’s genetic diversity (0.098) were observed in J. officinale. The lowest values of percentage polymorphism (0.011), number of effective alleles (1.009), Shannon index (0.007), and Nei’s genetic diversity (0.005) were obtained for J. nudiflorum. Based on pairwise population matrix of Nei’s unbiased genetic identity, the highest identity (0.85) was found between J.officinale and J. azoricum and the lowest identity (0.69) was between J. grandiflorum and J. perimulinum. Based on analysis of molecular variance, the amount of genetic variations among the eight populations was 83 %. This study demonstrated that the ISSR is an useful tool in jasmine genomic diversity studies and to detect their relationships.

Concepts: Genetics, Arithmetic mean, Standard deviation, Shannon index, Jasmine, Jasminum sambac, Human genetic variation, Jasminum


The significance of lytic viral lysis in shaping bacterial communities in temperate freshwater systems is less documented. Here we used Illumina sequencing of 16S rRNA genes to examine bacterial community structure and diversity in relation to variable viral lysis in the euphotic zone of 25 temperate freshwater lakes (French Massif Central). We captured a rich bacterial community that was dominated by a few bacterial classes and operational taxonomic units (OTUs) frequently detected in other freshwater ecosystems. In the investigated lakes with contrasting physico-chemical characteristics, the dominant bacterioplankton community was represented by major taxonomical orders, namely Actinomycetales, Burkholderiales, Sphingobacteriales, Acidimicrobiales, Flavobacteriales and Cytophagales covering about 70% of all sequences. Viral lysis was significantly correlated with the bacterial diversity indices (Chao, Shannon, OTUs) which explained about 33% and 45% of the variation in species diversity and observed richness respectively. Anosim and UniFrac analyses indicated a clear distinction of bacterial community structure among the lakes that exhibited high and low lytic viral infection (FIC) rates. Based on our findings, high FIC (>10%) supported higher species richness, whereas low FIC (<10%) resulted in less diverse community. Our study strongly suggests that lytic activity prevailed over the type of lake ecosystems in shaping bacterioplankton diversity.

Concepts: Biodiversity, Virus, Ribosomal RNA, Lake, Taxonomy, 16S ribosomal RNA, Shannon index, Aquatic ecology


Estimations of microbial community diversity based on metagenomic data sets are affected, often to an unknown degree, by biases derived from insufficient coverage and reference database-dependent estimations of diversity. For instance, the completeness of reference databases cannot be generally estimated since it depends on the extant diversity sampled to date, which, with the exception of a few habitats such as the human gut, remains severely undersampled. Further, estimation of the degree of coverage of a microbial community by a metagenomic data set is prohibitively time-consuming for large data sets, and coverage values may not be directly comparable between data sets obtained with different sequencing technologies. Here, we extend Nonpareil, a database-independent tool for the estimation of coverage in metagenomic data sets, to a high-performance computing implementation that scales up to hundreds of cores and includes, in addition, a k-mer-based estimation as sensitive as the original alignment-based version but about three hundred times as fast. Further, we propose a metric of sequence diversity (N d ) derived directly from Nonpareil curves that correlates well with alpha diversity assessed by traditional metrics. We use this metric in different experiments demonstrating the correlation with the Shannon index estimated on 16S rRNA gene profiles and show that N d additionally reveals seasonal patterns in marine samples that are not captured by the Shannon index and more precise rankings of the magnitude of diversity of microbial communities in different habitats. Therefore, the new version of Nonpareil, called Nonpareil 3, advances the toolbox for metagenomic analyses of microbiomes. IMPORTANCE Estimation of the coverage provided by a metagenomic data set, i.e., what fraction of the microbial community was sampled by DNA sequencing, represents an essential first step of every culture-independent genomic study that aims to robustly assess the sequence diversity present in a sample. However, estimation of coverage remains elusive because of several technical limitations associated with high computational requirements and limiting statistical approaches to quantify diversity. Here we described Nonpareil 3, a new bioinformatics algorithm that circumvents several of these limitations and thus can facilitate culture-independent studies in clinical or environmental settings, independent of the sequencing platform employed. In addition, we present a new metric of sequence diversity based on rarefied coverage and demonstrate its use in communities from diverse ecosystems.

Concepts: Statistics, Microbiology, Ribosomal RNA, 16S ribosomal RNA, Estimation, Diversity index, Shannon index, Measurement of biodiversity


Seagrass meadows globally are disappearing at a rapid rate with physical disturbances being one of the major drivers of this habitat loss. Disturbance of seagrass can lead to fragmentation, a reduction in shoot density, canopy height and coverage, and potentially permanent loss of habitat. Despite being such a widespread issue, knowledge of how such small scale change affects the spatial distribution and abundances of motile fauna remains limited. The present study investigated fish and macro faunal community response patterns to a range of habitat variables (shoot length, cover and density), including individual species habitat preferences within a disturbed and patchy intertidal seagrass meadow. Multivariate analysis showed a measurable effect of variable seagrass cover on the abundance and distribution of the fauna, with species specific preferences to both high and low seagrass cover seagrass. The faunal community composition varied significantly with increasing/decreasing cover. The faunal species composition of low cover seagrass was more similar to sandy control plots than to higher cover seagrass. Shannon Wiener Diversity (H') and species richness was significantly higher in high cover seagrass than in low cover seagrass, indicating increasing habitat value as density increases. The results of this study underline how the impacts of small scale disturbances from factors such as anchor damage, boat moorings and intertidal vehicle use on seagrass meadows that reduce shoot density and cover can impact upon associated fauna. These impacts have negative consequences for the delivery of ecosystem services such as the provision of nursery habitat.

Concepts: Habitat, Shannon index, Order theory, Abundance, Species richness, Habitat fragmentation, Habitat destruction, Seagrass