Discover the most talked about and latest scientific content & concepts.

Concept: Parsimony


A new species of Neopestalotiopsis based on both morphological and molecular characteristics is described. Neopestalotiopsis iranensis sp. nov. isolated from rotted strawberry (Fragaria ananassa) fruits as well as from stolon and leaf lesions in Kurdistan province, Iran. Initially, light tan and sunken spots developed on fruits and resulted in a soft decay of the fruit flesh. The new species is morphologically distinguished from similar species with different conidium size and by possessing longer apical appendages, as well as some knobbed basal appendages. Phylogenetic analyses (Bayesian inference, maximum likelihood and maximum parsimony analyses) based on internal transcribed spacer, β-tubulin, and partial translation elongation factor 1-alpha combined gene sequences also indicated that this species is phylogenetically distinct from others. Moreover, strawberry crop is introduced here as a new host for N. mesopotamica.

Concepts: Biology, Phylogenetics, Fruit, Bayesian inference, Likelihood function, Garden strawberry, Accessory fruit, Parsimony


Work disability affects quality of life, earnings, and opportunities to contribute to society. Work characteristics, lifestyle and sociodemographic factors have been associated with the risk of work disability, but few multifactorial algorithms exist to identify individuals at risk of future work disability. We developed and validated a parsimonious multifactorial score for the prediction of work disability using individual-level data from 65,775 public-sector employees (development cohort) and 13,527 employed adults from a general population sample (validation cohort), both linked to records of work disability. Candidate predictors for work disability included sociodemographic (3 items), health status and lifestyle (38 items), and work-related (43 items) variables. A parsimonious model, explaining > 99% of the variance of the full model, comprised 8 predictors: age, self-rated health, number of sickness absences in previous year, socioeconomic position, chronic illnesses, sleep problems, body mass index, and smoking. Discriminative ability of a score including these predictors was high: C-index 0.84 in the development and 0.83 in the validation cohort. The corresponding C-indices for a score constructed from work-related predictors (age, sex, socioeconomic position, job strain) were 0.79 and 0.78, respectively. It is possible to identify reliably individuals at high risk of work disability by using a rapidly-administered prediction score.

Concepts: Population, Prediction, Validation, Quality of life, Body mass index, Body weight, Socioeconomics, Parsimony


We present a novel, quantitative view on the human athletic performance of individual runners. We obtain a predictor for running performance, a parsimonious model and a training state summary consisting of three numbers by application of modern validation techniques and recent advances in machine learning to the thepowerof10 database of British runners' performances (164,746 individuals, 1,417,432 performances). Our predictor achieves an average prediction error (out-of-sample) of e.g. 3.6 min on elite Marathon performances and 0.3 seconds on 100 metres performances, and a lower error than the state-of-the-art in performance prediction (30% improvement, RMSE) over a range of distances. We are also the first to report on a systematic comparison of predictors for running performance. Our model has three parameters per runner, and three components which are the same for all runners. The first component of the model corresponds to a power law with exponent dependent on the runner which achieves a better goodness-of-fit than known power laws in the study of running. Many documented phenomena in quantitative sports science, such as the form of scoring tables, the success of existing prediction methods including Riegel’s formula, the Purdy points scheme, the power law for world records performances and the broken power law for world record speeds may be explained on the basis of our findings in a unified way. We provide strong evidence that the three parameters per runner are related to physiological and behavioural parameters, such as training state, event specialization and age, which allows us to derive novel physiological hypotheses relating to athletic performance. We conjecture on this basis that our findings will be vital in exercise physiology, race planning, the study of aging and training regime design.

Concepts: Scientific method, Regression analysis, Physiology, Prediction, Hypothesis, Machine learning, Predictor, Parsimony


We recently analyzed the robustness of competing evolution models developed to identify the root of the Tree of Life: 1) An empirical Sankoff parsimony (ESP) model (Harish and Kurland, 2017), which is a nonstationary and directional evolution model; and 2) An a priori ancestor (APA) model (Kim and Caetano-Anollés, 2011) that is a stationary and reversible evolution model. Both Bayesian model selection tests as well as maximum parsimony analyses demonstrate that the ESP model is, overwhelmingly, the better model. Moreover, we showed that the APA model is not only sensitive to artifacts, but also that the underlying assumptions are neither empirically grounded nor biologically realistic.

Concepts: Scientific method, Gene, Biology, Life, Bayesian statistics, Parsimony


We introduce a Schelling model in which people are modelled as agents following simple behavioural rules which dictate their tolerance to others, their corresponding preference for particular locations, and in turn their movement through a geographic or social space. Our innovation over previous work is to allow agents to adapt their tolerance to others in response to their local environment, in line with contemporary theories from social psychology. We show that adaptive tolerance leads to a polarization in tolerance levels, with distinct modes at either extreme of the distribution. Moreover, agents self-organize into communities of like-tolerance, just as they congregate with those of same colour. Our results are robust not only to variations in free parameters, but also experimental treatments in which migrants are dynamically introduced into the native population. We argue that this model provides one possible parsimonious explanation of the political landscape circa 2016.

Concepts: Scientific method, Psychology, Sociology, Philosophy of science, Theory, Motivation, Introduction, Parsimony


This paper discusses the problem of whether creating a matrix with all the character state combinations that have a fixed number of steps (or extra steps) on a given tree T, produces the same tree T when analyzed with maximum parsimony or maximum likelihood. Exhaustive enumeration of cases up to 20 taxa for binary characters, and up to 12 taxa for 4-state characters, shows that the same tree is recovered (as unique most likely or most parsimonious tree) as long as the number of extra steps is within ¼ of the number of taxa. This dependence, ¼ of the number of taxa, is discussed with a general argumentation, in terms of the spread of the character changes on the tree used to select character state distributions. The present finding allows creating matrices which have as much homoplasy as possible for the most parsimonious or likely tree to be predictable, and examination of these matrices with hill-climbing search algorithms provides additional evidence on the (lack of a) necessary relationship between homoplasy and the ability of search methods to find optimal trees.

Concepts: Present, Maximum likelihood, Tree, Phylogenetic tree, Cladistics, Computational phylogenetics, Matrix, Parsimony


In recent years, to infer phylogenies, which are NP-hard problems, more and more research has focused on using metaheuristics. Maximum Parsimony and Maximum Likelihood are two effective ways to conduct inference. Based on these methods, which can also be considered as the optimal criteria for phylogenies, various kinds of multi-objective metaheuristics have been used to reconstruct phylogenies. However, combining these two time-consuming methods results in those multi-objective metaheuristics being slower than a single objective. Therefore, we propose a novel, multi-objective optimization algorithm, MOEA-RC, to accelerate the processes of rebuilding phylogenies using structural information of elites in current populations. We compare MOEA-RC with two representative multi-objective algorithms, MOEA/D and NAGA-II, and a non-consensus version of MOEA-RC on three real-world datasets. The result is, within a given number of iterations, MOEA-RC achieves better solutions than the other algorithms.

Concepts: Algorithm, Operations research, Optimization, Computational complexity theory, Inference, Tabu search, Parsimony, NP-hard


The protracted speciation model presents a realistic and parsimonious explanation for the observed slowdown in lineage accumulation through time, by accounting for the fact that speciation takes time. A method to compute the likelihood for this model given a phylogeny is available and allows estimation of its parameters (rate of initiation of speciation, rate of completion of speciation, and extinction rate) and statistical comparison of this model to other proposed models of diversification. However this likelihood computation method makes an approximation of the protracted speciation model to be mathematically tractable: it sometimes counts fewer species than one would do from a biological perspective. This approximation may have large consequences for likelihood-based inferences: it may render any conclusions based on this method completely irrelevant. Here we study to what extent this approximation affects parameter estimations. We simulated phylogenies from which we reconstructed the tree of extant species according to the original, biologically meaningful protracted speciation model and according to the approximation. We then compared the resulting parameter estimates. We found that the differences were larger for high values of extinction rates and small values of speciation-completion rates. Indeed, a long speciation-completion time and a high extinction rate promote the appearance of cases to which the approximation applies. However, surprisingly, the deviation introduced is largely negligible over the parameter space explored, suggesting that this approximate likelihood can be applied reliably in practice to estimate biologically relevant parameters under the original protracted speciation model. This article is protected by copyright. All rights reserved.

Concepts: Evolution, Estimator, Biology, Species, Approximation, Estimation, Extinction, Parsimony


Occam’s razor-the idea that all else being equal, we should pick the simpler hypothesis-plays a prominent role in ordinary and scientific inference. But why are simpler hypotheses better? One attractive hypothesis known as Bayesian Occam’s razor (BOR) is that more complex hypotheses tend to be more flexible-they can accommodate a wider range of possible data-and that flexibility is automatically penalized by Bayesian inference. In two experiments, we provide evidence that people’s intuitive probabilistic and explanatory judgments follow the prescriptions of BOR. In particular, people’s judgments are consistent with the two most distinctive characteristics of BOR: They penalize hypotheses as a function not only of their numbers of free parameters but also as a function of the size of the parameter space, and they penalize those hypotheses even when their parameters can be “tuned” to fit the data better than comparatively simpler hypotheses.

Concepts: Scientific method, Experiment, Theory, C, Falsifiability, Abstraction, Parsimony, Occam's razor


A new parametric approach is proposed for nonlinear and nonstationary system identification based on a time-varying nonlinear autoregressive with exogenous input (TV-NARX) model. The TV coefficients of the TV-NARX model are expanded using multiwavelet basis functions, and the model is thus transformed into a time-invariant regression problem. An ultra-orthogonal forward regression (UOFR) algorithm aided by mutual information (MI) is designed to identify a parsimonious model structure and estimate the associated model parameters. The UOFR-MI algorithm, which uses not only the observed data themselves but also weak derivatives of the signals, is more powerful in model structure detection. The proposed approach combining the advantages of both the basis function expansion method and the UOFR-MI algorithm is proved to be capable of tracking the change of TV parameters effectively in both numerical simulations and the real EEG data.

Concepts: Mathematics, Philosophy of science, Set theory, Derivative, Parameter, Subroutine, Binary relation, Parsimony