### Concept: Parsimony

#### 24

##### Strawberry Fruit Rot Caused by Neopestalotiopsis iranensis sp. nov., and N. mesopotamica

- Current microbiology
- Published over 4 years ago
- Discuss

A new species of Neopestalotiopsis based on both morphological and molecular characteristics is described. Neopestalotiopsis iranensis sp. nov. isolated from rotted strawberry (Fragaria ananassa) fruits as well as from stolon and leaf lesions in Kurdistan province, Iran. Initially, light tan and sunken spots developed on fruits and resulted in a soft decay of the fruit flesh. The new species is morphologically distinguished from similar species with different conidium size and by possessing longer apical appendages, as well as some knobbed basal appendages. Phylogenetic analyses (Bayesian inference, maximum likelihood and maximum parsimony analyses) based on internal transcribed spacer, β-tubulin, and partial translation elongation factor 1-alpha combined gene sequences also indicated that this species is phylogenetically distinct from others. Moreover, strawberry crop is introduced here as a new host for N. mesopotamica.

#### 5

The prediction of the lineage dynamics of influenza B viruses for the next season is one of the largest obstacles for constructing an appropriate influenza trivalent vaccine. Seasonal fluctuation of transmissibility and epidemiological interference between the two major influenza B lineages make the lineage dynamics complicated. Here we construct a parsimonious model describing the lineage dynamics while taking into account seasonal fluctuation of transmissibility and epidemiological interference. Using this model we estimated the epidemiological and evolutional parameters with the time-series data of the lineage specific isolates in Japan from the 2010-2011 season to the 2014-2015 season. The basic reproduction number is similar between Victoria and Yamagata, with a minimum value during one year as 0.82 (95% highest posterior density (HPD): 0.77-0.87) for the Yamagata and 0.83 (95% HPD: 0.74-0.92) for Victoria, the amplitude of seasonal variation of the basic reproduction number is 0.77 (95% HPD:0.66-0.87) for Yamagata and 1.05 (95% HPD: 0.89-1.02) for Victoria. The duration for which the acquired immunity is effective against infection by the Yamagata lineage is shorter than the acquired immunity for Victoria, 424.1days (95% HPD:317.4-561.5days). The reduction rate of susceptibility due to immune cross-reaction is 0.51 (95% HPD: 0.084-0.92) for the immunity obtained from the infection with Yamagata against the infection with Victoria and 0.62 (95% HPD: 0.42-0.80) for the immunity obtained from the infection with Victoria against the infection with Yamagata. Using estimated parameters, we predicted the dominant lineage in 2015-2016 season. The accuracy of this prediction is 68.8% if the emergence timings of the two lineages are known and 61.4% if the emergence timings are unknown. Estimated seasonal variation of the lineage specific reproduction number can narrow down the range of emergence timing, with an accuracy of 64.6% if the emergence times are assumed to be the time at which the estimated reproduction number exceeds one.

#### 1

##### Development and validation of a risk prediction model for work disability: multicohort study

- OPEN
- Scientific reports
- Published over 2 years ago
- Discuss

Work disability affects quality of life, earnings, and opportunities to contribute to society. Work characteristics, lifestyle and sociodemographic factors have been associated with the risk of work disability, but few multifactorial algorithms exist to identify individuals at risk of future work disability. We developed and validated a parsimonious multifactorial score for the prediction of work disability using individual-level data from 65,775 public-sector employees (development cohort) and 13,527 employed adults from a general population sample (validation cohort), both linked to records of work disability. Candidate predictors for work disability included sociodemographic (3 items), health status and lifestyle (38 items), and work-related (43 items) variables. A parsimonious model, explaining > 99% of the variance of the full model, comprised 8 predictors: age, self-rated health, number of sickness absences in previous year, socioeconomic position, chronic illnesses, sleep problems, body mass index, and smoking. Discriminative ability of a score including these predictors was high: C-index 0.84 in the development and 0.83 in the validation cohort. The corresponding C-indices for a score constructed from work-related predictors (age, sex, socioeconomic position, job strain) were 0.79 and 0.78, respectively. It is possible to identify reliably individuals at high risk of work disability by using a rapidly-administered prediction score.

#### 1

We present a novel, quantitative view on the human athletic performance of individual runners. We obtain a predictor for running performance, a parsimonious model and a training state summary consisting of three numbers by application of modern validation techniques and recent advances in machine learning to the thepowerof10 database of British runners' performances (164,746 individuals, 1,417,432 performances). Our predictor achieves an average prediction error (out-of-sample) of e.g. 3.6 min on elite Marathon performances and 0.3 seconds on 100 metres performances, and a lower error than the state-of-the-art in performance prediction (30% improvement, RMSE) over a range of distances. We are also the first to report on a systematic comparison of predictors for running performance. Our model has three parameters per runner, and three components which are the same for all runners. The first component of the model corresponds to a power law with exponent dependent on the runner which achieves a better goodness-of-fit than known power laws in the study of running. Many documented phenomena in quantitative sports science, such as the form of scoring tables, the success of existing prediction methods including Riegel’s formula, the Purdy points scheme, the power law for world records performances and the broken power law for world record speeds may be explained on the basis of our findings in a unified way. We provide strong evidence that the three parameters per runner are related to physiological and behavioural parameters, such as training state, event specialization and age, which allows us to derive novel physiological hypotheses relating to athletic performance. We conjecture on this basis that our findings will be vital in exercise physiology, race planning, the study of aging and training regime design.

#### 0

##### Reply to Caetano-Anollés et al. comment on “Empirical genome evolution models root the tree of life”

We recently analyzed the robustness of competing evolution models developed to identify the root of the Tree of Life: 1) An empirical Sankoff parsimony (ESP) model (Harish and Kurland, 2017), which is a nonstationary and directional evolution model; and 2) An a priori ancestor (APA) model (Kim and Caetano-Anollés, 2011) that is a stationary and reversible evolution model. Both Bayesian model selection tests as well as maximum parsimony analyses demonstrate that the ESP model is, overwhelmingly, the better model. Moreover, we showed that the APA model is not only sensitive to artifacts, but also that the underlying assumptions are neither empirically grounded nor biologically realistic.

#### 0

We introduce a Schelling model in which people are modelled as agents following simple behavioural rules which dictate their tolerance to others, their corresponding preference for particular locations, and in turn their movement through a geographic or social space. Our innovation over previous work is to allow agents to adapt their tolerance to others in response to their local environment, in line with contemporary theories from social psychology. We show that adaptive tolerance leads to a polarization in tolerance levels, with distinct modes at either extreme of the distribution. Moreover, agents self-organize into communities of like-tolerance, just as they congregate with those of same colour. Our results are robust not only to variations in free parameters, but also experimental treatments in which migrants are dynamically introduced into the native population. We argue that this model provides one possible parsimonious explanation of the political landscape circa 2016.

#### 0

##### On Defining a Unique Phylogenetic Tree with Homoplastic Characters

- Molecular phylogenetics and evolution
- Published over 2 years ago
- Discuss

This paper discusses the problem of whether creating a matrix with all the character state combinations that have a fixed number of steps (or extra steps) on a given tree T, produces the same tree T when analyzed with maximum parsimony or maximum likelihood. Exhaustive enumeration of cases up to 20 taxa for binary characters, and up to 12 taxa for 4-state characters, shows that the same tree is recovered (as unique most likely or most parsimonious tree) as long as the number of extra steps is within ¼ of the number of taxa. This dependence, ¼ of the number of taxa, is discussed with a general argumentation, in terms of the spread of the character changes on the tree used to select character state distributions. The present finding allows creating matrices which have as much homoplasy as possible for the most parsimonious or likely tree to be predictable, and examination of these matrices with hill-climbing search algorithms provides additional evidence on the (lack of a) necessary relationship between homoplasy and the ability of search methods to find optimal trees.

#### 0

##### Using MOEA with Redistribution and Consensus Branches to Infer Phylogenies

- OPEN
- International journal of molecular sciences
- Published over 2 years ago
- Discuss

In recent years, to infer phylogenies, which are NP-hard problems, more and more research has focused on using metaheuristics. Maximum Parsimony and Maximum Likelihood are two effective ways to conduct inference. Based on these methods, which can also be considered as the optimal criteria for phylogenies, various kinds of multi-objective metaheuristics have been used to reconstruct phylogenies. However, combining these two time-consuming methods results in those multi-objective metaheuristics being slower than a single objective. Therefore, we propose a novel, multi-objective optimization algorithm, MOEA-RC, to accelerate the processes of rebuilding phylogenies using structural information of elites in current populations. We compare MOEA-RC with two representative multi-objective algorithms, MOEA/D and NAGA-II, and a non-consensus version of MOEA-RC on three real-world datasets. The result is, within a given number of iterations, MOEA-RC achieves better solutions than the other algorithms.

#### 0

##### Robustness of the Approximate Likelihood of the Protracted Speciation Model

- Journal of evolutionary biology
- Published over 2 years ago
- Discuss

The protracted speciation model presents a realistic and parsimonious explanation for the observed slowdown in lineage accumulation through time, by accounting for the fact that speciation takes time. A method to compute the likelihood for this model given a phylogeny is available and allows estimation of its parameters (rate of initiation of speciation, rate of completion of speciation, and extinction rate) and statistical comparison of this model to other proposed models of diversification. However this likelihood computation method makes an approximation of the protracted speciation model to be mathematically tractable: it sometimes counts fewer species than one would do from a biological perspective. This approximation may have large consequences for likelihood-based inferences: it may render any conclusions based on this method completely irrelevant. Here we study to what extent this approximation affects parameter estimations. We simulated phylogenies from which we reconstructed the tree of extant species according to the original, biologically meaningful protracted speciation model and according to the approximation. We then compared the resulting parameter estimates. We found that the differences were larger for high values of extinction rates and small values of speciation-completion rates. Indeed, a long speciation-completion time and a high extinction rate promote the appearance of cases to which the approximation applies. However, surprisingly, the deviation introduced is largely negligible over the parameter space explored, suggesting that this approximate likelihood can be applied reliably in practice to estimate biologically relevant parameters under the original protracted speciation model. This article is protected by copyright. All rights reserved.

#### 0

##### Bayesian Occam’s Razor Is a Razor of the People

- Cognitive science
- Published over 2 years ago
- Discuss

Occam’s razor-the idea that all else being equal, we should pick the simpler hypothesis-plays a prominent role in ordinary and scientific inference. But why are simpler hypotheses better? One attractive hypothesis known as Bayesian Occam’s razor (BOR) is that more complex hypotheses tend to be more flexible-they can accommodate a wider range of possible data-and that flexibility is automatically penalized by Bayesian inference. In two experiments, we provide evidence that people’s intuitive probabilistic and explanatory judgments follow the prescriptions of BOR. In particular, people’s judgments are consistent with the two most distinctive characteristics of BOR: They penalize hypotheses as a function not only of their numbers of free parameters but also as a function of the size of the parameter space, and they penalize those hypotheses even when their parameters can be “tuned” to fit the data better than comparatively simpler hypotheses.