Concept: Bayesian network
BACKGROUND: Dynamic Bayesian network (DBN) is among the mainstream approaches for modeling various biological networks, including the gene regulatory network (GRN). Most current methods for learning DBN employ either local search such as hill-climbing, or a meta stochastic global optimization framework such as genetic algorithm or simulated annealing, which are only able to locate sub-optimal solutions. Further, current DBN applications have essentially been limited to small sized networks. RESULTS: To overcome the above difficulties, we introduce here a deterministic global optimization based DBN approach for reverse engineering genetic networks from time course gene expression data. For such DBN models that consist only of inter time slice arcs, we show that there exists a polynomial time algorithm for learning the globally optimal network structure. The proposed approach, named GlobalMIT+, employs the recently proposed information theoretic scoring metric named mutual information test (MIT). GlobalMIT+ is able to learn high-order time delayed genetic interactions, which are common to most biological systems. Evaluation of the approach using both synthetic and real data sets, including a 733 cyanobacterial gene expression data set, shows significantly improved performance over other techniques. CONCLUSIONS: Our studies demonstrate that deterministic global optimization approaches can infer large scale genetic networks.
Hierarchical generative models, such as Bayesian networks, and belief propagation have been shown to provide a theoretical framework that can account for perceptual processes, including feedforward recognition and feedback modulation. The framework explains both psychophysical and physiological experimental data and maps well onto the hierarchical distributed cortical anatomy. However, the complexity required to model cortical processes makes inference, even using approximate methods, very computationally expensive. Thus, existing object perception models based on this approach are typically limited to tree-structured networks with no loops, use small toy examples or fail to account for certain perceptual aspects such as invariance to transformations or feedback reconstruction. In this study we develop a Bayesian network with an architecture similar to that of HMAX, a biologically-inspired hierarchical model of object recognition, and use loopy belief propagation to approximate the model operations (selectivity and invariance). Crucially, the resulting Bayesian network extends the functionality of HMAX by including top-down recursive feedback. Thus, the proposed model not only achieves successful feedforward recognition invariant to noise, occlusions, and changes in position and size, but is also able to reproduce modulatory effects such as illusory contour completion and attention. Our novel and rigorous methodology covers key aspects such as learning using a layerwise greedy algorithm, combining feedback information from multiple parents and reducing the number of operations required. Overall, this work extends an established model of object recognition to include high-level feedback modulation, based on state-of-the-art probabilistic approaches. The methodology employed, consistent with evidence from the visual cortex, can be potentially generalized to build models of hierarchical perceptual organization that include top-down and bottom-up interactions, for example, in other sensory modalities.
Previous studies indicated that the quality of tropical composts is poorer than that of composts produced in temperate regions. The aim of this study was to test the type of manure, the use of co-composting with green waste, and the stabilization method for their ability to improve compost quality in the tropics. We produced 68 composts and vermicomposts that were analysed for their C, lignin and NPK contents throughout the composting process. Bayesian networks were used to assess the mechanisms controlling compost quality. The concentration effect, for C and lignin, and the initial blend quality, for NPK content, were the main factors affecting compost quality. Cattle manure composts presented the highest C and lignin contents, and poultry litter composts exhibited the highest NPK content. Co-composting improved quality by enhancing the concentration effect, which reduced the impact of C and nutrient losses. Vermicomposting did not improve compost quality; co-composting without earthworms thus appears to be a suitable stabilization method under the conditions of this study because it produced high quality composts and is easier to implement.
Bayesian networks are probabilistic graphical models suitable for modeling several kinds of biological systems. In many cases, the structure of a Bayesian network represents causal molecular mechanisms or statistical associations of the underlying system. Bayesian networks have been applied, for example, for inferring the structure of many biological networks from experimental data. We present some recent progress in learning the structure of static and dynamic Bayesian networks from data.
Demand for clinical decision support systems in medicine and self-diagnostic symptom checkers has substantially increased in recent years. Existing platforms rely on knowledge bases manually compiled through a labor-intensive process or automatically derived using simple pairwise statistics. This study explored an automated process to learn high quality knowledge bases linking diseases and symptoms directly from electronic medical records. Medical concepts were extracted from 273,174 de-identified patient records and maximum likelihood estimation of three probabilistic models was used to automatically construct knowledge graphs: logistic regression, naive Bayes classifier and a Bayesian network using noisy OR gates. A graph of disease-symptom relationships was elicited from the learned parameters and the constructed knowledge graphs were evaluated and validated, with permission, against Google’s manually-constructed knowledge graph and against expert physician opinions. Our study shows that direct and automated construction of high quality health knowledge graphs from medical records using rudimentary concept extraction is feasible. The noisy OR model produces a high quality knowledge graph reaching precision of 0.85 for a recall of 0.6 in the clinical evaluation. Noisy OR significantly outperforms all tested models across evaluation frameworks (p < 0.01).
In this article, we offer an artificial intelligence method to estimate the carotid-femoral Pulse Wave Velocity (PWV) non-invasively from one uncalibrated carotid waveform measured by tonometry and few routine clinical variables. Since the signal processing inputs to this machine learning algorithm are sensor agnostic, the presented method can accompany any medical instrument that provides a calibrated or uncalibrated carotid pressure waveform. Our results show that, for an unseen hold back test set population in the age range of 20 to 69, our model can estimate PWV with a Root-Mean-Square Error (RMSE) of 1.12 m/sec compared to the reference method. The results convey the fact that this model is a reliable surrogate of PWV. Our study also showed that estimated PWV was significantly associated with an increased risk of CVDs.
The purpose of this study was to identify personality factor-associated predictors of smartphone addiction predisposition (SAP). Participants were 2,573 men and 2,281 women (n = 4,854) aged 20-49 years (Mean ± SD: 33.47 ± 7.52); participants completed the following questionnaires: the Korean Smartphone Addiction Proneness Scale (K-SAPS) for adults, the Behavioral Inhibition System/Behavioral Activation System questionnaire (BIS/BAS), the Dickman Dysfunctional Impulsivity Instrument (DDII), and the Brief Self-Control Scale (BSCS). In addition, participants reported their demographic information and smartphone usage pattern (weekday or weekend average usage hours and main use). We analyzed the data in three steps: (1) identifying predictors with logistic regression, (2) deriving causal relationships between SAP and its predictors using a Bayesian belief network (BN), and (3) computing optimal cut-off points for the identified predictors using the Youden index. Identified predictors of SAP were as follows: gender (female), weekend average usage hours, and scores on BAS-Drive, BAS-Reward Responsiveness, DDII, and BSCS. Female gender and scores on BAS-Drive and BSCS directly increased SAP. BAS-Reward Responsiveness and DDII indirectly increased SAP. We found that SAP was defined with maximal sensitivity as follows: weekend average usage hours > 4.45, BAS-Drive > 10.0, BAS-Reward Responsiveness > 13.8, DDII > 4.5, and BSCS > 37.4. This study raises the possibility that personality factors contribute to SAP. And, we calculated cut-off points for key predictors. These findings may assist clinicians screening for SAP using cut-off points, and further the understanding of SA risk factors.
Monte Carlo (MC)-based refinement software to analyze the atomic arrangements of perovskite oxide ultrathin films from the crystal truncation rod intensity is developed on the basis of Bayesian inference. The advantages of the MC approach are (i) it is applicable to multi-domain structures, (ii) it provides the posterior probability of structures through Bayes' theorem, which allows one to evaluate the uncertainty of estimated structural parameters, and (iii) one can involve any information provided by other experiments and theories. The simulated annealing procedure efficiently searches for the optimum model owing to its stochastic updates, regardless of the initial values, without being trapped by local optima. The performance of the software is examined with a five-unit-cell-thick LaAlO3 film fabricated on top of SrTiO3. The software successfully found the global optima from an initial model prepared by a small grid search calculation. The standard deviations of the atomic positions derived from a dataset taken at a second-generation synchrotron are ±0.02 Å for metal sites and ±0.03 Å for oxygen sites.
Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.].
Neuroimaging increasingly exploits machine learning techniques in an attempt to achieve clinically relevant single-subject predictions. An alternative to machine learning, which tries to establish predictive links between features of the observed data and clinical variables, is the deployment of computational models for inferring on the (patho)physiological and cognitive mechanisms that generate behavioural and neuroimaging responses. This paper discusses the rationale behind a computational approach to neuroimaging-based single-subject inference; focusing on its potential for characterising disease mechanisms in individual subjects and mapping these characterisations to clinical predictions. Following an overview of two main approaches - Bayesian model selection and generative embedding - which can link computational models to individual predictions, we review how these methods accommodate heterogeneity in psychiatric and neurological spectrum disorders, help avoid erroneous interpretations of neuroimaging data, and establish a link between a mechanistic, model-based approach and the statistical perspectives afforded by machine learning.