SciCombinator

Discover the most talked about and latest scientific content & concepts.

Concept: Markup language

26

Does PubMed Central-a government-run digital archive of biomedical articles-compete with scientific society journals? A longitudinal, retrospective cohort analysis of 13,223 articles (5999 treatment, 7224 control) published in 14 society-run biomedical research journals in nutrition, experimental biology, physiology, and radiology between February 2008 and January 2011 reveals a 21.4% reduction in full-text hypertext markup language (HTML) article downloads and a 13.8% reduction in portable document format (PDF) article downloads from the journals' websites when U.S. National Institutes of Health-sponsored articles (treatment) become freely available from the PubMed Central repository. In addition, the effect of PubMed Central on reducing PDF article downloads is increasing over time, growing at a rate of 1.6% per year. There was no longitudinal effect for full-text HTML downloads. While PubMed Central may be providing complementary access to readers traditionally underserved by scientific journals, the loss of article readership from the journal website may weaken the ability of the journal to build communities of interest around research papers, impede the communication of news and events to scientific society members and journal readers, and reduce the perceived value of the journal to institutional subscribers.-Davis, P. M. Public accessibility of biomedical articles from PubMed Central reduces journal readership-retrospective cohort analysis.

Concepts: Cohort study, Open access, XML, Markup language, Wiki, HTML, XHTML, DocBook

10

The number of image analysis tools supporting the extraction of architectural features of root systems has increased over the last years. These tools offer a handy set of complementary facilities, yet it is widely accepted that none of these software tool is able to extract in an efficient way growing array of static and dynamic features for different types of images and species. . We describe the Root System Markup Language (RSML) that has been designed to overcome two major challenges: (i) to enable portability of root architecture data between different software tools in an easy and interoperable manner allowing seamless collaborative work, and (ii) to provide a standard format upon which to base central repositories which will soon arise following the expanding worldwide root phenotyping effort. RSML follows the XML standard to store 2D or 3D image metadata, plant and root properties and geometries, continuous functions along individual root paths and a suite of annotations at the image, plant or root scales, at one or several time points. Plant ontologies are used to describe botanical entities that are relevant at the scale of root system architecture. An xml-schema describes the features and constraints of RSML and open-source packages have been developed in several languages (R, Excel, Java, Python, C#) to enable researchers to integrate RSML files into popular research workflow.

Concepts: Mathematics, Computer program, Annotation, Root, Software architecture, XML, Markup language, Weyl group

3

The goal of this work is to offer a computational framework for exploring data from the Recon2 human metabolic reconstruction model. Advanced user access features have been developed using the Neo4j graph database technology and this paper describes key features such as efficient management of the network data, examples of the network querying for addressing particular tasks, and how query results are converted back to the Systems Biology Markup Language (SBML) standard format. The Neo4j-based metabolic framework facilitates exploration of highly-connected and comprehensive human metabolic data and identification of metabolic subnetworks of interest. A Java-based parser component has been developed to convert query results (available in the JSON format) into SBML and SIF formats in order to facilitate further results exploration, enhancement or network sharing.

Concepts: The Network, Graph theory, Markup languages, Exploration, Markup language, Format, YAML

3

Personal Health Intervention Toolkit (PHIT) is an advanced cross-platform software framework targeted at personal self-help research on mobile devices. Following the subjective and objective measurement, assessment, and plan methodology for health assessment and intervention recommendations, the PHIT platform lets researchers quickly build mobile health research Android and iOS apps. They can (1) create complex data-collection instruments using a simple extensible markup language (XML) schema; (2) use Bluetooth wireless sensors; (3) create targeted self-help interventions based on collected data via XML-coded logic; (4) facilitate cross-study reuse from the library of existing instruments and interventions such as stress, anxiety, sleep quality, and substance abuse; and (5) monitor longitudinal intervention studies via daily upload to a Web-based dashboard portal. For physiological data, Bluetooth sensors collect real-time data with on-device processing. For example, using the BinarHeartSensor, the PHIT platform processes the heart rate data into heart rate variability measures, and plots these data as time-series waveforms. Subjective data instruments are user data-entry screens, comprising a series of forms with validation and processing logic. The PHIT instrument library consists of over 70 reusable instruments for various domains including cognitive, environmental, psychiatric, psychosocial, and substance abuse. Many are standardized instruments, such as the Alcohol Use Disorder Identification Test, Patient Health Questionnaire-8, and Post-Traumatic Stress Disorder Checklist. Autonomous instruments such as battery and global positioning system location support continuous background data collection. All data are acquired using a schedule appropriate to the app’s deployment. The PHIT intelligent virtual advisor (iVA) is an expert system logic layer, which analyzes the data in real time on the device. This data analysis results in a tailored app of interventions and other data-collection instruments. For example, if a user anxiety score exceeds a threshold, the iVA might add a meditation intervention to the task list in order to teach the user how to relax, and schedule a reassessment using the anxiety instrument 2 weeks later to re-evaluate. If the anxiety score exceeds a higher threshold, then an advisory to seek professional help would be displayed. Using the easy-to-use PHIT scripting language, the researcher can program new instruments, the iVA, and interventions to their domain-specific needs. The iVA, instruments, and interventions are defined via XML files, which facilities rapid app development and deployment. The PHIT Web-based dashboard portal provides the researcher access to all the uploaded data. After a secure login, the data can be filtered by criteria such as study, protocol, domain, and user. Data can also be exported into a comma-delimited file for further processing. The PHIT framework has proven to be an extensible, reconfigurable technology that facilitates mobile data collection and health intervention research. Additional plans include instrument development in other domains, additional health sensors, and a text messaging notification system.

Concepts: Posttraumatic stress disorder, Anxiety disorder, PHP, Mobile phone, Global Positioning System, XML, Markup language, HTML

3

Ontology organizes and formally conceptualizes information in a knowledge domain with a controlled vocabulary having defined terms and relationships between them. Several ontologies have been used to annotate numerous databases in biology and medicine. Due to their unambiguous nature, ontological annotations facilitate systematic description and data organization, data integration and mining, pattern recognition and statistics, as well as development of analysis and prediction tools. The Variation Ontology was developed to allow the annotation of effects, consequences and mechanisms of DNA, RNA and protein variations. Variation types are systematically organized and a detailed description of effects and mechanisms is possible. VariO is for annotating the variant, not the normal state features or properties, and requires a reference (e.g. reference sequence, reference state property, activity etc) compared to which the changes are indicated. VariO is versatile and can be used for variations ranging from genomic multiplications to single nucleotide or amino acid changes whether of genetic or non-genetic origin. VariO annotations are position specific and can be used for variations in any organism.

Concepts: DNA, Gene, Genetics, Ontology, Annotation, Annotated bibliography, Footnote, Markup language

3

Updates to maintain a state-of-the art reconstruction of the yeast metabolic network are essential to reflect our understanding of yeast metabolism and functional organization, to eliminate any inaccuracies identified in earlier iterations, to improve predictive accuracy and to continue to expand into novel subsystems to extend the comprehensiveness of the model. Here, we present version 6 of the consensus yeast metabolic network (Yeast 6) as an update to the community effort to computationally reconstruct the genome-scale metabolic network of Saccharomyces cerevisiae S288c. Yeast 6 comprises 1458 metabolites participating in 1888 reactions, which are annotated with 900 yeast genes encoding the catalyzing enzymes. Compared with Yeast 5, Yeast 6 demonstrates improved sensitivity, specificity and positive and negative predictive values for predicting gene essentiality in glucose-limited aerobic conditions when analyzed with flux balance analysis. Additionally, Yeast 6 improves the accuracy of predicting the likelihood that a mutation will cause auxotrophy. The network reconstruction is available as a Systems Biology Markup Language (SBML) file enriched with Minimium Information Requested in the Annotation of Biochemical Models (MIRIAM)-compliant annotations. Small- and macromolecules in the network are referenced to authoritative databases such as Uniprot or ChEBI. Molecules and reactions are also annotated with appropriate publications that contain supporting evidence. Yeast 6 is freely available at http://yeast.sf.net/ as three separate SBML files: a model using the SBML level 3 Flux Balance Constraint package, a model compatible with the MATLABĀ® COBRA Toolbox for backward compatibility and a reconstruction containing only reactions for which there is experimental evidence (without the non-biological reactions necessary for simulating growth). Database URL: http://yeast.sf.net/

Concepts: Protein, Metabolism, Fungus, Yeast, Model organism, Saccharomyces cerevisiae, Annotation, Markup language

2

Mitofish is a database of fish mitochondrial genomes (mitogenomes) that includes powerful and precise de novo annotations for mitogenome sequences. Fish occupy an important position in the evolution of vertebrates and the ecology of the hydrosphere, and mitogenomic sequence data have served as a rich source of information for resolving fish phylogenies and identifying new fish species. The importance of a mitogenomic database continues to grow at a rapid pace as massive amounts of mitogenomic data are generated with the advent of new sequencing technologies. A severe bottleneck seems likely to occur with regard to mitogenome annotation because of the overwhelming pace of data accumulation and the intrinsic difficulties in annotating sequences with degenerating transfer RNA structures, divergent start/stop codons of the coding elements, and the overlapping of adjacent elements. To ease this data backlog, we developed an annotation pipeline named MitoAnnotator. MitoAnnotator automatically annotates a fish mitogenome with a high degree of accuracy in approximately five minutes; thus, it is readily applicable to datasets of dozens of sequences. MitoFish also contains re-annotations of previously sequenced fish mitogenomes, enabling researchers to refer to them when they find annotations that are likely to be erroneous or while conducting comparative mitogenomic analyses. For users who need more information on the taxonomy, habitats, phenotypes, or life cycles of fish, MitoFish provides links to related databases. MitoFish and MitoAnnotator are freely available at http://mitofish.aori.u-tokyo.ac.jp/; all of the data can be batch downloaded, and the annotation pipeline can be used via a web interface.

Concepts: DNA, Gene, Sequence, Annotation, Reference, Marginalia, Footnote, Markup language

1

Noncoding DNA regions have central roles in human biology, evolution, and disease. ChromHMM helps to annotate the noncoding genome using epigenomic information across one or multiple cell types. It combines multiple genome-wide epigenomic maps, and uses combinatorial and spatial mark patterns to infer a complete annotation for each cell type. ChromHMM learns chromatin-state signatures using a multivariate hidden Markov model (HMM) that explicitly models the combinatorial presence or absence of each mark. ChromHMM uses these signatures to generate a genome-wide annotation for each cell type by calculating the most probable state for each genomic segment. ChromHMM provides an automated enrichment analysis of the resulting annotations to facilitate the functional interpretations of each chromatin state. ChromHMM is distinguished by its modeling emphasis on combinations of marks, its tight integration with downstream functional enrichment analyses, its speed, and its ease of use. Chromatin states are learned, annotations are produced, and enrichments are computed within 1 d.

Concepts: DNA, Gene, Genetics, Gene expression, Cell, Biology, Model organism, Markup language

1

In this article, we present a protocol for generating a complete (genome-scale) metabolic resource allocation model, as well as a proposal for how to represent such models in the systems biology markup language (SBML). Such models are used to investigate enzyme levels and achievable growth rates in large-scale metabolic networks. Although the idea of metabolic resource allocation studies has been present in the field of systems biology for some years, no guidelines for generating such a model have been published up to now. This paper presents step-by-step instructions for building a (dynamic) resource allocation model, starting with prerequisites such as a genome-scale metabolic reconstruction, through building protein and noncatalytic biomass synthesis reactions and assigning turnover rates for each reaction. In addition, we explain how one can use SBML level 3 in combination with the flux balance constraints and our resource allocation modeling annotation to represent such models.

Concepts: Protein, Biology, Chemical reaction, Model, Annotation, Resource allocation, Markup languages, Markup language

1

A vast amount of scientific information is encoded in natural language text, and the quantity of such text has become so great that it is no longer economically feasible to have a human as the first step in the search process. Natural language processing and text mining tools have become essential to facilitate the search for and extraction of information from text. This has led to vigorous research efforts to create useful tools and to create humanly labeled text corpora, which can be used to improve such tools. To encourage combining these efforts into larger, more powerful and more capable systems, a common interchange format to represent, store and exchange the data in a simple manner between different language processing systems and text mining tools is highly desirable. Here we propose a simple extensible mark-up language format to share text documents and annotations. The proposed annotation approach allows a large number of different annotations to be represented including sentences, tokens, parts of speech, named entities such as genes or diseases and relationships between named entities. In addition, we provide simple code to hold this data, read it from and write it back to extensible mark-up language files and perform some sample processing. We also describe completed as well as ongoing work to apply the approach in several directions. Code and data are available at http://bioc.sourceforge.net/. Database URL: http://bioc.sourceforge.net/

Concepts: Linguistics, Language, Writing, Data mining, Annotation, Natural language processing, Natural language, Markup language