Journal: Journal of biomedical informatics
Risk sharing arrangements between hospitals and payers together with penalties imposed by the Centers for Medicare and Medicaid (CMS) are driving an interest in decreasing early readmissions. There are a number of published risk models predicting 30 day readmissions for particular patient populations, however they often exhibit poor predictive performance and would be unsuitable for use in a clinical setting. In this work we describe and compare several predictive models, some of which have never been applied to this task and which outperform the regression methods that are typically applied in the healthcare literature. In addition, we apply methods from deep learning to the five conditions CMS is using to penalize hospitals, and offer a simple framework for determining which conditions are most cost effective to target.
Automatic monitoring of Adverse Drug Reactions (ADRs), defined as adverse patient outcomes caused by medications, is a challenging research problem that is currently receiving significant attention from the medical informatics community. In recent years, user-posted data on social media, primarily due to its sheer volume, has become a useful resource for ADR monitoring. Research using social media data has progressed using various data sources and techniques, making it difficult to compare distinct systems and their performances. In this paper, we perform a methodical review to characterize the different approaches to ADR detection/extraction from social media, and their applicability to pharmacovigilance. In addition, we present a potential systematic pathway to ADR monitoring from social media.
The Systematised Nomenclature of Medicine Clinical Terms (SNOMED CT) has been designated as the recommended clinical reference terminology for use in clinical information systems around the world and is reported to be used in over 50 countries. However, there are still few implementation details. This study examined the implementation of SNOMED CT in terms of design, use and maintenance issues involved in 13 healthcare organisations across eight countries through a series of interviews with 14 individuals. While a great deal of effort has been spent on developing and refining SNOMED CT, there is still much work ahead to bring SNOMED CT into routine clinical use.
Cancer is a malignant disease that has caused millions of human deaths. Its study has a long history of well over hundred years. There have been an enormous number of publications on cancer research. This integrated but unstructured biomedical text is of great value for cancer diagnostics, treatment, and prevention. The immense body and rapid growth of biomedical text on cancer has led to the appearance of a large number of text mining techniques aimed at extracting novel knowledge from scientific text. Biomedical text mining on cancer research is computationally automatic and high-throughput in nature. However, it is error-prone due to the complexity of natural language processing. In this review, we introduce the basic concepts underlying text mining and examine some frequently used algorithms, tools, and data sets, as well as assessing how much these algorithms have been utilized. We then discuss the current state-of-the-art text mining applications in cancer research and we also provide some resources for cancer text mining. With the development of systems biology, researchers tend to understand complex biomedical systems from a systems biology viewpoint. Thus, the full utilization of text mining to facilitate cancer systems biology research is fast becoming a major concern. To address this issue, we describe the general workflow of text mining in cancer systems biology and each phase of the workflow. We hope that this review can (i) provide a useful overview of the current work of this field; (ii) help researchers to choose text mining tools and datasets; and (iii) highlight how to apply text mining to assist cancer systems biology research.
Due to an enormous number of scientific publications that cannot be handled manually, there is a rising interest in text-mining techniques for automated information extraction, especially in the biomedical field. Such techniques provide effective means of information search, knowledge discovery, and hypothesis generation. Most previous studies have primarily focused on the design and performance improvement of either named entity recognition or relation extraction. In this paper, we present PKDE4J, a comprehensive text-mining system that integrates dictionary-based entity extraction and rule-based relation extraction in a highly flexible and extensible framework. Starting with the Stanford CoreNLP, we developed the system to cope with multiple types of entities and relations. The system also has fairly good performance in terms of accuracy as well as the ability to configure text-processing components. We demonstrate its competitive performance by evaluating it on many corpora and found that it surpasses existing systems with average F-measures of 85% for entity extraction and 81% for relation extraction.
When medical data have been successfully recorded or exchanged between systems there appear a need to present the data consistently to ensure that it is clearly understood and interpreted. A standard based user interface can provide interoperability on the visual level.
Patient classification systems (PCSs) are commonly used in nursing units to assess how many nursing care hours are needed to care for patients. These systems then provide staffing and nurse-patient assignment recommendations for a given patient census based on these acuity scores. Our hypothesis is that such systems do not accurately capture workload and we conduct an experiment to test this hypothesis. Specifically, we conducted a survey study to capture nurses' perception of workload in an inpatient unit. 45 nurses from an oncology and surgery unit completed the survey and rated the impact of patient acuity indicators on their perceived workload using a six-point Likert scale. From these ratings we can calculate a workload score for an individual nurse given a set of patient acuity indicators. The approach offers optimization models (prescriptive analytics), which use patient acuity indicators from a commercial PCS as well as a survey-based nurse workload score. The models assigns patients to nurses by distributing acuity scores from the PCS and survey-based perceived workload in a balanced way. Numerical results suggest that the proposed nurse-patient assignment models achieve a balanced assignment and lower overall survey-based perceived workload compared to the assignment based solely on acuity scores from the PCS. This results in an improvement of perceived workload that is upwards of five percent.
Though the genetic etiology of autism is complex, our understanding can be improved by identifying genes and gene-gene interactions that contribute to the development of specific autism subtypes. Identifying such gene groupings will allow individuals to be diagnosed and treated according to their precise characteristics. To this end, we developed a method to associate gene combinations with groups with shared autism traits, targeting genetic elements that distinguish patient populations with opposing phenotypes. Our computational method prioritizes genetic variants for genome-wide association, then utilizes Frequent Pattern Mining to highlight potential interactions between variants. We introduce a novel genotype assessment metric, the Unique Inherited Combination support, which accounts for inheritance patterns observed in the nuclear family while estimating the impact of genetic variation on phenotype manifestation at the individual level. High-contrast variant combinations are tested for significant subgroup associations. We apply this method by contrasting autism subgroups defined by severe or mild manifestations of a phenotype. Significant associations connected 286 genes to the subgroups, including 193 novel autism candidates. 71 pairs of genes have joint associations with subgroups, presenting opportunities to investigate interacting functions. This study analyzed 12 autism subgroups, but our informatics method can explore other meaningful divisions of autism patients, and can further be applied to reveal precise genetic associations within other phenotypically heterogeneous disorders, such as Alzheimer’s disease.
An estimated 25% of type two diabetes mellitus (DM2) patients in the United States are undiagnosed due to inadequate screening, because it is prohibitive to administer laboratory tests to everyone. We assess whether electronic health record (EHR) phenotyping could improve DM2 screening compared to conventional models, even when records are incomplete and not recorded systematically across patients and practice locations, as is typically seen in practice.
Increased adoption of electronic health records has resulted in increased availability of free text clinical data for secondary use. A variety of approaches to obtain actionable information from unstructured free text data exist. These approaches are resource intensive, inherently complex and rely on structured clinical data and dictionary-based approaches. We sought to evaluate the potential to obtain actionable information from free text pathology reports using routinely available tools and approaches that do not depend on dictionary-based approaches.