The last few decades have utterly transformed genetics and genomics, but what might the next ten years bring? PLOS Biology asked eight leaders spanning a range of related areas to give us their predictions. Without exception, the predictions are for more data on a massive scale and of more diverse types. All are optimistic and predict enormous positive impact on scientific understanding, while a recurring theme is the benefit of such data for the transformation and personalization of medicine. Several also point out that the biggest changes will very likely be those that we don’t foresee, even now.
How does network structure affect diffusion? Recent studies suggest that the answer depends on the type of contagion. Complex contagions, unlike infectious diseases (simple contagions), are affected by social reinforcement and homophily. Hence, the spread within highly clustered communities is enhanced, while diffusion across communities is hampered. A common hypothesis is that memes and behaviors are complex contagions. We show that, while most memes indeed spread like complex contagions, a few viral memes spread across many communities, like diseases. We demonstrate that the future popularity of a meme can be predicted by quantifying its early spreading pattern in terms of community concentration. The more communities a meme permeates, the more viral it is. We present a practical method to translate data about community structure into predictive knowledge about what information will spread widely. This connection contributes to our understanding in computational social science, social media analytics, and marketing applications.
Standard theories of decision-making involving delayed outcomes predict that people should defer a punishment, whilst advancing a reward. In some cases, such as pain, people seem to prefer to expedite punishment, implying that its anticipation carries a cost, often conceptualized as ‘dread’. Despite empirical support for the existence of dread, whether and how it depends on prospective delay is unknown. Furthermore, it is unclear whether dread represents a stable component of value, or is modulated by biases such as framing effects. Here, we examine choices made between different numbers of painful shocks to be delivered faithfully at different time points up to 15 minutes in the future, as well as choices between hypothetical painful dental appointments at time points of up to approximately eight months in the future, to test alternative models for how future pain is disvalued. We show that future pain initially becomes increasingly aversive with increasing delay, but does so at a decreasing rate. This is consistent with a value model in which moment-by-moment dread increases up to the time of expected pain, such that dread becomes equivalent to the discounted expectation of pain. For a minority of individuals pain has maximum negative value at intermediate delay, suggesting that the dread function may itself be prospectively discounted in time. Framing an outcome as relief reduces the overall preference to expedite pain, which can be parameterized by reducing the rate of the dread-discounting function. Our data support an account of disvaluation for primary punishments such as pain, which differs fundamentally from existing models applied to financial punishments, in which dread exerts a powerful but time-dependent influence over choice.
Secondary use of electronic health records (EHRs) promises to advance clinical research and better inform clinical decision making. Challenges in summarizing and representing patient data prevent widespread practice of predictive modeling using EHRs. Here we present a novel unsupervised deep feature learning method to derive a general-purpose patient representation from EHR data that facilitates clinical predictive modeling. In particular, a three-layer stack of denoising autoencoders was used to capture hierarchical regularities and dependencies in the aggregated EHRs of about 700,000 patients from the Mount Sinai data warehouse. The result is a representation we name “deep patient”. We evaluated this representation as broadly predictive of health states by assessing the probability of patients to develop various diseases. We performed evaluation using 76,214 test patients comprising 78 diseases from diverse clinical domains and temporal windows. Our results significantly outperformed those achieved using representations based on raw EHR data and alternative feature learning strategies. Prediction performance for severe diabetes, schizophrenia, and various cancers were among the top performing. These findings indicate that deep learning applied to EHRs can derive patient representations that offer improved clinical predictions, and could provide a machine learning framework for augmenting clinical decision systems.
Use of socially generated “big data” to access information about collective states of the minds in human societies has become a new paradigm in the emerging field of computational social science. A natural application of this would be the prediction of the society’s reaction to a new product in the sense of popularity and adoption rate. However, bridging the gap between “real time monitoring” and “early predicting” remains a big challenge. Here we report on an endeavor to build a minimalistic predictive model for the financial success of movies based on collective activity data of online users. We show that the popularity of a movie can be predicted much before its release by measuring and analyzing the activity level of editors and viewers of the corresponding entry to the movie in Wikipedia, the well-known online encyclopedia.
We are learning how to watch the news through tears. All those children, and the adults protecting them. With an assault rifle. Up close. The survivors, eyes averted, are led to safety in daisy chains. Ambulances rush to the scene, but nearly all return empty. Loved ones go home empty, too. We pore over the details, searching for the clues that will bring order to chaos and help us predict and prevent the next one. But these catastrophes are all different. We have found to our dismay that prediction is somewhere between difficult and impossible. Tailored interventions, designed for specific . . .
Correctly assessing a scientist’s past research impact and potential for future impact is key in recruitment decisions and other evaluation processes. While a candidate’s future impact is the main concern for these decisions, most measures only quantify the impact of previous work. Recently, it has been argued that linear regression models are capable of predicting a scientist’s future impact. By applying that future impact model to 762 careers drawn from three disciplines: physics, biology, and mathematics, we identify a number of subtle, but critical, flaws in current models. Specifically, cumulative non-decreasing measures like the h-index contain intrinsic autocorrelation, resulting in significant overestimation of their “predictive power”. Moreover, the predictive power of these models depend heavily upon scientists' career age, producing least accurate estimates for young researchers. Our results place in doubt the suitability of such models, and indicate further investigation is required before they can be used in recruiting decisions.
- Proceedings of the National Academy of Sciences of the United States of America
- Published over 1 year ago
Individuals with music training in early childhood show enhanced processing of musical sounds, an effect that generalizes to speech processing. However, the conclusions drawn from previous studies are limited due to the possible confounds of predisposition and other factors affecting musicians and nonmusicians. We used a randomized design to test the effects of a laboratory-controlled music intervention on young infants' neural processing of music and speech. Nine-month-old infants were randomly assigned to music (intervention) or play (control) activities for 12 sessions. The intervention targeted temporal structure learning using triple meter in music (e.g., waltz), which is difficult for infants, and it incorporated key characteristics of typical infant music classes to maximize learning (e.g., multimodal, social, and repetitive experiences). Controls had similar multimodal, social, repetitive play, but without music. Upon completion, infants' neural processing of temporal structure was tested in both music (tones in triple meter) and speech (foreign syllable structure). Infants' neural processing was quantified by the mismatch response (MMR) measured with a traditional oddball paradigm using magnetoencephalography (MEG). The intervention group exhibited significantly larger MMRs in response to music temporal structure violations in both auditory and prefrontal cortical regions. Identical results were obtained for temporal structure changes in speech. The intervention thus enhanced temporal structure processing not only in music, but also in speech, at 9 mo of age. We argue that the intervention enhanced infants' ability to extract temporal structure information and to predict future events in time, a skill affecting both music and speech processing.
The technology for evaluating patient-provider interactions in psychotherapy-observational coding-has not changed in 70 years. It is labor-intensive, error prone, and expensive, limiting its use in evaluating psychotherapy in the real world. Engineering solutions from speech and language processing provide new methods for the automatic evaluation of provider ratings from session recordings. The primary data are 200 Motivational Interviewing (MI) sessions from a study on MI training methods with observer ratings of counselor empathy. Automatic Speech Recognition (ASR) was used to transcribe sessions, and the resulting words were used in a text-based predictive model of empathy. Two supporting datasets trained the speech processing tasks including ASR (1200 transcripts from heterogeneous psychotherapy sessions and 153 transcripts and session recordings from 5 MI clinical trials). The accuracy of computationally-derived empathy ratings were evaluated against human ratings for each provider. Computationally-derived empathy scores and classifications (high vs. low) were highly accurate against human-based codes and classifications, with a correlation of 0.65 and F-score (a weighted average of sensitivity and specificity) of 0.86, respectively. Empathy prediction using human transcription as input (as opposed to ASR) resulted in a slight increase in prediction accuracies, suggesting that the fully automatic system with ASR is relatively robust. Using speech and language processing methods, it is possible to generate accurate predictions of provider performance in psychotherapy from audio recordings alone. This technology can support large-scale evaluation of psychotherapy for dissemination and process studies.
Will they participate? predicting patients' response to clinical trial invitations in a pediatric emergency department
- Journal of the American Medical Informatics Association : JAMIA
- Published over 1 year ago
(1) To develop an automated algorithm to predict a patient’s response (ie, if the patient agrees or declines) before he/she is approached for a clinical trial invitation; (2) to assess the algorithm performance and the predictors on real-world patient recruitment data for a diverse set of clinical trials in a pediatric emergency department; and (3) to identify directions for future studies in predicting patients' participation response.