Concept: Positive predictive value


BackgroundScales are widely used in psychiatric assessments following self-harm. Robust evidence for their diagnostic use is lacking.AimsTo evaluate the performance of risk scales (Manchester Self-Harm Rule, ReACT Self-Harm Rule, SAD PERSONS scale, Modified SAD PERSONS scale, Barratt Impulsiveness Scale); and patient and clinician estimates of risk in identifying patients who repeat self-harm within 6 months.MethodA multisite prospective cohort study was conducted of adults aged 18 years and over referred to liaison psychiatry services following self-harm. Scale a priori cut-offs were evaluated using diagnostic accuracy statistics. The area under the curve (AUC) was used to determine optimal cut-offs and compare global accuracy.ResultsIn total, 483 episodes of self-harm were included in the study. The episode-based 6-month repetition rate was 30% (n = 145). Sensitivity ranged from 1% (95% CI 0-5) for the SAD PERSONS scale, to 97% (95% CI 93-99) for the Manchester Self-Harm Rule. Positive predictive values ranged from 13% (95% CI 2-47) for the Modified SAD PERSONS Scale to 47% (95% CI 41-53) for the clinician assessment of risk. The AUC ranged from 0.55 (95% CI 0.50-0.61) for the SAD PERSONS scale to 0.74 (95% CI 0.69-0.79) for the clinician global scale. The remaining scales performed significantly worse than clinician and patient estimates of risk (P<0.001).ConclusionsRisk scales following self-harm have limited clinical utility and may waste valuable resources. Most scales performed no better than clinician or patient ratings of risk. Some performed considerably worse. Positive predictive values were modest. In line with national guidelines, risk scales should not be used to determine patient management or predict self-harm.

Concepts: Clinical trial, Positive predictive value, Hospital, Psychiatry


Background Bronchoscopy is frequently nondiagnostic in patients with pulmonary lesions suspected to be lung cancer. This often results in additional invasive testing, although many lesions are benign. We sought to validate a bronchial-airway gene-expression classifier that could improve the diagnostic performance of bronchoscopy. Methods Current or former smokers undergoing bronchoscopy for suspected lung cancer were enrolled at 28 centers in two multicenter prospective studies (AEGIS-1 and AEGIS-2). A gene-expression classifier was measured in epithelial cells collected from the normal-appearing mainstem bronchus to assess the probability of lung cancer. Results A total of 639 patients in AEGIS-1 (298 patients) and AEGIS-2 (341 patients) met the criteria for inclusion. A total of 43% of bronchoscopic examinations were nondiagnostic for lung cancer, and invasive procedures were performed after bronchoscopy in 35% of patients with benign lesions. In AEGIS-1, the classifier had an area under the receiver-operating-characteristic curve (AUC) of 0.78 (95% confidence interval [CI], 0.73 to 0.83), a sensitivity of 88% (95% CI, 83 to 92), and a specificity of 47% (95% CI, 37 to 58). In AEGIS-2, the classifier had an AUC of 0.74 (95% CI, 0.68 to 0.80), a sensitivity of 89% (95% CI, 84 to 92), and a specificity of 47% (95% CI, 36 to 59). The combination of the classifier plus bronchoscopy had a sensitivity of 96% (95% CI, 93 to 98) in AEGIS-1 and 98% (95% CI, 96 to 99) in AEGIS-2, independent of lesion size and location. In 101 patients with an intermediate pretest probability of cancer, the negative predictive value of the classifier was 91% (95% CI, 75 to 98) among patients with a nondiagnostic bronchoscopic examination. Conclusions The gene-expression classifier improved the diagnostic performance of bronchoscopy for the detection of lung cancer. In intermediate-risk patients with a nondiagnostic bronchoscopic examination, a negative classifier score provides support for a more conservative diagnostic approach. (Funded by Allegro Diagnostics and others; AEGIS-1 and AEGIS-2 numbers, NCT01309087 and NCT00746759 .).

Concepts: Pulmonology, Lung, Lung cancer, Positive predictive value, Negative predictive value, Sensitivity and specificity, Carcinoma, Lesion


The technology for evaluating patient-provider interactions in psychotherapy-observational coding-has not changed in 70 years. It is labor-intensive, error prone, and expensive, limiting its use in evaluating psychotherapy in the real world. Engineering solutions from speech and language processing provide new methods for the automatic evaluation of provider ratings from session recordings. The primary data are 200 Motivational Interviewing (MI) sessions from a study on MI training methods with observer ratings of counselor empathy. Automatic Speech Recognition (ASR) was used to transcribe sessions, and the resulting words were used in a text-based predictive model of empathy. Two supporting datasets trained the speech processing tasks including ASR (1200 transcripts from heterogeneous psychotherapy sessions and 153 transcripts and session recordings from 5 MI clinical trials). The accuracy of computationally-derived empathy ratings were evaluated against human ratings for each provider. Computationally-derived empathy scores and classifications (high vs. low) were highly accurate against human-based codes and classifications, with a correlation of 0.65 and F-score (a weighted average of sensitivity and specificity) of 0.86, respectively. Empathy prediction using human transcription as input (as opposed to ASR) resulted in a slight increase in prediction accuracies, suggesting that the fully automatic system with ASR is relatively robust. Using speech and language processing methods, it is possible to generate accurate predictions of provider performance in psychotherapy from audio recordings alone. This technology can support large-scale evaluation of psychotherapy for dissemination and process studies.

Concepts: Transcription, Positive predictive value, Type I and type II errors, Sensitivity and specificity, Prediction, Speech recognition, Speech processing, Speech synthesis


BACKGROUND: The objective of this study was to compare celiac disease (CD)– specific antibody tests to determine if they could replace jejunal biopsy in patients with a high pretest probability of CD. METHODS: This retrospective study included sera from 149 CD patients and 119 controls, all with intestinal biopsy. All samples were analyzed for IgA and IgG antibodies against native gliadin (ngli) and deamidated gliadin peptides (dpgli), as well as for IgA antibodies against tissue transglutaminase and endomysium. RESULTS: dpgli were superior to ngli for IgG antibody determination: 68% vs. 92% specificity and 79% vs. 85% sensitivity for ngli and dpgli, respectively. Positive (76% vs. 93%) and negative (72% vs. 83%) predictive values were also higher for dpgli than for ngli. Regarding IgA gliadin antibody determination, sensitivity improved from 61% to 78% with dpgli, while specificity and positive predictive value remained at 97% (P < 0.00001). A combination of four tests (IgA anti-dpgli, IgG anti-dpgli, IgA anti- tissue transglutaminase, and IgA anti-endomysium) yielded positive and negative predictive values of 99% and 100%, respectively and a likelihood ratio positive of 86 with a likelihood ratio negative of 0.00. Omitting the endomysium antibody determination still yielded positive and negative predictive values of 99% and 98%, respectively and a likelihood ratio positive of 87 with a likelihood ratio negative of 0.01. CONCLUSION: dpgli yielded superior results compared with ngli. A combination of three or four antibody tests including IgA anti-tissue transglutaminase and/or IgA anti- endomysium permitted diagnosis or exclusion of CD without intestinal biopsy in a high proportion of patients (78%). Jejunal biopsy would be necessary in patients with discordant antibody results (22%). With this two-step procedure, only patients with no CD-specific antibodies would be missed.

Concepts: Immune system, Antibody, Fc receptor, Glycoproteins, Positive predictive value, Negative predictive value, Antibodies, Coeliac disease


BACKGROUND: Healthcare claims databases have been used in several studies to characterize the risk and burden of chemotherapy-induced febrile neutropenia (FN) and effectiveness of colony-stimulating factors against FN. The accuracy of methods previously used to identify FN in such databases has not been formally evaluated. METHODS: Data comprised linked electronic medical records from Geisinger Health System and healthcare claims data from Geisinger Health Plan. Subjects were classified into subgroups based on whether or not they were hospitalized for FN per the presumptive “gold standard” (ANC <1.0x109/L, and body temperature >=38.30C or receipt of antibiotics) and claims-based definition (diagnosis codes for neutropenia, fever, and/or infection). Accuracy was evaluated principally based on positive predictive value (PPV) and sensitivity. RESULTS: Among 357 study subjects, 82 (23%) met the gold standard for hospitalized FN. For the claims-based definition including diagnosis codes for neutropenia plus fever in any position (n=28), PPV was 100% and sensitivity was 34% (95% CI: 24–45). For the definition including neutropenia in the primary position (n=54), PPV was 87% (78–95) and sensitivity was 57% (46–68). For the definition including neutropenia in any position (n=71), PPV was 77% (68–87) and sensitivity was 67% (56–77). CONCLUSIONS: Patients hospitalized for chemotherapy-induced FN can be identified in healthcare claims databases–with an acceptable level of mis-classification–using diagnosis codes for neutropenia, or neutropenia plus fever.

Concepts: Medicine, Health insurance, Positive predictive value, Negative predictive value, Thermoregulation, Hyperthermia, Fever, Febrile neutropenia


BACKGROUND: Laboratory tests to assess novel oral anticoagulants (NOACs) are under evaluation. Routine monitoring is unnecessary, but under special circumstances bioactivity assessment becomes crucial. We analyzed the effects of NOACs on coagulation tests and the availability of specific assays at different laboratories.METHODS: Plasma samples spiked with dabigatran (Dabi; 120 and 300 μg/L) or rivaroxaban (Riva; 60, 146, and 305 μg/L) were sent to 115 and 38 European laboratories, respectively. International normalized ratio (INR) and activated partial thromboplastin time (APTT) were analyzed for all samples; thrombin time (TT) was analyzed specifically for Dabi and calibrated anti-activated factor X (anti-Xa) activity for Riva. We compared the results with patient samples.RESULTS: Results of Dabi samples were reported by 73 laboratories (13 INR and 9 APTT reagents) and Riva samples by 22 laboratories (5 INR and 4 APTT reagents). Both NOACs increased INR values; the increase was modest, albeit larger, for Dabi, with higher CV, especially with Quick (vs Owren) methods. Both NOACs dose-dependently prolonged the APTT. Again, the prolongation and CVs were larger for Dabi. The INR and APTT results varied reagent-dependently (P < 0.005), with less prolongation in patient samples. TT results (Dabi) and calibrated anti-Xa results (Riva) were reported by only 11 and 8 laboratories, respectively.CONCLUSIONS: The screening tests INR and APTT are suboptimal in assessing NOACs, having high reagent dependence and low sensitivity and specificity. They may provide information, if laboratories recognize their limitations. The variation will likely increase and the sensitivity differ in clinical samples. Specific assays measure NOACs accurately; however, few laboratories applied them.

Concepts: Positive predictive value, Type I and type II errors, Sensitivity and specificity, Coagulation, Warfarin, Prothrombin time, Partial thromboplastin time


Bronchoscopy is frequently used for the evaluation of suspicious pulmonary lesions found on computed tomography, but its sensitivity for detecting lung cancer is limited. Recently, a bronchial genomic classifier was validated to improve the sensitivity of bronchoscopy for lung cancer detection, demonstrating a high sensitivity and negative predictive value among patients at intermediate risk (10-60 %) for lung cancer with an inconclusive bronchoscopy. Our objective for this study was to determine if a negative genomic classifier result that down-classifies a patient from intermediate risk to low risk (<10 %) for lung cancer would reduce the rate that physicians recommend more invasive testing among patients with an inconclusive bronchoscopy.

Concepts: Cancer, Pulmonology, Lung cancer, Positive predictive value, Negative predictive value, Sensitivity and specificity, Decision theory, Binary classification


Precision medicine in oncology requires an accurate characterization of a tumor molecular profile for patient stratification. Though targeted deep sequencing is an effective tool to detect the presence of somatic sequence variants, a significant number of patient specimens do not meet the requirements needed for routine clinical application. Analysis is hindered by contamination of normal cells and inherent tumor heterogeneity, compounded with challenges of dealing with minute amounts of tissue and DNA damages common in formalin-fixed paraffin-embedded (FFPE) specimens. Here we present an innovative workflow using DEPArray™ system, a microchip-based digital sorter to achieve 100%-pure, homogenous subpopulations of cells from FFPE samples. Cells are distinguished by fluorescently labeled antibodies and DNA content. The ability to address tumor heterogeneity enables unambiguous determination of true-positive sequence variants, loss-of-heterozygosity as well as copy number variants. The proposed strategy overcomes the inherent trade-offs made between sensitivity and specificity in detecting genetic variants from a mixed population, thus rescuing for analysis even the smaller clinical samples with low tumor cellularity.

Concepts: Gene, Genetics, Molecular biology, Positive predictive value, Type I and type II errors, Sensitivity and specificity


OBJECTIVES: To systematically review evidence on depression screening in coronary heart disease (CHD) by assessing the (1) accuracy of screening tools; (2) effectiveness of treatment; and (3) effect of screening on depression outcomes. BACKGROUND: A 2008 American Heart Association (AHA) Science Advisory recommended routine depression screening in CHD. METHODS: CINAHL, Cochrane, EMBASE, ISI, MEDLINE, PsycINFO and SCOPUS databases searched through December 2, 2011; manual journal searches; reference lists; citation tracking; trial registries. Included articles (1) compared a depression screening instrument to a depression diagnosis; (2) compared depression treatment to placebo or usual care in a randomized controlled trial (RCT); or (3) assessed the effect of screening on depression outcomes in a RCT. RESULTS: There were few examples of screening tools with good sensitivity and specificity using a priori-defined cutoffs in more than one patient sample among 15 screening accuracy studies. Depression treatment with antidepressants or psychotherapy generated modest symptom reductions among post-myocardial infarction (post-MI) and stable CHD patients (N = 6; effect size = 0.20-0.38), but antidepressants did not improve symptoms more than placebo in 2 heart failure (HF) trials. Depression treatment did not improve cardiac outcomes. No RCTs investigated the effects of screening on depression outcomes. CONCLUSIONS: There is evidence that treatment of depression results in modest improvement in depressive symptoms in post-MI and stable CHD patients, although not in HF patients. There is still no evidence that routine screening for depression improves depression or cardiac outcomes. The AHA Science Advisory on depression screening should be revised to reflect this lack of evidence.

Concepts: Positive predictive value, Heart, Randomized controlled trial, Type I and type II errors, Sensitivity and specificity, Heart disease, Clinical research, Meta-analysis


Alzheimer’s disease causes a progressive dementia that currently affects over 35 million individuals worldwide and is expected to affect 115 million by 2050 (ref. 1). There are no cures or disease-modifying therapies, and this may be due to our inability to detect the disease before it has progressed to produce evident memory loss and functional decline. Biomarkers of preclinical disease will be critical to the development of disease-modifying or even preventative therapies. Unfortunately, current biomarkers for early disease, including cerebrospinal fluid tau and amyloid-β levels, structural and functional magnetic resonance imaging and the recent use of brain amyloid imaging or inflammaging, are limited because they are either invasive, time-consuming or expensive. Blood-based biomarkers may be a more attractive option, but none can currently detect preclinical Alzheimer’s disease with the required sensitivity and specificity. Herein, we describe our lipidomic approach to detecting preclinical Alzheimer’s disease in a group of cognitively normal older adults. We discovered and validated a set of ten lipids from peripheral blood that predicted phenoconversion to either amnestic mild cognitive impairment or Alzheimer’s disease within a 2-3 year timeframe with over 90% accuracy. This biomarker panel, reflecting cell membrane integrity, may be sensitive to early neurodegeneration of preclinical Alzheimer’s disease.

Concepts: Alzheimer's disease, Brain, Positive predictive value, Type I and type II errors, Sensitivity and specificity, Magnetic resonance imaging, Dementia, Mild cognitive impairment