Concept: Positive predictive value
- The British journal of psychiatry : the journal of mental science
- Published almost 2 years ago
BackgroundScales are widely used in psychiatric assessments following self-harm. Robust evidence for their diagnostic use is lacking.AimsTo evaluate the performance of risk scales (Manchester Self-Harm Rule, ReACT Self-Harm Rule, SAD PERSONS scale, Modified SAD PERSONS scale, Barratt Impulsiveness Scale); and patient and clinician estimates of risk in identifying patients who repeat self-harm within 6 months.MethodA multisite prospective cohort study was conducted of adults aged 18 years and over referred to liaison psychiatry services following self-harm. Scale a priori cut-offs were evaluated using diagnostic accuracy statistics. The area under the curve (AUC) was used to determine optimal cut-offs and compare global accuracy.ResultsIn total, 483 episodes of self-harm were included in the study. The episode-based 6-month repetition rate was 30% (n = 145). Sensitivity ranged from 1% (95% CI 0-5) for the SAD PERSONS scale, to 97% (95% CI 93-99) for the Manchester Self-Harm Rule. Positive predictive values ranged from 13% (95% CI 2-47) for the Modified SAD PERSONS Scale to 47% (95% CI 41-53) for the clinician assessment of risk. The AUC ranged from 0.55 (95% CI 0.50-0.61) for the SAD PERSONS scale to 0.74 (95% CI 0.69-0.79) for the clinician global scale. The remaining scales performed significantly worse than clinician and patient estimates of risk (P<0.001).ConclusionsRisk scales following self-harm have limited clinical utility and may waste valuable resources. Most scales performed no better than clinician or patient ratings of risk. Some performed considerably worse. Positive predictive values were modest. In line with national guidelines, risk scales should not be used to determine patient management or predict self-harm.
Background Bronchoscopy is frequently nondiagnostic in patients with pulmonary lesions suspected to be lung cancer. This often results in additional invasive testing, although many lesions are benign. We sought to validate a bronchial-airway gene-expression classifier that could improve the diagnostic performance of bronchoscopy. Methods Current or former smokers undergoing bronchoscopy for suspected lung cancer were enrolled at 28 centers in two multicenter prospective studies (AEGIS-1 and AEGIS-2). A gene-expression classifier was measured in epithelial cells collected from the normal-appearing mainstem bronchus to assess the probability of lung cancer. Results A total of 639 patients in AEGIS-1 (298 patients) and AEGIS-2 (341 patients) met the criteria for inclusion. A total of 43% of bronchoscopic examinations were nondiagnostic for lung cancer, and invasive procedures were performed after bronchoscopy in 35% of patients with benign lesions. In AEGIS-1, the classifier had an area under the receiver-operating-characteristic curve (AUC) of 0.78 (95% confidence interval [CI], 0.73 to 0.83), a sensitivity of 88% (95% CI, 83 to 92), and a specificity of 47% (95% CI, 37 to 58). In AEGIS-2, the classifier had an AUC of 0.74 (95% CI, 0.68 to 0.80), a sensitivity of 89% (95% CI, 84 to 92), and a specificity of 47% (95% CI, 36 to 59). The combination of the classifier plus bronchoscopy had a sensitivity of 96% (95% CI, 93 to 98) in AEGIS-1 and 98% (95% CI, 96 to 99) in AEGIS-2, independent of lesion size and location. In 101 patients with an intermediate pretest probability of cancer, the negative predictive value of the classifier was 91% (95% CI, 75 to 98) among patients with a nondiagnostic bronchoscopic examination. Conclusions The gene-expression classifier improved the diagnostic performance of bronchoscopy for the detection of lung cancer. In intermediate-risk patients with a nondiagnostic bronchoscopic examination, a negative classifier score provides support for a more conservative diagnostic approach. (Funded by Allegro Diagnostics and others; AEGIS-1 and AEGIS-2 ClinicalTrials.gov numbers, NCT01309087 and NCT00746759 .).
The technology for evaluating patient-provider interactions in psychotherapy-observational coding-has not changed in 70 years. It is labor-intensive, error prone, and expensive, limiting its use in evaluating psychotherapy in the real world. Engineering solutions from speech and language processing provide new methods for the automatic evaluation of provider ratings from session recordings. The primary data are 200 Motivational Interviewing (MI) sessions from a study on MI training methods with observer ratings of counselor empathy. Automatic Speech Recognition (ASR) was used to transcribe sessions, and the resulting words were used in a text-based predictive model of empathy. Two supporting datasets trained the speech processing tasks including ASR (1200 transcripts from heterogeneous psychotherapy sessions and 153 transcripts and session recordings from 5 MI clinical trials). The accuracy of computationally-derived empathy ratings were evaluated against human ratings for each provider. Computationally-derived empathy scores and classifications (high vs. low) were highly accurate against human-based codes and classifications, with a correlation of 0.65 and F-score (a weighted average of sensitivity and specificity) of 0.86, respectively. Empathy prediction using human transcription as input (as opposed to ASR) resulted in a slight increase in prediction accuracies, suggesting that the fully automatic system with ASR is relatively robust. Using speech and language processing methods, it is possible to generate accurate predictions of provider performance in psychotherapy from audio recordings alone. This technology can support large-scale evaluation of psychotherapy for dissemination and process studies.
BACKGROUND: Laboratory tests to assess novel oral anticoagulants (NOACs) are under evaluation. Routine monitoring is unnecessary, but under special circumstances bioactivity assessment becomes crucial. We analyzed the effects of NOACs on coagulation tests and the availability of specific assays at different laboratories.METHODS: Plasma samples spiked with dabigatran (Dabi; 120 and 300 μg/L) or rivaroxaban (Riva; 60, 146, and 305 μg/L) were sent to 115 and 38 European laboratories, respectively. International normalized ratio (INR) and activated partial thromboplastin time (APTT) were analyzed for all samples; thrombin time (TT) was analyzed specifically for Dabi and calibrated anti-activated factor X (anti-Xa) activity for Riva. We compared the results with patient samples.RESULTS: Results of Dabi samples were reported by 73 laboratories (13 INR and 9 APTT reagents) and Riva samples by 22 laboratories (5 INR and 4 APTT reagents). Both NOACs increased INR values; the increase was modest, albeit larger, for Dabi, with higher CV, especially with Quick (vs Owren) methods. Both NOACs dose-dependently prolonged the APTT. Again, the prolongation and CVs were larger for Dabi. The INR and APTT results varied reagent-dependently (P < 0.005), with less prolongation in patient samples. TT results (Dabi) and calibrated anti-Xa results (Riva) were reported by only 11 and 8 laboratories, respectively.CONCLUSIONS: The screening tests INR and APTT are suboptimal in assessing NOACs, having high reagent dependence and low sensitivity and specificity. They may provide information, if laboratories recognize their limitations. The variation will likely increase and the sensitivity differ in clinical samples. Specific assays measure NOACs accurately; however, few laboratories applied them.
BACKGROUND: The objective of this study was to compare celiac disease (CD)– specific antibody tests to determine if they could replace jejunal biopsy in patients with a high pretest probability of CD. METHODS: This retrospective study included sera from 149 CD patients and 119 controls, all with intestinal biopsy. All samples were analyzed for IgA and IgG antibodies against native gliadin (ngli) and deamidated gliadin peptides (dpgli), as well as for IgA antibodies against tissue transglutaminase and endomysium. RESULTS: dpgli were superior to ngli for IgG antibody determination: 68% vs. 92% specificity and 79% vs. 85% sensitivity for ngli and dpgli, respectively. Positive (76% vs. 93%) and negative (72% vs. 83%) predictive values were also higher for dpgli than for ngli. Regarding IgA gliadin antibody determination, sensitivity improved from 61% to 78% with dpgli, while specificity and positive predictive value remained at 97% (P < 0.00001). A combination of four tests (IgA anti-dpgli, IgG anti-dpgli, IgA anti- tissue transglutaminase, and IgA anti-endomysium) yielded positive and negative predictive values of 99% and 100%, respectively and a likelihood ratio positive of 86 with a likelihood ratio negative of 0.00. Omitting the endomysium antibody determination still yielded positive and negative predictive values of 99% and 98%, respectively and a likelihood ratio positive of 87 with a likelihood ratio negative of 0.01. CONCLUSION: dpgli yielded superior results compared with ngli. A combination of three or four antibody tests including IgA anti-tissue transglutaminase and/or IgA anti- endomysium permitted diagnosis or exclusion of CD without intestinal biopsy in a high proportion of patients (78%). Jejunal biopsy would be necessary in patients with discordant antibody results (22%). With this two-step procedure, only patients with no CD-specific antibodies would be missed.
BACKGROUND: Healthcare claims databases have been used in several studies to characterize the risk and burden of chemotherapy-induced febrile neutropenia (FN) and effectiveness of colony-stimulating factors against FN. The accuracy of methods previously used to identify FN in such databases has not been formally evaluated. METHODS: Data comprised linked electronic medical records from Geisinger Health System and healthcare claims data from Geisinger Health Plan. Subjects were classified into subgroups based on whether or not they were hospitalized for FN per the presumptive “gold standard” (ANC <1.0x109/L, and body temperature >=38.30C or receipt of antibiotics) and claims-based definition (diagnosis codes for neutropenia, fever, and/or infection). Accuracy was evaluated principally based on positive predictive value (PPV) and sensitivity. RESULTS: Among 357 study subjects, 82 (23%) met the gold standard for hospitalized FN. For the claims-based definition including diagnosis codes for neutropenia plus fever in any position (n=28), PPV was 100% and sensitivity was 34% (95% CI: 24–45). For the definition including neutropenia in the primary position (n=54), PPV was 87% (78–95) and sensitivity was 57% (46–68). For the definition including neutropenia in any position (n=71), PPV was 77% (68–87) and sensitivity was 67% (56–77). CONCLUSIONS: Patients hospitalized for chemotherapy-induced FN can be identified in healthcare claims databases–with an acceptable level of mis-classification–using diagnosis codes for neutropenia, or neutropenia plus fever.
Bronchoscopy is frequently used for the evaluation of suspicious pulmonary lesions found on computed tomography, but its sensitivity for detecting lung cancer is limited. Recently, a bronchial genomic classifier was validated to improve the sensitivity of bronchoscopy for lung cancer detection, demonstrating a high sensitivity and negative predictive value among patients at intermediate risk (10-60 %) for lung cancer with an inconclusive bronchoscopy. Our objective for this study was to determine if a negative genomic classifier result that down-classifies a patient from intermediate risk to low risk (<10 %) for lung cancer would reduce the rate that physicians recommend more invasive testing among patients with an inconclusive bronchoscopy.
The accurate diagnosis of asbestos-related diseases is important because of past and current asbestos exposures. This study evaluated the reliability of clinical diagnoses of asbestos-related diseases in former mineworkers using autopsies as the reference standard. Sensitivity, specificity, positive predictive value and negative predictive value were calculated. The 149 cases identified had clinical examinations 0.3-7.4 years before death. More asbestos-related diseases were diagnosed at autopsy rather than clinically: 77 versus 52 for asbestosis, 27 versus 14 for mesothelioma and 22 versus 3 for lung cancer. Sensitivity and specificity values for clinical diagnoses were 50.6% and 81.9% for asbestosis, 40.7% and 97.5% for mesothelioma, and 13.6% and 100.0% for lung cancer. False-negative diagnoses of asbestosis were more likely using radiographs of acceptable (versus good) quality and in cases with pulmonary tuberculosis at autopsy. The low sensitivity values are indicative of the high proportion of false-negative diagnoses. It is unlikely that these were the result of disease manifestation between the last clinical assessment and autopsy. Where clinical features suggest asbestos-related diseases but the chest radiograph is negative, more sophisticated imaging techniques or immunohistochemistry for asbestos-related cancers should be used. Autopsies are useful for the detection of previously undiagnosed and misdiagnosed asbestos-related diseases, and for monitoring clinical practice and delivery of compensation.
Earlier detection of colorectal cancer greatly improves prognosis, largely through surgical excision of neoplastic polyps. These include benign adenomas which can transform over time to malignant adenocarcinomas. This progression may be associated with changes in full blood count indices. An existing risk algorithm derived in Israel stratifies individuals according to colorectal cancer risk using full blood count data, but has not been validated in the UK. We undertook a retrospective analysis using the Clinical Practice Research Datalink. Patients aged over 40 with full blood count data were risk-stratified and followed up for a diagnosis of colorectal cancer over a range of time intervals. The primary outcome was the area under the receiver operating characteristic curve for the 18-24-month interval. We also undertook a case-control analysis (matching for age, sex, and year of risk score), and a cohort study of patients undergoing full blood count testing during 2012, to estimate predictive values. We included 2,550,119 patients. The area under the curve for the 18-24-month interval was 0.776 [95% confidence interval (CI): 0.771, 0.781]. Performance improves as the time interval reduces. The area under the curve for the age-matched case-control analysis was 0.583 [0.574, 0.591]. For the population risk-scored in 2012, the positive predictive value at 99.5% specificity was 8.8% with negative predictive value 99.6%. The algorithm offers an additional means of identifying risk of colorectal cancer, and could support other approaches to early detection, including screening and active case finding.
Genomic analysis of tumor tissue is the standard technique for identifying DNA alterations in malignancies. Genomic analysis of circulating tumor cell-free DNA (cfDNA) represents a relatively non-invasive method of assessing genomic alterations using peripheral blood. We compared the concordance of genomic alterations between cfDNA and tissue biopsies in this retrospective study. Twenty-eight patients with advanced solid tumors with paired next-generation sequencing tissue and cfDNA biopsies were identified. Sixty-five genes were common to both assays. Concordance was defined as the presence or absence of the identical genomic alteration(s) in a single gene on both molecular platforms. Including all aberrations, the average number of alterations per patient for tissue and cfDNA analysis was 4.82 and 2.96, respectively. When eliminating alterations not detectable in the cfDNA assay, mean number of alterations for tissue and cfDNA was 3.21 and 2.96, respectively. Overall, concordance was 91.9-93.9%. However, the concordance rate decreased to 11.8-17.1% when considering only genes with reported genomic alterations in either assay. Over 50% of mutations detected in either technique were not detected using the other biopsy technique, indicating a potential complementary role of each assay. Across 5 genes (TP53, EGFR, KRAS, APC, CDKN2A), sensitivity and specificity were 59.1% and 94.8%, respectively. Potential explanations for the lack of concordance include differences in assay platform, spatial and temporal factors, tumor heterogeneity, interval treatment, subclones, and potential germline DNA contamination. These results highlight the importance of prospective studies to evaluate concordance of genomic findings between distinct platforms that ultimately may inform treatment decisions.