SciCombinator

Discover the most talked about and latest scientific content & concepts.

Concept: Multivariate statistics

171

BACKGROUND: Non-invasive phenotyping of chronic respiratory diseases would be highly beneficial in the personalised medicine of the future. Volatile organic compounds can be measured in the exhaled breath and may be produced or altered by disease processes. We investigated whether distinct patterns of these compounds were present in chronic obstructive pulmonary disease (COPD) and clinically relevant disease phenotypes. METHODS: Breath samples from 39 COPD subjects and 32 healthy controls were collected and analysed using gas chromatography time-of-flight mass spectrometry. Subjects with COPD also underwent sputum induction. Discriminatory compounds were identified by univariate logistic regression followed by multivariate analysis: 1. principal component analysis; 2. multivariate logistic regression; 3. receiver operating characteristic (ROC) analysis. RESULTS: Comparing COPD versus healthy controls, principal component analysis clustered the 20 best-discriminating compounds into four components explaining 71% of the variance. Multivariate logistic regression constructed an optimised model using two components with an accuracy of 69%. The model had 85% sensitivity, 50% specificity and ROC area under the curve of 0.74. Analysis of COPD subgroups showed the method could classify COPD subjects with far greater accuracy. Models were constructed which classified subjects with [GREATER-THAN OR EQUAL TO]2% sputum eosinophilia with ROC area under the curve of 0.94 and those having frequent exacerbations 0.95. Potential biomarkers correlated to clinical variables were identified in each subgroup. CONCLUSION: The exhaled breath volatile organic compound profile discriminated between COPD and healthy controls and identified clinically relevant COPD subgroups. If these findings are validated in prospective cohorts, they may have diagnostic and management value in this disease.

Concepts: Medicine, Asthma, Pneumonia, Multivariate statistics, Chronic obstructive pulmonary disease, Volatile organic compound, Organic compounds, Linear discriminant analysis

168

BACKGROUND: The mortality rate of patients complicated with sepsis-associated organ failure remains high in spite of intensive care treatment. The purpose of this study was to define the duration of systemic inflammatory response syndrome (SIRS) before organ failure (DSOF) and determine the value of DSOF as a prognostic factor in septic patients. METHODS: This retrospective cohort study was conducted in an 11-bed medical and surgical intensive care unit (ICU) in a university hospital. The primary endpoint was in-hospital mortality of the septic patients. RESULTS: One hundred ten septic patients with organ failure and/or shock were enrolled in this study. The in-hospital mortality rate was 36.9%. The median DSOF was 28.5 h. As a metric variable, DSOF was a statistically significant prognostic factor according to univariate analysis (survivor: 74.7 +/- 9.6 h, non-survivor: 58.8 +/- 16.5 h, p = 0.015). On the basis of the ROC curve, we defined an optimal cutoff of 24 h, with which we divided the patients as follows: group 1 (n = 50) comprised patients with a DSOF <=24 h, and group 2 (n = 60) contained patients with a DSOF >24 h. There were statistically significant differences in the in-hospital mortality rate between the two groups (52.0% vs. 25.0%, p = 0.004). Furthermore, by multivariate analysis, DSOF <=24 h (odds ratio: 5.89, 95% confidence interval: 1.46-23.8, p = 0.013) was a significant independent prognostic factor. CONCLUSION: DSOF may be a useful prognostic factor for severe sepsis.

Concepts: Inflammation, Cohort study, Statistics, Systemic inflammatory response syndrome, Intensive care medicine, Multivariate statistics, Septic shock, Sepsis

168

BACKGROUND: Trachoma is the leading cause of preventable blindness worldwide. It is common in areas where the people are socio-economically deprived. The aim of this study was to assess active trachoma and associated risk factors among children 1–9 years in East Gojjam. METHODS: Community-based cross-sectional study was conducted in Baso Liben District from February to April 2012. A two-stage random cluster-sampling technique was employed and all children 1–9 years old from each household were clinically assessed for trachoma based on simplified WHO 1983 classification. Data were collected by using semi-structured interview, pre-tested questionnaire and observation. The data were entered and analyzed using SPSS version 16 statistical package.Result: From a total of 792 children screened for trachoma (of which 50.6% were girls), the overall prevalence of active trachoma was 24.1% consisting of only 17.2% [95% CI: 14.8, 20.1] TF and 6.8% TI. There were variations among children living in low land (29.3%) and in medium land (21.4%). In multivariate analysis, low monthly income (AOR= adjusted odds ratio) 2.98; 95% CI (confidence interval): 1.85-7.85), illiterate family (AOR =5.18; 95% CI: 2.92-9.17); unclean face (AOR =18.68; 95% CI :1.98-175.55); access to water source (AOR=2.01;95% CI: 1.27-3.15); less than 20 liters of water use (AOR=4.88; 95% CI:1.51-15.78); not using soap for face washing (AOR=5.84; 95%CI :1.98-17.19); not using latrine frequently (AOR=1.75; 95% CI:0.01-0.42); density of flies (AOR=3.77; 95%CI: 2.26-6.29); less knowledgeable family (AOR=3.91; 95%CI :2.40-6.38) and average monthly income (AOR=2.98; 95%CI : 1.85-7.85) were found independently associated with trachoma. CONCLUSION: Active trachoma is a major public problem among 1–9 years children and significantly associated with a number of risky factors. Improvement in awareness of facial hygiene, environmental conditions, mass antibiotic distribution and health education on trachoma transmission and prevention should be strengthened in the District.

Concepts: Epidemiology, Medical statistics, Cross-sectional study, Risk, Multivariate statistics, Odds ratio, SPSS

167

BACKGROUND: Elevated Glasgow Prognostic Score (GPS) has been related to poor prognosis in patients with hepatocellular carcinoma (HCC) undergoing surgical resection or receiving sorafenib. The aim of this study was to investigate the prognostic value of GPS in patients with various stages of the disease and with different liver functional status. METHODS: One hundred and fifty patients with newly diagnosed HCC were prospectively evaluated. Patients were divided according to their GPS scores. Univariate and multivariate analyses were performed to identify clinicopathological variables associated with overall survival; the identified variables were then compared with those of other validated staging systems. RESULTS: Elevated GPS were associated with increased asparate aminotransferase ( P<0.0001), total bilirubin ( P<0.0001), decreased albumin (P<0.0001), alpha-fetoprotein ( P=0.008), larger tumor diameter ( P=0.003), tumor number ( P=0.041), vascular invasion ( P=0.0002), extra hepatic metastasis ( P=0.02), higher Child-Pugh scores (P<0.0001), and higher Cancer Liver Italian Program scores (P<0.0001). On multivariate analysis, the elevated GPS was independently associated with worse overall survival. CONCLUSIONS: Our results demonstrate that the GPS can serve as an independent marker of poor prognosis in patients with HCC in various stages of disease and different liver functional status.

Concepts: Cancer, Lung cancer, Cirrhosis, Liver, Multivariate statistics, Bilirubin, Prognosis, Jaundice

166

BACKGROUND: Static posture, repetitive movements and lack of physical variation are known risk factors for work-related musculoskeletal disorders, and thus needs to be properly assessed in occupational studies. The aims of this study were (i) to investigate the effectiveness of a conventional exposure variation analysis (EVA) in discriminating exposure time lines and (ii) to compare it with a new cluster-based method for analysis of exposure variation. METHODS: For this purpose, we simulated a repeated cyclic exposure varying within each cycle between “low” and “high” exposure levels in a “near” or “far” range, and with “low” or “high” velocities (exposure change rates). The duration of each cycle was also manipulated by selecting a “small” or “large” standard deviation of the cycle time. Theses parameters reflected three dimensions of exposure variation, i.e. range, frequency and temporal similarity.Each simulation trace included two realizations of 100 concatenated cycles with either low (rho = 0.1), medium (rho = 0.5) or high (rho = 0.9) correlation between the realizations. These traces were analyzed by conventional EVA, and a novel cluster-based EVA (C-EVA). Principal component analysis (PCA) was applied on the marginal distributions of 1) the EVA of each of the realizations (univariate approach), 2) a combination of the EVA of both realizations (multivariate approach) and 3) C-EVA. The least number of principal components describing more than 90% of variability in each case was selected and the projection of marginal distributions along the selected principal component was calculated. A linear classifier was then applied to these projections to discriminate between the simulated exposure patterns, and the accuracy of classified realizations was determined. RESULTS: C-EVA classified exposures more correctly than univariate and multivariate EVA approaches; classification accuracy was 49%, 47% and 52% for EVA (univariate and multivariate), and C-EVA, respectively (p < 0.001). All three methods performed poorly in discriminating exposure patterns differing with respect to the variability in cycle time duration. CONCLUSION: While C-EVA had a higher accuracy than conventional EVA, both failed to detect differences in temporal similarity. The data-driven optimality of data reduction and the capability of handling multiple exposure time lines in a single analysis are the advantages of the C-EVA.

Concepts: Multivariate statistics, Factor analysis, Principal component analysis, Exposure, Singular value decomposition, Photography, Linear discriminant analysis, The Unscrambler

151

OBJECTIVE

To analyze the prevalence of bullying and its associated factors in Brazilian adolescents.

METHODS

Data were used from a population-based household survey conducted by the Urban Health Observatory (OSUBH) utilizing probability sampling in three stages: census tracts, residences, and individuals. The survey included 598 adolescents (14-17 years old) who responded questions on bullying, sociodemographic characteristics, health-risk behaviors, educational well-being, family structure, physical activity, markers of nutritional habits, and subjective well-being (body image, personal satisfaction, and satisfaction with their present and future life). Univariate and multivariate analysis was done using robust Poisson regression.

RESULTS

The prevalence of bullying was 26.2% (28.0% among males, 24.0% among females). The location of most bullying cases was at or on route to school (70.5%), followed by on the streets (28.5%), at home (9.8%), while practicing sports (7.3%), at parties (4.6%), at work (1.7%), and at other locations (1.6%). Reports of bullying were associated with life dissatisfaction, difficulty relating to parents, involvement in fights with peers and insecurity in the neighborhood.

CONCLUSIONS

A high prevalence of bullying among participating adolescents was found, and the school serves as the main bullying location, although other sites such as home, parties and workplace were also reported. Characteristics regarding self-perception and adolescent perceptions of their environment were also associated with bullying, thus advancing the knowledge of this type of violence, especially in urban centers of developing countries.

.

Concepts: Family, Urban area, Multivariate statistics, Home, The Streets

138

The effective production and usage of ginsenosides, given their distinct pharmacological effects, are receiving increasing amounts of attention. As the ginsenosides content differs in different parts of Panax ginseng, we wanted to assess and compare the ginsenosides content in the ginseng roots, leave, stems, and berries. To extract the ginsenosides, 70% (v/v) methanol was used. The optimal ultra-performance liquid chromatography-quadrupole time of flight mass spectrometry (UPLC-QTOF/MS) method was used to profile various ginsenosides from the different parts of P. ginseng. The datasets were then subjected to multivariate analysis including principal component analysis (PCA) and hierarchical clustering analysis (HCA). A UPLC-QTOF/MS method with an in-house library was constructed to profile 58 ginsenosides. With this method, a total of 39 ginsenosides were successfully identified and quantified in the ginseng roots, leave, stem, and berries. PCA and HCA characterized the different ginsenosides compositions from the different parts. The quantitative ginsenoside contents were also characterized from each plant part. The results of this study indicate that the UPLC-QTOF/MS method can be an effective tool to characterize various ginsenosides from the different parts of P. ginseng.

Concepts: Mass spectrometry, Multivariate statistics, Principal component analysis, Root, Ginseng, Ginsenoside, Panax, Panax ginseng

138

Evaluation of Xagrid® Efficacy and Long-term Safety, a Phase IV, prospective, non interventional study performed in 13 European countries enrolled high risk essential thrombocythemia patients treated with cytoreductive therapy. Primary objectives were safety and pregnancy outcomes. Of 3721 registered patients, 3649 received cytoreductive therapy. At registration, 3611 were receiving: anagrelide (Xagrid®) (n=804), other cytoreductive therapy (n=2666), anagrelide + other cytoreductive therapy (n=141). Median age was 56 vs 70 years for anagrelide vs other cytoreductive therapy. Event rates (patients with events/100 patient years) were, for total thrombosis 1.62 vs 2.06, venous thrombosis 0.15 vs 0.53. Anagrelide was more commonly associated with hemorrhage (0.89 vs 0.43), especially with anti-aggregatory therapy (1.35 vs 0.33) and myelofibrosis (1.04 vs 0.30). Other cytoreductive therapies were more associated with acute leukemia (AL) (0.28 vs 0.07) and other malignancies (1.29 vs 0.44). Post-hoc multivariate analyses identified increased risk for thrombosis with prior thrombohemorrhagic events, age ≥65, cardiovascular risk factors, or hypertension. Risk factors for transformation were prior thrombohemorrhagic events, age ≥65, time since diagnosis, and platelet count increase. Safety analysis reflected published data and no new safety concerns for anagrelide were found. Live births occurred in 41/54 pregnancies (76%). (ClinicalTrials.gov #NCT00567502).

Concepts: Scientific method, Blood, Observational study, Platelet, Multivariate statistics, Essential thrombocytosis, Phase IV, Anagrelide

85

The number of diagnosed cases of Autism Spectrum Disorders (ASD) has increased dramatically over the last four decades; however, there is still considerable debate regarding the underlying pathophysiology of ASD. This lack of biological knowledge restricts diagnoses to be made based on behavioral observations and psychometric tools. However, physiological measurements should support these behavioral diagnoses in the future in order to enable earlier and more accurate diagnoses. Stepping towards this goal of incorporating biochemical data into ASD diagnosis, this paper analyzes measurements of metabolite concentrations of the folate-dependent one-carbon metabolism and transulfuration pathways taken from blood samples of 83 participants with ASD and 76 age-matched neurotypical peers. Fisher Discriminant Analysis enables multivariate classification of the participants as on the spectrum or neurotypical which results in 96.1% of all neurotypical participants being correctly identified as such while still correctly identifying 97.6% of the ASD cohort. Furthermore, kernel partial least squares is used to predict adaptive behavior, as measured by the Vineland Adaptive Behavior Composite score, where measurement of five metabolites of the pathways was sufficient to predict the Vineland score with an R2 of 0.45 after cross-validation. This level of accuracy for classification as well as severity prediction far exceeds any other approach in this field and is a strong indicator that the metabolites under consideration are strongly correlated with an ASD diagnosis but also that the statistical analysis used here offers tremendous potential for extracting important information from complex biochemical data sets.

Concepts: DNA, Scientific method, Regression analysis, Multivariate statistics, Psychometrics, Autism, Asperger syndrome, Autism spectrum

80

We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents' food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits.

Concepts: Regression analysis, Statistics, Multivariate statistics, Household, Linear discriminant analysis, Household income in the United States, Food safety, Income quintiles