Discover the most talked about and latest scientific content & concepts.

Concept: Covariate


It is common to present multiple adjusted effect estimates from a single model in a single table. For example, a table might show odds ratios for one or more exposures and also for several confounders from a single logistic regression. This can lead to mistaken interpretations of these estimates. We use causal diagrams to display the sources of the problems. Presentation of exposure and confounder effect estimates from a single model may lead to several interpretative difficulties, inviting confusion of direct-effect estimates with total-effect estimates for covariates in the model. These effect estimates may also be confounded even though the effect estimate for the main exposure is not confounded. Interpretation of these effect estimates is further complicated by heterogeneity (variation, modification) of the exposure effect measure across covariate levels. We offer suggestions to limit potential misunderstandings when multiple effect estimates are presented, including precise distinction between total and direct effect measures from a single model, and use of multiple models tailored to yield total-effect estimates for covariates.

Concepts: Logit, Odds ratio, Confounding, Real number, Covariate, Statistical terminology, Interpretation, Language interpretation


Many case-control tests of rare variation are implemented in statistical frameworks that make correction for confounders like population stratification difficult. Simple permutation of disease status is unacceptable for resolving this issue because the replicate data sets do not have the same confounding as the original data set. These limitations make it difficult to apply rare-variant tests to samples in which confounding most likely exists, e.g., samples collected from admixed populations. To enable the use of such rare-variant methods in structured samples, as well as to facilitate permutation tests for any situation in which case-control tests require adjustment for confounding covariates, we propose to establish the significance of a rare-variant test via a modified permutation procedure. Our procedure uses Fisher’s noncentral hypergeometric distribution to generate permuted data sets with the same structure present in the actual data set such that inference is valid in the presence of confounding factors. We use simulated sequence data based on coalescent models to show that our permutation strategy corrects for confounding due to population stratification that, if ignored, would otherwise inflate the size of a rare-variant test. We further illustrate the approach by using sequence data from the Dallas Heart Study of energy metabolism traits. Researchers can implement our permutation approach by using the R package BiasedUrn.

Concepts: Experimental design, Statistics, Data set, Confounding, Case-control study, Covariate, Permutation, Hypergeometric distribution


Common problems to many longitudinal HIV/AIDS, cancer, vaccine, and environmental exposure studies are the presence of a lower limit of quantification of an outcome with skewness and time-varying covariates with measurement errors. There has been relatively little work published simultaneously dealing with these features of longitudinal data. In particular, left-censored data falling below a limit of detection may sometimes have a proportion larger than expected under a usually assumed log-normal distribution. In such cases, alternative models, which can account for a high proportion of censored data, should be considered. In this article, we present an extension of the Tobit model that incorporates a mixture of true undetectable observations and those values from a skew-normal distribution for an outcome with possible left censoring and skewness, and covariates with substantial measurement error. To quantify the covariate process, we offer a flexible nonparametric mixed-effects model within the Tobit framework. A Bayesian modeling approach is used to assess the simultaneous impact of left censoring, skewness, and measurement error in covariates on inference. The proposed methods are illustrated using real data from an AIDS clinical study. Copyright © 2013 John Wiley & Sons, Ltd.

Concepts: Regression analysis, Measurement, Error, Model, Normal distribution, Covariate, Observational error, Tobit model


BACKGROUND: Whether long-term, low-level hydrogen sulfide (H2S) gas is a cause of health effects, including asthma, is uncertain. Rotorua city, New Zealand, has the largest population exposed, from geothermal sources, to relatively high ambient levels of H2S. In a cross-sectional study, the authors investigated associations with asthma in this population. METHODS: A total of 1637 adults, aged 18-65 years, were enrolled during 2008-2010. Residences and workplaces were geocoded. H2S exposures at homes and workplaces were estimated using city-wide networks of passive H2S samplers and kriging to create exposure surfaces. Exposure metrics were based on (1) time-weighted exposures at home and work; and (2) the maximum exposure (home or work). Exposure estimates were entered as quartiles into regression models, with covariate data. RESULTS: Neither exposure metric showed evidence of increased asthma risk from H2S. However, some suggestion of exposure-related reduced risks for diagnosed asthma and asthma symptoms, particularly wheezing during the last 12 months, emerged. With the maximum exposure metric, the prevalence ratio for wheeze in the highest exposure quartile was 0.80 (0.65, 0.99) and, for current asthma treatment, 0.75 (0.52, 1.08). There was no evidence that this was caused by a “survivor effect”. CONCLUSIONS: The study provided no evidence that asthma risk increases with H2S exposure. Suggestions of a reduced risk in the higher exposure areas are consistent with recent evidence that H2S has signaling functions in the body, including induction of smooth muscle relaxation and reduction of inflammation. Study limitations, including possible confounding, preclude definitive conclusions.

Concepts: Regression analysis, Asthma, Risk, Smooth muscle, Hydrogen sulfide, Covariate, Natural gas, Wheeze


When conducting recurrent event data analysis, it is common to assume that the covariate processes are observed throughout the follow-up period. In most applications, however, the values of time-varying covariates are only observed periodically rather than continuously. A popular ad-hoc approach is to carry forward the last observed covariate value until it is measured again. This simple approach, however, usually leads to biased estimation. To tackle this problem, we propose to model the covariate effect on the risk of the recurrent events through jointly modeling the recurrent event process and the longitudinal measures. Despite its popularity, estimation of the joint model with binary longitudinal measurements remains a challenge, because the standard linear mixed effects model approach is not appropriate for binary measures. In this paper, we postulate a Markov model for the binary covariate process and a random-effect proportional intensity model for the recurrent event process. We use a Markov chain Monte Carlo algorithm to estimate all the unknown parameters. The performance of the proposed estimator is evaluated via simulations. The methodology is applied to an observational study designed to evaluate the effect of Group A streptococcus on pharyngitis among school children in India.

Concepts: Scientific method, Mathematics, Philosophy of science, Monte Carlo, Markov chain, Covariate, Time-varying covariate, Markov chain Monte Carlo


Sensitivity analysis is useful in assessing how robust an association is to potential unmeasured or uncontrolled confounding. This article introduces a new measure called the “E-value,” which is related to the evidence for causality in observational studies that are potentially subject to confounding. The E-value is defined as the minimum strength of association, on the risk ratio scale, that an unmeasured confounder would need to have with both the treatment and the outcome to fully explain away a specific treatment-outcome association, conditional on the measured covariates. A large E-value implies that considerable unmeasured confounding would be needed to explain away an effect estimate. A small E-value implies little unmeasured confounding would be needed to explain away an effect estimate. The authors propose that in all observational studies intended to produce evidence for causality, the E-value be reported or some other sensitivity analysis be used. They suggest calculating the E-value for both the observed association estimate (after adjustments for measured confounders) and the limit of the confidence interval closest to the null. If this were to become standard practice, the ability of the scientific community to assess evidence from observational studies would improve considerably, and ultimately, science would be strengthened.

Concepts: Scientific method, Experimental design, Estimator, Measurement, Integral, Confounding, Philosophy of science, Covariate


Although some experimental biological studies have indicated that citrus may have preventive effects against cognitive impairment, no cohort study has yet examined the relationship between citrus consumption and incident dementia. In a baseline survey, we collected data on daily citrus intake (categorised as ≤2, 3-4 times/week or almost every day) and consumption of other foods using a FFQ, and used a self-reported questionnaire to collect data on other covariates. Data on incident dementia were retrieved from the Japanese Long-term Care Insurance database. A multivariate-adjusted Cox model was used to estimate the hazard ratios (HR) and 95 % CI for incident dementia according to citrus consumption. Among 13 373 participants, the 5·7-year incidence of dementia was 8·6 %. In comparison with participants who consumed citrus ≤2 times/week, the multivariate-adjusted HR for incident dementia among those did so 3-4 times/week and almost every day was 0·92 (95 % CI 0·80, 1·07) and 0·86 (95 % CI 0·73, 1·01), respectively (P trend=0·065). The inverse association persisted after excluding participants whose dementia events had occurred in the first 2 years of follow-up. The multivariate HR was 1·00 (reference) for ≤2 times/week, 0·82 (95 % CI 0·69, 0·98) for 3-4 times/week and 0·77 (95 % CI 0·64, 0·93) for almost every day (P trend=0·006). The present findings suggest that frequent citrus consumption was associated with a lower risk of incident dementia, even after adjustment for possible confounding factors.

Concepts: Cohort study, Experimental design, Cohort, Epidemiology, Hazard ratio, Covariate, Long-term care, Long term care insurance


Study Design Randomized clinical trial. Background The comparative effectiveness between non-thrust (NTM) and thrust manipulation ™ for mechanical neck pain has been investigated with inconsistent results. Objective To compare the clinical effectiveness of concordant cervical and thoracic NTM and TM for patients with mechanical neck pain. Methods The Neck Disability Index (NDI) was the primary outcome. Secondary outcomes included the Patient Specific Functional Scale (PSFS), Numerical Pain Rating Scale (NPRS), deep cervical flexion endurance (DCF), Global Rating of Change (GROC), number of visits, and duration of care. Covariates were clinical equipoise for intervention. Outcomes were collected at baseline, visit 2, and discharge. Patients were randomly assigned to receive either NTM or TM directed at the cervical and thoracic spines. Techniques and dosages were selected pragmatically and applied to the most symptomatic level. Two-way, repeated measures Analysis of Covariance (ANCOVA), were used to analyze clinical outcomes at three time points. ANCOVAs analyzed between-group differences for GROC, number of visits, and duration of care at discharge. Results One hundred and three patients were included in the analyses (N=55 NTM and N=48 TM). The between group analyses revealed no differences on the NDI (P=.67) PSFS (P=.26), NPRS (P=.25), and DCF (P=.98) or for the GROC (P=.77), number of visits (P=.21), and duration of care (P =.61), for patients with mechanical neck pain who received either NTM or TM. Conclusion NTM and TM produce equivalent outcomes for patients with mechanical neck pain. Level of Evidence Level 1b. J Orthop Sports Phys Ther, Epub 6 Feb 2018. doi:10.2519/jospt.2018.7738.

Concepts: Experimental design, Clinical trial, Crossover study, Effectiveness, Pharmaceutical industry, Clinical research, Covariate, Joint manipulation


Recent advances in sequencing technology have made it possible to obtain high-throughput data on the composition of microbial communities and to study the effects of dysbiosis on the human host. Analysis of pairwise intersample distances quantifies the association between the microbiome diversity and covariates of interest (e.g., environmental factors, clinical outcomes, treatment groups). In the design of these analyses, multiple choices for distance metrics are available. Most distance-based methods, however, use a single distance and are underpowered if the distance is poorly chosen. In addition, distance-based tests cannot flexibly handle confounding variables, which can result in excessive false-positive findings.

Concepts: Experimental design, Microbiology, Biotechnology, Confounding, Case-control study, Analysis of variance, Covariate, Distance


BACKGROUND AND PURPOSE: Depression is known to increase stroke risk. Although limited, there is some evidence for age differences, with a suggestion for a stronger association in younger groups. We investigated the effect of depression on stroke incidence in a large cohort of midaged women. METHODS: We included 10 547 women without a history of stroke aged 47 to 52 years from the Australian Longitudinal Study on Women’s Health, surveyed every 3 years from 1998 to 2010. Depression was defined at each survey using the Center for Epidemiological Studies Depression Scale (shortened version) and antidepressant use in the past month. Stroke was ascertained through self-report and mortality data. We determined the association between depression and stroke at the subsequent survey, using generalized estimating equation analysis, adjusting for time-varying covariates. RESULTS: At each survey, ≈24% were defined as having depression. During follow-up, 177 strokes occurred. Depression was associated with a >2-fold increased odds of stroke (odds ratio, 2.41; 95% confidence interval, 1.78-3.27), which attenuated after adjusting for age, socioeconomic status, lifestyle, and physiological factors (odds ratio, 1.94; 95% confidence interval, 1.37-2.74). Findings were robust to sensitivity analyses addressing methodological issues, including definition of depression, antidepressant use, and missing covariate data. CONCLUSIONS: Depression is a strong risk factor for stroke in midaged women, with the association partially explained by lifestyle and physiological factors. Further studies of midaged and older women from the same population are needed to confirm whether depression is particularly important in younger women and to inform targeted intervention approaches.

Concepts: Cohort study, Longitudinal study, Epidemiology, Statistics, Medical statistics, Sociology, Covariate, Statistical terminology