SciCombinator

Discover the most talked about and latest scientific content & concepts.

Concept: Statistics

367

How were cities distributed globally in the past? How many people lived in these cities? How did cities influence their local and regional environments? In order to understand the current era of urbanization, we must understand long-term historical urbanization trends and patterns. However, to date there is no comprehensive record of spatially explicit, historic, city-level population data at the global scale. Here, we developed the first spatially explicit dataset of urban settlements from 3700 BC to AD 2000, by digitizing, transcribing, and geocoding historical, archaeological, and census-based urban population data previously published in tabular form by Chandler and Modelski. The dataset creation process also required data cleaning and harmonization procedures to make the data internally consistent. Additionally, we created a reliability ranking for each geocoded location to assess the geographic uncertainty of each data point. The dataset provides the first spatially explicit archive of the location and size of urban populations over the last 6,000 years and can contribute to an improved understanding of contemporary and historical urbanization trends.

Concepts: Statistics, Chronology, Demography, Geographic information system, City, Urban area, Urbanization, Anno Domini

353

Engineering estimates of methane emissions from natural gas production have led to varied projections of national emissions. This work reports direct measurements of methane emissions at 190 onshore natural gas sites in the United States (150 production sites, 27 well completion flowbacks, 9 well unloadings, and 4 workovers). For well completion flowbacks, which clear fractured wells of liquid to allow gas production, methane emissions ranged from 0.01 Mg to 17 Mg (mean = 1.7 Mg; 95% confidence bounds of 0.67-3.3 Mg), compared with an average of 81 Mg per event in the 2011 EPA national emission inventory from April 2013. Emission factors for pneumatic pumps and controllers as well as equipment leaks were both comparable to and higher than estimates in the national inventory. Overall, if emission factors from this work for completion flowbacks, equipment leaks, and pneumatic pumps and controllers are assumed to be representative of national populations and are used to estimate national emissions, total annual emissions from these source categories are calculated to be 957 Gg of methane (with sampling and measurement uncertainties estimated at ±200 Gg). The estimate for comparable source categories in the EPA national inventory is ∼1,200 Gg. Additional measurements of unloadings and workovers are needed to produce national emission estimates for these source categories. The 957 Gg in emissions for completion flowbacks, pneumatics, and equipment leaks, coupled with EPA national inventory estimates for other categories, leads to an estimated 2,300 Gg of methane emissions from natural gas production (0.42% of gross gas production).

Concepts: Carbon dioxide, Statistics, Mathematics, United States, Natural gas, Methane, Air pollution, Greenhouse gas

347

Recent advances in Bayesian hypothesis testing have led to the development of uniformly most powerful Bayesian tests, which represent an objective, default class of Bayesian hypothesis tests that have the same rejection regions as classical significance tests. Based on the correspondence between these two classes of tests, it is possible to equate the size of classical hypothesis tests with evidence thresholds in Bayesian tests, and to equate P values with Bayes factors. An examination of these connections suggest that recent concerns over the lack of reproducibility of scientific studies can be attributed largely to the conduct of significance tests at unjustifiably high levels of significance. To correct this problem, evidence thresholds required for the declaration of a significant finding should be increased to 25-50:1, and to 100-200:1 for the declaration of a highly significant finding. In terms of classical hypothesis tests, these evidence standards mandate the conduct of tests at the 0.005 or 0.001 level of significance.

Concepts: Scientific method, Statistics, Statistical significance, Statistical hypothesis testing, Falsifiability, Bayesian inference, Statistical power, United States Declaration of Independence

341

A paper from the Open Science Collaboration (Research Articles, 28 August 2015, aac4716) attempting to replicate 100 published studies suggests that the reproducibility of psychological science is surprisingly low. We show that this article contains three statistical errors and provides no support for such a conclusion. Indeed, the data are consistent with the opposite conclusion, namely, that the reproducibility of psychological science is quite high.

Concepts: Scientific method, Psychology, Statistics, Mathematics, Research, Experiment

333

Background. Attribution to the original contributor upon reuse of published data is important both as a reward for data creators and to document the provenance of research findings. Previous studies have found that papers with publicly available datasets receive a higher number of citations than similar studies without available data. However, few previous analyses have had the statistical power to control for the many variables known to predict citation rate, which has led to uncertain estimates of the “citation benefit”. Furthermore, little is known about patterns in data reuse over time and across datasets. Method and Results. Here, we look at citation rates while controlling for many known citation predictors and investigate the variability of data reuse. In a multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available. Date of publication, journal impact factor, open access status, number of authors, first and last author publication history, corresponding author country, institution citation history, and study topic were included as covariates. The citation benefit varied with date of dataset deposition: a citation benefit was most clear for papers published in 2004 and 2005, at about 30%. Authors published most papers using their own datasets within two years of their first publication on the dataset, whereas data reuse papers published by third-party investigators continued to accumulate for at least six years. To study patterns of data reuse directly, we compiled 9,724 instances of third party data reuse via mention of GEO or ArrayExpress accession numbers in the full text of papers. The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties. Conclusion. After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered. We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.

Concepts: Statistics, Academic publishing, Data, Data set, DNA microarray, Reuse, Recycling, Remanufacturing

292

Over the past ten years, unconventional gas and oil drilling (UGOD) has markedly expanded in the United States. Despite substantial increases in well drilling, the health consequences of UGOD toxicant exposure remain unclear. This study examines an association between wells and healthcare use by zip code from 2007 to 2011 in Pennsylvania. Inpatient discharge databases from the Pennsylvania Healthcare Cost Containment Council were correlated with active wells by zip code in three counties in Pennsylvania. For overall inpatient prevalence rates and 25 specific medical categories, the association of inpatient prevalence rates with number of wells per zip code and, separately, with wells per km2 (separated into quantiles and defined as well density) were estimated using fixed-effects Poisson models. To account for multiple comparisons, a Bonferroni correction with associations of p<0.00096 was considered statistically significant. Cardiology inpatient prevalence rates were significantly associated with number of wells per zip code (p<0.00096) and wells per km2 (p<0.00096) while neurology inpatient prevalence rates were significantly associated with wells per km2 (p<0.00096). Furthermore, evidence also supported an association between well density and inpatient prevalence rates for the medical categories of dermatology, neurology, oncology, and urology. These data suggest that UGOD wells, which dramatically increased in the past decade, were associated with increased inpatient prevalence rates within specific medical categories in Pennsylvania. Further studies are necessary to address healthcare costs of UGOD and determine whether specific toxicants or combinations are associated with organ-specific responses.

Concepts: Medicine, Statistics, Petroleum, Statistical significance, The Association, Multiple comparisons, Natural gas, Bonferroni correction

289

Background Acetaminophen is a common therapy for fever in patients in the intensive care unit (ICU) who have probable infection, but its effects are unknown. Methods We randomly assigned 700 ICU patients with fever (body temperature, ≥38°C) and known or suspected infection to receive either 1 g of intravenous acetaminophen or placebo every 6 hours until ICU discharge, resolution of fever, cessation of antimicrobial therapy, or death. The primary outcome was ICU-free days (days alive and free from the need for intensive care) from randomization to day 28. Results The number of ICU-free days to day 28 did not differ significantly between the acetaminophen group and the placebo group: 23 days (interquartile range, 13 to 25) among patients assigned to acetaminophen and 22 days (interquartile range, 12 to 25) among patients assigned to placebo (Hodges-Lehmann estimate of absolute difference, 0 days; 96.2% confidence interval [CI], 0 to 1; P=0.07). A total of 55 of 345 patients in the acetaminophen group (15.9%) and 57 of 344 patients in the placebo group (16.6%) had died by day 90 (relative risk, 0.96; 95% CI, 0.66 to 1.39; P=0.84). Conclusions Early administration of acetaminophen to treat fever due to probable infection did not affect the number of ICU-free days. (Funded by the Health Research Council of New Zealand and others; HEAT Australian New Zealand Clinical Trials Registry number, ACTRN12612000513819 .).

Concepts: Clinical trial, Statistics, Mathematics, Estimator, Intensive care medicine, Interquartile range, Placebo, Fever

287

 To estimate how far changes in the prevalence of electronic cigarette (e-cigarette) use in England have been associated with changes in quit success, quit attempts, and use of licensed medication and behavioural support in quit attempts.

Concepts: Statistics, Smoking, Tobacco, Cigarette, Nicotine, Smoking cessation, Electronic cigarette, Time series

278

Endurance exercise training studies frequently show modest changes in VO2max with training and very limited responses in some subjects. By contrast, studies using interval training (IT) or combined IT and continuous training (CT) have reported mean increases in VO2max of up to ∼1.0 L · min(-1). This raises questions about the role of exercise intensity and the trainability of VO2max. To address this topic we analyzed IT and IT/CT studies published in English from 1965-2012. Inclusion criteria were: 1)≥3 healthy sedentary/recreationally active humans <45 yrs old, 2) training duration 6-13 weeks, 3) ≥3 days/week, 4) ≥10 minutes of high intensity work, 5) ≥1∶1 work/rest ratio, and 6) results reported as mean ± SD or SE, ranges of change, or individual data. Due to heterogeneity (I(2) value of 70), statistical synthesis of the data used a random effects model. The summary statistic of interest was the change in VO2max. A total of 334 subjects (120 women) from 37 studies were identified. Participants were grouped into 40 distinct training groups, so the unit of analysis was 40 rather than 37. An increase in VO2max of 0.51 L ·min(-1) (95% CI: 0.43 to 0.60 L · min(-1)) was observed. A subset of 9 studies, with 72 subjects, that featured longer intervals showed even larger (∼0.8-0.9 L · min(-1)) changes in VO2max with evidence of a marked response in all subjects. These results suggest that ideas about trainability and VO2max should be further evaluated with standardized IT or IT/CT training programs.

Concepts: Statistics, Exercise, Change, Analysis of variance, High-intensity interval training, Random effects model, Endurance, Interval training

260

This study documents reporting errors in a sample of over 250,000 p-values reported in eight major psychology journals from 1985 until 2013, using the new R package “statcheck.” statcheck retrieved null-hypothesis significance testing (NHST) results from over half of the articles from this period. In line with earlier research, we found that half of all published psychology papers that use NHST contained at least one p-value that was inconsistent with its test statistic and degrees of freedom. One in eight papers contained a grossly inconsistent p-value that may have affected the statistical conclusion. In contrast to earlier findings, we found that the average prevalence of inconsistent p-values has been stable over the years or has declined. The prevalence of gross inconsistencies was higher in p-values reported as significant than in p-values reported as nonsignificant. This could indicate a systematic bias in favor of significant results. Possible solutions for the high prevalence of reporting inconsistencies could be to encourage sharing data, to let co-authors check results in a so-called “co-pilot model,” and to use statcheck to flag possible inconsistencies in one’s own manuscript or during the review process.

Concepts: Statistics, Statistical significance, Ronald Fisher, Statistical hypothesis testing, P-value, Statistical power, Hypothesis testing, Counternull