Previous estimates of drug development success rates rely on relatively small samples from databases curated by the pharmaceutical industry and are subject to potential selection biases. Using a sample of 406 038 entries of clinical trial data for over 21 143 compounds from January 1, 2000 to October 31, 2015, we estimate aggregate clinical trial success rates and durations. We also compute disaggregated estimates across several trial features including disease type, clinical phase, industry or academic sponsor, biomarker presence, lead indication status, and time. In several cases, our results differ significantly in detail from widely cited statistics. For example, oncology has a 3.4% success rate in our sample vs. 5.1% in prior studies. However, after declining to 1.7% in 2012, this rate has improved to 2.5% and 8.3% in 2014 and 2015, respectively. In addition, trials that use biomarkers in patient-selection have higher overall success probabilities than trials without biomarkers.
Background Acetaminophen is a common therapy for fever in patients in the intensive care unit (ICU) who have probable infection, but its effects are unknown. Methods We randomly assigned 700 ICU patients with fever (body temperature, ≥38°C) and known or suspected infection to receive either 1 g of intravenous acetaminophen or placebo every 6 hours until ICU discharge, resolution of fever, cessation of antimicrobial therapy, or death. The primary outcome was ICU-free days (days alive and free from the need for intensive care) from randomization to day 28. Results The number of ICU-free days to day 28 did not differ significantly between the acetaminophen group and the placebo group: 23 days (interquartile range, 13 to 25) among patients assigned to acetaminophen and 22 days (interquartile range, 12 to 25) among patients assigned to placebo (Hodges-Lehmann estimate of absolute difference, 0 days; 96.2% confidence interval [CI], 0 to 1; P=0.07). A total of 55 of 345 patients in the acetaminophen group (15.9%) and 57 of 344 patients in the placebo group (16.6%) had died by day 90 (relative risk, 0.96; 95% CI, 0.66 to 1.39; P=0.84). Conclusions Early administration of acetaminophen to treat fever due to probable infection did not affect the number of ICU-free days. (Funded by the Health Research Council of New Zealand and others; HEAT Australian New Zealand Clinical Trials Registry number, ACTRN12612000513819 .).
Scoring goals in a soccer match can be interpreted as a stochastic process. In the most simple description of a soccer match one assumes that scoring goals follows from independent rate processes of both teams. This would imply simple Poissonian and Markovian behavior. Deviations from this behavior would imply that the previous course of the match has an impact on the present match behavior. Here a general framework for the identification of deviations from this behavior is presented. For this endeavor it is essential to formulate an a priori estimate of the expected number of goals per team in a specific match. This can be done based on our previous work on the estimation of team strengths. Furthermore, the well-known general increase of the number of the goals in the course of a soccer match has to be removed by appropriate normalization. In general, three different types of deviations from a simple rate process can exist. First, the goal rate may depend on the exact time of the previous goals. Second, it may be influenced by the time passed since the previous goal and, third, it may reflect the present score. We show that the Poissonian scenario is fulfilled quite well for the German Bundesliga. However, a detailed analysis reveals significant deviations for the second and third aspect. Dramatic effects are observed if the away team leads by one or two goals in the final part of the match. This analysis allows one to identify generic features about soccer matches and to learn about the hidden complexities behind scoring goals. Among others the reason for the fact that the number of draws is larger than statistically expected can be identified.
We provide a novel method, DRISEE (duplicate read inferred sequencing error estimation), to assess sequencing quality (alternatively referred to as “noise” or “error”) within and/or between sequencing samples. DRISEE provides positional error estimates that can be used to inform read trimming within a sample. It also provides global (whole sample) error estimates that can be used to identify samples with high or varying levels of sequencing error that may confound downstream analyses, particularly in the case of studies that utilize data from multiple sequencing samples. For shotgun metagenomic data, we believe that DRISEE provides estimates of sequencing error that are more accurate and less constrained by technical limitations than existing methods that rely on reference genomes or the use of scores (e.g. Phred). Here, DRISEE is applied to (non amplicon) data sets from both the 454 and Illumina platforms. The DRISEE error estimate is obtained by analyzing sets of artifactual duplicate reads (ADRs), a known by-product of both sequencing platforms. We present DRISEE as an open-source, platform-independent method to assess sequencing error in shotgun metagenomic data, and utilize it to discover previously uncharacterized error in de novo sequence data from the 454 and Illumina sequencing platforms.
BACKGROUND: Lidar height data collected by the Geosciences Laser Altimeter System (GLAS) from 2002 to 2008 has the potential to form the basis of a globally consistent sample-based inventory of forest biomass. GLAS lidar return data were collected globally in spatially discrete full waveform “shots,” which have been shown to be strongly correlated with aboveground forest biomass. Relationships observed at spatially coincident field plots may be used to model biomass at all GLAS shots, and well-established methods of model-based inference may then be used to estimate biomass and variance for specific spatial domains. However, the spatial pattern of GLAS acquisition is neither random across the surface of the earth nor is it identifiable with any particular systematic design. Undefined sample properties therefore hinder the use of GLAS in global forest sampling. RESULTS: We propose a method of identifying a subset of the GLAS data which can justifiably be treated as a simple random sample in model-based biomass estimation. The relatively uniform spatial distribution and locally arbitrary positioning of the resulting sample is similar to the design used by the US national forest inventory (NFI). We demonstrated model-based estimation using a sample of GLAS data in the US state of California, where our estimate of biomass (211 Mg/hectare) was within the 1.4% standard error of the design-based estimate supplied by the US NFI. The standard error of the GLAS-based estimate was significantly higher than the NFI estimate, although the cost of the GLAS estimate (excluding costs for the satellite itself) was almost nothing, compared to at least US$ 10.5 million for the NFI estimate. CONCLUSIONS: Global application of model-based estimation using GLAS, while demanding significant consolidation of training data, would improve inter-comparability of international biomass estimates by imposing consistent methods and a globally coherent sample frame. The methods presented here constitute a globally extensible approach for generating a simple random sample from the global GLAS dataset, enabling its use in forest inventory activities.
The problem of determining the optimal geometric configuration of a sensor network that will maximize the range-related information available for multiple target positioning is of key importance in a multitude of application scenarios. In this paper, a set of sensors that measures the distances between the targets and each of the receivers is considered, assuming that the range measurements are corrupted by white Gaussian noise, in order to search for the formation that maximizes the accuracy of the target estimates. Using tools from estimation theory and convex optimization, the problem is converted into that of maximizing, by proper choice of the sensor positions, a convex combination of the logarithms of the determinants of the Fisher Information Matrices corresponding to each of the targets in order to determine the sensor configuration that yields the minimum possible covariance of any unbiased target estimator. Analytical and numerical solutions are well defined and it is shown that the optimal configuration of the sensors depends explicitly on the constraints imposed on the sensor configuration, the target positions, and the probabilistic distributions that define the prior uncertainty in each of the target positions. Simulation examples illustrate the key results derived.
For many patients clinical prescription of walking will be beneficial to health and accelerometers can be used to monitor their walking intensity, frequency and duration over many days. Walking intensity should include establishment of individual specific accelerometer count, walking speed and energy expenditure (VO2) relationships and this can be achieved using a walking protocol on a treadmill or overground. However, differences in gait mechanics during treadmill compared to overground walking may result in inaccurate estimations of free-living walking speed and VO2. The aims of this study were to compare the validity of track- and treadmill-based calibration methods for estimating free-living level walking speed and VO2 and to explain between-method differences in accuracy of estimation.
The tadpole shrimp, Triops cancriformis, is a freshwater crustacean listed as endangered in the UK and Europe living in ephemeral pools. Populations are threatened by habitat destruction due to land development for agriculture and increased urbanisation. Despite this, there is a lack of efficient methods for discovering and monitoring populations. Established macroinvertebrate monitoring methods, such as net sampling, are unsuitable given the organism’s life history, that include long lived diapausing eggs, benthic habits and ephemerally active populations. Conventional hatching methods, such as sediment incubation, are both time consuming and potentially confounded by bet-hedging hatching strategies of diapausing eggs. Here we develop a new molecular diagnostic method to detect viable egg banks of T. cancriformis, and compare its performance to two conventional monitoring methods involving diapausing egg hatching. We apply this method to a collection of pond sediments from the Wildfowl & Wetlands Trust Caerlaverock National Nature Reserve, which holds one of the two remaining British populations of T. cancriformis. DNA barcoding of isolated eggs, using newly designed species-specific primers for a large region of mtDNA, was used to estimate egg viability. These estimates were compared to those obtained by the conventional methods of sediment and isolation hatching. Our method outperformed the conventional methods, revealing six ponds holding viable T. cancriformis diapausing egg banks in Caerlaverock. Additionally, designed species-specific primers for a short region of mtDNA identified degraded, inviable eggs and were used to ascertain the levels of recent mortality within an egg bank. Together with efficient sugar flotation techniques to extract eggs from sediment samples, our molecular method proved to be a faster and more powerful alternative for assessing the viability and condition of T. cancriformis diapausing egg banks.
Estimates of global species diversity have varied widely, primarily based on variation in the numbers derived from different inventory methods of arthropods and other small invertebrates. Within vertebrates, current diversity metrics for fishes, amphibians, and reptiles are known to be poor estimators, whereas those for birds and mammals are often assumed to be relatively well established. We show that avian evolutionary diversity is significantly underestimated due to a taxonomic tradition not found in most other taxonomic groups. Using a sample of 200 species taken from a list of 9159 biological species determined primarily by morphological criteria, we applied a diagnostic, evolutionary species concept to a morphological and distributional data set that resulted in an estimate of 18,043 species of birds worldwide, with a 95% confidence interval of 15,845 to 20,470. In a second, independent analysis, we examined intraspecific genetic data from 437 traditional avian species, finding an average of 2.4 evolutionary units per species, which can be considered proxies for phylogenetic species. Comparing recent lists of species to that used in this study (based primarily on morphology) revealed that taxonomic changes in the past 25 years have led to an increase of only 9%, well below what our results predict. Therefore, our molecular and morphological results suggest that the current taxonomy of birds understimates avian species diversity by at least a factor of two. We suggest that a revised taxonomy that better captures avian species diversity will enhance the quantification and analysis of global patterns of diversity and distribution, as well as provide a more appropriate framework for understanding the evolutionary history of birds.
There is an unmet need for greater investment in preparedness against major epidemics and pandemics. The arguments in favour of such investment have been largely based on estimates of the losses in national incomes that might occur as the result of a major epidemic or pandemic. Recently, we extended the estimate to include the valuation of the lives lost as a result of pandemic-related increases in mortality. This produced markedly higher estimates of the full value of loss that might occur as the result of a future pandemic. We parametrized an exceedance probability function for a global influenza pandemic and estimated that the expected number of influenza-pandemic-related deaths is about 720 000 per year. We calculated that the expected annual losses from pandemic risk to be about 500 billion United States dollars - or 0.6% of global income - per year. This estimate falls within - but towards the lower end of - the Intergovernmental Panel on Climate Change’s estimates of the value of the losses from global warming, which range from 0.2% to 2% of global income. The estimated percentage of annual national income represented by the expected value of losses varied by country income grouping: from a little over 0.3% in high-income countries to 1.6% in lower-middle-income countries. Most of the losses from influenza pandemics come from rare, severe events.