Concept: Impact factor
Most researchers acknowledge an intrinsic hierarchy in the scholarly journals (“journal rank”) that they submit their work to, and adjust not only their submission but also their reading strategies accordingly. On the other hand, much has been written about the negative effects of institutionalizing journal rank as an impact measure. So far, contributions to the debate concerning the limitations of journal rank as a scientific impact assessment tool have either lacked data, or relied on only a few studies. In this review, we present the most recent and pertinent data on the consequences of our current scholarly communication system with respect to various measures of scientific quality (such as utility/citations, methodological soundness, expert ratings or retractions). These data corroborate previous hypotheses: using journal rank as an assessment tool is bad scientific practice. Moreover, the data lead us to argue that any journal rank (not only the currently-favored Impact Factor) would have this negative impact. Therefore, we suggest that abandoning journals altogether, in favor of a library-based scholarly communication system, will ultimately be necessary. This new system will use modern information technology to vastly improve the filter, sort and discovery functions of the current journal system.
The relationship between traditional metrics of research impact (e.g., number of citations) and alternative metrics (altmetrics) such as Twitter activity are of great interest, but remain imprecisely quantified. We used generalized linear mixed modeling to estimate the relative effects of Twitter activity, journal impact factor, and time since publication on Web of Science citation rates of 1,599 primary research articles from 20 ecology journals published from 2012-2014. We found a strong positive relationship between Twitter activity (i.e., the number of unique tweets about an article) and number of citations. Twitter activity was a more important predictor of citation rates than 5-year journal impact factor. Moreover, Twitter activity was not driven by journal impact factor; the ‘highest-impact’ journals were not necessarily the most discussed online. The effect of Twitter activity was only about a fifth as strong as time since publication; accounting for this confounding factor was critical for estimating the true effects of Twitter use. Articles in impactful journals can become heavily cited, but articles in journals with lower impact factors can generate considerable Twitter activity and also become heavily cited. Authors may benefit from establishing a strong social media presence, but should not expect research to become highly cited solely through social media promotion. Our research demonstrates that altmetrics and traditional metrics can be closely related, but not identical. We suggest that both altmetrics and traditional citation rates can be useful metrics of research impact.
Clarity and accuracy of reporting are fundamental to the scientific process. Readability formulas can estimate how difficult a text is to read. Here, in a corpus consisting of 709,577 abstracts published between 1881 and 2015 from 123 scientific journals, we show that the readability of science is steadily decreasing. Our analyses show that this trend is indicative of a growing use of general scientific jargon. These results are concerning for scientists and for the wider public, as they impact both the reproducibility and accessibility of research findings.
The assessment of scientific publications is an integral part of the scientific process. Here we investigate three methods of assessing the merit of a scientific paper: subjective post-publication peer review, the number of citations gained by a paper, and the impact factor of the journal in which the article was published. We investigate these methods using two datasets in which subjective post-publication assessments of scientific publications have been made by experts. We find that there are moderate, but statistically significant, correlations between assessor scores, when two assessors have rated the same paper, and between assessor score and the number of citations a paper accrues. However, we show that assessor score depends strongly on the journal in which the paper is published, and that assessors tend to over-rate papers published in journals with high impact factors. If we control for this bias, we find that the correlation between assessor scores and between assessor score and the number of citations is weak, suggesting that scientists have little ability to judge either the intrinsic merit of a paper or its likely impact. We also show that the number of citations a paper receives is an extremely error-prone measure of scientific merit. Finally, we argue that the impact factor is likely to be a poor measure of merit, since it depends on subjective assessment. We conclude that the three measures of scientific merit considered here are poor; in particular subjective assessments are an error-prone, biased, and expensive method by which to assess merit. We argue that the impact factor may be the most satisfactory of the methods we have considered, since it is a form of pre-publication review. However, we emphasise that it is likely to be a very error-prone measure of merit that is qualitative, not quantitative.
Despite their recognized limitations, bibliometric assessments of scientific productivity have been widely adopted. We describe here an improved method to quantify the influence of a research article by making novel use of its co-citation network to field-normalize the number of citations it has received. Article citation rates are divided by an expected citation rate that is derived from performance of articles in the same field and benchmarked to a peer comparison group. The resulting Relative Citation Ratio is article level and field independent and provides an alternative to the invalid practice of using journal impact factors to identify influential papers. To illustrate one application of our method, we analyzed 88,835 articles published between 2003 and 2010 and found that the National Institutes of Health awardees who authored those papers occupy relatively stable positions of influence across all disciplines. We demonstrate that the values generated by this method strongly correlate with the opinions of subject matter experts in biomedical research and suggest that the same approach should be generally applicable to articles published in all areas of science. A beta version of iCite, our web tool for calculating Relative Citation Ratios of articles listed in PubMed, is available at https://icite.od.nih.gov.
Journal impact factors have become an important criterion to judge the quality of scientific publications over the years, influencing the evaluation of institutions and individual researchers worldwide. However, they are also subject to a number of criticisms. Here we point out that the calculation of a journal’s impact factor is mainly based on the date of publication of its articles in print form, despite the fact that most journals now make their articles available online before that date. We analyze 61 neuroscience journals and show that delays between online and print publication of articles increased steadily over the last decade. Importantly, such a practice varies widely among journals, as some of them have no delays, while for others this period is longer than a year. Using a modified impact factor based on online rather than print publication dates, we demonstrate that online-to-print delays can artificially raise a journal’s impact factor, and that this inflation is greater for longer publication lags. We also show that correcting the effect of publication delay on impact factors changes journal rankings based on this metric. We thus suggest that indexing of articles in citation databases and calculation of citation metrics should be based on the date of an article’s online appearance, rather than on that of its publication in print.
Understanding the relationship between scientific productivity and research group size is important for deciding how science should be funded. We have investigated the relationship between these variables in the life sciences in the United Kingdom using data from 398 principle investigators (PIs). We show that three measures of productivity, the number of publications, the impact factor of the journals in which papers are published and the number of citations, are all positively correlated to group size, although they all show a pattern of diminishing returns-doubling group size leads to less than a doubling in productivity. The relationships for the impact factor and the number of citations are extremely weak. Our analyses suggest that an increase in productivity will be achieved by funding more PIs with small research groups, unless the cost of employing post-docs and PhD students is less than 20% the cost of a PI. We also provide evidence that post-docs are more productive than PhD students both in terms of the number of papers they produce and where those papers are published.
Scientific reproducibility has been at the forefront of many news stories and there exist numerous initiatives to help address this problem. We posit that a contributor is simply a lack of specificity that is required to enable adequate research reproducibility. In particular, the inability to uniquely identify research resources, such as antibodies and model organisms, makes it difficult or impossible to reproduce experiments even where the science is otherwise sound. In order to better understand the magnitude of this problem, we designed an experiment to ascertain the “identifiability” of research resources in the biomedical literature. We evaluated recent journal articles in the fields of Neuroscience, Developmental Biology, Immunology, Cell and Molecular Biology and General Biology, selected randomly based on a diversity of impact factors for the journals, publishers, and experimental method reporting guidelines. We attempted to uniquely identify model organisms (mouse, rat, zebrafish, worm, fly and yeast), antibodies, knockdown reagents (morpholinos or RNAi), constructs, and cell lines. Specific criteria were developed to determine if a resource was uniquely identifiable, and included examining relevant repositories (such as model organism databases, and the Antibody Registry), as well as vendor sites. The results of this experiment show that 54% of resources are not uniquely identifiable in publications, regardless of domain, journal impact factor, or reporting requirements. For example, in many cases the organism strain in which the experiment was performed or antibody that was used could not be identified. Our results show that identifiability is a serious problem for reproducibility. Based on these results, we provide recommendations to authors, reviewers, journal editors, vendors, and publishers. Scientific efficiency and reproducibility depend upon a research-wide improvement of this substantial problem in science today.
What are the statistical practices of articles published in journals with a high impact factor? Are there differences compared with articles published in journals with a somewhat lower impact factor that have adopted editorial policies to reduce the impact of limitations of Null Hypothesis Significance Testing? To investigate these questions, the current study analyzed all articles related to psychological, neuropsychological and medical issues, published in 2011 in four journals with high impact factors: Science, Nature, The New England Journal of Medicine and The Lancet, and three journals with relatively lower impact factors: Neuropsychology, Journal of Experimental Psychology-Applied and the American Journal of Public Health. Results show that Null Hypothesis Significance Testing without any use of confidence intervals, effect size, prospective power and model estimation, is the prevalent statistical practice used in articles published in Nature, 89%, followed by articles published in Science, 42%. By contrast, in all other journals, both with high and lower impact factors, most articles report confidence intervals and/or effect size measures. We interpreted these differences as consequences of the editorial policies adopted by the journal editors, which are probably the most effective means to improve the statistical practices in journals with high or low impact factors.
Objective To examine how poor reporting and inadequate methods for key methodological features in randomised controlled trials (RCTs) have changed over the past three decades.Design Mapping of trials included in Cochrane reviews.Data sources Data from RCTs included in all Cochrane reviews published between March 2011 and September 2014 reporting an evaluation of the Cochrane risk of bias items: sequence generation, allocation concealment, blinding, and incomplete outcome data.Data extraction For each RCT, we extracted consensus on risk of bias made by the review authors and identified the primary reference to extract publication year and journal. We matched journal names with Journal Citation Reports to get 2014 impact factors.Main outcomes measures We considered the proportions of trials rated by review authors at unclear and high risk of bias as surrogates for poor reporting and inadequate methods, respectively.Results We analysed 20 920 RCTs (from 2001 reviews) published in 3136 journals. The proportion of trials with unclear risk of bias was 48.7% for sequence generation and 57.5% for allocation concealment; the proportion of those with high risk of bias was 4.0% and 7.2%, respectively. For blinding and incomplete outcome data, 30.6% and 24.7% of trials were at unclear risk and 33.1% and 17.1% were at high risk, respectively. Higher journal impact factor was associated with a lower proportion of trials at unclear or high risk of bias. The proportion of trials at unclear risk of bias decreased over time, especially for sequence generation, which fell from 69.1% in 1986-1990 to 31.2% in 2011-14 and for allocation concealment (70.1% to 44.6%). After excluding trials at unclear risk of bias, use of inadequate methods also decreased over time: from 14.8% to 4.6% for sequence generation and from 32.7% to 11.6% for allocation concealment.Conclusions Poor reporting and inadequate methods have decreased over time, especially for sequence generation and allocation concealment. But more could be done, especially in lower impact factor journals.