Discover the most talked about and latest scientific content & concepts.

Concept: Assessment


The assessment of scientific publications is an integral part of the scientific process. Here we investigate three methods of assessing the merit of a scientific paper: subjective post-publication peer review, the number of citations gained by a paper, and the impact factor of the journal in which the article was published. We investigate these methods using two datasets in which subjective post-publication assessments of scientific publications have been made by experts. We find that there are moderate, but statistically significant, correlations between assessor scores, when two assessors have rated the same paper, and between assessor score and the number of citations a paper accrues. However, we show that assessor score depends strongly on the journal in which the paper is published, and that assessors tend to over-rate papers published in journals with high impact factors. If we control for this bias, we find that the correlation between assessor scores and between assessor score and the number of citations is weak, suggesting that scientists have little ability to judge either the intrinsic merit of a paper or its likely impact. We also show that the number of citations a paper receives is an extremely error-prone measure of scientific merit. Finally, we argue that the impact factor is likely to be a poor measure of merit, since it depends on subjective assessment. We conclude that the three measures of scientific merit considered here are poor; in particular subjective assessments are an error-prone, biased, and expensive method by which to assess merit. We argue that the impact factor may be the most satisfactory of the methods we have considered, since it is a form of pre-publication review. However, we emphasise that it is likely to be a very error-prone measure of merit that is qualitative, not quantitative.

Concepts: Scientific method, Academic publishing, Assessment, Psychometrics, Peer review, Impact factor, Scientific journal, Open access


The increasing prevalence of social media means that we often encounter written language characterized by both stylistic variation and outright errors. How does the personality of the reader modulate reactions to non-standard text? Experimental participants read ‘email responses’ to an ad for a housemate that either contained no errors or had been altered to include either typos (e.g., teh) or homophonous grammar errors (grammos, e.g., to/too, it’s/its). Participants completed a 10-item evaluation scale for each message, which measured their impressions of the writer. In addition participants completed a Big Five personality assessment and answered demographic and language attitude questions. Both typos and grammos had a negative impact on the evaluation scale. This negative impact was not modulated by age, education, electronic communication frequency, or pleasure reading time. In contrast, personality traits did modulate assessments, and did so in distinct ways for grammos and typos.

Concepts: Evaluation, Personality psychology, Assessment, Educational psychology, Writing, Modulation, Communication, Big Five personality traits


(1) is children’s irony appreciation and processing related to their empathy skills? and (2) is children’s processing of a speaker’s ironic meaning best explained by a modular or interactive theory? Participants were thirty-one 8- and 9-year-olds children. We used a variant of the visual world paradigm to assess children’s processing of ironic and literal evaluative remarks; in this paradigm children’s cognition is revealed through their actions and eye gaze. Results in this paradigm showed that children’s irony appreciation and processing were correlated with their empathy development, suggesting that empathy or emotional perspective taking may be important for development of irony comprehension. Further, children’s processing of irony was consistent with an interactive framework, in which children consider ironic meanings in the earliest moments, as speech unfolds. These results provide important new insights about development of this complex aspect of emotion recognition.

Concepts: Psychology, Assessment, Linguistics, Empathy, Emotion, Perspective, Meaning, Irony


The aim was to determine if bracket prescription has any effect on the subjective outcome of pre-adjusted edgewise treatment as judged by professionals. This retrospective observational assessment study was undertaken in the Orthodontic Department of the Charles Clifford Dental Hospital, Sheffield, UK. Forty sets of post-treatment study models from patients treated using a pre-adjusted edgewise appliance (20 Roth and 20 MBT) were selected. The models were masked and shown in a random order to nine experienced orthodontic clinicians, who were asked to assess the quality of the outcome, using a pre-piloted questionnaire. The principal outcome measure was the Incisor and Canine Aesthetic Torque and Tip (ICATT) score for each of the 40 post-treatment models carried out by the nine judges. A two-way analysis of variance was undertaken with the dependent variable, total ICATT score and independent variables, Bracket prescription (Roth or MBT) and Assessor. There were statistically significant differences between the subjective assessments of the nine judges (P<0.001), but there was no statistically significant difference between the two bracket prescriptions (P = 0.900). The best agreement between a clinician's judgment of prescription used and the actual prescription was fair (kappa statistic 0.25; CI -0.05 to 0.55). The ability to determine which bracket prescription was used was no better than chance for the majority of clinicians. Bracket prescription had no effect on the subjective aesthetic judgments of post-treatment study models made by nine experienced orthodontists.

Concepts: Statistics, Hospital, Evaluation, Statistical significance, Assessment, Analysis of variance, Dentistry, Judgment


The nanoparticle industry is expected to become a trillion dollar business in the near future. Therefore, the unintentional introduction of nanoparticles into the environment is increasingly likely. However, currently applied risk-assessment practices require further adaptation to accommodate the intrinsic nature of engineered nanoparticles. Combining a chronic flow-through exposure system with subsequent acute toxicity tests for the standard test organism Daphnia magna, we found that juvenile offspring of adults that were previously exposed to titanium dioxide nanoparticles exhibit a significantly increased sensitivity to titanium dioxide nanoparticles compared with the offspring of unexposed adults, as displayed by lower 96 h-EC(50) values. This observation is particularly remarkable because adults exhibited no differences among treatments in terms of typically assessed endpoints, such as sensitivity, number of offspring, or energy reserves. Hence, the present study suggests that ecotoxicological research requires further development to include the assessment of the environmental risks of nanoparticles for the next and hence not directly exposed generation, which is currently not included in standard test protocols.

Concepts: Environment, Natural environment, Data, Assessment, Titanium dioxide, Star Trek: The Next Generation, Daphnia, Jonathan Frakes


BACKGROUND: Systematic reviews have been challenged to consider effects on disadvantaged groups. A priori specification of subgroup analyses is recommended to increase the credibility of these analyses. This study aimed to develop and assess inter-rater agreement for an algorithm for systematic review authors to predict whether differences in effect measures are likely for disadvantaged populations relative to advantaged populations (only relative effect measures were addressed). METHODS: A health equity plausibility algorithm was developed using clinimetric methods with three items based on literature review, key informant interviews and methodology studies. The three items dealt with the plausibility of differences in relative effects across sex or socioeconomic status (SES) due to: 1) patient characteristics; 2) intervention delivery (i.e., implementation); and 3) comparators. Thirty-five respondents (consisting of clinicians, methodologists and research users) assessed the likelihood of differences across sex and SES for ten systematic reviews with these questions. We assessed inter-rater reliability using Fleiss multi-rater kappa. RESULTS: The proportion agreement was 66% for patient characteristics (95% confidence interval: 61%-71%), 67% for intervention delivery (95% confidence interval: 62% to 72%) and 55% for the comparator (95% confidence interval: 50% to 60%). Inter-rater kappa, assessed with Fleiss kappa, ranged from 0 to 0.199, representing very low agreement beyond chance. CONCLUSIONS: Users of systematic reviews rated that important differences in relative effects across sex and socioeconomic status were plausible for a range of individual and population-level interventions. However, there was very low inter-rater agreement for these assessments. There is an unmet need for discussion of plausibility of differential effects in systematic reviews. Increased consideration of external validity and applicability to different populations and settings is warranted in systematic reviews to meet this need.

Concepts: Evidence-based medicine, Assessment, Interval finite element, Meta-analysis, Cohen's kappa, Inter-rater reliability, Contract, Fleiss' kappa


In 2007, the World Health Organization (WHO) received a criticism for a lack of transparency and systematic methods in the development of guidelines, which were at that time perceived as substantially driven by expert opinion. In this paper we assessed the quality of maternal and perinatal health guidelines developed since then. We used the Appraisal of Guidelines for Research and Evaluation (AGREE) II tool to evaluate the quality of methodological rigour and transparency of four different WHO guidelines published between 2007 and 2011. Our findings showed high scores among the most recent guidelines on maternal and perinatal health suggesting higher quality. However, there is still potential for improvement, especially in including different stakeholder views, transparency of guidelines regarding the role of the funding body and presentation of the guideline document.

Concepts: Infant, Evaluation, Assessment, World Health Organization


This study, conducted in a group of nine chronic patients with right-side hemiparesis after stroke, investigated the effects of a robotic-assisted rehabilitation training with an upper limb robotic exoskeleton for the restoration of motor function in spatial reaching movements. The robotic assisted rehabilitation training was administered for a period of 6 weeks including reaching and spatial antigravity movements. To assess the carry-over of the observed improvements in movement during training into improved function, a kinesiologic assessment of the effects of the training was performed by means of motion and dynamic electromyographic analysis of reaching movements performed before and after training. The same kinesiologic measurements were performed in a healthy control group of seven volunteers, to determine a benchmark for the experimental observations in the patients' group. Moreover degree of functional impairment at the enrolment and discharge was measured by clinical evaluation with upper limb Fugl-Meyer Assessment scale (FMA, 0-66 points), Modified Ashworth scale (MA, 0-60 pts) and active ranges of motion. The robot aided training induced, independently by time of stroke, statistical significant improvements of kinesiologic (movement time, smoothness of motion) and clinical (4.6 ± 4.2 increase in FMA, 3.2 ± 2.1 decrease in MA) parameters, as a result of the increased active ranges of motion and improved co-contraction index for shoulder extension/flexion. Kinesiologic parameters correlated significantly with clinical assessment values, and their changes after the training were affected by the direction of motion (inward vs. outward movement) and position of target to be reached (ipsilateral, central and contralateral peripersonal space). These changes can be explained as a result of the motor recovery induced by the robotic training, in terms of regained ability to execute single joint movements and of improved interjoint coordination of elbow and shoulder joints.

Concepts: Time, Scientific method, Evaluation, Measurement, Assessment, Joint, Upper limb, Clavicle


When established communication systems cannot be used, people rapidly create novel systems to modify the mental state of another agent according to their intentions. However, there are dramatic inter-individual differences in the implementation of this human competence for communicative innovation. Here we characterize psychological sources of inter-individual variability in the ability to build a shared communication system from scratch. We consider two potential sources of variability in communicative skills. Cognitive traits of two individuals could independently influence their joint ability to establish a communication system. Another possibility is that the overlap between those individual traits influences the communicative performance of a dyad. We assess these possibilities by quantifying the relationship between cognitive traits and behavior of communicating dyads. Cognitive traits were assessed with psychometric scores quantifying cooperative attitudes and fluid intelligence. Competence for implementing successful communicative innovations was assessed by using a non-verbal communicative task. Individual capacities influence communicative success when communicative innovations are generated. Dyadic similarities and individual traits modulate the type of communicative strategy chosen. The ability to establish novel communicative actions was influenced by a combination of the communicator’s ability to understand intentions and the addressee’s ability to recognize patterns. Communicative pairs with comparable systemizing abilities or behavioral inhibition were more likely to explore the search space of possible communicative strategies by systematically adding new communicative behaviors to those already available. No individual psychometric measure seemed predominantly responsible for communicative success. These findings support the notion that the human ability for fast communicative innovations represents a special type of complex collaborative activity.

Concepts: Psychology, Assessment, Educational psychology, Behavior, Human behavior, Communication, Behaviorism, Dyadic


To test for cross-sectional (at age 11) and longitudinal associations between objectively measured free-living physical activity (PA) and academic attainment in adolescents.Method Data from 4755 participants (45% male) with valid measurement of PA (total volume and intensity) by accelerometry at age 11 from the Avon Longitudinal Study of Parents and Children (ALSPAC) was examined. Data linkage was performed with nationally administered school assessments in English, Maths and Science at ages 11, 13 and 16.

Concepts: Cohort study, Longitudinal study, Cross-sectional study, Measurement, Sociology, Assessment, Psychometrics, Test method