Concept: Criterion validity


Working dog organisations, such as Guide Dogs, need to regularly assess the behaviour of the dogs they train. In this study we developed a questionnaire-style behaviour assessment completed by training supervisors of juvenile guide dogs aged 5, 8 and 12 months old (n = 1,401), and evaluated aspects of its reliability and validity. Specifically, internal reliability, temporal consistency, construct validity, predictive criterion validity (comparing against later training outcome) and concurrent criterion validity (comparing against a standardised behaviour test) were evaluated. Thirty-nine questions were sourced either from previously published literature or created to meet requirements identified via Guide Dogs staff surveys and staff feedback. Internal reliability analyses revealed seven reliable and interpretable trait scales named according to the questions within them as: Adaptability; Body Sensitivity; Distractibility; Excitability; General Anxiety; Trainability and Stair Anxiety. Intra-individual temporal consistency of the scale scores between 5-8, 8-12 and 5-12 months was high. All scales excepting Body Sensitivity showed some degree of concurrent criterion validity. Predictive criterion validity was supported for all seven scales, since associations were found with training outcome, at at-least one age. Thresholds of z-scores on the scales were identified that were able to distinguish later training outcome by identifying 8.4% of all dogs withdrawn for behaviour and 8.5% of all qualified dogs, with 84% and 85% specificity. The questionnaire assessment was reliable and could detect traits that are consistent within individuals over time, despite juvenile dogs undergoing development during the study period. By applying thresholds to scores produced from the questionnaire this assessment could prove to be a highly valuable decision-making tool for Guide Dogs. This is the first questionnaire-style assessment of juvenile dogs that has shown value in predicting the training outcome of individual working dogs.

Concepts: Scientific method, Psychometrics, Validity, Test validity, Criterion validity, Construct validity, Dog, Test


Study design:Cross-sectional validation study.Objectives:To develop and validate a self-report version of the Spinal Cord Independence Measure (SCIM III).Setting:Two SCI rehabilitation facilities in Switzerland.Methods:SCIM III comprises 19 questions on daily tasks with a total score between 0 and 100 and subscales for ‘self-care’, ‘respiration & sphincter management’ and ‘mobility’. A self-report version (SCIM-SR) was developed by expert discussions and pretests in individuals with spinal cord injury (SCI) using a German translation. A convenience sample of 99 inpatients with SCI was recruited. SCIM-SR data were analyzed together with SCIM III data obtained from attending health professionals.Results:High correlations between SCIM III and SCIM-SR were observed. Pearson’s r for the total score was 0.87 (95% confidence interval (CI) 0.82-0.91), for the subscales self-care 0.87 (0.81-0.91); respiration & sphincter management 0.81 (0.73-0.87); and mobility 0.87 (0.82-0.91). Intraclass correlations were: total score 0.90 (95% CI 0.85-0.93); self-care 0.86 (0.79-0.90); respiration & sphincter management 0.80 (0.71-0.86); and mobility 0.83 (0.76-0.89). Bland-Altman plots showed that patients rated their functioning higher than professionals, in particular for mobility. The mean difference between SCIM-SR and SCIM III for the total score was 5.14 (point estimate 95% CI 2.95-7.34), self-care 0.89 (0.19-1.59), respiration & sphincter management 1.05 (0.18-2.28 ) and mobility 3.49 (2.44-4.54). Particularly patients readmitted because of pressure sores rated their independence higher than attending professionals.Conclusion:Our results support the criterion validity of SCIM-SR. The self-report version may facilitate long-term evaluations of independence in persons with SCI in their home situation.

Concepts: Statistics, Validation, Criterion validity, Pearson product-moment correlation coefficient, Spinal cord injury, Statistical inference, The Criterion, Interval estimation


OBJECTIVE: The purpose of this study was to develop a short and reliable measure of hypersexuality that could be used in everyday practice in patients with Parkinson’s disease (PD). DESIGN: The original questionnaire containing twenty-five-items, the Sexual Addiction Screening Test (SAST), was shortened and tested in a PD population. METHODS: Successive reductions were performed until a final set of items satisfied the model fit requirements. The testing phase consisted of administering the SAST questionnaire to 159 PD patients. It included i) acceptability, ii) dimensionality construct validity, and iii) a complete general correlation structure of data. Finally, criterion validity of the final version of the instrument was assessed. RESULTS: The initial questionnaire was reduced to five items (PD-SAST) with a cut-off score of 2. Psychometric analysis revealed three factors corresponding to “Preoccupation”, “Cannot stop” and “Relationship disturbance”. The discriminant validity of the PD-SAST was high (ROC area under the curve: 0.96). CONCLUSIONS: The PD-SAST performs well as a screening instrument. It has been found to be acceptable to patients and is ready for use. Moreover, it tests multidimensional aspects of hypersexuality.

Concepts: Assessment, Psychometrics, Addiction, Validity, Test validity, Criterion validity, Construct validity, Reliability


This study examines preliminary evidence for the Lichtenberg Financial Decision Rating Scale (LFDRS), a new person-centered approach to assessing capacity to make financial decisions, and its relationship to self-reported cases of financial exploitation in 69 older African Americans. More than one third of individuals reporting financial exploitation also had questionable decisional abilities. Overall, decisional ability score and current decision total were significantly associated with cognitive screening test and financial ability scores, demonstrating good criterion validity. Financially exploited individuals, and non-exploited individuals, showed mean group differences on the Mini Mental State Exam, Financial Situational Awareness, Psychological Vulnerability, Current Decisional Ability, and Susceptibility to undue influence subscales, and Total Lichtenberg Financial Decision Rating Scale Score. Study findings suggest that impaired decisional abilities may render older adults more vulnerable to financial exploitation, and that the LFDRS is a valid tool for measuring both decisional abilities and financial exploitation.

Concepts: Decision making, Risk, Cognition, Psychometrics, Criterion validity, Construct validity, Mini-mental state examination, American Civil War


The purpose of the author in this systematic psychometric review includes: providing social work researchers, educators, and administrators with a summary of descriptive psychometric information pertaining to scales which measure social workers' beliefs about research and social work practice, evaluating chronological changes in psychometric/statistical methodology, and summarizing the role current and future scale development efforts have in improving the use of evidence-based social work practice. Using predetermined inclusion and exclusion criteria, electronic databases and reference lists of included studies were reviewed and coded for methodological and psychometric properties. Seventeen studies satisfied inclusion and exclusion criteria. Eleven unique scales measuring social worker beliefs regarding research and social work practice were identified. The majority of scales and subscales had Cronbach’s alphas that exceeded .70. Most of the scales had evidence of content, factorial, construct, and/or criterion validity. Strategies for improving psychometric research and implications for evidence-based social work practice are discussed.

Concepts: Sociology, Educational psychology, Psychometrics, Validity, Criterion validity, Construct validity, Social work, International Federation of Social Workers


To see that positive and negative effects of training induce apparent oscillations of performance, suggesting that the delayed cumulative effects of training on daily performance capacity (DPC) are best fitted by sine waves damped over time. The aim of this study is to compare the criterion validity of Impulse Response (IR) model of Banister and damped harmonic oscillation model (DHO) for quantifying training load (TL)-DPC relationship.

Concepts: Oscillation, Wave, Validity, Criterion validity, Periodic function, Harmonic oscillator, Sine wave, Simple harmonic motion


The Profile of Music Perception Skills (PROMS) is a recently developed measure of perceptual music skills which has been shown to have promising psychometric properties. In this paper we extend the evaluation of its brief version to three kinds of validity using an individual difference approach. The brief PROMS displays good discriminant validity with working memory, given that it does not correlate with backward digit span (r = .04). Moreover, it shows promising criterion validity (association with musical training (r = .45), musicianship status (r = .48), and self-rated musical talent (r = .51)). Finally, its convergent validity, i.e. relation to an unrelated measure of music perception skills, was assessed by correlating the brief PROMS to harmonic closure judgment accuracy. Two independent samples point to good convergent validity of the brief PROMS (r = .36; r = .40). The same association is still significant in one of the samples when including self-reported music skill in a partial correlation (rpartial = .30; rpartial = .17). Overall, the results show that the brief version of the PROMS displays a very good pattern of construct validity. Especially its tuning subtest stands out as a valuable part for music skill evaluations in Western samples. We conclude by briefly discussing the choice faced by music cognition researchers between different musical aptitude measures of which the brief PROMS is a well evaluated example.

Concepts: Cognition, Assessment, Psychometrics, Validity, Test validity, Criterion validity, Construct validity, Music


Although it is natural for parents to value their children, some parents “overvalue” them, believing that their own children are more special and more entitled than other children are. This research introduces this concept of parental overvaluation. We developed a concise self-report scale to measure individual differences in parental overvaluation, the Parental Overvaluation Scale (POS; Study 1). The POS has high test-retest stability over 6, 12, and 18 months (Study 2). As demonstrated in a representative sample of Dutch parents (Study 3) and a diverse sample of American parents (Study 4), the POS has an internally consistent single-factor structure; strong measurement invariance across sexes; as well as good convergent, discriminant, and criterion validity. Overvaluation is especially high in narcissistic parents (Studies 3, 4, 6). When parents overvalue their child, they overclaim their child’s knowledge (Study 4), perceive their child as more gifted than actual IQ scores justify (Study 5), want their child to stand out from others, and frequently praise their child in real-life settings (Study 6). By contrast, overvaluation is not consistently related to parents' basic parenting dimensions (i.e., warmth and control) or Big Five personality traits (Studies 3, 4, 6). Importantly, overvalued children are not more intelligent or better performing than other children (Studies 5-6). These findings support the validity of the POS and show that parental overvaluation has important and unique implications for parents' beliefs and practices. Research on overvaluation might shed light on the determinants of parenting practices and the socialization of children’s self-views, including narcissism. (PsycINFO Database Record © 2014 APA, all rights reserved).

Concepts: Parent, Psychometrics, Criterion validity, Intelligence, Parenting, Narcissistic personality disorder, Narcissism, Narcissistic parents


BACKGROUND: Two of the current methodological barriers to implementation science efforts are the lack of agreement regarding constructs hypothesized to affect implementation success and identifiable measures of these constructs. In order to address these gaps, the main goals of this paper were to identify a multi-level framework that captures the predominant factors that impact implementation outcomes, conduct a systematic review of available measures assessing constructs subsumed within these primary factors, and determine the criterion validity of these measures in the search articles METHOD: We conducted a systematic literature review to identify articles reporting the use or development of measures designed to assess constructs that predict the implementation of evidence-based health innovations. Articles published through 12 August 2012 were identified through MEDLINE, CINAHL, PsycINFO and the journal Implementation Science. We then utilized a modified five-factor framework in order to code whether each measure contained items that assess constructs representing structural, organizational, provider, patient, and innovation level factors. Further, we coded the criterion validity of each measure within the search articles obtained. RESULTS: Our review identified 62 measures. Results indicate that organization, provider, and innovation-level constructs have the greatest number of measures available for use, whereas structural and patient-level constructs have the least. Additionally, relatively few measures demonstrated criterion validity, or reliable association with an implementation outcome (e.g., fidelity). DISCUSSION: In light of these findings, our discussion centers on strategies that researchers can utilize in order to identify, adapt, and improve extant measures for use in their own implementation research. In total, our literature review and resulting measures compendium increases the capacity of researchers to conceptualize and measure implementation-related constructs in their ongoing and future research.

Concepts: Scientific method, Systematic review, Research, Psychometrics, Validity, Test validity, Criterion validity, Innovation


Evidence on the detrimental health effects of prolonged sedentary behavior is accumulating. Interventions need to have a specific focus on sedentary behavior in order to generate clinically meaningful decreases in sedentary time. When evaluating such intervention, the question whether a participant improved or deteriorated their behavior is fundamental and instruments that are able to detect those changes are essential. Therefore, the aim of this study was to determine the criterion validity against activPAL and responsiveness to change of two activity monitors (ActiGraph and activPAL) and two questionnaires for the assessment of occupational sitting and standing time.

Concepts: Psychometrics, Validity, Test validity, Criterion validity, The Criterion