Validity Evidence (validity + evidence)

Distribution by Scientific Domains


Selected Abstracts


Consequences of Test Score Use as Validity Evidence: Roles and Responsibilities

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 1 2009
Paul D. Nichols
This article has three goals. The first goal is to clarify the role that the consequences of test score use play in validity judgments by reviewing the role that modern writers on validity have ascribed for consequences in supporting validity judgments. The second goal is to summarize current views on who is responsible for collecting evidence of test score use consequences by attempting to separate the responsibilities of the test developer and the test user. The last goal is to offer a framework that attempts to prescribe the conditions under which the responsibility for collecting evidence of consequences falls to the test developer or to the test user. [source]


Validity Evidence of an Electronic Portfolio for Preservice Teachers

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 1 2008
Yuankun Yao
This study applied Messick's unified, multifaceted concept of construct validity to an electronic portfolio system used in a teacher education program. The subjects included 128 preservice teachers who recently completed their final portfolio reviews and student teaching experiences. Four of Messick's six facets of validity were investigated for the portfolio in this study, along with a discussion of the remaining facets examined in two previous studies. The evidence provided support for the substantive and generalizability aspects of validity, and limited support for the content, structural, external, and consequential aspects of validity. It was suggested that the electronic portfolio may be used as one requirement for certification purposes, but may not be valid for the purpose of assessing teacher competencies. [source]


Validation of a method to measure resident doctors' reflections on quality improvement

MEDICAL EDUCATION, Issue 3 2010
Christopher M Wittich
Medical Education 2010:44: 248,255 Objectives, Resident reflection on the clinical learning environment is prerequisite to identifying quality improvement (QI) opportunities and demonstrating competence in practice-based learning. However, residents' abilities to reflect on QI opportunities are unknown. Therefore, we developed and determined the validity of the Mayo Evaluation of Reflection on Improvement Tool (MERIT) for assessing resident reflection on QI opportunities. Methods, The content of MERIT, which consists of 18 items structured on 4-point scales, was based on existing literature and input from national experts. Using MERIT, six faculty members rated 50 resident reflections. Factor analysis was used to examine the dimensionality of MERIT instrument scores. Inter-rater and internal consistency reliabilities were calculated. Results, Factor analysis revealed three factors (eigenvalue; number of items): Reflection on Personal Characteristics of QI (8.5; 7); Reflection on System Characteristics of QI (1.9; 6), and Problem of Merit (1.5; 5). Inter-rater reliability was very good (intraclass correlation coefficient range: 0.73,0.89). Internal consistency reliability was excellent (Cronbach's , 0.93 overall and 0.83,0.91 for factors). Item mean scores were highest for Problem of Merit (3.29) and lowest for Reflection on System Characteristics of QI (1.99). Conclusions, Validity evidence supports MERIT as a meaningful measure of resident reflection on QI opportunities. Our findings suggest that dimensions of resident reflection on QI opportunities may include personal, system and Problem of Merit factors. Additionally, residents may be more effective at reflecting on ,problems of merit' than personal and systems factors. [source]


Development of the ways of helping questionnaire: A measure of preferred coping strategies for older African American cancer survivors,,

RESEARCH IN NURSING & HEALTH, Issue 3 2009
Jill B. Hamilton
Abstract Although researchers have identified beneficial coping strategies for cancer patients, existing coping measures do not capture the preferred coping strategies of older African American cancer survivors. A new measure, the Ways of Helping Questionnaire (WHQ), was evaluated with 385 African American cancer survivors. Validity evidence from factor analysis resulted in 10 WHQ subscales (Others There for Me, Physical and Treatment Care Needs, Help from God, Church Family Support, Helping Others, Being Strong for Others, Encouraging My Healthy Behaviors, Others Distract Me, Learning about Cancer, and Distracting Myself). Reliability evidence was generally strong. Evidence regarding hypothesized relationships with measures of well-being and another coping measure was mixed. The WHQ's content coverage makes it especially relevant for older African American cancer survivors. © 2009 Wiley Periodicals, Inc. Res Nurs Health 32:243,259, 2009 [source]


Use of Knowledge, Skill, and Ability Statements in Developing Licensure and Certification Examinations

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 1 2005
Ning Wang
The task inventory approach is commonly used in job analysis for establishing content validity evidence supporting the use and interpretation of licensure and certification examinations. Although the results of a task inventory survey provide job task-related information that can be used as a reliable and valid source for test development, it is often the knowledge, skills, and abilities (KSAs) required for performing the tasks, rather than the job tasks themselves, which are tested by licensure and certification exams. This article presents a framework that addresses the important role of KSAs in developing and validating licensure and certification examinations. This includes the use of KSAs in linking job task survey results to the test content outline, transferring job task weights to test specifications, and eventually applying the results to the development of the test items. The impact of using KSAs in the development of test specifications is illustrated from job analyses for two diverse professions. One method for transferring job task weights from the job analysis to test specifications through KSAs is also presented, along with examples. The two examples demonstrated in this article are taken from nursing certification and real estate licensure programs. However, the methodology for using KSAs to link job tasks and test content is also applicable in the development of teacher credentialing examinations. [source]


Development and initial validation of an instrument measuring managerial coaching skill

HUMAN RESOURCE DEVELOPMENT QUARTERLY, Issue 2 2005
Gary N. McLean
This article reports on two studies that used three different samples (N = 644) to construct and validate a multidimensional measure of managerial coaching skill. The four dimensions of coaching skill measured were Open Communication, Team Approach, Value People, and Accept Ambiguity. The two studies assessed the context adequacy, dimensionality, reliability, factor structure, and construct validity of the scale. Preliminary reliability and validity evidence of the scale was determined. Consequently, the coaching scale provides future researchers with a valuable tool to measure coaching skill in organizational studies, and it offers human resource development professionals a valid instrument to develop effective managers. [source]


On the Construct Validity of Integrity Tests: Individual and Situational Factors as Predictors of Test Performance

INTERNATIONAL JOURNAL OF SELECTION AND ASSESSMENT, Issue 3 2001
Michael D. Mumford
Although integrity tests are widely applied in screening job applicants, there is a need for research for examining the construct validity of these tests. In the present study, a theoretical model examining the causes of destructive behavior in organizational settings was used to develop background data measures of individual and situational variables that might be related to integrity test scores. Subsequently, 692 undergraduates were asked to complete these background data scales along with (a) two overt integrity tests , the Reid Report and the Personnel Selection Inventory, and (b) two personality-based measures , the delinquency and socialization scales of the California Psychological Inventory. When scores of these measures were correlated with and regressed on the background data scales, it was found that relevant individual variables, such as narcissism and power motives, and relevant situational variables, such as alienation and exposure to negative peer groups, were related to scores on both types of integrity tests. However, a stronger pattern of validity evidence was obtained for the personality-based measures and, in all cases, situational variables were found to be better predictors than individual variables. The implications of these findings for the validity of inferences drawn from overt and personality-based integrity tests are discussed. [source]


Integrity Tests and Other Criterion-Focused Occupational Personality Scales (COPS) Used in Personnel Selection

INTERNATIONAL JOURNAL OF SELECTION AND ASSESSMENT, Issue 1-2 2001
Deniz S. Ones
This article focuses on personality measures constructed for prediction of individual differences in particular work behaviors of interest (e.g., violence at work, employee theft, customer service). These scales can generically be referred to as criterion-focused occupational personality scales (COPS). Examples include integrity tests (which aim to predict dishonest behaviors at work), violence scales (which aim to predict violent behaviors at work), drug and alcohol avoidance scales (which aim to predict substance abuse at work), stress tolerance scales (which aim to predict handling work pressures well) and customer service scales (which aim to predict serving customers well). We first review the criterion-related validity, construct validity and incremental validity evidence for integrity tests, violence scales, stress tolerance scales, and customer service scales. Specifically, validities for counterproductive work behaviors and overall job performance are summarized as well as relations with the Big Five personality scales (conscientiousness, emotional stability, openness to experience, agreeableness and extraversion). Second, we compare the usefulness of COPS with traditional, general purpose, adult personality scales. We also highlight the theoretical and practical implications of these comparisons and suggest a research agenda in this area. [source]


Reliability: on the reproducibility of assessment data

MEDICAL EDUCATION, Issue 9 2004
Steven M Downing
Context, All assessment data, like other scientific experimental data, must be reproducible in order to be meaningfully interpreted. Purpose, The purpose of this paper is to discuss applications of reliability to the most common assessment methods in medical education. Typical methods of estimating reliability are discussed intuitively and non-mathematically. Summary, Reliability refers to the consistency of assessment outcomes. The exact type of consistency of greatest interest depends on the type of assessment, its purpose and the consequential use of the data. Written tests of cognitive achievement look to internal test consistency, using estimation methods derived from the test-retest design. Rater-based assessment data, such as ratings of clinical performance on the wards, require interrater consistency or agreement. Objective structured clinical examinations, simulated patient examinations and other performance-type assessments generally require generalisability theory analysis to account for various sources of measurement error in complex designs and to estimate the consistency of the generalisations to a universe or domain of skills. Conclusions, Reliability is a major source of validity evidence for assessments. Low reliability indicates that large variations in scores can be expected upon retesting. Inconsistent assessment scores are difficult or impossible to interpret meaningfully and thus reduce validity evidence. Reliability coefficients allow the quantification and estimation of the random errors of measurement in assessments, such that overall assessment can be improved. [source]


DISENTANGLING THE MEANING OF MULTISOURCE PERFORMANCE RATING SOURCE AND DIMENSION FACTORS

PERSONNEL PSYCHOLOGY, Issue 4 2009
BRIAN J. HOFFMAN
We extend multisource performance rating (MSPR) construct validity research by examining the pattern of relationships between factor analytically derived MSPR rating source and performance dimension factors and externally measured constructs (e.g., assessment center dimensions, personality constructs, and intelligence). The pattern of relationships among MSPR dimensions and external constructs provides modest construct validity evidence for the MSPR dimensions. In addition, MSPR source factors were differentially correlated with externally measured constructs, suggesting that MSPR source effects represent substantively meaningful source specific variance, as opposed to bias. These findings are discussed in the context of managerial skill diagnosis and the efficacy of collecting performance data from multiple sources. [source]


Self-report on the social skills rating system: Analysis of reliability and validity for an elementary sample

PSYCHOLOGY IN THE SCHOOLS, Issue 4 2005
James Clyde Diperna
The Social Skills Rating System (SSRS; F.M. Gresham & S.N. Elliott, 1990) is a norm-referenced measure of students' social and problem behaviors. Since its release, much of the published reliability and validity evidence for the SSRS has focused primarily on the Teacher Report Form. The purpose of this study was to explore reliability and validity evidence of scores on the SSRS-Student Elementary Form (SSRS-SEF) for children in Grades 3 to 5. Findings provided support for the use of Total scale as a measure of student social behavior for initial screening purposes; however, evidence for the subscales was not as strong as predicted. Directions for future research regarding reliability and validity of scores from the SSRS-SEF are discussed. © 2005 Wiley Periodicals, Inc. Psychol Schs 42: 345,354, 2005. [source]


Validity of the home and community social behavior scales: Comparisons with five behavior-rating scales

PSYCHOLOGY IN THE SCHOOLS, Issue 4 2001
Kenneth W. Merrell
Three separate studies focusing on convergent and discriminant validity evidence for the Home and Community Social Behavior Scales are presented. The HCSBS is a 65-item social behavior-rating scale for use by parents and caretakers of children and youth ages 5,18. It is a parent-rating version of the School Social Behavior Scales. Within these studies, relationships with five behavior-rating scales were examined: the Social Skills Rating System, Conners Parent Rating Scale,Revised-Short Form, Child Behavior Checklist, and the child and adolescent versions of the Behavior Assessment System for Children. HCSBS Scale A, Social Competence, evidenced strong positive correlations with measures of social skills and adaptability, strong negative correlations with measures of externalizing behavior problems, and modest negative correlations with measures of internalizing and atypical behavior problems. HCSBS Scale B, Antisocial Behavior, evidenced strong positive correlations with measures of externalizing behavior problems, modest positive correlations with measures of internalizing and atypical behavior problems, and strong negative correlations with measures of social skills and adaptability. These results support the HCSBS as a measure of social competence and antisocial behavior of children and youth. © 2001 John Wiley & Sons, Inc. [source]