Educational Measurement (educational + measurement)

Distribution by Scientific Domains


Selected Abstracts


Instructional Tools in Educational Measurement and Statistics (ITEMS) for School Personnel: Evaluation of Three Web-Based Training Modules

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 2 2008
Rebecca Zwick
In the current No Child Left Behind era, K-12 teachers and principals are expected to have a sophisticated understanding of standardized test results, use them to improve instruction, and communicate them to others. The goal of our project, funded by the National Science Foundation, was to develop and evaluate three Web-based instructional modules in educational measurement and statistics to help school personnel acquire the "assessment literacy" required for these roles. Our first module, "What's the Score?" was administered in 2005 to 113 educators who also completed an assessment literacy quiz. Viewing the module had a small but statistically significant positive effect on quiz scores. Our second module, "What Test Scores Do and Don't Tell Us," administered in 2006 to 104 educators, was even more effective, primarily among teacher education students. In evaluating our third module, "What's the Difference?" we were able to recruit only 33 participants. Although those who saw the module before taking the quiz outperformed those who did not, results were not statistically significant. Now that the research phase is complete, all ITEMS instructional materials are freely available on our Website. [source]


Defining and Evaluating Models of Cognition Used in Educational Measurement to Make Inferences About Examinees' Thinking Processes

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 2 2007
Jacqueline P. Leighton
The purpose of this paper is to define and evaluate the categories of cognitive models underlying at least three types of educational tests. We argue that while all educational tests may be based,explicitly or implicitly,on a cognitive model, the categories of cognitive models underlying tests often range in their development and in the psychological evidence gathered to support their value. For researchers and practitioners, awareness of different cognitive models may facilitate the evaluation of educational measures for the purpose of generating diagnostic inferences, especially about examinees' thinking processes, including misconceptions, strengths, and/or abilities. We think a discussion of the types of cognitive models underlying educational measures is useful not only for taxonomic ends, but also for becoming increasingly aware of evidentiary claims in educational assessment and for promoting the explicit identification of cognitive models in test development. We begin our discussion by defining the term cognitive model in educational measurement. Next, we review and evaluate three categories of cognitive models that have been identified for educational testing purposes using examples from the literature. Finally, we highlight the practical implications of "blending" models for the purpose of improving educational measures. [source]


Standard-Setting Methods as Measurement Processes

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 1 2010
Paul Nichols
Some writers in the measurement literature have been skeptical of the meaningfulness of achievement standards and described the standard-setting process as blatantly arbitrary. We argue that standard setting is more appropriately conceived of as a measurement process similar to student assessment. The construct being measured is the panelists' representation of student performance at the threshold of an achievement level. In the first section of this paper, we argue that standard setting is an example of stimulus-centered measurement. In the second section, we elaborate on this idea by comparing some popular standard-setting methods to the stimulus-centered scaling methods known as psychophysical scaling. In the third section, we use the lens of standard setting as a measurement process to take a fresh look at the two criticisms of standard setting: the role of judgment and the variability of results. In the fourth section, we offer a vision of standard-setting research and practice as grounded in the theory and practice of educational measurement. [source]


Instructional Tools in Educational Measurement and Statistics (ITEMS) for School Personnel: Evaluation of Three Web-Based Training Modules

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 2 2008
Rebecca Zwick
In the current No Child Left Behind era, K-12 teachers and principals are expected to have a sophisticated understanding of standardized test results, use them to improve instruction, and communicate them to others. The goal of our project, funded by the National Science Foundation, was to develop and evaluate three Web-based instructional modules in educational measurement and statistics to help school personnel acquire the "assessment literacy" required for these roles. Our first module, "What's the Score?" was administered in 2005 to 113 educators who also completed an assessment literacy quiz. Viewing the module had a small but statistically significant positive effect on quiz scores. Our second module, "What Test Scores Do and Don't Tell Us," administered in 2006 to 104 educators, was even more effective, primarily among teacher education students. In evaluating our third module, "What's the Difference?" we were able to recruit only 33 participants. Although those who saw the module before taking the quiz outperformed those who did not, results were not statistically significant. Now that the research phase is complete, all ITEMS instructional materials are freely available on our Website. [source]


Defining and Evaluating Models of Cognition Used in Educational Measurement to Make Inferences About Examinees' Thinking Processes

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 2 2007
Jacqueline P. Leighton
The purpose of this paper is to define and evaluate the categories of cognitive models underlying at least three types of educational tests. We argue that while all educational tests may be based,explicitly or implicitly,on a cognitive model, the categories of cognitive models underlying tests often range in their development and in the psychological evidence gathered to support their value. For researchers and practitioners, awareness of different cognitive models may facilitate the evaluation of educational measures for the purpose of generating diagnostic inferences, especially about examinees' thinking processes, including misconceptions, strengths, and/or abilities. We think a discussion of the types of cognitive models underlying educational measures is useful not only for taxonomic ends, but also for becoming increasingly aware of evidentiary claims in educational assessment and for promoting the explicit identification of cognitive models in test development. We begin our discussion by defining the term cognitive model in educational measurement. Next, we review and evaluate three categories of cognitive models that have been identified for educational testing purposes using examples from the literature. Finally, we highlight the practical implications of "blending" models for the purpose of improving educational measures. [source]


Avoiding Misconception, Misuse, and Missed Opportunities: The Collection of Verbal Reports in Educational Achievement Testing

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 4 2004
Jacqueline P. Leighton
The collection of verbal reports is one way in which cognitive and developmental psychologists gather data to formulate and corroborate models of problem solving. The current use of verbal reports to design and validate educational assessments reflects the growing trend to fuse cognitive psychological research and educational measurement. However, doubts about the trustworthiness or accuracy of verbal reports may suggest a potential reversal of this trend. Misconceptions about the trustworthiness of verbal reports could signal misuse of verbal reports and, consequently, waning interest and missed opportunities in the description of cognitive models of test performance. In this article, misconceptions of verbal reports are addressed by (a) discussing the value of cognitive models for educational achievement testing; (b) addressing pertinent issues in the collection of verbal reports from students; and (c) concluding with avenues for a more productive union between cognitive psychological research and educational measurement. [source]


Modeling Passing Rates on a Computer-Based Medical Licensing Examination: An Application of Survival Data Analysis

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 3 2004
André F. de Champlain
The purpose of this article was to model United States Medical Licensing Examination (USMLE) Step 2 passing rates using the Cox Proportional Hazards Model, best known for its application in analyzing clinical trial data. The number of months it took to pass the computer-based Step 2 examination was treated as the dependent variable in the model. Covariates in the model were: (a) medical school location (U.S. and Canadian or other), (b) primary language (English or other), and (c) gender. Preliminary findings indicate that examinees were nearly 2.7 times more likely to experience the event (pass Step 2) if they were U.S. or Canadian trained. Examinees with English as their primary language were 2.1 times more likely to pass Step 2, but gender had little impact. These findings are discussed more fully in light of past research and broader potential applications of survival analysis in educational measurement. [source]


Statistical Process Control Charts for Measuring and Monitoring Temporal Consistency of Ratings

JOURNAL OF EDUCATIONAL MEASUREMENT, Issue 1 2010
M. Hafidz Omar
Methods of statistical process control were briefly investigated in the field of educational measurement as early as 1999. However, only the use of a cumulative sum chart was explored. In this article other methods of statistical quality control are introduced and explored. In particular, methods in the form of Shewhart mean and standard deviation charts are introduced as techniques for ensuring quality in a measurement process for rating performance items in operational assessments. Several strengths and weaknesses of the procedures are explored with illustrative real and simulated rating data. Further research directions are also suggested. [source]