Test Development (test + development)

Distribution by Scientific Domains


Selected Abstracts


Testing Students with Special Needs: A Model for Understanding the Interaction Between Assessment and Student Characteristics in a Universally Designed Environment

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 3 2008
Leanne R. Ketterlin-Geller
This article presents a model of assessment development integrating student characteristics with the conceptualization, design, and implementation of standardized achievement tests. The model extends the assessment triangle proposed by the National Research Council (Pellegrino, Chudowsky, & Glaser, 2001) to consider the needs of students with disabilities and English learners on two dimensions: cognitive interaction and observation interaction. Specific steps in the test development cycle for including students with special needs are proposed following the guidelines provided byDowning (2006). Because this model of test development considers the range of student needs before test development commences, student characteristics are supported by applying the principles of universal design and appropriately aligning accommodations to address student needs. Specific guidelines for test development are presented. [source]


Defining and Evaluating Models of Cognition Used in Educational Measurement to Make Inferences About Examinees' Thinking Processes

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 2 2007
Jacqueline P. Leighton
The purpose of this paper is to define and evaluate the categories of cognitive models underlying at least three types of educational tests. We argue that while all educational tests may be based,explicitly or implicitly,on a cognitive model, the categories of cognitive models underlying tests often range in their development and in the psychological evidence gathered to support their value. For researchers and practitioners, awareness of different cognitive models may facilitate the evaluation of educational measures for the purpose of generating diagnostic inferences, especially about examinees' thinking processes, including misconceptions, strengths, and/or abilities. We think a discussion of the types of cognitive models underlying educational measures is useful not only for taxonomic ends, but also for becoming increasingly aware of evidentiary claims in educational assessment and for promoting the explicit identification of cognitive models in test development. We begin our discussion by defining the term cognitive model in educational measurement. Next, we review and evaluate three categories of cognitive models that have been identified for educational testing purposes using examples from the literature. Finally, we highlight the practical implications of "blending" models for the purpose of improving educational measures. [source]


Use of Knowledge, Skill, and Ability Statements in Developing Licensure and Certification Examinations

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 1 2005
Ning Wang
The task inventory approach is commonly used in job analysis for establishing content validity evidence supporting the use and interpretation of licensure and certification examinations. Although the results of a task inventory survey provide job task-related information that can be used as a reliable and valid source for test development, it is often the knowledge, skills, and abilities (KSAs) required for performing the tasks, rather than the job tasks themselves, which are tested by licensure and certification exams. This article presents a framework that addresses the important role of KSAs in developing and validating licensure and certification examinations. This includes the use of KSAs in linking job task survey results to the test content outline, transferring job task weights to test specifications, and eventually applying the results to the development of the test items. The impact of using KSAs in the development of test specifications is illustrated from job analyses for two diverse professions. One method for transferring job task weights from the job analysis to test specifications through KSAs is also presented, along with examples. The two examples demonstrated in this article are taken from nursing certification and real estate licensure programs. However, the methodology for using KSAs to link job tasks and test content is also applicable in the development of teacher credentialing examinations. [source]


Rethinking the OSCE as a Tool for National Competency Evaluation

EUROPEAN JOURNAL OF DENTAL EDUCATION, Issue 2 2004
M. A. Boyd
The relatively recent curriculum change to Problem-Based Learning/Case-Based Education has stimulated the development of new evaluation tools for student assessment. The Objective Structured Clinical Examination (OSCE) has become a popular method for such assessment. The National Dental Examining Board of Canada (NDEB) began using an OSCE format as part of the national certification testing process for licensure of beginning dentists in Canada in 1996. The OSCE has been well received by provincial licensing authorities, dental schools and students. ,Hands on' clinical competency is trusted to the dental programs and verified through NDEB participation in the Accreditation process. The desire to refine the OCSE has resulted in the development of a new format. Previously OSCE stations consisted of case-based materials and related multiple-choice questions. The new format has case-based material with an extended match presentation. Candidates ,select one or more correct answers' from a group of up to15 options. The blueprint is referenced to the national competencies for beginning practitioners in Canada. This new format will be available to students on the NDEB website for information and study purposes. Question stems and options will remain constant. Case histories and case materials will change each year. This new OSCE will be easier to administer and be less expensive in terms of test development. Reliability and validity is enhanced by involving content experts from all faculties in test development, by having the OSCE verified by general practitioners and by making the format available to candidates. The new OSCE will be pilot tested in September 2004. Examples will be provided for information and discussion. [source]


Family health effects: complements or substitutes

HEALTH ECONOMICS, Issue 8 2001
Michael Lee Ganz
Abstract Genetic endowments play a fundamental role in the production of health. At birth individuals have different capacities to be healthy, largely due to genetic dispositions. Whether or not individuals realize this health depends on their choice of health behaviours. Previous research has linked negative factors beyond the individual's control, which include genetic endowments, to both poor health and poor health behaviours. The health economics literature proposes that behaviours and genetic (or family health) endowments can be either substitutes or complements in the production of health. The goal of this paper is to investigate the behavioural consequences of changes in knowledge about one's genetic endowment. Using two waves of the National Health and Nutrition Examination Survey I Epidemiologic Followup Study, I find that for smokers, smoking intensity substitutes for newly diagnosed smoking-related family cancers, while smoking intensity is complementary to newly diagnosed non-smoking-related family cancers. I find no evidence for the hypothesized relationships with respect to alcohol consumption among drinkers. These results have implications for the growing field of genetic testing and test development. These results also reinforce current practices of ascertaining family health histories in the context of medical history taking. Copyright © 2001 John Wiley & Sons, Ltd. [source]


Content Validation Is Useful for Many Things, but Validity Isn't One of Them

INDUSTRIAL AND ORGANIZATIONAL PSYCHOLOGY, Issue 4 2009
KEVIN R. MURPHY
Content-oriented validation strategies establish the validity of selection tests as predictors of performance by comparing the content of the tests with the content of the job. These comparisons turn out to have little if any bearing on the predictive validity of selection tests. There is little empirical support for the hypothesis that the match between job content and test content influences validity, and there are often structural factors in selection (e.g., positive correlations among selection tests) that strongly limit the possible influence of test content on validity. Comparisons between test content and job content have important implications for the acceptability of testing, the defensibility of tests in legal proceedings, and the transparency of test development and validation, but these comparisons have little if any bearing on validity. [source]


Development of a test to evaluate residents' knowledge of medical procedures,,

JOURNAL OF HOSPITAL MEDICINE, Issue 7 2009
Shilpa Grover MD
Abstract BACKGROUND AND AIM: Knowledge of core medical procedures is required by the American Board of Internal Medicine (ABIM) for certification. Efforts to improve the training of residents in these procedures have been limited by the absence of a validated tool for the assessment of knowledge. In this study we aimed to develop a standardized test of procedural knowledge in 3 medical procedures associated with potentially serious complications. METHODS: Placement of an arterial line, central venous catheter, and thoracentesis were selected for test development. Learning objectives and multiple-choice questions were constructed for each topic. Content evidence was evaluated by critical care subspecialists. Item test characteristics were evaluated by administering the test to students, residents and specialty clinicians. Reliability of the 32-item instrument was established through its administration to 192 medical residents in 4 hospitals. RESULTS: Reliability of the instrument as measured by Cronbach's , was 0.79 and its test-retest reliability was 0.82. Median score was 53% on a test comprising elements deemed important by critical care subspecialists. Increasing number of procedures attempted, higher self-reported confidence, and increasing seniority were predictors of overall test scores. Procedural confidence correlated significantly with increasing seniority and experience. Residents performed few procedures. CONCLUSIONS: We have successfully developed a standardized instrument to assess residents' cognitive competency for 3 common procedures. Residents' overall knowledge about procedures is poor. Experiential learning is the dominant source for knowledge improvement, but these experiences are increasingly rare. Journal of Hospital Medicine 2009;4:430,432. © 2009 Society of Hospital Medicine. [source]


A primer on classical test theory and item response theory for assessments in medical education

MEDICAL EDUCATION, Issue 1 2010
André F De Champlain
Context, A test score is a number which purportedly reflects a candidate's proficiency in some clearly defined knowledge or skill domain. A test theory model is necessary to help us better understand the relationship that exists between the observed (or actual) score on an examination and the underlying proficiency in the domain, which is generally unobserved. Common test theory models include classical test theory (CTT) and item response theory (IRT). The widespread use of IRT models over the past several decades attests to their importance in the development and analysis of assessments in medical education. Item response theory models are used for a host of purposes, including item analysis, test form assembly and equating. Although helpful in many circumstances, IRT models make fairly strong assumptions and are mathematically much more complex than CTT models. Consequently, there are instances in which it might be more appropriate to use CTT, especially when common assumptions of IRT cannot be readily met, or in more local settings, such as those that may characterise many medical school examinations. Objectives, The objective of this paper is to provide an overview of both CTT and IRT to the practitioner involved in the development and scoring of medical education assessments. Methods, The tenets of CCT and IRT are initially described. Then, main uses of both models in test development and psychometric activities are illustrated via several practical examples. Finally, general recommendations pertaining to the use of each model in practice are outlined. Discussion, Classical test theory and IRT are widely used to address measurement-related issues that arise from commonly used assessments in medical education, including multiple-choice examinations, objective structured clinical examinations, ward ratings and workplace evaluations. The present paper provides an introduction to these models and how they can be applied to answer common assessment questions. Medical Education 2010: 44: 109,117 [source]


Use of Educational and Psychological Tests Internationally

APPLIED PSYCHOLOGY, Issue 2 2004
Thomas Oakland
Les aspects internationaux du développement et de l'usage des tests sont traités ici, en particulier pour situer les articles de cette édition spéciale. On présente succinctement l'utilisation internationale des tests ainsi que l'histoire, l'impact des conditions externes et internes, les normes et les lignes directrices du développement et de l'usage des tests. On parle aussi des organisations régionales et internationales actives dans le domaine du développement et de l'usage des tests, sans oublier les initiatives de structures comme la Commission Internationale des Tests. International aspects of test development and use are described, in part, to provide a context for other articles in this special issue. The history of test development and use, external and internal conditions that impact test development and use, test use internationally, together with standards and guidelines for test development and use are summarised. Regional and international organisations providing leadership in test development and use as well as leadership efforts by the International Test Commission and others are discussed. [source]