Test Reliability (test + reliability)

Distribution by Scientific Domains


Selected Abstracts


Wada Test Reliability (Response to Haber et al.)

EPILEPSIA, Issue 9 2007
Tobias Loddenkemper MD
No abstract is available for this article. [source]


Competence in the musculoskeletal system: assessing the progression of knowledge through an undergraduate medical course

MEDICAL EDUCATION, Issue 12 2004
Subhashis Basu
Background, Professional bodies have expressed concerns that medical students lack appropriate knowledge in musculoskeletal medicine despite its high prevalence of use within the community. Changes in curriculum and teaching strategies may be contributing factors to this. There is little evidence to evaluate the degree to which these concerns are justified. Objectives, To design and evaluate an assessment procedure that tests the progress of medical students in achieving a core level of knowledge in musculoskeletal medicine during the course. Participants and Setting, A stratified sample of 136 volunteer students from all 5 years of the medical course at Sheffield University. Methods, The progress test concept was adapted to provide a cross-sectional view of student knowledge gain during each year of the course. A test was devised which aimed to provide an assessment of competence set at the standard required of the newly qualified doctor in understanding basic and clinical sciences relevant to musculoskeletal medicine. The test was blueprinted against internal and external guidelines. It comprised 40 multiple-choice and extended matching questions administered by computer. Six musculoskeletal practitioners set the standard using a modified Angoff procedure. Results, Test reliability was 0.6 (Cronbach's ,). Mean scores of students increased from 41% in Year 1 to 84% by the final year. Data suggest that, from a baseline score in Year 1, there is a disparate experience of learning in Year 2 that evens out in Year 3, with knowledge progression becoming more consistent thereafter. All final year participants scored above the standard predicted by the Angoff procedure. Conclusions, This short computer-based test was a feasible method of estimating student knowledge acquisition in musculoskeletal medicine across the undergraduate curriculum. Tested students appear to have acquired a satisfactory knowledge base by the end of the course. Knowledge gain seemed relatively independent of specialty-specific clinical training. Proposals from specialty bodies to include long periods of disciplinary teaching may be unnecessary. [source]


Measurement error: implications for diagnosis and discrepancy models of developmental dyslexia

DYSLEXIA, Issue 3 2005
Sue M. Cotton
Abstract The diagnosis of developmental dyslexia (DD) is reliant on a discrepancy between intellectual functioning and reading achievement. Discrepancy-based formulae have frequently been employed to establish the significance of the difference between ,intelligence' and ,actual' reading achievement. These formulae, however, often fail to take into consideration test reliability and the error associated with a single test score. This paper provides an illustration of the potential effects that test reliability and measurement error can have on the diagnosis of dyslexia, with particular reference to discrepancy models. The roles of reliability and standard error of measurement (SEM) in classic test theory are also briefly reviewed. This is followed by illustrations of how SEM and test reliability can aid with the interpretation of a simple discrepancy-based formula of DD. It is proposed that a lack of consideration of test theory in the use of discrepancy-based models of DD can lead to misdiagnosis (both false positives and false negatives). Further, misdiagnosis in research samples affects reproducibility and generalizability of findings. This in turn, may explain current inconsistencies in research on the perceptual, sensory, and motor correlates of dyslexia. Copyright © 2005 John Wiley & Sons, Ltd. [source]


A Bayesian predictive analysis of test scores

JAPANESE PSYCHOLOGICAL RESEARCH, Issue 1 2001
Hidetoki Ishii
In the classical test theory, a high-reliability test always leads to a precise measurement. However, when it comes to the prediction of test scores, it is not necessarily so. Based on a Bayesian statistical approach, we predicted the distributions of test scores for a new subject, a new test, and a new subject taking a new test. Under some reasonable conditions, the predicted means, variances, and covariances of predicted scores were obtained and investigated. We found that high test reliability did not necessarily lead to small variances or covariances. For a new subject, higher test reliability led to larger predicted variances and covariances, because high test reliability enabled a more accurate prediction of test score variances. Regarding a new subject taking a new test, in this study, higher test reliability led to a large variance when the sample size was smaller than half the number of tests. The classical test theory is reanalyzed from the viewpoint of predictions and some suggestions are made. [source]


Modeling Randomness in Judging Rating Scales with a Random-Effects Rating Scale Model

JOURNAL OF EDUCATIONAL MEASUREMENT, Issue 4 2006
Wen-Chung Wang
This study presents the random-effects rating scale model (RE-RSM) which takes into account randomness in the thresholds over persons by treating them as random-effects and adding a random variable for each threshold in the rating scale model (RSM) (Andrich, 1978). The RE-RSM turns out to be a special case of the multidimensional random coefficients multinomial logit model (MRCMLM) (Adams, Wilson, & Wang, 1997) so that the estimation procedures for the MRCMLM can be directly applied. The results of the simulation indicated that when the data were generated from the RSM, using the RSM and the RE-RSM to fit the data made little difference: both resulting in accurate parameter recovery. When the data were generated from the RE-RSM, using the RE-RSM to fit the data resulted in unbiased estimates, whereas using the RSM resulted in biased estimates, large fit statistics for the thresholds, and inflated test reliability. An empirical example of 10 items with four-point rating scales was illustrated in which four models were compared: the RSM, the RE-RSM, the partial credit model (Masters, 1982), and the constrained random-effects partial credit model. In this real data set, the need for a random-effects formulation becomes clear. [source]


Cervicocephalic kinaesthesia: reliability of a new test approach

PHYSIOTHERAPY RESEARCH INTERNATIONAL, Issue 4 2001
Eythor Kristjansson Faculty of Medicine
Abstract Background and Purpose Relocating either the natural head posture (NHP) or predetermined points in range are clinical tests of impaired neck proprioception but memory might influence these tests. Three new tests, reasoned to be more challenging for the proprioceptive system, were developed. The objectives were to assess the reliability of all tests and whether the three new tests were more challenging for the proprioceptive system. Method A test,retest design was used to assess the reproducibility and errors of all five tests. Twenty asymptomatic volunteers were assessed a week apart, using an electromagnetic movement sensor system, the 3-Space Fastrak. A measure of error magnitude was used to detect kinaesthetic sensibility. Comparison of the means and their corresponding dispersion were analysed descriptively. The between-day intraclass correlation coefficients (ICCs) were calculated and plots of mean differences between days 1 and 2 were conducted to estimate test reliability. Multivariate analysis of variance (MANOVA) and least significant difference (LSD) pairwise comparisons were performed to compare the test accuracy between different target positions. Results ICCs were between 0.35 and 0.9, but plotting the data modified the interpretation in some tests. Relocating a NHP was easier when the trunk was in a neutral position than when pre-rotated (error 2.46° (±0.2°) versus 5.95° (±0.7°). Relocating a 30° rotation position (error 5.8° (±0.6°) and repeatedly moving through a target (error 4.82° (±0.7°) was also difficult. Conclusions The new tests were more challenging than relocating the NHP but the reliability of tests relocating uncommon positions was questionable. Copyright © 2001 Whurr Publishers Ltd. [source]