Home About us Contact

Interrater Reliability (interrater + reliability)

Distribution by Scientific Domains

Medical Sciences	91%
Psychology	4%
3 Other Domains	5%

Distribution within Medical Sciences

Neurology	23%
Nursing General	15%
Anesthesia & Pain Management	10%
Occupational Therapy	8%
General & Internal Medicine	4%
Psychiatry	2%
10 Other Subdomains	38%

Selected Abstracts

An acute care skills evaluation for graduating medical students: a pilot study using clinical simulation

MEDICAL EDUCATION, Issue 9 2002
David Murray
Purpose, This investigation aimed to explore the measurement properties of scores from a patient simulator exercise. Methods, Analytic and holistic scores were obtained for groups of medical students and residents. Item analysis techniques were used to explore the nature of specific examinee actions. Interrater reliability was calculated. Scores were contrasted for third year medical students, fourth year medical students and emergency department residents. Results, Interrater reliabilities for analytic and holistic scores were 0·92 and 0·81, respectively. Based on item analysis, proper timing and sequencing of actions discriminated between low- and high-ability examinees. In general, examinees with more advanced training obtained higher scores on the simulation exercise. Conclusion, Reliable and valid measures of clinical performance can be obtained from a trauma simulation provided that care is taken in the development and scoring of the scenario. [source]

Assessing chimpanzee personality and subjective well-being in Japan

AMERICAN JOURNAL OF PRIMATOLOGY, Issue 4 2009
Alexander Weiss
Abstract We tested whether the cultural background of raters influenced ratings of chimpanzee personality. Our study involved comparing personality and subjective well-being ratings of 146 chimpanzees in Japan that were housed in zoos, research institutes, and a retirement sanctuary to ratings of chimpanzees in US and Australian zoos. Personality ratings were made on a translated and expanded version of a questionnaire used to rate chimpanzees in the US and Australia. Subjective well-being ratings were made on a translated version of a questionnaire used to rate chimpanzees in the US and Australia. The mean interrater reliabilities of the 43 original adjectives did not markedly differ between the present sample and the original sample of 100 zoo chimpanzees in the US. Interrater reliabilities of these samples were highly correlated, suggesting that their rank order was preserved. Comparison of the factor structures for the Japanese sample and for the original sample of chimpanzees in US zoos indicated that the overall structure was replicated and that the Dominance, Extraversion, Conscientiousness, and Agreeableness domains clearly generalized. Consistent with earlier studies, older chimpanzees had higher Dominance and lower Extraversion and Openness scores. Correlations between the six domain scores and subjective well-being were comparable to those for chimpanzees housed in the US and Australia. These findings suggest that chimpanzee personality ratings are not affected by the culture of the raters. Am. J. Primatol. 71:283,292, 2009. © 2009 Wiley-Liss, Inc. [source]

Effects of back care education in elementary schoolchildren

ACTA PAEDIATRICA, Issue 8 2000
G Cardon
The purpose of this study was to investigate the effects of a back care education programme, consisting of six sessions of 1 h each, in fourth- and fifth-grade elementary schoolchildren. Testing consisted of a practical performance and a back care knowledge test. Forty-two subjects and 36 controls performed a pre-test and were tested within 1 wk after the programme. To monitor effects and follow-up effects on a larger sample, 82 different pupils were tested within 1 wk after the programme and 116 other children 3 mo after. Both larger samples were compared with one group of 129 controls. Interrater reliability for the test items of the practical assessment was high; intraclass correlation coefficients varied from 0.785 to 0.980. In the pre/post design study, interaction between time and condition was significant for the sum score of the practical assessment and for the knowledge test (p < 0.001), with higher scores for the intervention group (15% improvement for the knowledge test score, 31.6% for the practical sum score). Significantly higher sum scores for the knowledge test and for all practical assessment items were found in the intervention groups, tested within 1 wk and 3 mo after the programme, in comparison with the control group (p <0.001). Conclusion: The effectiveness of a primary educational prevention programme on back care principles was demonstrated in this study. Effectiveness, long-term outcomes and behavioural changes need further evaluation to optimize back care prevention programmes for elementary schoolchildren. [source]

The Gross Motor Function Classification System for Cerebral Palsy: a study of reliability and stability over time

DEVELOPMENTAL MEDICINE & CHILD NEUROLOGY, Issue 5 2000
Ellen Wood MD MSc FRCP(C) Assistant Professor
Children with cerebral palsy (CP) experience a change in motor function with age and development. It is important to consider this expected change in offering a prognosis, or in assessing differences in motor function after an intervention. The Gross Motor Function Classification System for CP (GMFCS) has been developed for these purposes. This study was based on a retrospective chart review of 85 children with CP followed from ,2 to ,12 years of age. The GMFCS was applied to clinical notes by two blinded raters four times throughout the study. Interrater reliability was high (G=0.93). Test-retest reliability was high (G=0.79). The positive predictive value of the GMFCS at 1 to 2 years of age to predict walking by age 12 years was 0.74. The negative predictive value was 0.90. The GMFCS can validly predict motor function for children with CP. The results are discussed in terms of their implications for clinical practice and future research. [source]

Specific Epileptic Syndromes Are Rare Even in Tertiary Epilepsy Centers: A Patient-oriented Approach to Epilepsy Classification

EPILEPSIA, Issue 3 2004
Christoph Kellinghaus
Summary: Purpose: To assess the practicability and reliability of a five-dimensional patient-oriented epilepsy classification and to compare it with the International League Against Epilepsy (ILAE) classification of epilepsy and epileptic syndromes. The dimensions consist of the epileptogenic zone, semiologic seizure type(s), etiology, related medical conditions, and seizure frequency. Methods: The 185 epilepsy patients (94 adults, 91 children, aged 18 years or younger) were randomly selected from the database of a tertiary epilepsy center and the general neurological department of a metropolitan hospital (28 adults). The charts were reviewed independently by two investigators and classified according to both the ILAE and the patient-oriented classification. Interrater reliability was assessed, and a final consensus among all investigators was established. Results: Only four (4%) adults and 19 (21%) children were diagnosed with a specific epilepsy syndrome of the ILAE classification. All other patients were in unspecific categories. The patient-oriented classification revealed that 64 adults and 56 children had focal epilepsy. In an additional 34 adults and 45 children, the epileptogenic zone could be localized to a certain brain region, and in 14 adults and five children, the epileptogenic zone could be lateralized. Fourteen adults and 21 children had generalized epilepsy. In 16 adults and 14 children, it remained unclear whether the epilepsy was focal or generalized. Generalized simple motor seizures were found in 66 adults and 52 children, representing the most frequent seizure type. Etiology could be determined in 40 adults and 45 children. Hippocampal sclerosis was the most frequent etiology in adults (10%), and cortical dysplasia (9%), in children. Seven adults and 31 children had at least daily seizures. Seventeen adults and 26 children had rare or no seizures at their last documented contact. The most frequent related medical conditions were psychiatric disorders and mental retardation. Interrater agreement was high (kappa values of 0.8 to 0.9) for both the patient-oriented and the ILAE classification. Conclusions: Specific epilepsy syndromes included in the current ILAE classification are rare even in a tertiary epilepsy center. Most patients are included in unspecific categories that provide only incomplete information. In contrast, all of the patients could be classified by the five-dimensional patient-oriented classification, providing all essential information for the management of the patients with a high degree of interrater reliability. [source]

A neurological examination score for the assessment of spinocerebellar ataxia 3 (SCA3)

EUROPEAN JOURNAL OF NEUROLOGY, Issue 4 2008
C. Kieling
Spinocerebellar ataxias (SCAs) are characterized by a heterogeneous set of clinical manifestations. Our aims were to assess the neurological features of SCA3, and to describe and test the feasibility, reliability, and validity of a comprehensive Neurological Examination Score for Spinocerebellar Ataxia (NESSCA). The NESSCA was administered to molecularly diagnosed SCA3 patients at an outpatient neurogenetics clinic. The scale, based on the standardized neurological examination, consisted of 18 items that yielded a total score ranging from 0 to 40. The score's interrater reliability and internal consistency were investigated, and a principal components analysis and a correlation with external measures were performed. Ninety-nine individuals were evaluated. Interrater reliability ranged from 0.8 to 1 across individual items (P < 0.001); internal consistency, indicated by Cronbach's alpha, was 0.77. NESSCA scores were significantly correlated with measures of disease severity: disease stage (rho = 0.76, P < 0.001), duration (rho = 0.56, P < 0.001), and length of CAG repeat (rho = 0.30, P < 0.05). NESSCA was a reliable measure for the assessment of distinct neurological deficits in SCA3 patients. Global scores correlated with all external variables tested, showing NESSCA to be a comprehensive measure of disease severity that is both clinically useful and scientifically valid. [source]

Evaluation of a German version of the Rivermead Mobility Index (RMI) in acute and chronic stroke patients

EUROPEAN JOURNAL OF NEUROLOGY, Issue 5 2000
M. R. Schindl
The English Rivermead Mobility Index (RMI) has been proposed as a simple, valid and reliable measure in stroke rehabilitation. A German version was established and validated in two centres. In centre A 46 acute (median: 3.0 days after onset) and in centre B 151 chronic (median: 88.0 days after onset) stroke patients participated. Interrater reliability of the German RMI was tested in 12 subjects in the acute stage of stroke and was found to be statistically significant (r = 0.98, P < 0.0001). In centre A, a statistically significant correlation was found between the German RMI and the 10-m walk time at baseline (r = 0.73, P < 0.0001) and after three weeks (r = 0.92, P < 0.0001). In centre B, the German RMI correlated significantly with the motor part of the Functional Independence Measure (motor-FIM) on admission (r = 0.78, P < 0.0001) and after three weeks (r = 0.79, P < 0.0001), respectively. The change of the RMI correlated significantly with the change in 10-m walk time in acute patients (r = 0.87, P < 0.0001) and with the change in motor-FIM in chronic patients (r = 0.54, P < 0.0001). A moderate ceiling-effect was detected in the chronic study population. The German RMI appears to be a reliable, valid and responsive measure for mobility disability in acute and chronic stroke patients. [source]

A Comparison of Data Sources for Motor Vehicle Crash Characteristic Accuracy

ACADEMIC EMERGENCY MEDICINE, Issue 8 2000
Robert J. Grant MD
Abstract. Objective: To determine the accuracy of police reports (PRs), ambulance reports (ARs), and emergency department records (EDRs) in describing motor vehicle crash (MVC) characteristics when compared with an investigation performed by an experienced crash investigator trained in impact biomechanics. Methods: This was a cross-sectional, observational study. Ninety-one patients transported by ambulance to a university emergency department (ED) directly from the scene of an MVC from August 1997 to April 1998 were enrolled. Potential patients were identified from the ED log and consent was obtained to investigate the crash vehicle. Data describing MVC characteristics were abstracted from the PR, AR, and medical record. Variables of interest included restraint use (RU), air bag deployment (AD), and type of impact (TI). Agreements between the variables and the independent crash investigation were compared using kappa. Interrater reliability was determined using kappa by comparing a random sample of 20 abstracted reports for each data source with the originally abstracted data. Results: Agreement using kappa between the crash investigation and each data source was 0.588 (95% CI = 0.508 to 0.667) for the PR, 0.330 (95% CI = 0.252 to 0.407) for the AR, and 0.492 (95% CI = 0.413 to 0.572) for the EDR. Variable agreement was 0.239 (95% CI = 0.164 to 0.314) for RU, 0.350 (95% CI = 0.268 to 0.432) for AD, and 0.631 (95%= 0.563 to 0.698) for TI. Interrater reliability was excellent (kappa > 0.8) for all data sources. Conclusions: The strength of the agreement between the independent crash investigation and the data sources that were measured by kappa was fair to moderate, indicating inaccuracies. This presents ramifications for researchers and necessitates consideration of the validity and accuracy of crash characteristics contained in these data sources. [source]

Interrater reliability of the Psychiatric Research Interview for Substance and Mental Disorders in an HIV-infected cohort: experience of the National NeuroAIDS Tissue Consortium

INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, Issue 3 2006
S. Morgello
Abstract The interrater reliability of the Psychiatric Research Interview for Substance and Mental Disorders (PRISM) was assessed in a multicentre study. Four sites of the National NeuroAIDS Tissue Consortium performed blinded reratings of audiotaped PRISM interviews of 63 HIV-infected patients. Diagnostic modules for substance-use disorders and major depression were evaluated. Seventy-six per cent of the patient sample displayed one or more substance-use disorder diagnoses and 54% had major depression. Kappa coefficients for lifetime histories of substance abuse or dependence (cocaine, opiates, alcohol, cannabis, sedative, stimulant, hallucinogen) and major depression ranged from 0.66 to 1.00. Overall the PRISM was reliable in assessing both past and current disorders except for current cannabis disorders when patients had concomitant cannabinoid prescriptions for medical therapy. The reliability of substance-induced depression was poor to fair although there was a low prevalence of this diagnosis in our group. We conclude that the PRISM yields reliable diagnoses in a multicentre study of substance-experienced, HIV-infected individuals. Copyright © 2006 John Wiley & Sons, Ltd. [source]

Evaluation of NOC Measures in Home Care Nursing Practice

INTERNATIONAL JOURNAL OF NURSING TERMINOLOGIES AND CLASSIFICATION, Issue 2003
Gail M. Keenan
PURPOSE To evaluate the reliability, validity, usefulness, and sensitivity of 89 NOC outcomes in two Visiting Nurse Associations in Michigan. METHODS Of a total 190 NOC outcomes 89 were assigned for testing. Interrater reliability and criterion validity were assessed a total of 50 times per outcome (on 50 different patients) across the study units. The total number of times the reliability and validity were assessed for each of the 89 measures studied ranged from 5,45. Three RN research assistants (RNRAs) oversaw and participated in data collection with the help of 15 clinicians. Convenience sampling was used to identify subjects. A roster of outcomes to be studied was maintained and matched with patient conditions whenever possible until the quota of outcomes assigned had been evaluated. Clinicians and RNRAs independently rated the outcomes and indicators applicable to the patient. NANDA diagnoses, NIC interventions, and medical diagnoses were recorded. FINDINGS A total of 258 patients (mean age 62) enrolled; 60% were women, 23% were from minority groups, and 78% had no college degree. Thirty-six of the 89 NOC measures were designated "clinically useful." The 10 outcomes with the highest interrater reliability were Caregiver Home Care Readiness; Caregiver Stressors; Caregiving Endurance Potential; Infection Status; Mobility Level; Safety Status: Physical Injury; Self-Care: Activities of Daily Living; Self-Care: Bathing; Self-Care: Hygiene; and Wound Healing: Secondary Intention. Criterion measurement and repeated ratings provided evidence to support the validity and sensitivity of the NOC outcomes. Evidence also suggested that NOC label level ratings could be a feasible, reliable, and valid method of evaluating nursing outcomes under actual use. For some measures, adjustments in the scales and anchors are needed to enhance reliability. For others, it may be unrealistic to reliably score in one encounter, thus scoring should be deferred until the clinician has adequate knowledge of the patient. CONCLUSIONS Continued study and refinement that are coordinated and integrated systematically strongly recommended. Comprehensive study in an automated system with a controlled format will increase the efficiency of future studies. [source]

Nursing Outcomes for Evaluations of Caregiver Outcomes in a Rural Alzheimer Demonstration Project

INTERNATIONAL JOURNAL OF NURSING TERMINOLOGIES AND CLASSIFICATION, Issue 2003
Janet Specht
PURPOSE To evaluate the effectiveness of the interventions of nurse care managers in the care of family members of people with dementia. METHODS Data were collected as part of a 3-year Administration on Aging,funded Alzheimer Demonstration Project to provide expanded in-home services to rural Iowans affected by Alzheimer disease and related disorders in 8 rural Iowa counties,randomly selected to have a nurse care manager and 4 designated control counties that had traditional case management service. Nurse care managers were trained in the care of people with dementia and their caregivers, the use of role transition theory, and the Progressively Lowered Stress Threshold model of care to provide and coordinate services for enrollees. All referred people with cognitive impairment and their families in the 8 study counties were eligible for inclusion. Three selected NOC outcomes were tested in clinical settings. Interrater reliability for the outcomes was good (87%,95%). The construct validity of Caregiver Stressors Outcome was .74 when correlated with the Caregiver Stress Index. FINDINGS Of the 142 subjects with cognitive impairment enrolled within the first year of the grant, 113 had a caregiver. The outcomes were used to evaluate differences in caregiver outcomes at baseline and at 6-month intervals. The majority of caregivers at follow-up was female and had been providing care for ,5 years. For each of the outcomes the majority of caregivers had improved scores, with only 2,4 caregivers getting scores indicating worsening conditions or remaining the same. CONCLUSIONS Preliminary analysis shows a trend of improved outcomes with the use of a nurse care manager. The NOC caregiver outcomes showed good variability among caregivers at baseline, with caregiver responses distributed throughout the scales. The NOC outcomes also provide guidance for interventions of the nurse care managers. Further evaluation of the outcomes is needed, including examining the relationships of placement, health status, and service use of each outcome. The caregiver outcomes offer an effective and efficient means to evaluate services delivered to caregivers of people with dementia. [source]

The Role of Benzodiazepines in the Treatment of Insomnia

JOURNAL OF AMERICAN GERIATRICS SOCIETY, Issue 6 2001
Meta-Analysis of Benzodiazepine Use in the Treatment of Insomnia
PURPOSE: To obtain a precise estimate of the efficacy and common adverse effects of benzodiazepines for the treatment of insomnia compared with those of placebo and other treatments. BACKGROUND: Insomnia, also referred to as disorder of initiating or maintaining sleep, is a common problem and its prevalence among older people is estimated to be 23% to 34%.1 The total direct cost in the United States for insomnia in 1995 was estimated to be $13.9 billion.2 The complaint of insomnia in older people is associated with chronic medical conditions; psychiatric problems, mainly depression, chronic pain, and poor perceived general condition;1,3,4 and use of sleep medications.5 Thus in most cases, insomnia is due to some other underlying problem and is not just a consequence of aging.6 Accordingly, the management of insomnia should focus on addressing the primary problem and not just short-term treatment of the insomnia. Benzodiazepines belong to the drug class of choice for the symptomatic treatment of primary insomnia.7 This abstract will appraise a meta-analysis that compared the effect of benzodiazepines for short-term treatment of primary insomnia with placebo or other treatment. DATA SOURCES: Data sources included articles listed in Medline from 1966 to December 1998 and the Cochrane Controlled Trials Registry. The medical subject heading (MeSH) search terms used were "benzodiazepine" (exploded) or "benzodiazepine tranquillizers" (exploded) or "clonazepam,""drug therapy,""randomized controlled trial" or "random allocation" or "all random,""human," and "English language." In addition, bibliographies of retrieved articles were scanned for additional articles and manufacturers of brand-name benzodiazepines were asked for reports of early trials not published in the literature. STUDY SELECTION CRITERIA: Reports of randomized controlled trials of benzodiazepine therapy for primary insomnia were considered for the meta-analysis if they compared a benzodiazepine with a placebo or an alternative active drug. DATA EXTRACTION: Data were abstracted from 45 randomized controlled trials representing 2,672 patients, 47% of whom were women. Fifteen studies included patients age 65 and older and four studies involved exclusively older patients. Twenty-five studies were based in the community and nine involved inpatients. The duration of the studies ranged from 1 day to 6 weeks, with a mean of 12.2 days and median of 7.5 days. The primary outcome measures analyzed were sleep latency and total sleep duration after a sleep study, subjects' estimates of sleep latency and sleep duration, and subjects' report of adverse effects. Interrater reliability was checked through duplicate, independent abstraction of the first 21 articles. Overall agreement was between 95% and 98% (kappa value of 0.90 and 0.95 accordingly) for classification of the studies and validity of therapy, and 76% (kappa value of 0.51) for study of harmful effects. A scale of 0 to 5 was used to rate the individual reports, taking into account the quality of randomization, blinding, follow-up, and control for baseline differences between groups. Tests for homogeneity were applied across the individual studies and, when studies were found to be heterogeneous, subgroup analysis according to a predefined group was performed. MAIN RESULTS: The drugs used in the meta-analysis included triazolam in 16 studies; flurazepam in 14 studies; temazepam in 13 studies; midazolam in five studies; nitrazepam in four studies; and estazolam, lorazepam, and diazepam in two studies each. Alternative drug therapies included zopiclone in 13 studies and diphenhydramine, glutethimide, and promethazine in one study each. Only one article reported on a nonpharmacological treatment (behavioral therapy). The mean age of patients was reported in 33 of the 45 studies and ranged between 29 and 82. SLEEP LATENCY: In four studies involving 159 subjects, there was sleep-record latency (time to fall asleep) data for analysis. The pooled difference indicated that the latency to sleep for patients receiving a benzodiazepine was 4.2 minutes (95% CI = (,0.7) (,9.2)) shorter than for those receiving placebo. Patient's estimates of sleep latency examined in eight studies showed a difference of 14.3 minutes (95% CI = 10.6,18.0) in favor of benzodiazepines over placebo. TOTAL SLEEP DURATION: Analysis of two studies involving 35 patients in which total sleep duration using sleep-record results was compared indicated that patients in the benzodiazepine groups slept for an average of 61.8 minutes (95% CI = 37.4,86.2) longer than those in the placebo groups. Patient's estimates of sleep duration from eight studies (566 points) showed total sleep duration to be 48.4 minutes (95% CI = 39.6,57.1) longer for patients taking benzodiazepines than for those on placebo. ADVERSE EFFECTS: Analysis of eight studies (889 subjects) showed that those in the benzodiazepine groups were more likely than those in the placebo groups to complain of daytime drowsiness (odds ratio (OR) 2.4, 95% confidence interval (CI) = 1.8,3.4). Analysis of four studies (326 subjects) also showed that subjects in the benzodiazepine groups were more likely to complain of dizziness or lightheadedness than the placebo groups. (OR 2.6, 95% CI = 0.7,10.3). Despite the increased reported side effects in the benzodiazepine groups, drop-out rates were similar in the benzodiazepine and placebo groups. For patient reported outcome, there was no strong correlation found for sleep latency data, (r = 0.4, 95% CI = (,0.3) (,0.9)) or for sleep duration (r = 0.2, 95% CI = ,0.8,0.4) between benzodiazepine dose and outcome. COMPARISON WITH OTHER DRUGS AND TREATMENTS: In three trials with 96 subjects, meta-analysis of the results comparing benzodiazepines with zopiclone, did not show significant difference in sleep latency in the benzodiazepine and placebo groups, but the benzodiazepine groups had increased total sleep duration (23.1 min. 95% CI = 5.6,40.6). In four trials with 252 subjects, the side effect profile did not show a statistically significant difference (OR 1.5, CI 0.8,2.9). There was only one study comparing the effect of behavioral therapy with triazolam. The result showed that triazolam was more effective than behavioral therapy in decreasing sleep latency, but its efficacy declined by the second week of treatment. Behavioral therapy remained effective throughout the 9-week follow-up period. There were four small trials that involved older patients exclusively, with three of the studies having less than 2 weeks of follow-up. The results were mixed regarding benefits and adverse effects were poorly reported. CONCLUSION: The result of the meta-analysis shows that the use of benzodiazepines results in a decrease in sleep latency and a significant increase in total sleep time as compared with placebo. There was also a report of significantly increased side effects, but this did not result in increased discontinuation rate. There was no dose-response relationship for beneficial effect seen with the use of benzodiazepines, although the data are scant. Zopiclone was the only alternative pharmacological therapy that could be studied with any precision. There was no significant difference in the outcome when benzodiazepines were compared with zopiclone. There was only one study that compared the effect of benzodiazepines with nonpharmacological therapy; thus available data are insufficient to comment. [source]

The development, validity and reliability of a multimodality objective structured clinical examination in psychiatry

MEDICAL EDUCATION, Issue 3 2005
K Walters
Objectives, To evaluate the development, validity and reliability of a multimodality objective structured clinical examination (OSCE) in undergraduate psychiatry, integrating interactive face-to-face and telephone history taking and communication skills stations, videotape mental state examinations and problem-oriented written stations. Methods, The development of the OSCE on a restricted budget is described. This study evaluates the validity and reliability of 4 15,18-station OSCEs for 128 students over 1 year. Face and content validity were assessed by a panel of clinicians and from feedback from OSCE participants. Correlations with consultant clinical ,firm grades' were performed. Interrater reliability and internal consistency (interstation reliability) were assessed using generalisability theory. Results, The OSCE was feasible to conduct and had a high level of high perceived face and content validity. Consultant firm grades correlated moderately with scores on interactive stations and poorly with written and video stations. Overall reliability was moderate to good, with G-coefficients in the range 0.55,0.68 for the 4 OSCEs. Conclusions, Integrating a range of modalities into an OSCE in psychiatry appears to represent a feasible, generally valid and reliable method of examination on a restricted budget. Different types of stations appear to have different advantages and disadvantages, supporting the integration of both interactive and written components into the OSCE format. [source]

An acute care skills evaluation for graduating medical students: a pilot study using clinical simulation

Interrater reliability of diagnosing complex regional pain syndrome type I

ACTA ANAESTHESIOLOGICA SCANDINAVICA, Issue 4 2002
R. S. G. M. Perez
Background: Diagnosis of complex regional pain syndrome type I (CRPS I) is based on clinical observation of symptoms. As little information is available on the reliability of CRPS I diagnosis, we evaluated the agreement between therapists with regard to the presence and severity of CRPS I and its symptoms. Methods: The interrater reliability was evaluated in 37 presumed CRPS I patients by three observers; one consultant anesthesiologist and two resident anesthesiologists. Patients were assessed on the basis of Veldman's CRPS criteria. Results: The interrater reliability for diagnosing CRPS I was good for the majority of observer combinations. The percentage of agreement for the absence or presence of CRPS I was good (88%,100%). Cohen's Kappa's ranged from 0.60 to 0.86. The agreement for the mean symptom score ranged from 70.2% to 88.6%; Kappa's were lower and showed more variation. Interrater reliability for assessment of the severity of CRPS I and its symptoms was poor. Factors influencing the interrater reliability were symptom type, individual observers and sample population. Conclusion: Diagnosing CRPS I can be performed on the basis of clinical observation. Further assessment of severity of CRPS I and its symptoms should be performed with reliable and valid measurement instruments. [source]

Reliability and validity of a Japanese quality of life scale for the elderly with dementia

NURSING & HEALTH SCIENCES, Issue 2 2000
Noriko Yamamoto-Mitani RN
Abstract This paper examines reliability and validity of an instrument measuring the quality of life of elderly Japanese people with dementia. The instrument is a translation of an American instrument. The instrument has 48 items with binary answer format over five domains: ,social interaction', ,awareness of self', ,enjoyment of activities', ,feelings and mood', and ,response to surroundings'. Altogether, 321 elderly in various facilities/services in Japan were evaluated by their formal caregivers. Factor analysis supported the domain of ,enjoyment of activities', but the domains of ,awareness of self' and ,response to surroundings' were statistically overlapped. The domains of ,social interaction' and ,feeling and mood' were not supported. Test,retest reliability was generally satisfactory except for the domain of ,response to surroundings'. Interrater reliability was relatively low for domain scores but the total score was acceptable. Thus, the instrument showed moderate reliability and validity and further improvement is needed. [source]

The FLACC behavioral scale for procedural pain assessment in children aged 5,16 years

PEDIATRIC ANESTHESIA, Issue 8 2008
STEFAN NILSSON MSN RN
Summary Objectives:, To evaluate the concurrent and construct validity and the interrater reliability of the Face, Legs, Activity, Cry and Consolability (FLACC) scale during procedural pain in children aged 5,16 years. Background:, Self-reporting of pain is considered to be the primary source of information on pain intensity for older children but a validated observational tool will provide augment information to self-reports during painful procedures. Methods:, Eighty children scheduled for peripheral venous cannulation or percutaneous puncture of a venous port were included. In 40 cases two nurses simultaneously and independently assessed pain by using the FLACC scale and in 40 cases one of these nurses assessed the child. All children scored the intensity of pain by using the Coloured Analogue Scale (CAS) and distress by the Facial Affective Scale (FAS). Results:, Concurrent validity was supported by the correlation between FLACC scores and the children's self-reported CAS scores during the procedure (r = 0.59, P < 0.05). A weaker correlation was found between the FLACC scores and children's self-reported FAS (r = 0.35, P < 0.05). Construct validity was demonstrated by the increase in median FLACC score to 1 during the procedure compared with 0 before and after the procedure (P < 0.001). Interrater reliability during the procedure was supported by adequate kappa statistics for all items and for the total FLACC scores (, = 0.85, P < 0.001). Conclusions:, The findings of this study support the use of FLACC as a valid and reliable tool for assessing procedural pain in children aged 5,16 years. [source]

Reliability and Validity of the Emergency Severity Index for Pediatric Triage

ACADEMIC EMERGENCY MEDICINE, Issue 9 2009
Debbie A. Travers PhD
Abstract Objectives:, The Emergency Severity Index (ESI) triage algorithm is a five-level triage acuity tool used by emergency department (ED) triage nurses to rate patients from Level 1 (most acute) to Level 5 (least acute). ESI has established reliability and validity in an all-age population, but has not been well studied for pediatric triage. This study assessed the reliability and validity of the ESI for pediatric triage at five sites. Methods:, Interrater reliability was measured with weighted kappa for 40 written pediatric case scenarios and 100 actual patient triages at each of five research sites (independently rated by both a triage nurse and a research nurse). Validity was evaluated with a sample of 200 patients per site. The ESI ratings were compared with outcomes, including hospital admission, resource consumption, and ED length of stay. Results:, Interrater reliability was 0.77 (95% confidence interval [CI] = 0.76 to 0.78) for the scenarios (n = 155 nurses) and 0.57 (95% CI = 0.52 to 0.62) for actual patients (n = 498 patients). Inconsistencies in triage were noted for the most acute and least acute patients, as well as those less than 1 year of age and those with medical (rather than trauma) chief complaints. For the validity cohort (n = 1,173 patients), outcomes differed by ESI level, including hospital admission, which went from 83% for Level 1 patients to 0% for Level 5 (chi-square, p < 0.0001). Nurses from dedicated pediatric EDs were 31% less likely to undertriage patients than nurses in general EDs (odds ratio [OR] = 0.31, 95% CI = 0.14 to 0.67). Conclusions:, Reliability of the ESI for pediatric triage is moderate. The ESI provides a valid stratification of pediatric patients into five distinct groups. We found several areas in which nurses have difficulty triaging pediatric patients consistently. The study results are being used to develop pediatric-specific ESI educational materials to strengthen reliability and validity for pediatric triage. [source]

Lack of agreement between rheumatologists in defining digital ulceration in systemic sclerosis

ARTHRITIS & RHEUMATISM, Issue 3 2009
Ariane L. Herrick
Objective To test the intra- and interobserver variability, among clinicians with an interest in systemic sclerosis (SSc), in defining digital ulcers. Methods Thirty-five images of finger lesions, incorporating a wide range of abnormalities at different sites, were duplicated, yielding a data set of 70 images. Physicians with an interest in SSc were invited to take part in the Web-based study, which involved looking through the images in a random sequence. The sequence differed for individual participants and prevented cross-checking with previous images. Participants were asked to grade each image as depicting "ulcer" or "no ulcer," and if "ulcer," then either "inactive" or "active." Images of a range of exemplar lesions were available for reference purposes while participants viewed the test images. Intrarater reliability was assessed using a weighted kappa coefficient with quadratic weights. Interrater reliability was estimated using a multirater weighted kappa coefficient. Results Fifty individuals (most of them rheumatologists) from 15 countries participated in the study. There was a high level of intrarater reliability, with a mean weighted kappa value of 0.81 (95% confidence interval [95% CI] 0.77, 0.84). Interrater reliability was poorer (weighted , = 0.46 [95% CI 0.35, 0.57]). Conclusion The poor interrater reliability suggests that if digital ulceration is to be used as an end point in multicenter clinical trials of SSc, then strict definitions must be developed. The present investigation also demonstrates the feasibility of Web-based studies, for which large numbers of participants can be recruited over a short time frame. [source]

Bedside Ultrasound Diagnosis of Clavicle Fractures in the Pediatric Emergency Department

ACADEMIC EMERGENCY MEDICINE, Issue 7 2010
Keith P. Cross MD
ACADEMIC EMERGENCY MEDICINE 2010; 17:687,693 © 2010 by the Society for Academic Emergency Medicine Abstract Objectives:, Clavicle fractures are among the most common orthopedic injuries in children. Diagnosis typically involves radiographs, which expose children to radiation and may consume significant time and resources. Our objective was to determine if bedside emergency department (ED) ultrasound (US) is an accurate alternative to radiography. Methods:, This was a prospective study of bedside US for diagnosing clavicle fractures. A convenience sample of children ages 1,18 years with shoulder injuries requiring radiographs was enrolled. Bedside US imaging and an unblinded interpretation were completed by a pediatric emergency physician (EP) prior to radiographs. A second interpreter, a pediatric EP attending physician with extensive US experience, determined a final interpretation of the US images at a later date. This final interpretation was blinded to both clinical and radiography outcomes. The reference standard was an attending radiologist's interpretation of radiographs. The primary outcome was the accuracy of the blinded US interpretation for detecting clavicle fractures compared to the reference standard. Secondary outcome measures included the interrater reliability of the unblinded bedside and the blinded physicians' interpretations and the FACES pain scores (range, 0,5) for US and radiograph imaging. Results:, One-hundred patients were included in the study, of whom 43 had clavicle fractures by radiography. The final US interpretation had 95% sensitivity (95% confidence interval [CI] = 83% to 99%) and 96% specificity (95% CI = 87% to 99%), and overall accuracy was 96%, with 96 congruent readings. Positive and negative predictive values (PPVs and NPVs, respectively) were 95% (95% CI = 83% to 99%) and 96% (95% CI = 87% to 99%), respectively. Interrater reliability (kappa) was 0.74 (95% CI = 0.60 to 0.88). FACES pain scores were available for the 86 subjects who were at least 5 years old. Pain scores were similar during US and radiography. Conclusions:, Compared to radiographs, bedside US can accurately diagnose pediatric clavicle fractures. US causes no more discomfort than radiography when detecting clavicle fractures. Given US's advantage of no radiation, pediatric EPs should consider this application. [source]

Interrater reliability of the Personal Care Participation Assessment and Resource Tool (PC-PART) in a rehabilitation setting

AUSTRALIAN OCCUPATIONAL THERAPY JOURNAL, Issue 2 2009
Christopher Turner
Background:,The Personal Care Participation Assessment and Resource Tool (PC-PART), formerly the Handicap Assessment and Resource Tool (HART), assesses the domains of clothing, hygiene, nutrition, mobility, safety, residence and supports. Aim:,To examine the interrater reliability of the PC-PART in a rehabilitation setting. Methods:,Assessments made by the researcher were compared to the interdisciplinary rehabilitation team. The research and standard assessments occurred within three working days. Raters were blind to each other's scores. Sample participants were a consecutive case-series of rehabilitation clients with varied diagnoses, activity limitations and participation restrictions. Of 66 consecutive patients seen during the a priori determined enrolment period, 25 were included in the study (nine males and 16 females, aged 44,85 years). The remaining 41 patients did not meet the inclusion criteria. Conclusion:,The PC-PART has good interrater reliability. Clinicians, administrators and researchers can be reassured about this aspect of the validity of the tool. [source]

A Rapid Qualitative Test for Suspected Ethylene Glycol Poisoning

ACADEMIC EMERGENCY MEDICINE, Issue 7 2008
Heather Long MD
Abstract Objectives:, Many hospitals must send out ethylene glycol (EG) samples to a reference laboratory, and delays in diagnosis and treatment may occur. A qualitative colorimetric test (ethylene glycol test [EGT] kit), already in use by veterinarians, gives results in 30 minutes with little expertise or cost. The EGT reliably detects the presence of EG in spiked human serum samples. The objective of this study was to prospectively assess the sensitivity and specificity of the EGT kit in actual clinical samples submitted for EG testing by the criterion standard gas chromatography (GC). Methods:, Blood samples from patients with suspected toxic alcohol poisoning submitted to a reference laboratory were tested by GC. An investigator blinded to the GC results tested the same sample with the EGT kit following the manufacturer's instructions and using the internal control. Three physicians also blinded to the GC results categorized the sample as positive for EG, negative, or inconclusive. Interrater reliability was assessed with a kappa statistic (,). Results of the EGT kit testing were then compared to those from GC testing. Results:, Data are reported on 24 samples submitted. By GC, 15 samples were confirmed for EG (range 27,281 mg/dL), 5 were confirmed for methanol (ME; range 64,101 mg/dL), and 4 were negative for both alcohols. The EGT was unanimously positive in all confirmed EG samples and negative in all ME samples. In one of the negative samples, an ambiguous result occurred and was counted as a false-positive. Interobserver agreement with the EGT was high (, = 0.909; 95% confidence interval [CI] = 0.735 to 1.0). Sensitivity and specificity were 100% (95% CI = 70% to 100%) and 88.8% (95% CI = 52% to 100%), respectively. Conclusions:, The EGT appears to be a reliable qualitative test in cases of suspected human EG poisoning. [source]

Selective Control Assessment of the Lower Extremity (SCALE): development, validation, and interrater reliability of a clinical tool for patients with cerebral palsy

DEVELOPMENTAL MEDICINE & CHILD NEUROLOGY, Issue 8 2009
EILEEN G FOWLER PhD PT
Normal selective voluntary motor control (SVMC) can be defined as the ability to perform isolated joint movement without using mass flexor/extensor patterns or undesired movement at other joints, such as mirroring. SVMC is an important determinant of function, yet a valid, reliable assessment tool is lacking. The Selective Control Assessment of the Lower Extremity (SCALE) is a clinical tool developed to quantify SVMC in patients with cerebral palsy (CP). This paper describes the development, utility, validation, and interrater reliability of SCALE. Content validity was based on review by 14 experienced clinicians. Mean agreement was 91.9% (range 71.4,100%) for statements about content, administration, and grading. SCALE scores were compared with Gross Motor Function Classification System Expanded and Revised (GMFCS-ER) levels for 51 participants with spastic diplegic, hemiplegic, and quadriplegic CP (GMFCS levels I , IV, 21 males, 30 females; mean age 11y 11mo [SD 4y 9mo]; range 5,23y). Construct validity was supported by significant inverse correlation (Spearman's r=-0.83, p<0.001) between SCALE scores and GMFCS levels. Six clinicians rated 20 participants with spastic CP (seven males, 13 females, mean age 12y 3mo [SD 5y 5mo], range 7,23y) using SCALE. A high level of interrater reliability was demonstrated by intraclass correlation coefficients ranging from 0.88 to 0.91 (p<0.001). [source]

Comparison of the Melbourne Assessment of Unilateral Upper Limb Function and the Quality of Upper Extremity Skills Test in hemiplegic CP

DEVELOPMENTAL MEDICINE & CHILD NEUROLOGY, Issue 12 2008
K Klingels MSc
This study investigated interrater reliability and measurement error of the Melbourne Assessment of Unilateral Upper Limb Function (Melbourne Assessment) and the Quality of Upper Extremity Skills Test (QUEST), and assessed the relationship between both scales in 21 children (15 females, six males; mean age 6y 4mo [SD 1y 3mo], range 5,8y) with hemiplegic CP. Two raters scored the videotapes of the assessments independently in a randomized order. According to the House Classification, three participants were classified as level 1, one participant as level 3, eight as level 4, three as level 5, one participant as level 6, and five as level 7. The Melbourne Assessment and the QUEST showed high interrater reliability (intraclass correlation 0.97 for Melbourne Assessment; 0.96 for QUEST total score; 0.96 for QUEST hemiplegic side). The standard error of measurement and the smallest detectable difference was 3.2% and 8.9% for the Melbourne Assessment and 5.0% and 13.8% for the QUEST score on the hemiplegic side. Correlation analysis indicated that different dimensions of upper limb function are addressed in both scales. [source]

Reliability of Computerized Emergency Triage

ACADEMIC EMERGENCY MEDICINE, Issue 3 2006
Sandy L. Dong MD
Objectives: Emergency department (ED) triage prioritizes patients based on urgency of care. This study compared agreement between two blinded, independent users of a Web-based triage tool (eTRIAGE) and examined the effects of ED crowding on triage reliability. Methods: Consecutive patients presenting to a large, urban, tertiary care ED were assessed by the duty triage nurse and an independent study nurse, both using eTRIAGE. Triage score distribution and agreement are reported. The study nurse collected data on ED activity, and agreement during different levels of ED crowding is reported. Two methods of interrater agreement were used: the linear-weighted , and quadratic-weighted ,. Results: A total of 575 patients were assessed over nine weeks, and complete data were available for 569 patients (99.0%). Agreement between the two nurses was moderate if using linear , (weighted ,= 0.52; 95% confidence interval = 0.46 to 0.57) and good if using quadratic , (weighted ,= 0.66; 95% confidence interval = 0.60 to 0.71). ED overcrowding data were available for 353 patients (62.0%). Agreement did not significantly differ with respect to periods of ambulance diversion, number of admitted inpatients occupying stretchers, number of patients in the waiting room, number of patients registered in two hours, or nurse perception of busyness. Conclusions: This study demonstrated different agreement depending on the method used to calculate interrater reliability. Using the standard methods, it found good agreement between two independent users of a computerized triage tool. The level of agreement was not affected by various measures of ED crowding. [source]

Telemetry Monitoring during Transport of Low-risk Chest Pain Patients from the Emergency Department: Is It Necessary?

ACADEMIC EMERGENCY MEDICINE, Issue 10 2005
Adam J. Singer MD
Abstract Background: Low-risk emergency department (ED) patients with chest pain (CP) are often transported by nurses to monitored beds on telemetry monitoring, diverting valuable resources from the ED and delaying transport. Objectives: To test the hypothesis that transporting low-risk CP patients off telemetry monitoring is safe. Methods: This was a secondary analysis of a prospective, observational cohort of ED patients with low-risk chest pain (no active chest pain, normal or nondiagnostic electrocardiogram, normal initial troponin I) admitted to a non,intensive care unit monitored bed who were transported off telemetry monitor by nonclinical personnel. A protocol allowing transportation of low-risk CP patients off telemetry monitoring to a monitored bed was developed, and an ongoing daily log of patients transported off telemetry was maintained for the occurrence of any adverse events en route to the floor. Adverse events requiring treatment included dysrhythmias, hypotension, syncope, and cardiac arrest. The study population included patients who presented during September,October 2004, whose data were abstracted from the medical records using standardized methodology. A subset of 10% of the medical records were reviewed by a second investigator for interrater reliability. Death, syncope, resuscitation, and dysrhythmias during transport or immediately on arrival to the floor were the outcomes measured. Descriptive statistics and confidence intervals (CIs) were used in data analysis. Results: During the study period, 425 patients had CP of potentially ischemic origin, of whom 322 (75.8%) were low risk and met the inclusion criteria and were transported off monitors. Their mean (±standard deviation) age was 58.3 (±16.0) years; 48.1% were female. During transport from the ED, there was no patient with any adverse events requiring treatment and there was no death (95% CI = 0% to 0.93%). Conclusions: Transportation of low-risk ED chest pain patients off telemetry monitoring by nonclinical personnel to the floor appears safe. This may reduce diversion of ED nurses from the ED, helping to alleviate nursing shortages. [source]

Optimizing triage consistency in Australian emergency departments: The Emergency Triage Education Kit

EMERGENCY MEDICINE AUSTRALASIA, Issue 3 2008
Marie Frances Gerdtz
Abstract Objective: The Emergency Triage Education Kit was designed to optimize consistency of triage using the Australasian Triage Scale. The present study was conducted to determine the interrater reliability of a set of scenarios for inclusion in the programme. Methods: A postal survey of 237 paper-based triage scenarios was utilized. A quota sample of triage nurses (n = 42) rated each scenario using the Australasian Triage Scale. The scenarios were analysed for concordance and agreement. The criterion for inclusion of the scenarios in the programme was , , 0.6. Results: Data were collected during 2 April to 14 May 2007. Agreement for the set was , = 0.412 (95% CI 0.410,0.415). Of the initial set: 92/237 (38.8%, 95% CI 32.6,45.3) showed concordance ,70% to the modal triage category (, = 0.632, 95% CI 0.629,0.636) and 155/237 (65.4%, 95% CI 59.3,71.5) showed concordance ,60% to the modal triage category (, = 0.507, 95% CI 0.504,0.510). Scenarios involving mental health and pregnancy presentations showed lower levels of agreement (, = 0.243, 95% CI 0.237,0.249; , = 0.319, 95% CI 0.310,0.328). Conclusion: All scenarios that showed good levels of agreement have been included in the Emergency Triage Education Kit and are recommended for testing purposes; those that showed moderate agreement have been incorporated for teaching purposes. Both scenario sets are accompanied by explanatory notes that link the decision outcome to the Australasian College for Emergency Medicine Guidelines on the Implementation of the Australasian Triage Scale. Future analysis of the scenarios is required to identify how task-related factors influence consistency of triage. [source]

Variability in agreement between physicians and nurses when measuring the Glasgow Coma Scale in the emergency department limits its clinical usefulness

EMERGENCY MEDICINE AUSTRALASIA, Issue 4 2006
Anna Holdgate
Abstract Objective:, To assess the interrater reliability of the Glasgow Coma Scale (GCS) between nurses and senior doctors in the ED. Methods:, This was a prospective observational study with a convenience sample of patients aged 18 or above who presented with a decreased level of consciousness to a tertiary hospital ED. A senior ED doctor (emergency physicians and trainees) and registered nurse each independently scored the patient's GCS in blinded fashion within 15 min of each other. The data were then analysed to determine interrater reliability using the weighted kappa statistic and the size and directions of differences between paired scores were examined. Results:, A total of 108 eligible patients were enrolled, with GCS scores ranging from 3 to 14. Interrater agreement was excellent (weighted kappa > 0.75) for verbal scores and total GCS scores, and intermediate (weighted kappa 0.4,0.75) for motor and eye scores. Total GCS scores differed by more than two points in 10 of the 108 patients. Interrater agreement did not vary substantially across the range of actual numeric GCS scores. Conclusions:, Although the level of agreement for GCS scores was generally high, a significant proportion of patients had GCS scores which differed by two or more points. This degree of disagreement indicates that clinical decisions should not be based solely on single GCS scores. [source]

Specific Epileptic Syndromes Are Rare Even in Tertiary Epilepsy Centers: A Patient-oriented Approach to Epilepsy Classification

The Colorado Haemophilia Paediatric Joint Physical Examination Scale: normal values and interrater reliability

HAEMOPHILIA, Issue 1 2007
M. R. HACKER
Summary., ,Persons with haemophilia often experience their first joint haemorrhage in early childhood. Recurrent bleeding into a joint may lead to significant morbidity, specifically haemophilic arthropathy. Early identification of the onset and progression of joint damage is critical to preserving joint structure and function. Physical examination is the most feasible approach to monitor joint health. Our group developed the Colorado Haemophilia Paediatric Joint Physical Examination Scale to identify earlier signs of joint degeneration and incorporate developmentally appropriate tasks for assessing joint function in young children. This study's objectives were to establish normal ranges for this scale and assess interrater reliability. The ankles, knees and elbows of 72 healthy boys aged 1 through 7 years were evaluated by a physical therapist to establish normal ranges. Exactly 10 boys in each age category from 2 to 7 years were evaluated by a second physical therapist to determine interrater reliability. The original scale was modified to account for the finding that mild angulation in the weight-bearing joints is developmentally normal. The interrater reliability of the scale ranged from fair to good, underscoring the need for physical therapists to have specific training in the orthopaedic assessment of very young children and the measurement error inherent in the goniometer. Modifications to axial alignment scoring will allow the scale to distinguish healthy joints from those suffering frequent haemarthroses. [source]