Home About us Contact

Reliability

Distribution by Scientific Domains

Medical Sciences	53%
Engineering	9%
Life Sciences	6%
Psychology	5%
Chemistry	5%
Humanities and Social Sciences	4%
Business, Economics, Finance and Accounting	3%
Mathematics and Statistics	2%
Earth and Environmental Science	2%
Polymers and Materials Science	2%
3 Other Domains	9%

Distribution within Medical Sciences

Psychiatry	11%
Nursing General	9%
Neurology	7%
93 Other Subdomains	73%

Kinds of Reliability

acceptable reliability

adequate reliability

alpha reliability

consistency reliability

device reliability

excellent reliability

good inter-rater reliability

good internal reliability

good reliability

good test-retest reliability

high internal reliability

high reliability

initial reliability

inter-observer reliability

inter-rater reliability

internal consistency reliability

internal reliability

interobserver reliability

interrater reliability

intra-observer reliability

intra-rater reliability

intraobserver reliability

intrarater reliability

long-term reliability

sufficient reliability

supply reliability

system reliability

test reliability

test-retest reliability

Terms modified by Reliability

reliability analysis

reliability aspect

reliability assessment

reliability coefficient

reliability data

reliability estimate

reliability evaluation

reliability prediction

Selected Abstracts

VALIDITY AND RELIABILITY OF QUANTITATIVE GAIT ANALYSIS IN GERIATRIC PATIENTS WITH AND WITHOUT DEMENTIA

JOURNAL OF AMERICAN GERIATRICS SOCIETY, Issue 4 2007
Marianne B. Van Iersel MD
No abstract is available for this article. [source]

THE RELIABILITY OF NAÏVE ASSESSORS IN SENSORY EVALUATION VISUALIZED BY PRAGMATICAL MULTIVARIATE ANALYSIS

JOURNAL OF FOOD QUALITY, Issue 5 2002
M.G. O'SULLIVAN
The first part of this paper demonstrates a simple graphical way to visualize estimated variances, in terms of a plot of the total initial variance ("SIGNAL") versus residual variance ("NOISE"), as a pragmatic alternative to tables of F-tests. The recently developed Procrustes rotation in the bilinear "jack-knifing" form is then presented as a method for simplifying the comparison of PLS Regression models from different data sets. These methods are applied to sensory data in order to study if naïve (untrained) sensory panelists can produce reliable descriptions of systematic differences between various test meals. The results confirm that three panels of 15 naïve assessors each could give repeatable intersubjective description of the most dominant sensory variation dimensions. [source]

RELIABILITY OF SENSORY ASSESSORS: ISSUES OF COMPLEXITY

JOURNAL OF SENSORY STUDIES, Issue 1 2009
JANNA BITNES
ABSTRACT The objective of this study was to investigate whether the sensory performance of assessors in a sensory panel maybe explained by complexity of evaluated product. We aimed to investigate whether we could observe a decline in sensory performance when increasing the complexity of the product. The products increased in number of constituents from mixtures of sucrose, sodium chloride, citric acid and caffeine in water, to the foods ice tea and tomato soup constituting different levels of the same substances. Candidates who succeeded evaluating one product were not always successful evaluating others. Few subjects were successful in everything. The conclusion was that there is only minor systematic decline with increasing complexity of products. The authors emphasize that definition of complexity involves more than just counting number of constituents and taste sensations, and suggest that minor differences in the task given to the assessor might explain different performances. PRACTICAL APPLICATIONS Practical use of the research presented in the present paper is in a sensory evaluation context. It is important for the users of sensory data to find out how the profiling should be organized to achieve optimum output, and in specific, the need for extensive training when dealing with a more complex product. The present study hypothesized that sensory assessors would have more difficulties evaluating a more complex product. However, the results showed that panel leaders should be more concerned with the task variables in the sensory evaluation. Even a minor shift in task variables had a stronger impact on the performance and reliability of the assessors than increasing number of constituents and/or stimuli sensations of the product. This study did not demonstrate a need for extensive training when dealing with a more complex product as hypothesized. [source]

INTERRATER CORRELATIONS DO NOT ESTIMATE THE RELIABILITY OF JOB PERFORMANCE RATINGS

PERSONNEL PSYCHOLOGY, Issue 4 2000
KEVIN R. MURPHY
Interrater correlations are widely interpreted as estimates of the reliability of supervisory performance ratings, and are frequently used to correct the correlations between ratings and other measures (e.g., test scores) for attenuation. These interrater correlations do provide some useful information, but they are not reliability coefficients. There is clear evidence of systematic rater effects in performance appraisal, and variance associated with raters is not a source of random measurement error. We use generalizability theory to show why rater variance is not properly interpreted as measurement error, and show how such systematic rater effects can influence both reliability estimates and validity coefficients. We show conditions under which interrater correlations can either overestimate or underestimate reliability coefficients, and discuss reasons other than random measurement error for low interrater correlations. [source]

CHARACTER, RELIABILITY AND VIRTUE EPISTEMOLOGY

THE PHILOSOPHICAL QUARTERLY, Issue 223 2006
Jason Baehr
Standard characterizations of virtue epistemology divide the field into two camps: virtue reliabilism and virtue responsibilism. Virtue reliabilists think of intellectual virtues as reliable cognitive faculties or abilities, while virtue responsibilists conceive of them as good intellectual character traits. I argue that responsibilist character virtues sometimes satisfy the conditions of a reliabilist conception of intellectual virtue, and that consequently virtue reliabilists, and reliabilists in general, must pay closer attention to matters of intellectual character. This leads to several new questions and challenges for any reliabilist epistemology. [source]

Reliability in grid computing systems,

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 8 2009
Christopher Dabrowski
Abstract In recent years, grid technology has emerged as an important tool for solving compute-intensive problems within the scientific community and in industry. To further the development and adoption of this technology, researchers and practitioners from different disciplines have collaborated to produce standard specifications for implementing large-scale, interoperable grid systems. The focus of this activity has been the Open Grid Forum, but other standards development organizations have also produced specifications that are used in grid systems. To date, these specifications have provided the basis for a growing number of operational grid systems used in scientific and industrial applications. However, if the growth of grid technology is to continue, it will be important that grid systems also provide high reliability. In particular, it will be critical to ensure that grid systems are reliable as they continue to grow in scale, exhibit greater dynamism, and become more heterogeneous in composition. Ensuring grid system reliability in turn requires that the specifications used to build these systems fully support reliable grid services. This study surveys work on grid reliability that has been done in recent years and reviews progress made toward achieving these goals. The survey identifies important issues and problems that researchers are working to overcome in order to develop reliability methods for large-scale, heterogeneous, dynamic environments. The survey also illuminates reliability issues relating to standard specifications used in grid systems, identifying existing specifications that may need to be evolved and areas where new specifications are needed to better support the reliability. Published in 2009 by John Wiley & Sons, Ltd. [source]

The Assessment of Emergency Physicians by a Regulatory Authority

ACADEMIC EMERGENCY MEDICINE, Issue 12 2006
Jocelyn M. Lockyer PhD
Abstract Objectives To determine whether it is possible to develop a feasible, valid, and reliable multisource feedback program (360° evaluation) for emergency physicians. Methods Surveys with 16, 20, 30, and 31 items were developed to assess emergency physicians by 25 patients, eight coworkers, eight medical colleagues, and self, respectively, using five-point scales along with an "unable to assess" category. Items addressed key competencies related to communication skills, professionalism, collegiality, and self-management. Results Data from 187 physicians who identified themselves as emergency physicians were available. The mean number of respondents per physician was 21.6 (SD ± 3.87) (93%) for patients, 7.6 (SD ± 0.89) (96%) for coworkers, and 7.7 (SD ± 0.61) (95%) for medical colleagues, suggesting it was a feasible tool. Only the patient survey had four items with "unable to assess" percentages ,15%. The factor analysis indicated there were two factors on the patient questionnaire (communication/professionalism and patient education), two on the coworker survey (communication/collegiality and professionalism), and four on the medical colleague questionnaire (clinical performance, professionalism, self-management, and record management) that accounted for 80.0%, 62.5%, and 71.9% of the variance on the surveys, respectively. The factors were consistent with the intent of the instruments, providing empirical evidence of validity for the instruments. Reliability was established for the instruments (Cronbach's , > 0.94) and for each physician (generalizability coefficients were 0.68 for patients, 0.85 for coworkers, and 0.84 for medical colleagues). Conclusions The psychometric examination of the data suggests that the instruments developed to assess emergency physicians were feasible and provide evidence for validity and reliability. [source]

Accounting Conservatism and the Temporal Trends in Current Earnings' Ability to Predict Future Cash Flows versus Future Earnings: Evidence on the Trade-off between Relevance and Reliability

CONTEMPORARY ACCOUNTING RESEARCH, Issue 2 2010
SATI P. BANDYOPADHYAY
M41; C23; D21; G38 This research reports that an increasing level of accounting conservatism over the 1973,2005 period is associated with: (1) an increase in the ability of current earnings to predict future cash flows and (2) a decrease in the ability of current earnings to predict future earnings. We also find that usefulness of earnings for explaining stock prices over book values is positively related to reliability but not to relevance. Our results hold for the constant and full samples in both in-sample and out-of-sample analyses and are robust to the use of alternative measures for relevance, reliability, earnings usefulness, and conservatism. Our findings about the relations among conservatism, relevance, reliability, and usefulness suggest a trade-off between relevance and reliability and seem to indicate that the adoption of an increasing number of conservative accounting standards has a possible adverse impact on earnings usefulness through a negative effect on reliability. [source]

Panic Disorder Severity Scale: Reliability and validity of the Turkish version,,

DEPRESSION AND ANXIETY, Issue 1 2004
E. Serap Monkul M.D.
Abstract We assessed the reliability and validity of the Turkish version of the seven-item Panic Disorder Severity Scale (PDSS). We recruited 174 subjects, including 104 with current DSM-IV panic disorder with (n = 76) or without(n = 28)agoraphobia, 14 with a major depressive episode, 24 with a non-panic anxiety disorder, and 32 healthy controls. Assessment instruments were Panic Disorder Severity Scale, Panic and Agoraphobia Scale, both the observer-rated (P&Ao) and self-rating (P&Asr); Clinical Global Impression Scale (CGI); Hamilton Anxiety Scale, and Beck Depression Inventory. We repeated the measures for a group of panic disorder patients (n = 51) after 4 weeks to assess test,retest reliability. The internal consistency (Cronbach's ,) of the PDSS was .92,94. The inter-rater correlation coefficient was .79. The test,retest correlation coefficient after 4 weeks was .63. In discriminant validity analyses, the highest correlation for PDSS was with P&Ao, P&Asr (r=.87 and .87, respectively) and CGI (r=.76) and the lowest with Beck Depression Inventory (r=.29). The cut-off point was six/seven, associated with high sensitivity (99%) and specificity (98%). This study confirmed the objectivity, reliability and validity of the Turkish version of the PDSS. Depression and Anxiety 00:000,000, 2004. © 2004 Wiley-Liss, Inc. [source]

Reliability and validity of a structured interview guide for the Hamilton Anxiety Rating Scale (SIGH-A)

DEPRESSION AND ANXIETY, Issue 4 2001
M. Katherine Shear M.D.
Abstract The Hamilton Anxiety Rating Scale, a widely used clinical interview assessment tool, lacks instructions for administration and clear anchor points for the assignment of severity ratings. We developed a Structured Interview Guide for the Hamilton Anxiety Scale (SIGH-A) and report on a study comparing this version to the traditional form of this scale. Experienced interviewers from three Anxiety Disorders research sites conducted videotaped interviews using both traditional and structured instruments in 89 participants. A subset of the tapes was co-rated by all raters. Participants completed self-report symptom questionnaires. We observed high inter-rater and test-retest reliability using both formats. The structured format produced similar but consistently higher (+ 4.2) scores. Correlation with a self-report measure of overall anxiety was also high and virtually identical for the two versions. We conclude that in settings where extensive training is not practical, the structured scale is an acceptable alternative to the traditional Hamilton Anxiety instrument. Depression and Anxiety 13:166,178, 2001. © 2001 Wiley-Liss, Inc. [source]

Detecting language problems: accuracy of five language screening instruments in preschool children

DEVELOPMENTAL MEDICINE & CHILD NEUROLOGY, Issue 2 2007
H M E Van Agt MA
To identify a simple and effective screening instrument for language delays in 3-year-old children the reliability, validity, and accuracy of five screening instruments were examined. A postal questionnaire sent to parents of 11423 children included the Dutch version of the General Language Screen (GLS), the Van Wiechen (VW) items, the Language Screening Instrument for 3- to 4-year-olds, consisting of a parent form (LSI-PF) and a child test (LSI-CT), and parents' own judgement of their child's language development on a visual analogue scale (VAS). The response rate was 78% or 8877 children. Reliability (internal consistency) was found to be acceptable (,=0.67,0.72) for all instruments. Significant correlations between the screening instruments (r=0.29,0.55, p<0.01) indicated good concurrent validity. Accuracy was estimated by the sensitivity, specificity, and receiver operating characteristic (ROC) curves against two reference tests based on parent report and specialists' judgement. If the test would classify approximately 5% of the population as screen-positive, the mean sensitivity was 50%; assigning between 20% and 30% of the population as screen-positive, the mean sensitivity was 77%. The sensitivity was lowest for the LSI-CT (range 43,62%), whereas short instruments like the LSI-PF, VW, and the one-item VAS exhibited high levels of sensitivity (range 50,86%). The area under the ROC curves, ranged from 0.75 to 0.87. Apparently, short and simple parent report instruments like the LSI-PF and the one-item VAS perform remarkably well in detecting language delays in preschool children. [source]

Reliability of the V-scope system in the measurement of arm movement in children with obstetric brachial plexus palsy

DEVELOPMENTAL MEDICINE & CHILD NEUROLOGY, Issue 11 2006
Andrea E Bialocerkowski PhD BApp Sc (Physio) MApp Sc (Physio)
This study reports on a novel methodology using the V-scope to quantify elbow and shoulder movement in young children with obstetric brachial plexus palsy (OBPP), and the intra-and interreliability of this method. The V-scope, a portable, inexpensive movement analysis system, was configured in an L-shape, with two transmitting towers placed on the floor and one 1.35m off the ground. These towers received ultrasonic pulses from buttons that were placed over standardized landmarks of the child's trunk, chest, and upper limb. Two physiotherapists (a paediatric and a generalist) facilitated the maximum range of active elbow flexion/extension and shoulder abduction/flexion in 30 children with OBPP (18 females, 12 males; age range 6mo-4y 7mo; mean age 2y 6mo [SD 1y 2mo]). Assessments were conducted on two occasions, one week apart. The V-scope was found to be feasible to use by a specialist and a generalist physiotherapist, demonstrating moderate to high reliability coefficients, small measurement errors, and lack of missing data. The pediatric physiotherapist was more reliable in measuring elbow and shoulder movement compared with the generalist physiotherapist, which suggests that the same experienced, pediatric physiotherapist should assess elbow and shoulder movement across all occasions of testing. [source]

Reliability of personality disorder diagnosis during depression: the contribution of collateral informant reports

ACTA PSYCHIATRICA SCANDINAVICA, Issue 6 2007
B. G. Case
Objective:, Research has found low concordance of personality disorder diagnoses made during depression versus after remission and made using patient versus collateral informants, but little is known about the reliability of personality disorder (PD) diagnoses made during depression using patient and collateral reports. Method:, A total of 168 patients were evaluated for PDs during depression and following response using patient and close informant reports. , coefficients of inter-informant and test,retest reliability were calculated. Results:, After depression response, the proportion diagnosed with cluster A and C PDs fell by both patient and close informant report, and overall inter-informant reliability declined. Overall test,retest reliability did not differ between patients and informants. Conclusion:, Collateral informants do not improve the reliability of PD diagnoses made during depressive episodes. [source]

Reliability and validity of the Observational Gait Scale in children with spastic diplegia

DEVELOPMENTAL MEDICINE & CHILD NEUROLOGY, Issue 1 2003
Anna H Mackey MS PT
The aim of this study was to establish the reliability and validity of visual gait assessment in children with spastic diplegia, who were community or household ambulators, using a modified version of the Physicians Rating Scale, known as the Observational Gait Scale (OGS). Two clinicians viewed edited split-screen video recordings of 20 children/adolescents (11 males, 9 females; mean age 12 years, range 6 to 21 years) made at the time of three-dimensional gait analysis (3-DGA). Walking ability in each child was scored at initial assessment and reassessed from the same videos three months later using the first seven sections of the OGS. Validity of the OGS score was determined by comparison with 3-DGA. The OGS was found to have acceptable interrater and intrarater reliability for knee and foot position in mid-stance, initial foot contact, and heel rise with weighted kappas (wk) ranging from 0.53 to 0.91 (intrarater) and 0.43 to 0.86 (interrater). Comparison with 3-DGA suggests that these sections might also have high validity(wk range 0.38,0.94). Base of support and hind foot position had lower interrater and intrarater reliabilities (wk 0.29 to 0.71 and wk 0.30 to 0.78 respectively) and were not easily validated by 3-DGA. [source]

Reliability of Computerized Emergency Triage

ACADEMIC EMERGENCY MEDICINE, Issue 3 2006
Sandy L. Dong MD
Objectives: Emergency department (ED) triage prioritizes patients based on urgency of care. This study compared agreement between two blinded, independent users of a Web-based triage tool (eTRIAGE) and examined the effects of ED crowding on triage reliability. Methods: Consecutive patients presenting to a large, urban, tertiary care ED were assessed by the duty triage nurse and an independent study nurse, both using eTRIAGE. Triage score distribution and agreement are reported. The study nurse collected data on ED activity, and agreement during different levels of ED crowding is reported. Two methods of interrater agreement were used: the linear-weighted , and quadratic-weighted ,. Results: A total of 575 patients were assessed over nine weeks, and complete data were available for 569 patients (99.0%). Agreement between the two nurses was moderate if using linear , (weighted ,= 0.52; 95% confidence interval = 0.46 to 0.57) and good if using quadratic , (weighted ,= 0.66; 95% confidence interval = 0.60 to 0.71). ED overcrowding data were available for 353 patients (62.0%). Agreement did not significantly differ with respect to periods of ambulance diversion, number of admitted inpatients occupying stretchers, number of patients in the waiting room, number of patients registered in two hours, or nurse perception of busyness. Conclusions: This study demonstrated different agreement depending on the method used to calculate interrater reliability. Using the standard methods, it found good agreement between two independent users of a computerized triage tool. The level of agreement was not affected by various measures of ED crowding. [source]

Development of the International Classification of Mental Health Care (ICMHC)

ACTA PSYCHIATRICA SCANDINAVICA, Issue 2000
A. De Jong
Objective: Evaluations of the process of providing mental health care have been hampered because a tool to systematically describe the interventions actually provided by the services was lacking. In this paper the development of such a tool (the International Classification of Mental Health Care; ICMHC) is described. Method: Subsequent versions of the ICMHC were developed, using comments from experts in 24 WHO field centres and results from a number of field trials. In the final version 10 Modalities of Care can be used to describe Modules of Care, using the Level of Specialization scale. The inter-rater reliability of this version was evaluated by the Italian research team, using data from 43 services. Results: Reliability ranged from excellent for nine modalities to reasonably good for the remaining modality. Conclusion: In the context of evaluation studies, the ICMHC can be used to describe systematically mental health care interventions. [source]

Reliability of Intraoperative Transesophageal Echocardiography During Tetralogy of Fallot Repair

ECHOCARDIOGRAPHY, Issue 4 2000
JAMES J. JOYCE M.D.
There is limited information available concerning the accuracy of intraoperative transesophageal echocardiography (TEE) in predicting the extent of residual abnormalities after recovery from surgical repair of tetralogy of Fallot. Therefore, we investigated differences between the results of final postbypass TEE and those of postrecovery (mean, 6 days after surgery) transthoracic echocardiography in a total of 28 consecutive pediatric patients who underwent repair of tetralogy of Fallot with biplane or multiplane TEE. Both postbypass and postrecovery echocardiographic examinations included measurements of the right ventricle (RV)-main pulmonary artery (PA) and the main PA-branch PA peak instantaneous gradients, the degree of pulmonary valvar insufficiency, and color Doppler interrogation of the ventricular septum for residual defects. The RV-main PA gradient did not change significantly: 15 ± 13 vs 18 ± 14 mmHg (postbypass versus postrecovery, mean ± SD). None of the patients had a decrease of , 10 mmHg; and only one patient had an increase of ,: 15 mmHg. There also was no change in the degree of pulmonary insufficiency (3.0 ±1.2 versus 3.1 ± 1.1, using a scale of 0 to 4). Only one of the seven very small (, 2 mm) residual ventricular septal defects was not discovered during postbypass TEE. However, postrecovery transthoracic echocardiography detected significant branch PA stenosis (peak gradient, , 15 mmHg) in five patients (18%) that was not detected during postbypass TEE (P < 0.03). Of the branch PA stenoses that were not detected during TEE, four were left and one was right. Conclusions: Postbypass TEE after tetralogy of Fallot repair reliably predicts residual postrecovery hemodynamic abnormalities, except for branch PA stenosis. [source]

An Examination of the Reliability of Prestigious Scholarly Journals: Evidence and Implications for Decision-Makers

ECONOMICA, Issue 293 2007
ANDREW J. OSWALD
Scientific-funding bodies are increasingly under pressure to use journal rankings to measure research quality. Hiring and promotion committees routinely hear an equivalent argument: ,this is important work because it is to be published in prestigious journal X'. But how persuasive is such an argument? This paper examines data on citations to articles published 25 years ago. It finds that it is better to write the best article published in an issue of a medium-quality journal such as the OBES than all four of the worst four articles published in an issue of an elite journal like the AER. Decision-makers need to understand this. [source]

Reliability of No Child Left Behind Accountability Designs

EDUCATIONAL MEASUREMENT: ISSUES AND PRACTICE, Issue 3 2003
Richard K. Hill
The No Child Left Behind Act of 2001 requires states to establish accountability systems that are both valid and reliable. If one follows the language of the law literally, there is no design that will meet both requirements. If one interprets the law more flexibly, it is possible to create such a design. States will need to approach the problem carefully if they are going to appropriately balance the various probabilities of making incorrect decisions about schools. [source]

The Dependence of the Sensitivity and Reliability of Contactless Conductivity Detection on the Wall Thickness of Electrophoretic Fused-Silica Capillaries

ELECTROANALYSIS, Issue 3-5 2009
Petr T
Abstract A contactless conductivity detector (C4D) performance has been tested on a simple capillary electrophoretic separation in a standard fused-silica capillary with an external diameter of 360,,m and in a thin-walled capillary (an external diameter of 150,,m); the internal diameters of the two capillaries were identical, equal to 75,,m. Potassium and sodium ions have been separated in a morpholinoethanesulfonic acid/histidine background electrolyte (MES/His), over a wide range of its concentrations (0,100,mM). At low MES/His concentrations, the C4D response, obtained from the height of the potassium peak, is by 100 to 200 per cent higher for the thin-walled capillary and the calibration dependences are linear, in contrast to the thick-walled capillary. These differences between the two capillaries decrease with increasing MES/His concentration, the C4D response in the thin-walled capillary is then higher by mere 20 per cent and the calibration dependences are linear in both the capillaries. The highest sensitivities have been obtained at a MES/His concentration of 50,mM, with LOD values for potassium ion of 2.0 and 2.6,,M, in the thin- and thick-walled capillaries, respectively. The signal-to-noise ratios and the plate counts are generally similar for the two capillaries. It follows from the results that special thin-walled capillaries can be advantageous when background electrolytes with very low conductivities must be employed. [source]

Test,re-test reliability of DSM-IV adopted criteria for 3,4-methylenedioxymethamphetamine (MDMA) abuse and dependence: a cross-national study

ADDICTION, Issue 10 2009
Linda B. Cottler
ABSTRACT Aims This study evaluated the prevalence and reliability of DSM-IV adopted criteria for 3,4-methylenedioxymethamphetamine (MDMA) abuse and dependence with a purpose to determine whether it is best conceptualized within the category of hallucinogens, amphetamines or its own category. Design Test,re-test study. Participants MDMA users (life-time use >5 times) were recruited in St Louis, Miami and Sydney (n = 593). The median life-time MDMA consumption was 50 pills at the baseline. Measurements The computerized Substance Abuse Module for Club Drug (CD-SAM) was used to assess MDMA abuse and dependence. The Discrepancy Interview Protocol (DIP) was used to determine the reasons for the discrepant responses between the two interviews. Reliability of diagnoses, individual diagnostic criteria and withdrawal symptoms was examined using the kappa coefficient (,). Findings For baseline data, 15% and 59% met MDMA abuse and dependence, respectively. Substantial test,re-test reliability of the diagnoses was observed consistently across cities (, = 0.69). ,Continued use despite knowledge of physical/psychological problems' (87%) and ,withdrawal' (68%) were the two most prevalent dependence criteria. ,Physically hazardous use' was the most prevalent abuse criterion. Six dependence criteria and all abuse criteria were reported reliably across cities (,: 0.53,0.77). Seventeen of 19 withdrawal symptoms showed consistency in the reliability across cities. The most commonly reported reason for discrepant responses was ,interpretation of question changed'. Only a small proportion of the total discrepancies were attributed to lying or social desirability. Conclusion The adopted DSM-IV diagnostic classification for MDMA abuse and dependence was moderately reliable across cities. Findings on MDMA withdrawal support the argument that MDMA should be separated from other hallucinogens in DSM. [source]

Reliability and replicability of genetic association studies

ADDICTION, Issue 9 2009
MARCUS R. MUNAFÒ
No abstract is available for this article. [source]

Reliability of patterns of hippocampal sclerosis as predictors of postsurgical outcome

EPILEPSIA, Issue 9 2010
Maria Thom
Summary Purpose:, Around one-third of patients undergoing temporal lobe surgery for the treatment of intractable temporal lobe epilepsy with hippocampal sclerosis (HS) fail to become seizure-free. Identifying reliable predictors of poor surgical outcome would be helpful in management. Atypical patterns of HS may be associated with poorer outcomes. Our aim was to identify atypical HS cases from a large surgical series and to correlate pathology with clinical and outcome data. Methods:, Quantitative neuropathologic evaluation on 165 hippocampal surgical specimens and 21 control hippocampi was carried out on NeuN-stained sections. Neuronal densities (NDs) were measured in CA4, CA3, CA2, and CA1 subfields. The severity of granule cell dispersion (GCD) was assessed. Results:, Comparison with control ND values identified the following patterns based on the severity and distribution of neuronal loss: classical HS (CHS; n = 60) and total HS (THS; n = 39). Atypical patterns were present in 30% of cases, including end-folium sclerosis (EFS; n = 5), CA1 predominant pattern (CA1p; n = 9), and indeterminate HS (IHS, n = 35). No HS was noted in 17 cases. Poorest outcomes were noted for no-HS, and CA1p groups with 33,44% International League Against Epilepsy (ILAE) class I at up to 2 years follow-up compared to 69% for CHS (p < 0.05). GCD associated with HS type (p < 0.01), but not with outcome. Conclusions:, These findings support the identification and delineation of atypical patterns of HS using quantitative methods. Atypical patterns may represent distinct clinicopathologic subtypes and may have predictive value following epilepsy surgery. [source]

Wada Test Reliability (Response to Haber et al.)

EPILEPSIA, Issue 9 2007
Tobias Loddenkemper MD
No abstract is available for this article. [source]

Interobserver Reliability of Video Recording in the Diagnosis of Nocturnal Frontal Lobe Seizures

EPILEPSIA, Issue 8 2007
Luca Vignatelli
Summary:,Background: Nocturnal frontal lobe seizures (NFLS) show one or all of the following semeiological patterns: (1) paroxysmal arousals (PA: brief and sudden recurrent motor paroxysmal behavior); (2) hyperkinetic seizures (HS: motor attacks with complex dyskinetic features); (3) asymmetric bilateral tonic seizures (ATS: motor attacks with dystonic features); (4) epileptic nocturnal wanderings (ENW: stereotyped, prolonged ambulatory behavior). Objective: To estimate the interobserver reliability (IR) of video-recording diagnosis in patients with suspected NFLS among sleep medicine experts, epileptologists, and trainees in sleep medicine. Methods: Sixty-six patients with suspected NFLS were included. All underwent nocturnal video-polysomnographic recording. Six doctors (three experts and three trainees) independently classified each case as "NFLS ascertained" (according to the above specified subtypes: PA, HS, ATS, ENW) or "NFLS excluded". IR was calculated by means of Kappa statistics, and interpreted according to the standard classification (0.0,0.20 = slight agreement; 0.21,0.40 = fair; 0.41,0.60 = moderate; 0.61,0.80 = substantial; 0.81,1.00 = almost perfect). Results: The observed raw agreement ranged from 63% to 79% between each pair of raters; the IR ranged from "moderate" (kappa = 0.50) to "substantial" (kappa = 0.72). A major source of variance was the disagreement in distinguishing between PA and nonepileptic arousals, without differences in the level of agreement between experts and trainees. Conclusions: Among sleep experts and trainees, IR of diagnosis of NFLS, based on videotaped observation of sleep phenomena, is not satisfactory. Explicit video-polysomnographic criteria for the classification of paroxysmal sleep motor phenomena are needed. [source]

Reliability and repeatability of thermographic examination and the normal thermographic image of the thoracolumbar region in the horse

EQUINE VETERINARY JOURNAL, Issue 4 2004
B. V. TUNLEY
Summary Reasons for performing study: Thermographic imaging is an increasingly used diagnostic tool. When performing thermography, guidelines suggest that horses should be left for 10,20 mins to ,acclimatise' to the thermographic imaging environment, with no experimental data to substantiate this recommendation. In addition, little objective work has been published on the repeatability and reliability of the data obtained. Thermography has been widely used to identify areas of abnormal body surface temperature in horses with back pathology; however, no normal data is available on the thermographic ,map' of the thoracolumbar region with which to compare horses with suspected pathology. Objectives: To i) investigate whether equilibration of the thermographic subject was required and, if so, how long it should take, ii) investigate what factors affect time to equilibration, iii) investigate the repeatability and reliability of the technique and iv) generate a topographic thermographic ,map' of the thoracolumbar region. Methods: A total of 52 horses were used. The following investigations were undertaken: thermal imaging validation, i.e. detection of movement around the baseline of an object of constant temperature; factors affecting equilibration; pattern reproducibility during equilibration and over time (n = 25); and imaging of the thoracolumbar region (n = 27). Results: A 1°C change was detected in an object of stable temperature using this detection system, i.e the ,noise' in the system. The average time taken to equilibrate, i.e. reach a plateau temperature, was 39 mins (40.2 in the gluteal region, 36.2 in lateral thoracic region and 40.4 in metacarpophalangeal region). Only 19% of horses reached plateau within 10,20 mins. Of the factors analysed hair length and difference between the external environment and the internal environment where the measurements were being taken both significantly affected time to plateau (P<0.05). However, during equilibration, the thermographic patterns obtained did not change, nor when assessed over a 7 day period. A ,normal' map of the surface temperature of the thoracolumbar region has been produced, demonstrating that the midline is the hottest, with a fall off of 3°C either side of the midline. Conclusions: This study demonstrates that horses may not need time to equilibrate prior to taking thermographic images and that thermographic patterns are reproducible over periods up to 7 days. A topographical thermographic ,map' of the thoracolumbar region has been obtained. Potential relevance: Clinicians can obtain relevant thermographic images without the need for prior equilibration and can compare cases with thoracolumbar pathology to a normal topographic thermographic map. [source]

On the reliability of a dental OSCE, using SEM: effect of different days

EUROPEAN JOURNAL OF DENTAL EDUCATION, Issue 3 2008
M. Schoonheim-Klein
Abstract Aim:, The first aim was to study the reliability of a dental objective structured clinical examination (OSCE) administered over multiple days, and the second was to assess the number of test stations required for a sufficiently reliable decision in three score interpretation perspectives of a dental OSCE administered over multiple days. Materials and methods:, In four OSCE administrations, 463 students of the year 2005 and 2006 took the summative OSCE after a dental course in comprehensive dentistry. The OSCE had 16,18 5-min stations (scores 1,10), and was administered per OSCE on four different days of 1 week. ANOVA was used to test for examinee performance variation across days. Generalizability theory was used for reliability analyses. Reliability was studied from three interpretation perspectives: for relative (norm) decisions, for absolute (domain) and pass,fail (mastery) decisions. As an indicator of reproducibility of test scores in this dental OSCE, the standard error of measurement (SEM) was used. The benchmark of SEM was set at <0.51. This is corresponding to a 95% confidence interval (CI) of <1 on the original scoring scale that ranged from 1 to 10. Results:, The mean weighted total OSCE score was 7.14 on a 10-point scale. With the pass,fail score set at 6.2 for the four OSCE, 90% of the 463 students passed. There was no significant increase in scores over the different days the OSCE was administered. ,Wished' variance owing to students was 6.3%. Variance owing to interaction between student and stations and residual error was 66.3%, more than two times larger than variance owing to stations' difficulty (27.4%). The SEM norm was 0.42 with a CI of ±0.83 and the SEM domain was 0.50, with a CI of ±0.98. In order to make reliable relative decisions (SEM <0.51), the use of minimal 12 stations is necessary, and for reliable absolute and pass,fail decisions, the use of minimal 17 stations is necessary in this dental OSCE. Conclusions:, It appeared reliable, when testing large numbers of students, to administer the OSCE on different days. In order to make reliable decisions for this dental OSCE, minimum 17 stations are needed. Clearly, wide sampling of stations is at the heart of obtaining reliable scores in OSCE, also in dental education. [source]

Rethinking the OSCE as a Tool for National Competency Evaluation

EUROPEAN JOURNAL OF DENTAL EDUCATION, Issue 2 2004
M. A. Boyd
The relatively recent curriculum change to Problem-Based Learning/Case-Based Education has stimulated the development of new evaluation tools for student assessment. The Objective Structured Clinical Examination (OSCE) has become a popular method for such assessment. The National Dental Examining Board of Canada (NDEB) began using an OSCE format as part of the national certification testing process for licensure of beginning dentists in Canada in 1996. The OSCE has been well received by provincial licensing authorities, dental schools and students. ,Hands on' clinical competency is trusted to the dental programs and verified through NDEB participation in the Accreditation process. The desire to refine the OCSE has resulted in the development of a new format. Previously OSCE stations consisted of case-based materials and related multiple-choice questions. The new format has case-based material with an extended match presentation. Candidates ,select one or more correct answers' from a group of up to15 options. The blueprint is referenced to the national competencies for beginning practitioners in Canada. This new format will be available to students on the NDEB website for information and study purposes. Question stems and options will remain constant. Case histories and case materials will change each year. This new OSCE will be easier to administer and be less expensive in terms of test development. Reliability and validity is enhanced by involving content experts from all faculties in test development, by having the OSCE verified by general practitioners and by making the format available to candidates. The new OSCE will be pilot tested in September 2004. Examples will be provided for information and discussion. [source]

Reliability of orthostatic responses in healthy men aged between 65 and 75 years

EXPERIMENTAL PHYSIOLOGY, Issue 4 2005
Tim J. Gabbett
The purpose of this study was to investigate the short-, medium- and long-term reproducibility of cardiovascular responses during 90° head-up tilt (HUT) in healthy older men. Twenty-eight healthy male subjects aged 69 (95% confidence intervals, 68,70) years participated in the study. Eight subjects underwent duplicate 90° HUT tests on consecutive days, while 20 subjects underwent four 90° HUT tests performed at baseline, and after 1 week, 1 month and 1 year. Following a 20-min supine resting period, each subject was rapidly tilted to the upright vertical position (90° HUT) and remained in that position for 15 min. Beat-by-beat recordings of mean (MAP), systolic (SBP) and diastolic (DBP) pressures were made via Finapres, while heart rate (HR) was monitored continuously from an electrocardiogram. No significant test,retest differences (P > 0.05) were observed for the changes in HR, MAP, SBP or DBP during 90° HUT. These measurements demonstrated high reproducibility (intraclass correlation coefficient, r= 0.91,0.99, P < 0.05). The supine resting and tilted HR, MAP, SBP and DBP over the 1-week, 1-month and 1-year period were not significantly different (P > 0.05) from baseline, and demonstrated high reproducibility (intraclass correlation coefficient, r= 0.82,0.98, P < 0.05). The results of this study demonstrate that in healthy older men, cardiovascular responses during orthostasis are highly reproducible, and this reproducibility is maintained over a 12-month period. These findings demonstrate that the 90° HUT test offers a reproducible method of monitoring longitudinal orthostatic responses in healthy older men. [source]

The Family Daily Hassles Inventory: A Preliminary Investigation of Reliability and Validity

FAMILY & CONSUMER SCIENCES RESEARCH JOURNAL, Issue 2 2002
Suzanne Z. Rollins
This study investigated the preliminary reliability and validity of a new family daily hassles assessment using two well-established instruments that measure daily hassles and health status. Participants (140 mothers) completed a self-administered survey. The results indicated adequate reliability of the new assessment and supported the concurrent and construct validity of the scores obtained with the new assessment as a measure of daily hassles. [source]