Kappa Coefficient (kappa + coefficient)

Distribution by Scientific Domains

Kinds of Kappa Coefficient

  • weighted kappa coefficient


  • Selected Abstracts


    Measuring Agreement of Multivariate Discrete Survival Times Using a Modified Weighted Kappa Coefficient

    BIOMETRICS, Issue 1 2009
    Ying Guo
    Summary Assessing agreement is often of interest in clinical studies to evaluate the similarity of measurements produced by different raters or methods on the same subjects. We present a modified weighted kappa coefficient to measure agreement between bivariate discrete survival times. The proposed kappa coefficient accommodates censoring by redistributing the mass of censored observations within the grid where the unobserved events may potentially happen. A generalized modified weighted kappa is proposed for multivariate discrete survival times. We estimate the modified kappa coefficients nonparametrically through a multivariate survival function estimator. The asymptotic properties of the kappa estimators are established and the performance of the estimators are examined through simulation studies of bivariate and trivariate survival times. We illustrate the application of the modified kappa coefficient in the presence of censored observations with data from a prostate cancer study. [source]


    Interobserver agreement between primary graders and an expert grader in the Bristol and Weston diabetic retinopathy screening programme: a quality assurance audit

    DIABETIC MEDICINE, Issue 8 2009
    S. Patra
    Abstract Aims, To assess the quality and accuracy of primary grading in the Bristol and Weston diabetic retinopathy screening programme and to set standards for future interobserver agreement reports. Methods, A prospective audit of 213 image sets from six fully trained primary graders in the Bristol and Weston diabetic retinopathy screening programme was carried out over a 4-week period. All the images graded by the primary graders were regraded by an expert grader blinded to the primary grading results and the identity of the primary grader. The interobserver agreement between primary graders and the blinded expert grader and the corresponding Kappa coefficient was determined for overall grading, referable, non-referable and ungradable disease. The audit standard was set at 80% for interobserver agreement with a Kappa coefficient of 0.7. Results, The interobserver agreement bettered the audit standard of 80% in all the categories. The Kappa coefficient was substantial (0.7) for the overall grading results and ranged from moderate to substantial (0.59,0.65) for referable, non-referable and ungradable disease categories. The main recommendation of the audit was to provide refresher training for the primary graders with focus on ungradable disease. Conclusion, The audit demonstrated an acceptable level of quality and accuracy of primary grading in the Bristol and Weston diabetic retinopathy screening programme and provided a standard against which future interobserver agreement can be measured for quality assurance within a screening programme. [source]


    Poorly performing physicians: Does the script concordance test detect bad clinical reasoning?,

    THE JOURNAL OF CONTINUING EDUCATION IN THE HEALTH PROFESSIONS, Issue 3 2010
    François Goulet MD
    Abstract Introduction Evaluation of poorly performing physicians is a worldwide concern for licensing bodies. The Collège des Médecins du Québec currently assesses the clinical competence of physicians previously identified with potential clinical competence difficulties through a day-long procedure called the Structured Oral Interview (SOI). Two peer physicians produce a qualitative report. In view of remediation activities and the potential for legal consequences, more information on the clinical reasoning process (CRP) and quantitative data on the quality of that process is needed. This study examines the Script Concordance Test (SCT), a tool that provides a standardized and objective measure of a specific dimension of CRP, clinical data interpretation (CDI), to determine whether it could be useful in that endeavor. Methods Over a 2-year period, 20 family physicians took, in addition to the SOI, a 1-hour paper-and-pencil SCT. Three evaluators, blind as to the purpose of the experiment, retrospectively reviewed SOI reports and were asked to estimate clinical reasoning quality. Subjects were classified into 2 groups (below and above median of the score distribution) for the 2 assessment methods. Agreement between classifications is estimated with the use of the Kappa coefficient. Results Intraclass correlation for SOI was 0.89. Cronbach alpha coefficient for the SCT was 0.90. Agreement between methods was found for 13 participants (Kappa: 0.30, P = 0.18), but 7 out of 20 participants were classified differently in both methods. All participants but 1 had SCT scores below 2 SD of panel mean, thus indicating serious deficiencies in CDI. Discussion The finding that the majority of the referred group did so poorly on CDI tasks has great interest for assessment as well as for remediation. In remediation of prescribing skills, adding SCT to SOI is useful for assessment of cognitive reasoning in poorly performing physicians. The structured oral interview should be improved with more precise reporting by those who assess the clinical reasoning process of examinees, and caution is recommended in interpreting SCT scores; they reflect only a part of the reasoning process. [source]


    Validation of a pre-anaesthetic screening questionnaire

    ANAESTHESIA, Issue 9 2003
    W. G. Hilditch
    Summary We developed a screening questionnaire to be used by nurses to decide which patients should see an anaesthetist for further evaluation before the day of surgery. Our objective was to measure the accuracy of responses to the questionnaire. Agreement between questionnaire responses and the anaesthetist's assessment was assessed. For questions with a prevalence of 5 to 95%, the Kappa coefficient was used; percentage agreement was used for all other questions. Criterion validity was excellent/good for all questions with a prevalence between 5 and 95%, except for the question ,Do you have kidney disease?' For questions with prevalence <,5%, all demonstrated adequate criterion validity except the questions ,Has anyone in your family had a problem following an anaesthetic?' and ,If you have been put to sleep for an operation were there any anaesthetic problems?' Therefore, it is reasonable for nurses to use this questionnaire to determine which patients an anaesthetist should see before the day of surgery. [source]


    Age- versus time-comparative self-rated health in Hong Kong Chinese older adults

    INTERNATIONAL JOURNAL OF GERIATRIC PSYCHIATRY, Issue 8 2006
    Zhi Bin Li
    Abstract Objectives The main objectives were to examine the relation between age-comparative (self vs others of same age) self-rated health (SRH) and time-comparative (self this year vs last year) SRH, and to evaluate which was more strongly associated with specific physical health problems. Methods Cross-sectional data on two SRH measures and various physical health problems from 18749 male and 37413 female clients aged 65 or over from 18 Elderly Health Centres in Hong Kong were analysed using logistic regression with adjustment for potential confounders. Results Men were more likely to report ,better' and less likely to report ,worse' SRH than women. ,Normal' was the most common option but the proportions choosing this decreased with age on both SRH measures. There was a fairly weak but statistically significant correlation between these two measures, with Kappa coefficients of 0.125 and 0.167 for men and women, respectively. For both men and women, there were significantly positive linear trends between age-comparative SRH options from ,better' to ,worse' and physical health problems, such as respiratory diseases, musculoskeletal diseases, any active chronic diseases, functional disability, depressive symptoms, taking medication regularly, and admission to hospital last year. However, for time-comparative SRH, those who rated ,normal' had the smallest odds ratios in all of the physical health problems above than those who rated ,better' or ,worse'. Conclusions The two SRH measures correlated with each other weakly but significantly. Age-comparative SRH was linearly, and time-comparative SRH was curvilinearly associated with physical health problems. Copyright © 2006 John Wiley & Sons, Ltd. [source]


    Interrater reliability of the Psychiatric Research Interview for Substance and Mental Disorders in an HIV-infected cohort: experience of the National NeuroAIDS Tissue Consortium

    INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, Issue 3 2006
    S. Morgello
    Abstract The interrater reliability of the Psychiatric Research Interview for Substance and Mental Disorders (PRISM) was assessed in a multicentre study. Four sites of the National NeuroAIDS Tissue Consortium performed blinded reratings of audiotaped PRISM interviews of 63 HIV-infected patients. Diagnostic modules for substance-use disorders and major depression were evaluated. Seventy-six per cent of the patient sample displayed one or more substance-use disorder diagnoses and 54% had major depression. Kappa coefficients for lifetime histories of substance abuse or dependence (cocaine, opiates, alcohol, cannabis, sedative, stimulant, hallucinogen) and major depression ranged from 0.66 to 1.00. Overall the PRISM was reliable in assessing both past and current disorders except for current cannabis disorders when patients had concomitant cannabinoid prescriptions for medical therapy. The reliability of substance-induced depression was poor to fair although there was a low prevalence of this diagnosis in our group. We conclude that the PRISM yields reliable diagnoses in a multicentre study of substance-experienced, HIV-infected individuals. Copyright © 2006 John Wiley & Sons, Ltd. [source]


    Guidelines for the descriptive presentation and statistical analysis of contact allergy data

    CONTACT DERMATITIS, Issue 2 2004
    Wolfgang Uter
    The present guidelines aim to support clinical researchers in adequately presenting data on contact allergy, and to use statistical tests appropriate for their data. A description of the mode of selection of patients, and of their relevant demographic details, is an essential prerequisite for the correct interpretation of study results. Proportions and rates, if regarded as estimate of these parameters of a target populations, should normally be supplemented with confidence intervals to address precision. Concordance, i.e., agreement between two ratings in a dependent sample, must be quantified with a chance-corrected measure such as Cohen's kappa coefficient. If the diagnostic quality of an outcome is being assessed, standard measures like sensitivity and specificity, as well as the prevalence-dependent positive and negative predictive values should be calculated. Often, contact allergy to a certain substance depends on several factors. In this situation, depending on the research question, techniques like stratification, standardization or multifactorial analysis should be employed. With increasing complexity of statistical description and analysis, consulting with a biostatistician is often mandatory. [source]


    The metabolic syndrome and changing relationship between blood pressure and insulin with age, as observed in Aboriginal and Torres Strait Islander peoples

    DIABETIC MEDICINE, Issue 11 2005
    A. E. Schutte
    Abstract Aims To determine the prevalence of the metabolic syndrome (MS) among Aboriginal and Torres Strait Islander peoples. A further objective was to investigate the relationships between fasting insulin and blood pressure (BP) within these groups with increasing age. Methods A cross-sectional population-based study included 369 Torres Strait Islanders (residing in Torres Strait and Far North Queensland), and 675 Aborigines from central Australia. Data necessary for classification of MS was collected, including fasting and 2-h glucose and insulin, urinary albumin and creatinine, anthropometric measurements, BP, serum lipids. Results The ATPIII criteria classified 43% of Torres Strait Islanders and 44% of Aborigines with MS, whereas 32 and 28%, respectively, had the MS according to WHO criteria. Agreement between the two criteria was only modest (kappa coefficient from 0.28 to 0.57). Factor analyses indicated no cluster including both insulin and BP in either population. Significant correlations (P < 0.05) [adjusted for gender, body mass index (BMI) and waist circumference] were observed between BP and fasting insulin: a positive correlation for Torres Strait Islanders aged 15,29 years, and an inverse correlation for Aborigines aged 40 years and older. Conclusion Torres Strait Islanders and Aborigines had very high prevalences of the MS. Specific population characteristics (high prevalences of central obesity, dyslipidaemia, renal disease) may make the WHO definition preferable to the ATPIII definition in these population groups. The poor agreement between criteria suggests a more precise definition of the metabolic syndrome that is applicable across populations is required. This study showed an inverse relationship with age for the correlation of BP and fasting insulin. [source]


    Evaluation of agreement between conventional and liquid-based cytology in cervical cancer early detection based on analysis of 2,091 smears: Experience at the Brazilian National Cancer Institute

    DIAGNOSTIC CYTOPATHOLOGY, Issue 9 2007
    Vania Reis Girianelli M.Sc.
    Abstract The aim of this work was to evaluate the agreement between conventional cytology (CC) and liquid-based cytology (LBC) in cervical cancer early detection. The results of CC and LBC (DNACitoliq®) in 2,059 women aged 25,59 years were compared. The percent agreement, kappa coefficient, prevalence-adjusted bias-adjusted kappa coefficient (PABAK), and Chamberlain's percent positive agreement (PPA) were calculated. The percent agreement between the two methods was very good (80% and 79%, respectively, for normal versus ASCUS+; and normal versus ASCUS, AGUS and LSIL+ vs. HSIL+). The kappa coefficient indicates slight agreement (0.26 and 0.23, respectively), but when PABAK was used the agreement was good (0.61 and 0.68, respectively). PPA was high for normal results (79.2%) and low for the remaining categories. To conclude, in this study, agreement between LBC and CC was only good for normal results, which involves the majority of cases and positively influences the overall agreement rate. Diagn. Cytopathol. 2007;35:545-549. © 2007 Wiley-Liss, Inc. [source]


    Test,re-test reliability of DSM-IV adopted criteria for 3,4-methylenedioxymethamphetamine (MDMA) abuse and dependence: a cross-national study

    ADDICTION, Issue 10 2009
    Linda B. Cottler
    ABSTRACT Aims This study evaluated the prevalence and reliability of DSM-IV adopted criteria for 3,4-methylenedioxymethamphetamine (MDMA) abuse and dependence with a purpose to determine whether it is best conceptualized within the category of hallucinogens, amphetamines or its own category. Design Test,re-test study. Participants MDMA users (life-time use >5 times) were recruited in St Louis, Miami and Sydney (n = 593). The median life-time MDMA consumption was 50 pills at the baseline. Measurements The computerized Substance Abuse Module for Club Drug (CD-SAM) was used to assess MDMA abuse and dependence. The Discrepancy Interview Protocol (DIP) was used to determine the reasons for the discrepant responses between the two interviews. Reliability of diagnoses, individual diagnostic criteria and withdrawal symptoms was examined using the kappa coefficient (,). Findings For baseline data, 15% and 59% met MDMA abuse and dependence, respectively. Substantial test,re-test reliability of the diagnoses was observed consistently across cities (, = 0.69). ,Continued use despite knowledge of physical/psychological problems' (87%) and ,withdrawal' (68%) were the two most prevalent dependence criteria. ,Physically hazardous use' was the most prevalent abuse criterion. Six dependence criteria and all abuse criteria were reported reliably across cities (,: 0.53,0.77). Seventeen of 19 withdrawal symptoms showed consistency in the reliability across cities. The most commonly reported reason for discrepant responses was ,interpretation of question changed'. Only a small proportion of the total discrepancies were attributed to lying or social desirability. Conclusion The adopted DSM-IV diagnostic classification for MDMA abuse and dependence was moderately reliable across cities. Findings on MDMA withdrawal support the argument that MDMA should be separated from other hallucinogens in DSM. [source]


    The Danish Prostatic Symptom Score (DAN-PSS-1) questionnaire is reliable in stroke patients,

    NEUROUROLOGY AND URODYNAMICS, Issue 4 2006
    Sigrid Tibaek
    Abstract Aims To investigate the test,retest reliability of Danish Prostatic Symptom Score (DAN-PSS-1) questionnaire in a sample of stroke patients. Methods A prospective study design was used in which the stroke patients were invited to complete a postal self-administrated DAN-PSS-1 questionnaire twice. The questionnaire consists of 12 questions related to lower urinary tract symptoms (LUTS). The participants were asked to state the frequency and severity of their symptoms (symptom score) and its impact on their daily life (bother score). Seventy-one stroke patients were included and 59 (83%) answered the questionnaire twice. The reliability test was done in two aspects: (a) detecting the frequency of each symptom and its bother factor, the scores were reduced to a two-category scale (=0, >0) and simple kappa statistics was used; (b) detecting the severity of each symptom and its bother factor, the total scale (0,3) and weighted kappa statistics was used. Results The proportion of agreement for the frequency symptom scores ranged from 76% to 97% and the simple kappa coefficient ranged from poor (,,=,0.00) to excellent (,,=,0.91). The proportion of agreement for the corresponding bother scores ranged from 76% to 95% and the simple kappa coefficient ranged from good (,,=,0.61) to excellent (,,=,0.84). The weighted kappa coefficient for the severity symptom scores ranged from moderate (,w,=,0.43) to good (,w,=,0.75) and the corresponding bother scores ranged from moderate (,w,=,0.48) to good (,w,=,0.68). Conclusions The DAN-PSS-1 questionnaire had acceptable test,retest reliability and may be suitable for measuring the frequency and severity of LUTS and its bother factor in stroke patients. Neurourol. Urodynam. © 2006 Wiley-Liss, Inc. [source]


    Lack of agreement between rheumatologists in defining digital ulceration in systemic sclerosis

    ARTHRITIS & RHEUMATISM, Issue 3 2009
    Ariane L. Herrick
    Objective To test the intra- and interobserver variability, among clinicians with an interest in systemic sclerosis (SSc), in defining digital ulcers. Methods Thirty-five images of finger lesions, incorporating a wide range of abnormalities at different sites, were duplicated, yielding a data set of 70 images. Physicians with an interest in SSc were invited to take part in the Web-based study, which involved looking through the images in a random sequence. The sequence differed for individual participants and prevented cross-checking with previous images. Participants were asked to grade each image as depicting "ulcer" or "no ulcer," and if "ulcer," then either "inactive" or "active." Images of a range of exemplar lesions were available for reference purposes while participants viewed the test images. Intrarater reliability was assessed using a weighted kappa coefficient with quadratic weights. Interrater reliability was estimated using a multirater weighted kappa coefficient. Results Fifty individuals (most of them rheumatologists) from 15 countries participated in the study. There was a high level of intrarater reliability, with a mean weighted kappa value of 0.81 (95% confidence interval [95% CI] 0.77, 0.84). Interrater reliability was poorer (weighted , = 0.46 [95% CI 0.35, 0.57]). Conclusion The poor interrater reliability suggests that if digital ulceration is to be used as an end point in multicenter clinical trials of SSc, then strict definitions must be developed. The present investigation also demonstrates the feasibility of Web-based studies, for which large numbers of participants can be recruited over a short time frame. [source]


    Radiologic features in juvenile idiopathic arthritis: A first step in the development of a standardized assessment method

    ARTHRITIS & RHEUMATISM, Issue 2 2003
    Marion A. J. Van Rossum
    Objective To describe radiologic features of patients with juvenile idiopathic arthritis (JIA) in a standardized manner, to test the reliability and feasibility of this description, and to correlate these features with clinical signs as a first step in the development of a standardized assessment method. Methods The placebo-controlled study of sulfasalazine in patients with oligoarticular, extended oligoarticular, and polyarticular JIA performed by the Dutch Juvenile Idiopathic Arthritis Study Group yielded the data for this study. All trial entry radiographs (clinically involved joints and contralateral joints) were scored (in consensus by a skeletal radiologist and pediatric rheumatologist) for the presence of swelling, osteopenia, joint space narrowing, growth abnormalities, subchondral bone cysts, erosions, and malalignment. Results Data on 67 of 69 patients were analyzed. The mean age was 9.1 years (range 2.5,17.6 years), and the median disease duration was 24 months (range 5,176 months). Thirteen percent of the patients were IgM rheumatoid factor (IgM-RF) positive, and 16% were HLA,B27 positive. All 68 clinically evaluated joints were included in the maximum of 19 radiographed joints (or joint groups) per patient. The mean number of radiographed joints per patient was 7 (range 2,15); knees, hands, ankles, and feet were most frequently affected. Fifty-eight patients (87%) had radiologic abnormalities in at least one joint (soft-tissue swelling in 63% of patients, growth disturbances in 48%, joint space narrowing in 28%, and erosions in 15%). In total, half of the radiographs of the clinically involved joints showed radiologic abnormalities, including two-thirds of the radiographs of the clinically affected hands and knees. Univariate analysis revealed a good correlation between the overall articular (clinical) severity and the presence of radiologic abnormalities (odds ratio [OR] 1.38, P < 0.0001). Multivariate analysis showed increased ORs for the presence of radiologic abnormalities and IgM-RF positivity (OR 4.6, P = 0.005) or HLA,B27 positivity (OR 3.0, P = 0.004). In general, reproducibility of the radiologic scoring method was good (mean kappa coefficient of 0.74 [range 0.40,0.86]), although there were scoring discrepancies for swelling, osteopenia, and growth disturbances. The scoring took 10,20 minutes per patient. Conclusion Our model of describing and scoring radiologic abnormalities of radiographed joints in JIA was feasible, mostly reproducible, correlated well with the overall articular severity score, and added substantial new information not available on clinical examination. [source]


    Measuring Agreement of Multivariate Discrete Survival Times Using a Modified Weighted Kappa Coefficient

    BIOMETRICS, Issue 1 2009
    Ying Guo
    Summary Assessing agreement is often of interest in clinical studies to evaluate the similarity of measurements produced by different raters or methods on the same subjects. We present a modified weighted kappa coefficient to measure agreement between bivariate discrete survival times. The proposed kappa coefficient accommodates censoring by redistributing the mass of censored observations within the grid where the unobserved events may potentially happen. A generalized modified weighted kappa is proposed for multivariate discrete survival times. We estimate the modified kappa coefficients nonparametrically through a multivariate survival function estimator. The asymptotic properties of the kappa estimators are established and the performance of the estimators are examined through simulation studies of bivariate and trivariate survival times. We illustrate the application of the modified kappa coefficient in the presence of censored observations with data from a prostate cancer study. [source]


    Evaluation of the new Ocuton S tonometer

    ACTA OPHTHALMOLOGICA, Issue 2 2002
    Giorgio Marchini
    ABSTRACT. Purpose:, To evaluate the intra- and interobserver variability of the Ocuton S tonometer, its correlation with Goldmann tonometry, the reliability of self-tonometry and the safety of the instrument. Methods:, Thirty-five healthy subjects and 45 patients with primary open-angle glaucoma (POAG), aged from 38 to 80 years (mean age: 64.6 ± 12.2 years), underwent tonometry with the Ocuton S tonometer in one eye chosen at random. The intra- and interobserver variability between two operators (kappa coefficient), the Ocuton S/Goldmann correlation and the reliability of self-tonometry were evaluated by performing two tonometries on each patient in subgroups. Each tonometry was considered as the mean of three consecutive measurements. Central ultrasonic pachymetry, keratometry and corneal biomicroscopy were also evaluated. Results:, The intra- and interobserver variability ranged from 0.38 to 0.66. The difference between the means of intraocular pressure (IOP) with the Ocuton S (24.4 ± 4.7 mmHg) and the Goldmann tonometer (18.1 ± 4.7 mmHg) was statistically significant (p < 0.0001). Linear regression analysis revealed a good Ocuton S/Goldmann correspondence (r = 0.88, p = 0.0001). However, IOP values detected with the Ocuton were consistently overestimated, compared to those detected with the Goldmann tonometer. The correlation between corneal thickness and IOP was statistically significant both for the Goldmann (r = 0.510, p = 0.021) and for the Ocuton S tonometer (r = 0.520, p = 0.019). No correlation was found between keratometry and IOP. The mean measurement obtained by self-tonometry (21.9 ± 3.6 mmHg) showed no statistically significant difference when compared to the mean measurement obtained by an expert operator (21.3 ± 3.4 mmHg). Conclusion:, The Ocuton S tonometer is a safe instrument that can be used easily by the patient. However, in comparison to the Goldmann tonometer , it overestimates IOP and requires further technical and methodological refinements in order to ensure greater reliability. [source]


    Reproducibility and accuracy of the ICDAS-II for occlusal caries detection

    COMMUNITY DENTISTRY AND ORAL EPIDEMIOLOGY, Issue 5 2009
    Michele Baffi Diniz
    Abstract,,, Objectives:, The aim of this in vitro study was to assess the inter- and intra-examiner reproducibility and the accuracy of the International Caries Detection and Assessment System-II (ICDAS-II) in detecting occlusal caries. Methods:, One hundred and sixty-three molars were independently assessed twice by two experienced dentists using the 0- to 6-graded ICDAS-II. The teeth were histologically prepared and classified using two different histological systems [Ekstrand et al. (1997) Caries Research vol. 31, pp. 224,231; Lussi et al. (1999) Caries Research vol. 33, pp. 261,266] and assessed for caries extension. Sensitivity, specificity, accuracy and area under the ROC curve (Az) were obtained at D2 and D3 thresholds. Unweighted kappa coefficient was used to assess inter- and intra-examiner reproducibility. Results:, For the Ekstrand et al. histological classification the sensitivity was 0.99 and 1.00, specificity 1.00 and 0.69 and accuracy 0.99 and 0.76 at D2 and D3, respectively. For the Lussi et al. histological classification the sensitivity was 0.91 and 0.75, specificity 0.47 and 0.62 and accuracy 0.86 and 0.68 at D2 and D3, respectively. The Az varied from 0.54 to 0.73. The inter- and intra-examiner kappa values were 0.51 and 0.58, respectively. Conclusions:, ICDAS-II presented good reproducibility and accuracy in detecting occlusal caries, especially caries lesions in the outer half of the enamel. [source]


    ENDOSCOPIC DEFINITION OF ESOPHAGOGASTRIC JUNCTION FOR DIAGNOSIS OF BARRETT'S ESOPHAGUS: IMPORTANCE OF SYSTEMATIC EDUCATION AND TRAINING

    DIGESTIVE ENDOSCOPY, Issue 4 2009
    Norihisa Ishimura
    The diagnosis of Barrett's esophagus (BE) requires an accurate recognition of the columnar-lined esophagus at endoscopy. However, a universally accepted standardized endoscopic grading system of BE was lacking prior to the development of the Prague ,circumferential and maximal' criteria. In this system, the landmark for the esophagogastric junction (EGJ) is the proximal end of the gastric folds, not the distal end of the palisade vessels, which are used to endoscopically identify the EGJ in Japan. Although the circumferential and maximal criteria are clinically relevant, an important shortcoming of this system may be failure to identify short-segment BE, a lesion that is found frequently in the Japanese. To compare the diagnostic yield for BE when using the palisade vessels versus gastric folds as a landmark for the EGJ, we evaluated interobserver diagnostic concordance. The endoscopic identification of the EGJ using both landmarks resulted in unacceptably low kappa coefficients of reliability. However, there was a statistically significant improvement after the participants were thoroughly trained in identification of the EGJ during the endoscopic study. Although it remains controversial which landmark is better for the endoscopic diagnosis of BE, it is important to systematically educate and train endoscopists in order to improve diagnostic consistency in patients with BE. [source]


    The psychometric properties of the Miller Behavioural Style Scale with adult daughters of women with early breast cancer: a literature review and empirical study

    JOURNAL OF ADVANCED NURSING, Issue 2 2000
    Charlotte E. Rees BSc PhD
    The psychometric properties of the Miller Behavioural Style Scale with adult daughters of women with early breast cancer: a literature review and empirical study Several researchers have suggested that the information-seeking behaviours of patients need to be taken into consideration when assessing their information needs. This study reviews published evidence of the psychometric properties of the Miller Behavioural Style Scale, a tool commonly used to identify the information-seeking behaviours of individuals under threat, and examines its reliability and validity with adult daughters of women with early breast cancer. Ninety-seven adult daughters completed the MBSS and a 30-item, self-administered questionnaire, a tool designed to explore the information needs of adult daughters of women with breast cancer. The internal consistency of the monitoring and blunting sub-scales of the MBSS was ,=0·65 and 0·41 respectively. The blunting sub-scale fell substantially below acceptable limits and was discarded from subsequent analyses. The monitoring sub-scale possessed good test,retest reliability (n=17) with a 5-week time interval (r=0·71, P < 0·005), as measured using a Pearson's correlation coefficient. Furthermore, the majority (73·4%) of monitoring items possessed moderate or substantial test,retest reliability, as indicated by kappa coefficients. Finally, the monitoring sub-scale possessed good construct validity, both discriminant and convergent validity, as measured by the univariate associations between monitoring behaviour and selected items from the information questionnaire and a demographic questionnaire. In conclusion, adequate support exists for the psychometric properties of the monitoring sub-scale of the MBSS and its use with adult daughters of women with early breast cancer in future research. These findings have a number of implications for nursing research and these are discussed in this paper. [source]


    Optometric glaucoma referrals , measures of effectiveness and implications for screening strategy

    OPHTHALMIC AND PHYSIOLOGICAL OPTICS, Issue 6 2000
    Jim Gilchrist
    Summary The effectiveness of disease screening is conventionally evaluated using the epidemiological indices of sensitivity and specificity, which measure the association between screening test results and final diagnoses of all the patients screened. The effectiveness of optometric glaucoma referrals cannot be measured using such indices because diagnoses are obtained only on patients who are referred, while the true disease status of those not referred remains unknown. Instead, glaucoma referral effectiveness has been evaluated using measures of ,detection rate', the proportion of those screened who are correctly referred, and ,referral accuracy', the proportion of those referred who are correctly referred occurrence. Examination of these operational measures shows that their obtainable values and, hence, their interpretation are influenced by the total proportions of diseased and referred patients, one or both of which will generally be unavailable in evaluating samples of referrals. On the other hand, if valid estimates of these proportions can be obtained from other sources, it is possible to rescale detection rate and referral accuracy to take account of them. This rescaling produces a pair of weighted kappa coefficients, chance-corrected measures of association between referral and diagnosis, which provide a better indication of true referral effectiveness than other measures. An important consequence of this approach is that it provides a clear quantitative illustration of the need for a dual strategy to improve the overall quality of optometric glaucoma screening; widespread adoption of more comprehensive modes of screening to improve accuracy, together with a significant increase in the total numbers of patients screened to improve detection. In order for detection rates to reach desirable levels, the total number of referrals in any sub-population of patients must match or exceed the number of patients with disease. This analysis confirms quantitatively that which is intuitively obvious; not only that glaucoma awareness and uptake of screening opportunities must be encouraged in all patients over 40 years of age, but also that the older and/or more at risk patients are, the greater is their need to take advantage of glaucoma screening. [source]


    The Measurement of the QT and QTc on the Neonatal and Infant Electrocardiogram: A Comprehensive Reliability Assessment

    ANNALS OF NONINVASIVE ELECTROCARDIOLOGY, Issue 2 2009
    B.S., Robert M. Gow M.B.
    Background: An electrocardiogram has been proposed to screen for prolonged QT interval that may predispose infants to sudden death in the first year of life. Understanding the reliability of QT interval measurement will inform the design of a screening program. Methods: Three pediatric cardiologists measured the QT/RR intervals on 60 infant electrocardiograms (median age 46 days), from leads II, V5 and V6 on three separate occasions, 7 days apart, according to a standard protocol. The QTc was corrected by Bazett's (QTcB), Fridericia's (QTCFrid), and Hodges' (QTcH) formulae. Intraobserver and interobserver reliability were assessed by intraclass correlation coefficients (ICC), limits of agreement and repeatability coefficients for single, average of two and average of three measures. Agreement for QTc prolongation (> 440 msec) was assessed by kappa coefficients. Results: QT interval intraobserver ICC was 0.86 and repeatability coefficient was 25.9 msec; interobserver ICC increased from 0.88 for single observations to 0.94 for the average of 3 measurements and repeatability coefficients decreased from 22.5 to 16.7 msec. For QTcB, intraobserver ICC was 0.67, and repeatability was 39.6 msec. Best interobserver reliability for QTcB was for the average of three measurements (ICC 0.83, reproducibility coefficient 25.8 msec), with further improvement for QTcH (ICC 0.92, reproducibility coefficient 16.69 msec). Maximum interobserver kappa for prolonged QTc was 0.77. Misclassification around specific cut points occurs because of the repeatability coefficients. Conclusions: Uncorrected QT measures are more reliable than QTcB and QTCFrid. An average of three independent measures provides the most reliable QT and QTc measurements, with QTcH better than QTcB. [source]


    Audit of antibiotic prescribing in two governmental teaching hospitals in Indonesia

    CLINICAL MICROBIOLOGY AND INFECTION, Issue 7 2008
    U. Hadi
    Abstract This article estimates the magnitude and quality of antibiotic prescribing in Indonesian hospitals and aims to identify demographic, socio-economic, disease-related and healthcare-related determinants of use. An audit on antibiotic use of patients hospitalized for 5 days or more was conducted in two teaching hospitals (A and B) in Java. Data were collected by review of records on the day of discharge. The method was validated through concurrent data collection in Hospital A. Multivariate logistic regression analysis was performed to determine variables to explain antibiotic prescribing. Prescriptions were assessed by three reviewers using standardized criteria. A high proportion (84%) of 999 patients (499 in Hospital A and 500 in Hospital B) received an antibiotic. Prescriptions could be categorized as therapeutic (53%) or prophylactic (15%), but for 32% the indication was unclear. Aminopenicillins accounted for 54%, and cephalosporins (mostly third generation) for 17%. The average level of antibiotic use amounted to 39 DDD/100 patient-days. Validation revealed that 30% of the volume could be underestimated due to incompleteness of the records. Predictors of antibiotic use were diagnosis of infection, stay in surgical or paediatric departments, low-cost nursing care, and urban residence. Only 21% of prescriptions were considered to be definitely appropriate; 15% were inappropriate regarding choice, dosage or duration, and 42% of prescriptions, many for surgical prophylaxis and fever without diagnosis of infection, were deemed to be unnecessary. Agreement among assessors was low (kappa coefficients 0.13,0.14). Despite methodological limitations, recommendations could be made to address the need for improving diagnosis, treatment and drug delivery processes in this setting. [source]


    The International Caries Detection and Assessment System (ICDAS): an integrated system for measuring dental caries

    COMMUNITY DENTISTRY AND ORAL EPIDEMIOLOGY, Issue 3 2007
    A. I. Ismail
    Abstract,,, This paper describes early findings of evaluations of the International Caries Detection and Assessment System (ICDAS) conducted by the Detroit Center for Research on Oral Health Disparities (DCR-OHD). The lack of consistency among the contemporary criteria systems limits the comparability of outcomes measured in epidemiological and clinical studies. The ICDAS criteria were developed by an international team of caries researchers to integrate several new criteria systems into one standard system for caries detection and assessment. Using ICDAS in the DCR-OHD cohort study, dental examiners first determined whether a clean and dry tooth surface is sound, sealed, restored, crowned, or missing. Afterwards, the examiners classified the carious status of each tooth surface using a seven-point ordinal scale ranging from sound to extensive cavitation. Histological examination of extracted teeth found increased likelihood of carious demineralization in dentin as the ICDAS codes increased in severity. The criteria were also found to have discriminatory validity in analyses of social, behavioral and dietary factors associated with dental caries. The reliability of six examiners to classify tooth surfaces by their ICDAS carious status ranged between good to excellent (kappa coefficients ranged between 0.59 and 0.82). While further work is still needed to define caries activity, validate the criteria and their reliability in assessing dental caries on smooth surfaces, and develop a classification system for assessing preventive and restorative treatment needs, this early evaluation of the ICDAS platform has found that the system is practical; has content validity, correlational validity with histological examination of pits and fissures in extracted teeth; and discriminatory validity. [source]