Home About us Contact | |||
Interobserver Variation (interobserver + variation)
Selected AbstractsRapid review of liquid-based smears as a quality control measureDIAGNOSTIC CYTOPATHOLOGY, Issue 3 2004Sheryl Henderson M.Med.Sc.(Cytol.) Abstract The objective of this study was to investigate the effectiveness of a standardized method of rapid review (RR) of monolayer preparations for the identification of abnormalities, the presence of an endocervical component and infectious agents. A total of 200 ThinPrep (Cytyc, Boxborough, MA) slides representing the spectrum of abnormalities commonly encountered in cervical/vaginal cytologic specimens was retrieved from archive. The study set comprised 129 cases within normal limits (WNL); 36 low-grade epithelial abnormalities (LGEA); 28 high-grade epithelial abnormalities (HGEA), including 2 endocervical adenocarcinomas in situ (AIS) and 7 carcinomas. Eighteen false negative (FN) cases were also included for study. Originally missed on initial review, these cases were found to be abnormal on quality control review (17 LGEA; 1 AIS). Commonly encountered infectious agents were represented and included Candida albicans, Trichomonas vaginalis, herpes simplex virus, and Actinomyces. The slides were reviewed using a standardized method of RR (turret technique, for 60 sec) by three experienced screeners masked to the original reference diagnosis. Median sensitivity for LGEA was 70% (range, 67,72%); HGEA, 69% (range, 54,80%); and FN, 65% (range, 56,78%). Specificity remained high, median specificity for LGEA was 95%; HGEA, 97%; and FN, 100%. There was no significant overcalling of any diagnostic category. The chi-square test at P < 0.05 showed no significant difference between RR and full manual rescreen of the ThinPrep smears in this study. While no statistical difference was proven, the sensitivity measurements for all categories of abnormality were moderate due to the high proportion of atypical cases included into the study set. Abnormalities on the monolayer preparations frequently displayed fewer, smaller groups of disaggregated cells with rounded cytoplasmic outlines that were difficult to discern on RR. Interobserver variation was noted. Monolayers with a paucity of diagnostic cells and those displaying subtle nuclear atypia were often overlooked. Diagn. Cytopathol. 2004;31:141,146. © 2004 Wiley-Liss, Inc. [source] Smears diagnosed as ASCUS: Interobserver variation and follow-upDIAGNOSTIC CYTOPATHOLOGY, Issue 2 2001C.F.I.A.C., Rose Marie Gatscha S.C.T. (A.S.C.P.) Abstract The purpose of this study was to apply atypical squamous cells of undetermined significance (ASCUS) criteria from the Bethesda System for Reporting Cervical/Vaginal Cytologic Diagnoses (TBS) to the rescreen of cases previously diagnosed as ASCUS, to compare initial and rescreen diagnoses, and to analyze agreement with follow-up (cytology or histology). Two cytotechnologists (S.B. and M.J.M.) and one cytopathology fellow (M.A.) rescreened 632 cervicovaginal specimens diagnosed as ASCUS between June 1, 1992,December 31, 1995. Age and LMP were provided. Rescreen diagnoses were categorized as within normal limits (WNL), ASCUS, low-grade squamous intraepithelial lesions (LSIL), high-grade squamous intraepithelial lesions (HSIL), or carcinoma (CA). Complete agreement was found in 200 specimens (32%): 31 (15%) WNL; 91 (45%) ASCUS; 77 (38.5%) SIL; and one (0.50%) CA. Follow-up revealed no abnormality in 67% of the cases reclassified as WNL, 49% of the cases reclassified as ASCUS, and 48% of the cases reclassified as squamous intraepithelial lesions (SIL). SIL was found in 29% of cases reclassified as WNL, 29% of specimens rediagnosed as ASCUS, and 34% of cases reclassified as SIL. Partial agreement was found in 391 specimens (62%). In 41 specimens (6%), rescreeners were in complete disagreement, and follow-up revealed 9/41 (22%) SIL or worse; 21/41 (51%) WNL; and 4/41 (10%) inconclusive. Applying established criteria, 14% (91/632) of cases diagnosed as ASCUS resulted in complete agreement, and 30% (190/632) resulted in partial agreement. Follow-up of cases initially diagnosed as ASCUS revealed SIL or CA in 30% of cases. ASCUS is a significant diagnosis warranting careful patient follow-up. Diagn. Cytopathol. 2001;25:138,140. © 2001 Wiley-Liss, Inc. [source] Observer variation in immunohistochemical analysis of protein expression, time for a change?HISTOPATHOLOGY, Issue 7 2006T Kirkegaard Aim :,Immunohistochemical analysis of protein expression is central to most clinical translational studies and defines patient treatment or selection criteria for novel drugs. Interobserver variation is rarely analysed despite recognition that this is a key area of potential inaccuracy. Therefore our aim was to examine observer variation and suggest the revision of current standards. Methods and results :,We analysed inter- and intra-observer variation, by interclass correlation coefficient (ICCC) and , statistics, in 8661 samples. Intra-observer assessment of nuclear, cytoplasmic and membrane staining for seven proteins in 1323 samples resulted in an ICCC of 0.94 and a , -value of 0.787. Interobserver reproducibility, assessed on 28 proteins by seven observer pairs in 8661 carcinomas, gave an ICCC of 0.90 and a , -value of 0.70. No significant effect of either antibody or cellular compartmentalization was observed. Conclusion :,We have demonstrated that ICCC is a consistent method to assess observer variation when a continuous scoring system is used, compared with , statistics, which depends on a categorical system. Given the importance of accurate assessment of protein expression in diagnostic and experimental medicine, we suggest raising thresholds for observer variation: ICCC of 0.7 should be regarded as the minimum acceptable standard, ICCC of 0.8 as good and ICCC of ,,0.9 as excellent. [source] Measurement of cardiac output in normal pregnancy by a non-invasive two-dimensional independent Doppler deviceAUSTRALIAN AND NEW ZEALAND JOURNAL OF OBSTETRICS AND GYNAECOLOGY, Issue 2 2009Catharina C. M. KAGER Aims: To compose a normogram regarding cardiac output during pregnancy measured with ultrasonic cardiac output monitor (USCOM), a non-expensive simple continuous wave Doppler device and to investigate if this machine could be useful for haemodynamic monitoring during pregnancy. Methods: Cardiac output was measured in 172 pregnant women with a gestational age < 21 weeks (n = 59), 21,32 weeks (n = 48), and > 32 weeks' gestation (n = 48). Interobserver differences were determined by measuring 24 patients and comparing results between three different observers. Results: A good signal could be obtained in 155 (90.2%) pregnant women. Haemodynamic profiles were in line with data published in the literature. In 9.8 % of cases it was difficult to get a good result. Interobserver variations between the research officer (CK) and two clinicians were good (r = 0.9359 and r = 0.9609). Conclusion: USCOM appears to be a reliable and fast method to measure cardiac output compared with existing highly complex ultrasounds machines used in cardiology. It is easy to learn, cheap and quite reproducible between different observers. Further research is required to define its place in the management of hypertensive complications during pregnancy. [source] Grading of dysplasia in Barrett's oesophagus: substantial interobserver variation between general and gastrointestinal pathologistsHISTOPATHOLOGY, Issue 7 2007M Kerkhof Aims:, To determine interobserver variation in grading of dysplasia in Barrett's oesophagus (BO) between non-expert general pathologists and expert gastrointestinal pathologists on the one hand and between expert pathologists on the other hand. Methods and results:, In this prospective multicentre study, non-expert and expert pathologists graded biopsy specimens of 920 patients with endoscopic BO, which were blindly reviewed by one member of a panel of expert pathologists (panel experts) and by a second panel expert in case of disagreement on dysplasia grade. Agreement between two of three pathologists was established as the final diagnosis. Analysis was performed by , statistics. Due to absence of intestinal metaplasia, 127/920 (14%) patients were excluded. The interobserver agreement for dysplasia [no dysplasia (ND) versus indefinite for dysplasia/low-grade dysplasia (IND/LGD) versus high-grade dysplasia (HGD)/adenocarcinoma (AC)] between non-experts and first panel experts and between initial experts and first panel experts was fair (, = 0.24 and ,,= 0.27, respectively), and substantial for differentiation of HGD/AC from ND/IND/LGD (, = 0.62 and ,,= 0.58, respectively). Conclusions:, There was considerable interobserver variability in the interpretation of ND or IND/LGD in BO between non-experts and experts, but also between expert pathologists. This suggests that less subjective markers are needed to determine the risk of developing AC in BO. [source] Measurement of Midfemoral Shaft Geometry: Repeatability and Accuracy Using Magnetic Resonance Imaging and Dual-Energy X-ray AbsorptiometryJOURNAL OF BONE AND MINERAL RESEARCH, Issue 12 2001Helen J. Woodhead Abstract Although macroscopic geometric architecture is an important determinant of bone strength, there is limited published information relating to the validation of the techniques used in its measurement. This study describes new techniques for assessing geometry at the midfemur using magnetic resonance imaging (MRI) and dual-energy X-ray absorptiometry (DXA) and examines both the repeatability and the accuracy of these and previously described DXA methods. Contiguous transverse MRI (Philips 1.5T) scans of the middle one-third femur were made in 13 subjects, 3 subjects with osteoporosis. Midpoint values for total width (TW), cortical width (CW), total cross-sectional area (TCSA), cortical cross-sectional area (CCSA), and volumes from reconstructed three-dimensional (3D) images (total volume [TV] and cortical volume [CVol]) were derived. Midpoint TW and CW also were determined using DXA (Lunar V3.6, lumbar software) by visual and automated edge detection analysis. Repeatability was assessed on scans made on two occasions and then analyzed twice by two independent observers (blinded), with intra- and interobserver repeatability expressed as the CV (CV ± SD). Accuracy was examined by comparing MRI and DXA measurements of venison bone (and Perspex phantom for MRI), against "gold standard" measures made by vernier caliper (width), photographic image digitization (area) and water displacement (volume). Agreement between methods was analyzed using mean differences (MD ± SD%). MRI CVs ranged from 0.5 ± 0.5% (TV) to 3.1 ± 3.1% (CW) for intraobserver and 0.55 ± 0.5% (TV) to 3.6 ± 3.6% (CW) for interobserver repeatability. DXA results ranged from 1.6 ± 1.5% (TW) to 4.4 ± 4.5% (CW) for intraobserver and 3.8 ± 3.8% (TW) to 8.3 ± 8.1% (CW) for interobserver variation. MRI accuracy was excellent for TV (3.3 ± 6.4%), CVol (3.5 ± 4.0%), TCSA (1.8 ± 2.6%), and CCSA (1.6 ± 4.2%) but not TW (4.1 ± 1.4%) or CW (16.4 ± 14.9%). DXA results were TW (6.8 ± 2.7%) and CW (16.4 ± 17.0%). MRI measures of geometric parameters of the midfemur are highly accurate and repeatable, even in osteoporosis. Both MRI and DXA techniques have limited value in determining cortical width. MRI may prove valuable in the assessment of surface-specific bone accrual and resorption responses to disease, therapy, and variations in mechanical loading. [source] Use of a curved-array transducer to reduce interobserver variation in sonographic measurement of thyroid volume in healthy adultsJOURNAL OF CLINICAL ULTRASOUND, Issue 4 2003Els Y. Peeters MD Abstract Purpose Sonographic calculation of thyroid volume is used in the diagnosis and follow-up of thyroid diseases. Since the calculated volume of thyroid lobes is highly influenced by the longest (ie, craniocaudal) diameter, we examined whether using a curved-array transducer as opposed to a linear-array transducer to measure the craniocaudal diameter would reduce interobserver variation. Methods Three sonographers with different levels of expertise each used a 5,12-MHz linear-array transducer and a 2,5-MHz curved-array transducer to measure the craniocaudal diameter of both thyroid lobes of 25 healthy volunteers. On the basis of these measurements, thyroid lobe volumes were calculated. Single-factor analysis of variance was used to evaluate the interobserver variations between the measurements made by all 3 observers as well as between measurements taken by pairs of observers. A p value of less than 0.05 was considered significant. Results Using the linear-array transducer to measure the craniocaudal diameter resulted in significant interobserver variation in thyroid volume calculation (p = 0.02), whereas using the convex-array transducer did not. Using either transducer resulted in a highly significant interobserver variation in measurements of the craniocaudal diameter, although the variation was far more pronounced for measurements made with the linear-array transducer (p = 0.0005) than for those made with the curved-array transducer (p = 0.04). For both transducers, the interobserver variations were most pronounced between the most and the least experienced sonographers. Conclusions To avoid significant interobserver variation in calculating thyroid lobe volume, we recommend using a curved-array transducer to measure the craniocaudal diameter of the thyroid lobes. © 2003 Wiley Periodicals, Inc. J Clin Ultrasound 31:189,193, 2003 [source] Measurement error in computed tomography pelvimetryJOURNAL OF MEDICAL IMAGING AND RADIATION ONCOLOGY, Issue 2 2005N Anderson SUMMARY Computed tomography pelvimetry is still used in clinical practice. We wished to quantify observer error in order to assess the level of confidence with which pelvic measurements can be described as adequate or inadequate. Anteroposterior inlet, anteroposterior outlet, transverse inlet and interspinous distances were measured from 11 CT pelvimetry examinations by five observers at one institution. Three CT pelvimetries were measured by five observers at a second institution. Intraobserver and interobserver variation was assessed using analysis of variance. Reliability of measurements was assessed using intraclass correlation coefficient. Combined error was calculated to determine 95% confidence limits for published minimum recommended pelvic measurements. The standard error of measurement, combining all sources, for measurement of the bony dimensions of the pelvis were: for anteroposterior inlet, 2.0 mm; anteroposterior outlet, 6.9 mm; transverse inlet, 1.3 mm; and interspinous distance, 2.1 mm. The 95% confidence interval around the recommended anteroposterior outlet of 100 mm was 88.5,111.3 mm. Observer variation in measurement of anteroposterior outlet is so large as to make the measurement of doubtful clinical utility. [source] Variation in identifying neonatal percutaneous central venous line positionJOURNAL OF PAEDIATRICS AND CHILD HEALTH, Issue 9-10 2004DE Odd Objective: The study objective was to obtain data on interpretation, including intra and interobserver variation and action taken for a given line tip location, for a series of radiographs demonstrating neonatal long lines. Methods: Nineteen radiographs taken to identify line tip position were digitized and published on an internet site. One film was included twice in order to assess intraobserver variation giving a total of 20 images. Fourteen used radio-opaque contrast and five no contrast. Australian and New Zealand Neonatal Network members and National Women's Hospital NICU staff were invited to participate in the study. For each radiograph, participants were asked to identify if long line tip could be identified, the likely anatomical position and desired action. Interobserver agreement was assessed by the maximum proportion of agreement per radiograph and by the number of different options selected. Intraobserver agreement was assessed by comparing the two reports from the duplicate radiograph. Results: Twenty-seven responses were received. Overall, 50% of the reports stated that the long line tips could be identified. The most commonly reported position was in the right atrium (31%) and most commonly reported action was to pull the line back (53%). The median agreement of whether the line was seen was 68%, agreement on position 62% and agreement on action 86%. On analysis of intraobserver variability, from the identical radiographs, 27% of respondents differed on whether the line tip could be visualized. Conclusion: Interobserver and intraobserver reliability was poor when using radiographs to assess long line tips. The major determinant of line repositioning was the perceived location. [source] Diagnostic evaluation of planar and tomographic ventilation/perfusion lung images in patients with suspected pulmonary emboliCLINICAL PHYSIOLOGY AND FUNCTIONAL IMAGING, Issue 5 2004Marika Bajc Summary Planar lung ventilation/perfusion scintigraphy (V/PPLANAR) is a standard method for diagnosis of pulmonary embolism (PE). The goals of this study were to test whether the diagnostic information of ventilation/perfusion tomography (V/PSPET) applied in clinical routine might enhance information compared with V/PPLANAR and to streamline data processing for the demands of clinical routine. This prospective study includes 53 patients suspected for PE referred for lung scintigraphy. After inhalation of 99mTc-DTPA planar ventilation imaging was followed by tomography, using a dual-head gamma camera. 99mTc-MAA was injected i.v. for perfusion tomography followed by planar imaging. Patients were examined in supine position, unchanged during V/P tomography. Two reviewers evaluated V/PPLANAR and V/PSPET images separately and randomly. Mismatch points were calculated on the basis of extension of perfusion defects with preserved ventilation. Patients were followed up clinically for at least 6 months. With V/PSPET the number of patients with PE was higher and 53% more mismatch points were found. In V/PSPET interobserver variation was less compared with V/PPLANAR. Ancillary findings were observed by both techniques in half of the patients but more precisely interpreted with V/PSPET. V/PSPET shows more and better delineated mismatch defects, improved quantification and less interobserver variation compared with V/PPLANAR. V/PSPET is amenable to implementation for clinical routine and suitable even when there is demand for a high patient throughput. [source] Use of a curved-array transducer to reduce interobserver variation in sonographic measurement of thyroid volume in healthy adultsJOURNAL OF CLINICAL ULTRASOUND, Issue 4 2003Els Y. Peeters MD Abstract Purpose Sonographic calculation of thyroid volume is used in the diagnosis and follow-up of thyroid diseases. Since the calculated volume of thyroid lobes is highly influenced by the longest (ie, craniocaudal) diameter, we examined whether using a curved-array transducer as opposed to a linear-array transducer to measure the craniocaudal diameter would reduce interobserver variation. Methods Three sonographers with different levels of expertise each used a 5,12-MHz linear-array transducer and a 2,5-MHz curved-array transducer to measure the craniocaudal diameter of both thyroid lobes of 25 healthy volunteers. On the basis of these measurements, thyroid lobe volumes were calculated. Single-factor analysis of variance was used to evaluate the interobserver variations between the measurements made by all 3 observers as well as between measurements taken by pairs of observers. A p value of less than 0.05 was considered significant. Results Using the linear-array transducer to measure the craniocaudal diameter resulted in significant interobserver variation in thyroid volume calculation (p = 0.02), whereas using the convex-array transducer did not. Using either transducer resulted in a highly significant interobserver variation in measurements of the craniocaudal diameter, although the variation was far more pronounced for measurements made with the linear-array transducer (p = 0.0005) than for those made with the curved-array transducer (p = 0.04). For both transducers, the interobserver variations were most pronounced between the most and the least experienced sonographers. Conclusions To avoid significant interobserver variation in calculating thyroid lobe volume, we recommend using a curved-array transducer to measure the craniocaudal diameter of the thyroid lobes. © 2003 Wiley Periodicals, Inc. J Clin Ultrasound 31:189,193, 2003 [source] Global angiographic scoring system for inflammatory diseasesACTA OPHTHALMOLOGICA, Issue 2009M KHAIRALLAH Purpose Fundus fluorescein and indocyanine green angiography are essential imaging techniques in the appraisal of posterior segment inflammation. A combined fluorescein and indocyanine green angiographic scoring system has been developed in order to provide semi-quantitative data for follow-up of disease progression, monitoring response to treatment, and comparison between clinical studies. We tested interobserver variations in the semi-quantitative scoring of dual fluorescein/indocyanine green angiograms. Methods Four observers scored 32 dual fluorescein and indocyanine green angiograms. Spearman rank correlation was used to analyze correlation between scores assigned to each angiographic sign. We used the Kappa statistics to test agreement between pairs of observers in comparative total fluorescein and indocyanine green angiographic scores. Results We found a significant correlation between pairs of observers in scores assigned to each fluorescein angiographic sign and the total score of fluorescein angiograms. A significant correlation was found only between 2 separate pairs of observers in scores assigned to early stromal vessel hyperfluoresence on indocyanine green angiography. However, a significant correlation was found in other indocyanine green angiographic signs and the total score of indocyanine green angiograms. There was a good agreement between observers in comparative fluorescein , indocyanine green angiographic total scores. Conclusion Further experience with the scoring system, especially with the indocyanine green angiographic scoring, may improve its reproducibility. [source] |