Acceptable Reliability (acceptable + reliability)

Distribution by Scientific Domains


Selected Abstracts


A preliminary investigation of the reliability and validity of the Brief Assessment Schedule Depression Cards and the Beck Depression Inventory-Fast Screen to screen for depression in older stroke survivors

INTERNATIONAL JOURNAL OF GERIATRIC PSYCHIATRY, Issue 5 2008
A. K. Healey
Abstract Objective To conduct an initial assessment of the reliability and validity of the Brief Assessment Schedule Depression Cards (BASDEC) and the Beck Depression Inventory-Fast Screen (BDI-FS) to screen for depression in older stroke survivors. Methods Participants from four inpatient rehabilitation units completed the BASDEC and the BDI-FS together with the Hospital Anxiety and Depression Scale (HADS) for comparison. The Structured Clinical Interview for DSM-IV Axis 1 Disorders (SCID) was then completed with all participants to ascertain a criterion depression diagnosis. The BASDEC and BDI-FS were subsequently completed for a second time. Results Forty-nine stroke survivors (M,=,78.80, SD,=,6.79 years) were included. The BASDEC and BDI-FS demonstrated acceptable internal consistency and test,retest reliability. The BASDEC (cut-off ,7) resulted in a sensitivity of 1.0 and specificity of 0.95 for detecting major depression whereas the BDI-FS (cut-off ,4) had a sensitivity of 0.71 and specificity of 0.74. When participants with minor depression were included in analyses, sensitivity lowered to 0.69 (specificity,=,0.97) for the BASDEC and 0.62 (specificity,=,0.78) for the BDI-FS. Conclusions The BASDEC and BDI-FS were found to have acceptable reliability. The BASDEC demonstrated some advantage in criterion validity over the BDI-FS at the examined cut-offs. Copyright © 2007 John Wiley & Sons, Ltd. [source]


Translation and restandardization of an instrument: the Early Infant Temperament Questionnaire

JOURNAL OF ADVANCED NURSING, Issue 2 2003
Elisabeth O.C. Hall PhD RN
Aims of the study. ,To test the psychometric properties of a Danish translation of the Early Infant Temperament Questionnaire (EITQ) and to establish standards for scoring the questionnaire. Rationale. ,The general aim was to create a translation that remained close to the original version, was meaningful for the Danish participants, and had acceptable psychometric properties. Background. ,Patterns of temperament can be discerned early in life and tend to persist over time and across situations. For the past 50 years, temperament has been studied by theorists, clinicians and nurse clinicians to predict behaviour, discover interventions that prevent serious behaviour disturbances, and help parents understand the implications of their child's temperament. Thomas and Chess's conceptualization of temperament in nine categories was the framework for the development of the English-language EITQ. Research methods. ,The translation followed a stepwise process of translation, back translation and consensus. A convenience sample of 204 Danish mothers with 1,4-month old infants completed the translated questionnaire and a demographic questionnaire in 1999. Results. ,Alpha coefficients for the nine subscales ranged from 0·59 to 0·82. All alpha coefficients were comparable to or higher than those reported on the original United States standardization study. There were statistically significant differences between reported United States mean scores and those in the Danish sample. Discussion. ,The psychometric properties of the Danish translation are equal to or better than those reported for the United States study. Differences in mean scores or most subscales point to the need to create Danish profiles for scoring. Conclusions. ,The Danish version of the EITQ has acceptable reliability and is ready for use in Denmark. [source]


Multidimensional Attitudes of Medical Residents and Geriatrics Fellows Toward Older People

JOURNAL OF AMERICAN GERIATRICS SOCIETY, Issue 3 2005
Ming Lee PhD
Objectives: To examine dimensions of a validated instrument measuring geriatric attitudes of primary care residents and performances on these dimensions between residents and fellows. Design: Cross-sectional and longitudinal studies. Setting: An academic medical center. Participants: Two hundred thirty-eight primary care residents (n=177) and geriatrics fellows (n=61) participated in the study from 1995 to 2000. Measurements: A 14-item, 5-point Likert scale previously validated for measuring primary care residents' attitudes toward older people and geriatric patient care was used. Results: Factor analysis showed four dimensions of the scale, labeled Social Value, Medical Care (MC), Compassion (CP), and Resource Distribution, which demonstrated acceptable reliability. Both groups of subjects showed significantly (P<.001) positive (mean>3) attitudes across the dimensions and times, except for residents, who had near-neutral (mean=3) attitudes on MC. Residents' mean attitude scores on the overall scale and the MC and CP subscales were significantly (P<.001) lower than those of fellows over time. Residents and fellows showed different change patterns in attitudes over time. Residents' attitudes generally improved during the first 2 years of training, whereas fellows' attitudes declined slightly. Personal experience was a strong predictor of residents' attitudes toward older patients. Ethnicity, academic specialty, professional experience, and career interest in geriatrics were also associated with residents' attitude scores. Conclusion: The multidimensional analysis of the scale contributes to better understanding of medical trainees' attitudes and sheds light on educational interventions. [source]


Job Satisfaction and Subjective Well-Being in a Sample of Nurses

JOURNAL OF APPLIED SOCIAL PSYCHOLOGY, Issue 5 2005
Stephen A. Sparks
It is surprising that there are no published studies exploring job satisfaction and subjective well-being (SWB) in nurses given the current shortage (Clark & Clark, 2002). For the present study, 152 nurses completed measures of job satisfaction, SWB, and social desirability. The Dimensions of Satisfaction scale was designed for this study and demonstrated acceptable reliability and validity. Results indicated that the most important aspect to nurses' job satisfaction is pay, followed by staffing and benefits. When entering the field, nurses most valued pay, followed by personal fulfillment and respect. A majority of the sample (59%) indicated satisfaction with their job, but this is well below the national average for American workers (85%; National Opinion Research Center, 2000). Nurses indicated higher SWB than the general population (Myers & Diener, 1996). However, the correlation between job satisfaction and SWB was lower than that of the general population (Tail, Padgett, & Baldwin, 1989). [source]


Connected Learning and the Foundations of Psychometrics: A Rejoinder

JOURNAL OF PHILOSOPHY OF EDUCATION, Issue 1 2006
RANDALL CURREN
This paper continues an exchange between its author and Andrew Davis. Part I addresses the attribution and ontological status of mental constructs and argues that philosophical work on these topics does not undermine high stakes testing. Part II examines the significance for testing of the connectedness of meaningful learning. Part III addresses the high stakes in high stakes testing in connection with the risk entailed by limited scoring reliability. It concludes that there is no straightforward relationship between the magnitude of what is at stake for students and teachers and the threshold of acceptable reliability in scoring. [source]


Initial evaluation of the first year of the Foundation Assessment Programme

MEDICAL EDUCATION, Issue 1 2009
Helena Davies
Objectives, This study represents an initial evaluation of the first year (F1) of the Foundation Assessment Programme (FAP), in line with Postgraduate Medical Education and Training Board (PMETB) assessment principles. Methods, Descriptive analyses were undertaken for total number of encounters, assessors and trainees, mean number of assessments per trainee, mean number of assessments per assessor, time taken for the assessments, mean score and standard deviation for each method. Reliability was estimated using generalisability coefficients. Pearson correlations were used to explore relationships between instruments. The study sample included 3640 F1 trainees from 10 English deaneries. Results, A total of 2929 trainees submitted at least one of all four methods. A mean of 16.6 case-focused assessments were submitted per F1 trainee. Based on a return per trainee of six of each of the case-focused assessments, and eight assessors for multi-source feedback, 95% confidence intervals (CIs) ranged between 0.4 and 0.48. The estimated time required for this is 9 hours per trainee per year. Scores increased over time for all instruments and correlations between methods were in keeping with their intended focus of assessment, providing evidence of validity. Conclusions, The FAP is feasible and achieves acceptable reliability. There is some evidence to support its validity. Collated assessment data should form part of the evidence considered for selection and career progression decisions although work is needed to further develop the FAP. It is in any case of critical importance for the profession's accountability to the public. [source]


Assessment in the context of uncertainty: how many members are needed on the panel of reference of a script concordance test?

MEDICAL EDUCATION, Issue 3 2005
R Gagnon
Purpose, The script concordance test (SCT) assesses clinical reasoning in the context of uncertainty. Because there is no single correct answer, scoring is based on a comparison of answers provided by examinees with those provided by members of a panel of reference made up of experienced practitioners. This study aims to determine how many members are needed on the panel to obtain reliable scores to compare against the scores of examinees. Methods, A group of 80 residents were tested on 73 items (Cronbach's ,: 0.76). A total of 38 family doctors made up the pool of experienced practitioners, from which 1000 random panels of reference of increasing sizes (5, 10, 15, 20, 25 and 30) were generated with a resampling procedure. Residents' scores were computed for each panel sample. Units of analysis were means of residents' score, test reliability coefficient and correlation coefficient between scores obtained with a given panel of reference versus the scores obtained with the full panel of 38. Statistics were averaged across the 1000 samples for each panel size for the mean and test reliability computations, and across 100 samples for the correlation computation. Results, For sample variability, there was a 3-fold increase in standard deviation of means between a sample panel size of 5 (SD = 1.57) and a panel size of 30 (SD = 0.50). For reliability, there was a large difference in precision between a panel size of 5 (0.62) and a panel size of 10 (0.70). When the panel size was over 20, the gain became negligible (0.74 for 20 and 0.76 for 38). For correlation, the mean correlation coefficient values were 0.90 with 5 panel members, 0.95 with 10 members and 0.98 with 20 members. Conclusion, Any number over 10 is associated with acceptable reliability and good correlation between the samples versus the full panel of 38. For high stake examinations, using a panel of 20 members is recommended. Recruiting more than 20 panel members shows only a marginal benefit in terms of psychometric properties. [source]


Achieving acceptable reliability in oral examinations: an analysis of the Royal College of General Practitioners membership examination's oral component

MEDICAL EDUCATION, Issue 2 2003
Val Wass
Background, The membership examination of the Royal College of General Practitioners (RCGP) uses structured oral examinations to assess candidates' decision making skills and professional values. Aim, To estimate three indices of reliability for these oral examinations. Methods, In summer 1998, a revised system was introduced for the oral examinations. Candidates took two 20-minute (five topic) oral examinations with two examiner pairs. Areas for oral topics had been identified. Examiners set their own topics in three competency areas (communication, professional values and personal development) and four contexts (patient, teamwork, personal, society). They worked in two pairs (a quartet) to preplan questions on 10 topics. The results were analysed in detail. Generalisability theory was used to estimate three indices of reliability: (A) intercase (B) pass/fail decision and (C) standard error of measurement (SEM). For each index, a benchmark requirement was preset at (A) 0·8 (B) 0·9 and (C) 0·5. Results, There were 896 candidates in total. Of these, 87 candidates (9·7%) failed. Total score variance was attributed to: 41% candidates, 32% oral content, 27% examiners and general error. Reliability coefficients were: (A) intercase 0·65; (B) pass/fail 0·85. The SEM was 0·52 (i.e. precise enough to distinguish within one unit on the rating scale). Extending testing time to four 20-minute oral examinations, each with two examiners, or five orals, each with one examiner, would improve intercase and pass/fail reliabilities to 0·78 and 0·94, respectively. Conclusion, Structured oral examinations can achieve reliabilities appropriate to high stakes examinations if sufficient resources are available. [source]


The quality of life for cancer children (QOLCC) in Taiwan (part I): Reliability and construct validity by confirmatory factor analysis

PSYCHO-ONCOLOGY, Issue 3 2004
Chao-Hsing Yeh
Part 1, the current paper describes the development and testing of a quality-of-life (QOL) assessment specifically designed for Taiwanese pediatric cancer patients (7,18 years) and their parents/caregivers. The assessment instrument was established based on a qualitative study, then refined using recognized item-analysis methods and pilot tested on a group of 25 patients. The final assessment instrument included three versions of the same instrument, a patient self-report (QOLCC-7,12, for children aged 7,12 years; QOLCC-ADO for adolescent aged 13,18 years) and a parent proxy-report (QOLCC-PAR). The final seven-subscale tool has a total of 34 items and was tested among 106 young cancer patients and 106 their parents. Psychometric properties of the measure were tested using item analysis, Cronbach's alpha, and a confirmatory factor analysis. Results suggest acceptable reliability and goodness of fit of this seven-scale measure. In order to test the factor validity of QOLCC, an independent group of 42 children with cancer participated. The results of confirmatory factor analysis shows the goodness of fit in QOLCC. Copyright © 2003 John Wiley & Sons, Ltd. [source]


A Relationship Power Scale for Female Adolescents: Preliminary Development and Psychometric Testing

PUBLIC HEALTH NURSING, Issue 1 2007
Ruey-Hsia Wang
ABSTRACT Objectives: To develop and test psychometric characteristics of the Relationship Power Scale (RPS), which can be used to explore the relationship power of female adolescents in heterosexual relationships. Methods: Cross-sectional design. Female adolescents in Kaohsiung City, Taiwan, who had a steady relationship with a boyfriend at the time of the study were recruited as study subjects (n=414) to test validity and reliability of the RPS. Results: Confirmatory factor analysis revealed that a one-factor model with correlated uniqueness among the positively worded items best fits the data. There were significantly different scores in 3 different response groups on 2 items regarding who (participants, both themselves and their boyfriends equally, or their boyfriends) had more power in the relationship, and who was more emotionally involved in the relationship for all subjects. For subjects having sex with their steady boyfriends, RPS scores significantly differ among the 3 different response groups on 2 items regarding who had more say about having sex, and who had more say about using condoms. Cronbach's , for the RPS was .69. Test-retest reliability coefficients for the RPS were .83. Conclusions: The RPS exhibited acceptable reliability and validity. Further research is recommended to use the RPS in sex-related behavior research among heterosexual female adolescents. [source]


Measuring blood pressure knowledge and self-care behaviors of African Americans

RESEARCH IN NURSING & HEALTH, Issue 6 2008
Rosalind M. Peters
Abstract The purpose of this study was to develop and conduct preliminary psychometric assessment of instruments measuring knowledge and self-care practices regarding behaviors needed for blood pressure (BP) control among African Americans. Items were empirically derived and scored on a 7-point, bipolar scale. The instruments were evaluated in a sample of 306 community-dwelling African Americans. Results revealed acceptable reliability and validity of the BP Knowledge Scale. Results for the BP Self-Care Scale were mixed. A structural equation model of these scales, recorded BP, and covariates fit well. There was an unexpected positive correlation between self-care and BP suggesting a potential bi-directional relationship. The scales demonstrated acceptable psychometric properties and, with minor revisions, may have clinical utility as measures of BP knowledge and self-care. © 2008 Wiley Periodicals, Inc. Res Nurs Health 31:543,552, 2008 [source]


Development and validation of the psychosomatic scale for atopic dermatitis in adults

THE JOURNAL OF DERMATOLOGY, Issue 7 2006
Tetsuya ANDO
ABSTRACT Psychosocial factors play an important role in the course of adult atopic dermatitis (AD). Nevertheless, AD patients are rarely treated for their psychosomatic concerns. The purpose of the present study was to develop and validate a brief self-rating scale for adult AD in order to aid dermatologists in evaluating psychosocial factors during the course of AD. A preliminary scale assessing stress-induced exacerbation, the secondary psychosocial burden, and attitude toward treatment was developed and administered to 187 AD patients (82 male, 105 female, aged 28.4 ± 7.8, 13,61). Severity of skin lesions and improvement with standard dermatological treatment were assessed by both the dermatologist and the participant. Measures of anxiety and depression were also determined. In addition, psychosomatic evaluations were made according to the Psychosomatic Diagnostic Criteria for AD. Factor analysis resulted in the development of a 12-item scale (The Psychosomatic Scale for Atopic Dermatitis; PSS-AD) consisting of three factors: (i) exacerbation triggered by stress; (ii) disturbances due to AD; and (iii) ineffective control. Internal consistency indicated by Cronbach's alpha coefficient was 0.86 for the entire measure, 0.82 for (i), 0.81 for (ii), and 0.77 for (iii), verifying the acceptable reliability of PSS-AD. Patients with psychosomatic problems had higher PSS-AD scores than those without. PSS-AD scores were positively associated with the severity of the skin lesions, anxiety and depression. The scores were negatively associated with improvement during dermatological treatments. In conclusion, PSS-AD is a simple and reliable measure of the psychosomatic pathology of adult AD patients. It may be useful in dermatological practice for screening patients who would benefit from psychological or psychiatric interventions. [source]


Posttraumatic stress among children in Kurdistan

ACTA PAEDIATRICA, Issue 7 2008
A Ahmad
Abstract Aim: To identify a posttraumatic stress disorder profile for the Child Behaviour Checklist. Method: Checklist item scores for 806 school-aged children in Iraqi Kurdistan (201 randomly selected from the general population, 241 orphans, 199 primary medical care visitors and 165 hospital in-patients) were analysed against the Posttraumatic Stress Symptom Scale for Children (PTSS-C) scores, estimating not only stress diagnoses, but also nonstress-related, child-specific posttraumatic symptoms. Results: Twenty checklist items, which revealed significant correlations with the stress diagnoses, formed the checklist,stress profile with acceptable reliability and validity, and significant correlation to the PTSS-C estimates. Conclusion: A child-specific stress profile for the checklist is recommended for use as a screening instrument. [source]