Initial Validation Evidence for Clinical Case Presentations by Student Pharmacists
Jennifer S Byrd
Union University College of Pharmacy
Michael J Peeters
University of Toledo
DOI: https://doi.org/10.24926/iip.v12i1.2136
Abstract
Objective: There is a paucity of validation evidence for assessing clinical case-presentations by Doctor of Pharmacy (PharmD) students. Within Kane’s Framework for Validation, evidence for inferences of scoring and generalization should be generated first. Thus, our objectives were to characterize and improve scoring, as well as build initial generalization evidence, in order to provide validation evidence for performance-based assessment of clinical case-presentations.
Design: Third-year PharmD students worked up patient-cases from a local hospital. Students orally presented and defended their therapeutic care-plan to pharmacist preceptors (evaluators) and fellow students. Evaluators scored each presentation using an 11-item instrument with a 6-point rating-scale. In addition, evaluators scored a global-item with a 4-point rating-scale. Rasch Measurement was used for scoring analysis, while Generalizability Theory was used for generalization analysis.
Findings: Thirty students each presented five cases that were evaluated by 15 preceptors using an 11-item instrument. Using Rasch Measurement, the 11-item instrument’s 6-point rating-scale did not work; it only worked once collapsed to a 4-point rating-scale. This revised 11-item instrument also showed redundancy. Alternatively, the global-item performed reasonably on its own. Using multivariate Generalizability Theory, the g-coefficient (reliability) for the series of five case-presentations was 0.76 with the 11-item instrument, and 0.78 with the global-item. Reliability was largely dependent on multiple case-presentations and, to a lesser extent, the number of evaluators per case-presentation.
Conclusions: Our pilot results confirm that scoring should be simple (scale and instrument). More specifically, the longer 11-item instrument measured but had redundancy, whereas the single global-item provided measurement over multiple case-presentations. Further, acceptable reliability can be balanced between more/fewer case-presentations and using more/fewer evaluators.