Evidence suggests that retake opportunities do not result in large positive effects on scores for borderline students. Our own data suggests that even large score improvements did not result in improvements for outcomes.
Applicability of Assessment Scores
During the development phase of a competency-based exam, select elements of a professional competency document will influence exam design. Regulators and exam design experts collaborate to determine the ideal sample of professional competencies that can be assessed and in which test format. This sampling process must be aligned with the intended purpose of the exam. For example, an exam to evaluate entry to practice level competency may sample more competencies than an exam to evaluate readiness for admission to a professional training program. Therefore, the results of any competency based exam should be interpreted in light of changes to a profession’s competency standard, current societal expectations and the kinds of competencies being evaluated. For example, competencies that are highly context specific, such as procedural and technical clinical skills, may degrade more rapidly over time without formal practice than competencies like communication.
Retakes of High Stakes Assessments
Most of the evidence and discussion related to the value of retakes comes from the education context, where the focus of the discussion is how to ensure the success of the student. In this context, a number of findings are important to understanding the success rates of repeated assessments, particularly the necessity for remediation prior to retakes, and the student’s level of performance on the original attempt. We have synthesized and translated these findings and highlight three general themes of evidence to consider when making decisions on exam retakes: Understanding self-assessment and standardized assessment; the likelihood of improving test scores after retakes, and the implications of retake success rates on patient safety
Understanding self-assessment and standardized assessment
Many unsuccessful candidates intuitively believe that opportunities for retaking an exam will result in significantly improved outcomes. In other words, their self-assessment suggests they should have performed better and passed. These beliefs are unfortunately not necessarily accurate but we need to understand the source and veracity of these beliefs. First, candidates tend to be highly anxious in a high stakes exam, and as a consequence feel that if only they were more relaxed they could have focused more and would have performed better. The evidence on this issue is mixed, but it seems that high levels of anxiety tend to negatively impact lower performing students more than higher performing students(1,2). In other words, those with the ability to handle a lot of information overload, because they are more experienced or skilled, can handle the increase in anxiety and will be fine. Those that cannot handle a great deal of information will do worse when anxious compared to when they are not anxious. Second, candidates’ self-assessment tends to be seriously misaligned with reality (3). This is especially true for lower performing health professionals, where confidence is a poor measure of competence. These candidates often have self-identified which skills and knowledge were relevant to study prior to an exam, and they may be poorly situated to make those decisions. Third, candidates’ expectations for the structure of the assessment may impact their ability to be objective. At Touchstone Institute, high stakes assessments typically incorporate the objective standardized clinical exam (OSCE). While the OSCE is a prevalent exam format in North America, it can be quite confusing and overwhelming to many who are internationally trained and not acquainted with the OSCE format.
The likelihood of improving test scores after retake
In order for a retake opportunity to be of most value to candidates, they need to be aware of the specific nature and degree of improvements they need to make in order to bridge the gap between their initial results and the level needed for success(4). As well, regulators need to be aware of the value of remediation to candidates scored at different levels of performance. We focus on the performance levels indicative of a clear pass, the clear fail and the borderline candidate. The clear fail and borderline groups of candidates would be the only groups seeking a retake opportunity.
Published analyses regarding retakes following remediation suggest that the effects of remediation vary according to the student’s level of preparation. In essence, remediation may help or hinder students differently depending on their levels of academic preparedness. Self-directed remediation for students at the marginal or borderline level have mostly negative effects (5). This may reflect the challenges of identifying the very context specific key gaps to address and of developing appropriate remediation, but this nevertheless remains a barrier for borderline candidates. According to published work, the impact of remediation for students at the clear fail level is generally positive, but does not result in a clear pass (5,6).
The likelihood of passing on a second attempt may depend on the quality of the exam (6). In other words, some borderline candidates’ performance may improve out of chance if the exam is not designed to a sufficiently high standard. Exams designed at Touchstone Institute follow strict validation processes and consistently fall above the recommended standard of a Cronbach’s alpha of 0.7. This indicates a high level of discrimination power to detect differences between the clear pass and the clear fail groups. Studies conducted at Touchstone Institute, of a high stakes exam for internationally graduated health professionals, indicated a very low rate of success at improving either the performance or knowledge based score, thus indicating the accuracy of the initial performance assessment. If the exam was a less accurate measure of competence, we should see a different pattern of performance on retakes. While[S1] this may also reflect a lack of understanding regarding what skills and knowledge to remediate, evidence suggests that candidates have the best chance to improve on the first re-take and that success rapidly drops after that (6). With increasing time since graduation and with decreasing access to training opportunities, the likelihood of improving scores begins to decline. Retake policies and examinee decisions must take into account this evidence suggesting low improvement rates, particularly regarding exam outcomes.
Implications of retake success rates on patient safety
We could not find evidence that evaluated the long-term quality of health professionals who achieved licensure after a second or multiple attempts. Generally, studies tend to find a relationship between lower exam scores and the frequency of subsequent complaints from patients (7). In an education context, some would argue that students should be allowed multiple resits or retakes until their score improves (5,6). There is often little to no financial cost to the student in this context so this approach raises little concern. In a high stakes context, where the goal is to advance only the most competent individuals in a fair and unbiased manner, the answer is not so simple. When answering to the very high standard of patient safety, there is less enthusiasm for allowing multiple retakes, but there is no clear evidence for this caution.
While it may be considered ethical and fair to permit candidates to retake an examination, it is imperative to provide them with the necessary information to help them weigh the potential benefits of a retake against financial and other factors. A retake may be most effective when there has been sufficient time to address appropriate deficiencies, particularly for skills oriented clinical performance examinations and when there may be is a realistic possibility of score improvement. As a general practice, we would support allowing one opportunity to retake a high stakes exam, given sufficient time to improve preparedness.
- Owens M, Stevenson J, Hadwin JA, Norgate R. When does anxiety help or hinder cognitive test performance? The role of working memory capacity. British Journal of Psychology. 2014 Feb 1;105(1):92-101.
- Johnson DR, Gronlund SD. Individuals lower in working memory capacity are particularly vulnerable to anxiety's disruptive effect on performance. Anxiety, Stress, & Coping. 2009 Mar 1;22(2):201-13.
- Davis DA, Mazmanian PE, Fordis M, Van Harrison RT, Thorpe KE, Perrier L. Accuracy of physician self-assessment compared with observed measures of competence: a systematic review. Jama. 2006 Sep 6;296(9):1094-102.
- Wormeli R. Redos and retakes done right. Educational Leadership. 2011 Nov 1;69(3):22-6.
- Arnold I. Resitting or compensating a failed examination: does it affect subsequent results?. Assessment & Evaluation in Higher Education. 2017 Oct 3;42(7):1103-17.
- Pell G, Fuller R, Homer M, Roberts T. Is short-term remediation after OSCE failure sustained? A retrospective analysis of the longitudinal attainment of underperforming students in OSCE assessments. Medical teacher. 2012 Feb 1;34(2):146-50.
- Tamblyn R, Abrahamowicz M, Dauphinee D, Wenghofer E, Jacques A, Klass D, Smee S, Blackmore D, Winslade N, Girard N, Du Berger R. Physician scores on a national clinical skills examination as predictors of complaints to medical regulatory authorities. Jama. 2007 Sep 5;298(9):993-1001.