Abstract
A critical component of the standard setting process is collecting evidence to evaluate the recommended cut scores and their use for making decisions and classifying students based on test performance. Kane (1994, 2001) proposed a framework by which practitioners can identify and evaluate evidence of the results of the standard setting from (1) the procedural elements of the study, (2) the internal consistency of the recommendations, and (3) the external consistency of the impact or results of other measures of examinee performance. For many programs, the availability of external validity evidence is limited due the nature of the testing program. This is particularly the case for national testing programs in developing nations or international programs that span diverse populations across the world. In this article, we review two plausible approaches for identifying and evaluating external validity evidence in settings where other national or international benchmarks may not be available to guide policymakers. Each approach is presented along with a demonstration of how it could be applied in a case study from a national testing program.