238
Views
4
CrossRef citations to date
0
Altmetric
INTERVENTION, EVALUATION, AND POLICY STUDIES

Incorporating End-of-Course Exam Timing Into Educational Performance Evaluations

, , , &
Pages 130-147 | Published online: 12 Jan 2015
 

Abstract

There is increased policy interest in extending test-based evaluations in K–12 education to include student achievement in high school. High school achievement is typically measured by performance on end-of-course exams (EOCs), which test course-specific standards in a variety of subjects. However, unlike standardized tests in the early grades, students take EOCs at different points in their schooling careers. The timing of the test is a choice variable presumably determined by input from administrators, students, and parents. Recent research indicates that school and district policies that determine when students take particular courses can have important consequences for achievement and subsequent outcomes such as advanced course taking. We develop an approach for modeling EOC test performance that disentangles the influence of school and district policies regarding the timing of course taking from other factors. After separating out the timing issue, better measures of the quality of instruction provided by districts, schools, and teachers can be obtained. Our approach also offers diagnostic value because it separates out the influence of school and district course-timing policies from other factors that determine student achievement.

Notes

In practice, districts need not bundle test taking with course taking. For example, students could take algebra-I in grade 9 and then take the algebra-I EOC in grade 11. Our analysis assumes course taking and test taking occur concurrently, which is what we expect to be the most common circumstance. Of course, policies could be enacted to force the bundling of course and test taking for EOCs.

Here, “instructional effectiveness” is a catchall phrase meant to cover a wide variety of factors that may affect student learning. Obviously, teacher effectiveness is one part of this measure, but it may also include other non–teacher-related factors such as curriculum choice.

It is implicit in our analysis that course-timing policies are largely at the discretion of districts. This view is consistent with the variation in course-timing policies that we observe across Missouri districts (see ) and supported by two studies by Clotfelter et al. (Citation2012a, Citation2012b) using data from North Carolina.

The exact specification for the student-achievement model is not critical to the overall approach; for example, district fixed effects could be included directly in equation (1) if desired. Changes to the structure of the initial student-achievement model would require minor operational adjustments to subsequent steps in the process. One advantage of the two-step model as described in equations (1) and (2) is that it produces “proportional” district rankings (see CitationEhlert et al., in press).

All MAP exam scores are standardized by year–grade–subject cell. The outcome variable (the EOC score) is also standardized by year to have a mean of zero and a standard deviation of one, although its standardization is not performed separately by grade level in order to preserve cross-grade-level performance gaps in the outcome measure. For a discussion of the vector of missing lagged score dummy variables (Mig(tk)), see the appendix. Exam retakers are included in the analytic sample that we use to estimate equations (1) and (2), and there is an indicator for retaking status included in Xidgt. Note that the inclusion of these students in equations (1) and (2) does not change our findings with regard to the course-timing effects, which are estimated separately using a procedure described in the next section (that excludes retakers).

The by-grade-level estimation is useful because it allows for heterogeneity in the predictive power of available covariates for students who take EOCs in different grades. As a specific example, if the model uses standardized math scores in grades 6, 7, and 8 to predict the EOC score in algebra-I, the predictive power of these prior scores is allowed to vary depending on whether students take the EOC in grade 9 or grade 10. The differing gaps between the lagged exam scores and the outcome variable may affect the precision of the estimates in the higher grades, but the by-grade-level estimation should limit concerns about bias, particularly at the district level.

Equation (2) is estimated without an intercept so that effect estimates and standard errors are calculated for every district. The effect estimates are simply the average of the residuals assigned to the given district, and the standard errors are calculated to be robust to the presence of heteroskedasticity and are clustered at the student-level to account for retakers. Shrinkage is applied via the method used in Koedel, Leatherman, and Parsons (Citation2012).

Students who took the algebra-I EOC before grade 7 were excluded from the model. These students represent a very small fraction of the overall sample (≈0.1%; see ).

Limiting comparisons to be between students taking the course in the same grade is also important for models at the school and teacher levels (we elaborate on this point in a later section, Extensions to School- and Teacher-Level Models).

A separate issue is that the EOC is administered up to three times during the academic year in Missouri (fall, spring, and summer). We do not take up the issue of “within-academic-year” test timing in this study because supplementary analysis suggests it is a second-order issue. One reason is that the vast majority of students take their EOCs in the spring (in 2011–2012 and 2012–2013, 93.6% of Missouri students who took the algebra-I EOC took it in the spring, 5.4% took it in the fall, and 1.0% took it in the summer). In results omitted for brevity, we also directly estimated the effect of within-academic-year timing on achievement using an approach analogous to the one outlined below for our main analysis of grade-level timing (focusing on the fall and spring test dates) and found that within-academic-year timing is not an important determinant of achievement. More information is available from the authors upon request.

Again, recall that students who take the EOC prior to grade 7 are excluded from our analysis (≈0.1% of the students in Missouri; see ).

Given that students who have previously taken the EOC are not included in the estimation of equations (3) and (4), Xidgt excludes the indicator for retaking the exam.

All standard errors in are clustered at the district level and calculated to be robust in the presence of heteroskedasticity.

Note that the course-timing adjustment parameters are treated as deterministic in equation (5). The fact that the adjustment parameters are estimated with error can be accounted for directly if desired.

There are alternative ways to illustrate this information. For example, in unreported results we consider a scenario where the state would like to identify the top and bottom 10% of districts in terms of EOC performance. Moving from the case where we do not account for course timing to the case where we do account for course timing (from the left to right panel in ) results in 5 of the 51 districts in the original top 10% and 7 of the 50 districts in the original bottom 10% being replaced.

An alternative concern is that the fixed course-timing adjustments could become biased over time, as they would not account for changes in the testing instrument, demographics, instructional quality, etc., at different grade-levels. If this is a concern these parameters could be periodically updated, perhaps with some smoothing, with the trade-off that the updated parameters would potentially be influenced by districts’ behavioral responses to the evaluation system.

Districts with fewer than 20 students are excluded from .

For the declining districts, the opposite holds true. These districts have the vast majority of their students taking the course in the optimal grades and, as such, do not receive much in the way of penalty forgiveness.

Even this story does not seem particularly likely. Our use of district-level course placement percentages rather than school-level percentages means that the teacher quality differentials would have to vary substantially between districts to invalidate the instruments. Most of the variance in teacher quality occurs within schools (Hanushek & Rivkin, Citation2012). Furthermore, the fact that our models condition on district characteristics means that the cross-district variance in teacher quality must not be highly correlated with observable district characteristics in order to confound our instrumental-variables estimates. A related issue is that teacher quality might be systematically higher in some grades relative to others in Missouri, for example, in grades 9 and 10. If this were the case, then differences in teacher quality across grades would be a mechanism for the course-timing effects we estimate. However, the likelihood that our findings are strongly driven by cross-grade differences in teacher quality seems low given our OLS estimates and the corroborative findings from Clotfelter et al. (Citation2012a, Citation2012b), with their 2012b study being particularly compelling because it relies on an abrupt policy change for identification (in the case of an abrupt policy change it is unlikely that there will be a wholesale change in personnel, but rather a change in which teachers teach in which grades).

Our work could be extended to formally apply the techniques laid out in Conley et al. (Citation2012). They provide a rigorous framework for examining the sensitivity of the IV estimates to deviations from the exact exclusion restriction.

An added advantage of the method presented in this article from the standpoint of designing an evaluation system is that no student records are systematically excluded from the model (although retakers are excluded from the estimation of equations [3] and [4]). This is in contrast to the method used in Clotfelter et al. (Citation2012a) in which district-by-prior-achievement cells are removed from the analysis if they do not have enough variance over time to rule out random enrollment fluctuations, a procedure that was implemented to help limit endogeneity concerns and improve the case for the instruments being valid. Educational administrators and policymakers often place considerable weight on “inclusion” considerations for political reasons. Such considerations are typically of less importance to researchers.

That said, substantial challenges remain in developing teacher-level performance measures based on student EOC exam scores beyond simply accounting for course-timing effects. A central concern is how to deal with more complex student tracking (particularly within-grade), an issue discussed in recent studies by Anderson and Harris (Citation2013) and CitationJackson (2014).

Although policymakers await stronger evidence on this issue, they may still choose to develop incentives for school districts to discourage late algebra-I course taking. Kane (Citation2013) provides a rationale for why this might occur. In short, the issue is that the standard hypothesis testing framework is not well-suited for some policy decisions.

A similar strategy is applied by Clotfelter et al. (Citation2012a, Citation2012b) in assigning exam scores for students who never take the EOC.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 302.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.