13,867
Views
8
CrossRef citations to date
0
Altmetric
Articles

Resitting or compensating a failed examination: does it affect subsequent results?

Abstract

Institutions of higher education commonly employ a conjunctive standard setting strategy, which requires students to resit failed examinations until they pass all tests. An alternative strategy allows students to compensate a failing grade with other test results. This paper uses regression discontinuity design to compare the effect of first-year resits and compensations on second-year study results. We select students with a similar level of knowledge in a first-year introductory course and estimate the treatment effect of a resit on the result for a second-year intermediate course in the same subject. We find that the treatment effect is positive, but insignificantly different from zero. Additional results show that students’ overall second-year performance is insignificantly related to the number of compensated failing grades in their first year. The number of attempts that students need to complete their first year does not have a significant positive effect on second-year performance. We conclude that the evidence for a positive effect of resits on learning is weak at best.

Introduction

Pass or fail decisions are ubiquitous in higher education. Institutions of higher education need to decide whether students qualify for an academic degree or for advancement to a higher level of study. These decisions should be based on valid and reliable assessments of academic performance (van de Watering and van der Rijt Citation2006). The large literature on standard setting in assessment testifies to the importance of high-quality testing. However, a consensus on the best standard setting method is still lacking (Friedman Ben-David Citation2000; Downing, Tekian, and Yudkowsky Citation2006). In practice, various standard setting methods are used to decide on a passing score for an individual test, ranging from relative or norm-referenced methods to absolute or criterion-referenced methods (Norcini Citation2003). On top of this, a standard setting strategy is needed to determine how multiple test scores are combined to arrive at a pass or fail decision related to degree-granting, certification or admission to a next study phase. Compared to standard setting methods, standard setting strategies have received much less attention in the literature (Haladyna and Hess Citation1999).

The choice of standard setting strategy will determine whether low performance on a single test can be compensated with other test results, as is done under a compensatory strategy, or whether a student will have to pass every single test, as is required under a conjunctive strategy. In the latter case, students typically are offered the opportunity to do a resit at a later point in time. In this way, a disjunctive feature is built into the conjunctive strategy. From a psychometric perspective, the optimal choice of standard setting strategy depends on a host of factors, including the reliability of the test, the correlation among tests and the number of resits allowed (Yocarini Citation2015). In general, a compensatory strategy yields more false positive errors (meaning an observed pass while the student should have failed), whereas conjunctive strategies result in more false negative errors (meaning an observed fail while the student should have passed) (Haladyna and Hess Citation1999). Educators using a compensatory strategy are willing to tolerate poor performance in a single test if a student’s overall performance is satisfactory. In comparison, the conjunctive strategy requires students to do resits until they pass all tests. This strategy is perceived as more rigorous (and therefore more appealing) by many instructors, legislators and policy-makers. As students advance in their study and specialise in a field in which they want to practice, educators’ willingness to tolerate poor performance will be low and the compensatory strategy will be inappropriate. Earlier on in their study path, when students still need to make study choices based on their talent and skills, a compensatory strategy typically will be more applicable.

The academic literature offers little evidence for this preference of resits over compensation. Ricketts (Citation2010) observes that the literature on resits is sparse and that there is no ‘theory of resits’ (351). He argues that resit scores should be combined with the results from the main examination, as the limited research findings inhibit the proper interpretation of resit scores. Pell, Boursicot, and Roberts (Citation2009) are also critical of the practice of resitting. Comparing the results on main assessments and resits in two medical schools, they find that resit students are able to improve their scores substantially. They also note that the resit pass rate is comparable to the pass rate for the main assessment, which fits ill with their observation that the resit students have ‘demonstrated academic ability in the lower tail of the distribution’ (249). Pell, Boursicot, and Roberts (Citation2009) therefore raise the question whether resits are too easy. A recent study by Proud (Citation2015) uses regression discontinuity design (RDD) to estimate the effect of resits on future academic results. He finds that resits have no significant effect on future grades, except for students who do relatively well in the resit examination.

Ostermaier, Beltz, and Link (Citation2013) add to the literature by discussing the effect of resits on incentives. While students’ additional study effort for resits may improve their level of learning and thereby increase the probability of academic success, ample resit opportunities also offer students the opportunity to postpone their study effort. Procrastination is widely prevalent among students in higher education. It goes beyond resitting and may also show up in non-test student behaviour, such as attendance and participation. A large majority of students report postponing academic tasks and engaging in distractions (Schouwenburg Citation1995; Grunschel, Patrzek, and Fries Citation2013). Survey evidence also indicates that academic procrastination leads to stress and lower grades (Tice and Baumeister Citation1997). According to Ostermaier, Beltz, and Link (Citation2013), by offering students the opportunity to do resits, they ‘may also lose motivation and perform even worse as they retake the exam’ (24). Their evidence shows that students who need more test attempts score lower and have a higher likelihood to fail. While this result can be driven both by differences in ability and by the negative effect of procrastination, it does raise doubts about whether allowing multiple resits will always help students to succeed. Their study also illustrates that the choice of standard setting strategy is not just an exercise in psychometrics, but may also have effects on the behaviour of students.

The present paper contributes to this literature by empirically investigating the effect of requiring students to resit an examination, instead of allowing them to compensate a failing score, on future study results. If the increased learning from studying for the resit outweighs any negative effects arising from procrastination, one would expect a positive effect on future grades. Our study is closely related to Proud (Citation2015), both methodologically and in terms of problem statement. Our research design differs, however, in one important way. Proud (Citation2015) focuses on the effect of the resit on future results for a sample of students who eventually all achieve grades above the pass mark (with or without resits). In contrast, our design compares students who use a resit to achieve a passing grade with students who compensate their failing grade. As such, the current study is better placed to compare a conjunctive with a compensatory strategy. We have no prior hypothesis as to which group of students will show better second-year study results. Compared to Proud (Citation2015), this study has a much larger sample size and is also able to exploit the tight substantive connection between a number of introductory first-year courses and intermediate second-year courses.

We test the effect of resitting on future results using a data-set of seven cohorts over the period 2007–2013. As the prevailing examination rules apply to all students indiscriminately, a purely experimental design is not possible. We therefore employ a quasi-experimental RDD (Jacob and Lefgren Citation2004; Feng and Graetz Citation2015) by estimating a local linear regression model around the threshold above which students can compensate a failing grade. We find that the effect of a resit is positive, but insignificantly different from zero.

As a caveat, we note that in this study the standard setting strategy is based on tests and examinations. This implies that our findings cannot be extrapolated to educational settings where pass or fail decisions are based on non-examination behaviours.

Setting and data

Setting

This study is conducted at Erasmus School of Economics, which is part of Erasmus University Rotterdam, one of the research universities in the Netherlands. We use data from the bachelor programme in Economics & Business, which is the largest programme in the school. The nominal duration of this programme is three years. The first two years consist of obligatory core and support courses. In their third year, students specialise by choosing a minor and a major. Most courses are delivered using a combination of large-scale lectures and small-scale tutorials. During the period of investigation, no major changes in the educational system have taken place.

The curriculum is programmed in five eight-week modules. Each module consists of an 8-credit core course and a 4-credit support course. In the first year the 8-credit courses focus on core economics content, i.e. accounting, microeconomics, macroeconomics, marketing and organisation. The 4-credit support courses are mathematics (I & II), ICT skills, statistics I and financial information systems. In the second year, the 8-credit courses are international economics, finance, applied microeconomics, intermediate accounting and methodology. The 4-credit courses include behavioural economics, statistics II, history of economic thought, economics & taxation and a research project. Three second-year courses (applied microeconomics, intermediate accounting and statistics II) have an especially strong substantive connection to the corresponding first-year courses (microeconomics, accounting and statistics I). We will use this feature in the research design.

For most courses, grades are determined on the basis of interim tests (during the module) and a final examination. In the Dutch system, grades are between 0 and 10, where a grade of 8 or higher corresponds to an ‘A’, a grade of 7 corresponds to a ‘B’ and 5.5 is the cut-off point for a passing grade. The examination rules allow for compensation between first-year courses, so that a maximum of three ‘failed’ grades can be compensated by higher grades for related courses. To this end the curriculum has been divided into clusters of related courses (economics courses, business courses and support courses). Each cluster allows one grade between 4.5 and 5.5 to be compensated, such that the cluster grade point average (GPA) equals 5.5 as a minimum. Failing grades below 4.5 cannot be compensated. The result is a hybrid standard setting strategy, in which the examination rules contain both a compensatory and a conjunctive feature. Resits are limited to a maximum of three out of 10 courses, and take place during the summer period to discourage their use. In this way, the resit is positioned as an opportunity of last resort instead of a regular examination opportunity. The 4.5 threshold for compensation is another feature that will be used in the research design.

During the first bachelor year an academic dismissal policy has been in place during the full sample period. Dutch law allows institutions of higher education to give first-year students a so-called negative ‘binding study advice’, which implies that underperforming students are dismissed from the programme and are not allowed to reregister for this programme in the next three years. The purpose of the binding study advice is to prevent students spending too much time in pursuing a programme for which they do not have the skills, talent or motivation. The minimum number of credits that students must have to avoid a negative binding study advice was 40 out of 60 until 2012. Since 2012 students need to have the full 60 credits. These minimum standards include the credits for courses which students can compensate. The analysis below focuses on the group of students who have completed their first bachelor year and investigates the effect of resits and compensations on subsequent performance in their second year. We refrain from including the third-year academic results in the analysis, as students’ choices with regard to their minor and major courses lead to a large heterogeneity in courses taken. As a consequence, students’ third-year performance suffers from a lack of comparability.

Data

The student-level data are collected from the school’s information system. These include the course grade (Grade) and the number of attempts that a student needs to pass or compensate a course (#Attempts). The variable Comp records whether a result has been compensated. It takes on the value one if a course is compensated, and zero otherwise. For each student, we calculate a credit weighted grade point average for the first (GPA(B1)) and for the second year (GPA(B2)).

The literature on study success in economics identifies a number of intervening variables that may influence students’ academic performance. Ballard and Johnson (Citation2004), Lagerlöf and Seltzer (Citation2009) and Arnold and Straten (Citation2012) find that mathematics skills acquired at secondary school are an important determinant for successful economics study. GPA scores in preparatory education are also known to explain study success (Clauretie and Johnson Citation1975; Park and Kerr Citation1990; Ballard and Johnson Citation2004; Johnson and Kuennen Citation2006). We have experimented with including measures for performance at secondary school in our model for second-year study results. However, these measures turned insignificant in specifications that also include the results in the first year. In other words, first-year GPA encompasses students’ performance in secondary school. We therefore dropped the school variables from our model.

Gender is often thought to have an impact on student performance in economics as well, but the evidence is mixed. Johnson and Kuennen (Citation2006) find that female students earn significantly higher grades than male students in a statistics class. In contrast, Ballard and Johnson (Citation2004) show that male students outperform female students in introductory economics. In a follow-up study, Ballard and Johnson (Citation2005) relate this gender gap to women’s low expectations about their ability to succeed in economics. Swope and Schmitt (Citation2006) find no evidence of a gender effect in economics. A recent meta-analysis by Johnson, Robson and Taengai (Citation2011) concludes that the gender gap in economics has decreased over time. We include the binary variable Gender, which takes on the value one for male and zero for female students. We also include students’ age (Age) as an intervening variable, following for example Harmon and Lambrinos (Citation2008). The bachelor programme in Economics & Business is offered in two languages: Dutch for the national market and English for international students. As these groups may have different student characteristics, we include the variable IntProg which takes on the value one for students in the international group and zero for students in the Dutch group.

The sample includes 4805 students from seven cohorts (2007–2013). Of these, 3042 students have completed their first bachelor year and 1696 students have completed both their first and second bachelor year. Table provides descriptive statistics for the population of students who have completed the first year. The GPA averages between 6.5 and 7.0 in both years. On average students need 0.88 compensations and 14.22 attempts to complete their first year. As the first-year curriculum consists of 10 courses, the latter number comes down to an average number of resits of 4.22. A large majority of students (71%) is male. A quarter of the students are enrolled in the international programme. The age at the start of the programme averages 19 years. Finally, Table reports the means and standard deviations for the courses that we will use in the RDD.

Table 1. Descriptive statistics.

Figures and provide a first glance at the relationship between resits, compensations and second-year performance. For the population of students who completed their first and second year, Figure plots GPA(B2) by number of attempts and compensations needed in the first year. Students who finish their first year using 10–12 attempts and no compensations have an average GPA(B2) higher than 7.2. This number declines rapidly once students use more compensations and attempts, suggesting negative relationships with GPA(B2). An important driver behind this pattern is student quality, which can be measured by first-year performance. Good students tend to do well both in their first and second year, and good students also tend not to need any compensations or resits. That, however, is not the effect we are interested in here. This paper is about the differential effect of resits and compensations on future results, given the quality of the student. Figure accounts for the effect of student quality in a crude way, by focusing on the population of below-average students, with GPA(B1) ≤ 6.5. Compared to Figure , the bar chart is much flatter and the negative relationships between resits, compensations and GPA(B2) are more difficult to discern by the eye. The large difference between Figures and highlights the importance of controlling for student quality and possible other intervening variables.

Figure 1. GPA(B2) by number of attempts and compensations in B1, all students.

Figure 1. GPA(B2) by number of attempts and compensations in B1, all students.

Figure 2. GPA(B2) by number of attempts and compensations in B1, students with GPA(B1) ≤ 6.5.

Figure 2. GPA(B2) by number of attempts and compensations in B1, students with GPA(B1) ≤ 6.5.

Methodology

Preliminary regressions

As a first pass at estimating the effect of resits versus compensations on second-year study results, we start with Equation (Equation1):(1)

where subscript i denotes the student and #Attempts(B1)i and #Comps(B1)i are, respectively, the number of examination attempts and compensations that student i needs to complete the first year. All other variable names are as previously defined. GPA(B1) summarises a number of factors which determine study success, such as human capital, students’ motivation to study economics or their self-efficacy. As such, it is our, admittedly imperfect, measure of student quality. We also estimate a specification excluding GPA(B1) to show the effect of controlling for student quality on the results. IntProg, Gender and Age are included as intervening variables. Equation (Equation1) is estimated with ordinary-least-squares (OLS) for the group of students who have successfully completed both their first and second bachelor year.

Our next step is to exploit the close connection between the introductory (B1) and intermediate (B2) courses in Microeconomics, Accounting and Statistics using the following panel regression model:

(2)

where subscript i denotes the student and subscript j denotes the subject area (i.e. Microeconomics, Accounting or Statistics). Course fixed effects are denoted by b0,j; student fixed effects are denoted by b1,i. Student fixed effects are included as an additional variable to control for the underlying ability of the individual student. The variable #Attemptsi,j measures the number of attempts that student i needs to complete the introductory course in subject j. Compi,j is a dummy variable that equals one if student i has compensated a failing grade for the introductory course in subject j and zero otherwise. The variable #Attemptsi,j ⋅ Compi,j is included to measure a possible interaction effect between compensation and the number of attempts. For future results, it may matter if a student needs multiple attempts to achieve a compensated grade. Grade(B2)i,j denotes the grade of student i for the intermediate course in subject j. Similarly, Grade(B1)i,j denotes the grade of student i for the introductory course in subject j. To account for possible variation in test difficulty across cohorts, Grade(B2)i,j and Grade(B1)i,j have been demeaned by subtracting the average grade of students who have taken the same examination. GPA(B1) is again included to account for differences in student quality. As before, IntProg, Gender and Age are included as intervening variables.

We estimate four specifications based on Equation (Equation2), by either including or excluding the student fixed effects and the interaction term. For computational reasons, student fixed effects are estimated by demeaning the variables using the within transformation. All specifications are estimated by OLS. Standard errors are calculated using the White cross-section method, which assumes that the errors are contemporaneously cross-sectionally correlated.

Regression discontinuity design

The main issue with the non-experimental methods outlined above is that they do not completely solve the problem identified in Figures and . Since below-average first-year students are more likely to need resits or compensations and to continue performing below-average in their second year, the coefficients of #Attemptsi,j and Compi,j may be biased. Even if resits or compensations do not have a negative effect on future performance, students not needing them will tend to perform better, simply because resits and compensations are predominantly used by below-average students. The inclusion of GPA(B1) in Equations (Equation1) and (Equation2) is an imperfect way to address this endogeneity.

RDD is a quasi-experimental design that tries to uncover the causal effect of interventions by making use of arbitrary thresholds. By comparing observations lying closely on either side of a treatment threshold, one can estimate a treatment effect. Hahn, Todd, and van der Klaauw (Citation2001) show that local linear regressions provide a convenient non-parametric way of estimating the treatment effect in an RDD. To this end, the following regression can be estimated:

(3)

where

In Equation (Equation3), c is the treatment threshold and D is the binary variable that equals one if X ≥ c. The data points lying close to the threshold are defined by the bandwidth parameter h. Equation (Equation3) allows for different slopes and intercepts on either side of the threshold.

Our research design is visualised in Figure . In the context of this paper the threshold is the minimum grade of 4.5 needed to compensate a failing grade. The treatment is defined as a resit leading to a passing grade. We select two groups of students who have in common that their performance on the first regular examination of subject j is close to the compensation threshold of 4.5. In Figure , we take a window between 4.3 and 4.6, centred on the midpoint of 4.45. Students scoring a 4.3 or 4.4 will need to do a resit during the summer, whereas students scoring a 4.5 or 4.6 can compensate this failing grade with higher scores on other courses. We first select a group of students who have compensated their failing grade and have not taken a resit. For this group, Resitij = 0. Our second group consist of students that have received the treatment of a resit and scored a passing grade (> = 5.5). For this group, Resitij = 1. By comparing these two groups we may find out whether requiring students to do a resit until they receive a passing grade has a positive effect on future study outcomes. In that case, allowing students to compensate a failing grade might not be in their best interest.

Figure 3. Research discontinuity design.

Figure 3. Research discontinuity design.

Based on this design, we estimate the following local linear regression:(4)

where Resiti,j is the binary treatment variable as defined above. Grade(B2)i,j is again demeaned by subtracting the class average. In (4), Grade(B1)i,j is the grade for subject j on the first regular examination, taken in deviation of the midpoint 4.45 (as in Equation (Equation3)). To avoid multicollinearity among regressors, GPA(otherB1)i,j is the first-year GPA of student i excluding subject j. Equation (Equation4) also includes the covariates used in the preliminary regressions and is estimated by OLS. We have experimented with specifications allowing for different slopes on either side of the threshold, but the interaction coefficient (β2 in Equation (Equation3)) turned out to be insignificant. Equation (Equation4) is estimated for specifications including and excluding covariates. For the bandwidth h, we have chosen either 0.15 or 0.25 on either side of the midpoint. We initially estimate Equation (Equation4) separately for each of the three subject areas. Due to the limited number of observations around the threshold, we also pool the data from the three subject areas to estimate a panel version of (4).

Results

Preliminary regressions

Table reports the estimation results for the regression model in Equation (Equation1). The first specification excludes GPA(B1) and shows significant negative relationships of both #Attempts(B1) and #Comps(B1) with GPA(B2), as in Figure . An additional attempt needed to complete the first year reduces GPA(B2) by 0.058, while an additional compensation reduces GPA(B2) by 0.304. Of the intervening variables, Gender and Age have a significant coefficient, while IntProg is insignificant. The negative coefficient for Gender, implying that male students achieve a lower GPA(B2), is opposite to the gender gap reported in the literature, but insignificant. Age is significantly positively related to GPA(B2). Once we include GPA(B1) in the regression model, the explanatory power jumps from 0.311 to 0.631. The coefficient of GPA(B1) is 0.799 and highly significant. As in Figure , the inclusion of GPA(B1) has a strong effect on the other relationships. IntProg, Gender and Age are all insignificant. Most notably, the coefficient of #Comps(B1) turns from significantly negative into insignificantly positive. The effect on #Attempts(B1) is less pronounced. The coefficient becomes smaller but remains significantly negative. These results suggests that once we control for student quality, as measured by GPA(B1), the number of compensations has no independent effect on future results and the number of attempts has a small effect on future results. In other words, among students with a similar GPA in their first year, the presence of compensations does not negatively affect their overall performance in the second year.

Table 2. Regression model for GPA(B2).

Table reports the results of four panel estimations based on Equation (Equation2). As a first observation, the inclusion of student fixed effects removes most of the explanatory power of the panel regression model. The adjusted R2 drops from 0.316 in the models excluding student fixed effects to close to zero in the models including student fixed effects. Most relevant to this paper’s problem statement, the coefficient on Compi,j is insignificantly different from zero across all specifications. This implies that, controlling for student quality and other intervening variables, the fact that a student has compensated a failing grade for the introductory course in subject j does not have a negative impact on the result for the intermediate course in the same subject. In contrast to Table , the coefficient on #Attemptsi,j is insignificantly different from zero at a 5% significance level. This implies that the findings in Ostermaier, Beltz, and Link (Citation2013) that students needing more test attempts tend to perform worse cannot be corroborated. Regarding the interaction term #Attemptsi,j ⋅ Compi,j, Table shows that it has the expected negative sign, implying that the more attempts it takes to compensate a grade, the worse future results are. This effect is statistically significant for the specification including student fixed effects. Finally, from Tables and (excluding student fixed effects) one can conclude that student quality, as measured by GPA(B1), is the strongest predictor of study success in the second year.

Table 3. Panel regressions for second-year courses.

Regression discontinuity design

Tables finally report the estimation results for the RDD model. Table shows the results for the narrow [4.3–4.6] window around the compensation threshold for the subject areas microeconomics, accounting and statistics. In each case, three specifications are estimated. The first specification includes the treatment variable Resiti,j and Grade(B1)i,j, the second specification adds GPA(otherB1) and the third specification includes all intervening variables. In all cases, the explanatory power of the first specification is close to zero. The adjusted R2 increases substantially once the specification includes GPA(otherB1). The coefficient for Resitij in most cases has a positive value and lies between −0.064 and 0.686. However, the resit treatment effect is insignificantly different from zero in all specifications. These conclusions hold both for the narrow and for the wide [4.2–4.7] window around the compensation threshold, as a comparison of Tables and shows. The sole exception is the first specification for Statistics in Table , which yields a significantly positive treatment effect. This disappears once we include intervening variables in the second and third specification. Finally, Table reports the results from pooling the observations from the three subject areas in a panel regression. In theory, the larger number of observations could increase the significance of the estimates. The results are, however, broadly in line with the estimates by subject area. The positive coefficient for the resit treatment is insignificantly different from zero and the explanatory power results from the inclusion of GPA(otherB1).

Table 4. Local linear regressions for second-year courses (narrow window of 4.3–4.6 around threshold).

Table 5. Local linear regressions for second-year courses (wide window of 4.2–4.7 around threshold).

Table 6. Local linear regressions for a panel of courses.

Conclusions

A common view among educators is that requiring students to do a resit will increase their learning. If this is true, resits should contribute to better academic results in subsequent courses. This view underlies the widespread use of the conjunctive standard setting strategy, which requires students to do resits until they pass all examinations. The recent literature on academic procrastination points to a possible darker side of resits. Allowing students to do multiple resits may unduly accommodate students’ tendency to postpone their study effort and as a result have a negative effect on study progress.

This paper has empirically investigated the effect of a resit, as opposed to the compensation of a failing grade, on future results. If the increased learning from a resit outweighs the negative effect from procrastination, one would expect a significant positive effect of taking a resit on future results. Our regression discontinuity design compares students who use a resit to achieve a passing grade with students who do not use a resit and compensate their failing grade. This allows us to compare a compensatory standard setting strategy with a conjunctive strategy. We exploit the discontinuity arising from the compensation threshold of 4.5 to select students who at the time of their first regular test have a similar level of knowledge in an introductory course. Our model then estimates whether the resit treatment improves their result for the intermediate course in the same subject.

While the effect of the resit treatment in general is positive, it is insignificantly different from zero in almost all specifications. This finding follows both from the local linear regressions by subject area and from the pooled regression. Based on this result, we cannot reject the hypothesis that doing a resit does not improve future academic results, compared to the case in which a failing grade is compensated. This paper thus fails to find a significant negative effect of a compensatory standard setting strategy on subsequent results. This finding supports the use of the compensatory strategy in educational settings where a low tolerance of poor performance in individual tests can be permitted. However, the evidence from the panel regression models also indicates that the number of attempts that a student needs to pass a course is unrelated to future academic results. This implies that there is no evidence in favour of a negative procrastination effect arising from increasing students’ opportunities to do resits. The findings in Ostermaier, Beltz, and Link (Citation2013) that students needing more test attempts tend to perform worse therefore cannot be corroborated.

Disclosure statement

No potential conflict of interest was reported by the author.

Notes on contributor

Ivo Arnold is a professor at Erasmus School of Economics. His research interests include higher education, monetary economics and financial intermediation. His most recent publication is on cheating at online tests in The Internet and Higher Education.

References

  • Arnold, I. J. M., and J. T. Straten. 2012. “Motivation and Math Skills as Determinants of First-year Performance in Economics.” Journal of Economic Education 43 (1): 33–47.10.1080/00220485.2012.636709
  • Ballard, C. L., and M. F. Johnson. 2004. “Basic Math Skills and Performance in an Introductory Economics Class.” Journal of Economic Education 35 (1): 3–23.10.3200/JECE.35.1.3-23
  • Ballard, C., and M. F. Johnson. 2005. “Gender, Expectations, and Grades in Introductory Microeconomics at a US University.” Feminist Economics 11 (1): 95–122.10.1080/1354570042000332560
  • Clauretie, T. M., and E. W. Johnson. 1975. “Factors Affecting Student Performance in Principles of Economics.” Journal of Economic Education 6 (2): 132–134.10.2307/1182468
  • Downing, S. M., A. Tekian, and R. Yudkowsky. 2006. “Research Methodology: Procedures for Establishing Defensible Absolute Passing Scores on Performance Examinations in Health Professions Education.” Teaching and Learning in Medicine 18 (1): 50–57.10.1207/s15328015tlm1801_11
  • Feng, A., and G. Graetz. 2015. A Question of Degree: The Effects of Degree Class on Labor Market Outcomes (No. 8826). IZA Discussion Papers, Bonn, IZA.
  • Friedman Ben-David, M. 2000. “AMEE Guide No. 18: Standard Setting in Student Assessment.” Medical Teacher 22: 120–130.10.1080/01421590078526
  • Grunschel, C., J. Patrzek, and S. Fries. 2013. “Exploring Reasons and Consequences of Academic Procrastination: An Interview Study.” European Journal of Psychology of Education 28 (3): 841–861.10.1007/s10212-012-0143-4
  • Hahn, J., P. Todd, and W. Van der Klaauw. 2001. “Identification and Estimation of Treatment Effects with a Regression-discontinuity Design.” Econometrica 69 (1): 201–209.10.1111/ecta.2001.69.issue-1
  • Haladyna, T., and R. Hess. 1999. “An Evaluation of Conjunctive and Compensatory Standard-setting Strategies for Test Decisions.” Educational Assessment 6 (2): 129–153.10.1207/S15326977EA0602_03
  • Harmon, O. R., and J. Lambrinos. 2008. “Are Online Exams an Invitation to Cheat?” Journal of Economic Education 39 (2): 116–125.10.3200/JECE.39.2.116-125
  • Jacob, B. A., and L. Lefgren. 2004. “Remedial Education and Student Achievement: A Regression-discontinuity Analysis.” Review of Economics and Statistics 86 (1): 226–244.10.1162/003465304323023778
  • Johnson, M., and E. Kuennen. 2006. “Basic Math Skills and Performance in an Introductory Statistics Course.” Journal of Statistics Education 14 (2): 14.
  • Johnson, M., D. Robson, and S. Taengnoi. 2011. The Gender Gap in Economics: A Meta Analysis. Rochester: Social Science Research Network (SSRN).
  • Lagerlöf, J. N., and A. J. Seltzer. 2009. “The Effects of Remedial Mathematics on the Learning of Economics: Evidence from a Natural Experiment.” Journal of Economic Education 40 (2): 115–137.10.3200/JECE.40.2.115-137
  • Norcini, J. J. 2003. “Setting Standards on Educational Tests.” Medical Education 37: 464–469.10.1046/j.1365-2923.2003.01495.x
  • Ostermaier, A., P. Beltz, and S. Link. 2013. Do University Policies Matter? Effects of Course Policies on Performance. Dusseldorf:  Beiträge zur Jahrestagung des Vereins für Socialpolitik.
  • Park, K. H., and P. M. Kerr. 1990. “Determinants of Academic Performance: A Multinomial Logit Approach.” The Journal of Economic Education 21 (2): 101–111.10.1080/00220485.1990.10844659
  • Pell, G., K. Boursicot, and T. Roberts. 2009. “The Trouble with Resits….” Assessment & Evaluation in Higher Education 34 (2): 243–251.10.1080/02602930801955994
  • Proud, S. 2015. “Resits in Higher Education: Merely a Bar to Jump over, or Do They Give a Pedagogical ‘Leg Up’?” Assessment & Evaluation in Higher Education 40 (5): 681–697.10.1080/02602938.2014.947241
  • Ricketts, C. 2010. “A New Look at Resits: Are They Simply a Second Chance?” Assessment & Evaluation in Higher Education 35 (4): 351–356.10.1080/02602931003763954
  • Schouwenburg, H. C. 1995. “Academic Procrastination.” In Procrastination and Task Avoidance, The Springer Series in Social Clinical Psychology, edited by J. R. Ferrari, J. L. Johnson, and W. G. McCown, 71–96. New York: Springer-US.10.1007/978-1-4899-0227-6
  • Swope, K., and P. Schmitt. 2006. “The Performance of Economics Graduates over the Entire Curriculum: The Determinants of Success.” Journal of Economic Education 37 (4): 387–394.10.3200/JECE.37.4.387-394
  • Tice, D. M., and R. F. Baumeister. 1997. “Longitudinal Study of Procrastination, Performance, Stress, and Health: The Costs and Benefits of Dawdling.” Psychological Science 8 (6): 454–458.10.1111/j.1467-9280.1997.tb00460.x
  • van de Watering, G., and J. van der Rijt. 2006. “Teachers’ and Students’ Perceptions of Assessments: A Review and a Study into the Ability and Accuracy of Estimating the Difficulty Levels of Assessment Items.” Educational Research Review 1: 133–147.10.1016/j.edurev.2006.05.001
  • Yocarini, I. 2015. “Systematic Comparison of Decision Accuracy of Complex Decision Rules Combining Multiple Measures in a Higher Education Context.” Thesis Universiteit Leiden, Leiden.