Publication Cover
Engineering Education
a Journal of the Higher Education Academy
Volume 7, 2012 - Issue 1
1,690
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Involving supervisors in assessing undergraduate student projects: is double marking robust?

, PhD, MSc, BSc (Hon), FHEA, CITP
Pages 40-47 | Published online: 15 Dec 2015

Abstract

The individual student project is a significant piece of work, typically carried out in the final or penultimate year of an engineering programme. Its primary aim is to provide evidence of a student’s competence in applying the knowledge and experience gained over the entire engineering programme to practical engineering problem solving. Each student project is supervised by at least one academic. In many institutions, double marking involving project supervisors and non-supervisors is used to provide a summative assessment of student projects. This has raised concerns about potential supervisor bias. The work reported in this paper is a statistical investigation into potential supervisor bias in a double marking scheme in the engineering department of a UK university. In this scheme, 90 final year undergraduate engineering student projects were assessed by 24 supervisors and 20 non-supervisors (referred to in this paper as second markers). The findings from this analysis suggest that there is significant correlation between the marks awarded by project supervisors and second markers and that there is no statistically significant evidence to suggest that project supervisors mark and grade final year student projects differently from second markers. An analysis of the project assessment environment from a community of practice perspective leads to the conclusion that the practice of double marking, as widely applied in engineering project assessment, probably has sufficient feedback and control mechanisms to ensure minimal intermarker variability between project supervisors and second markers.

Introduction

The student engineering project is highly valued both within and outside academia. Owing to a high weighting relative to the other course modules, the project may contribute significantly to the student’s overall degree classification. In addition, obtaining a failing mark in the student project may lead to the award of a nonaccredited engineering degree. This may lead in turn to difficulties in securing employment since the award of a degree accredited by a relevant engineering professional institution gives employers the assurance that the student has completed a course that has been professionally validated (CitationLevy, 2000).

The double marking scheme is a common assessment method used to evaluate undergraduate engineering projects (CitationShay, 2004). The two assessors mark the student’s work independently using common reference criteria and are then required to agree a project mark for the student. If the process of negotiating and agreeing a final project mark becomes intractable, a third assessor may be brought in. Failing this, the body of academics, sitting as an examination board, may agree a final project mark, usually with the advice and concurrence of an external examiner (usually a senior academic from another university).

In most engineering departments, one of the two assessors appointed to evaluate a student’s project is the academic who supervised it. However, owing to their regular contact and familiarity with the students under their supervision, there is a possibility that supervisors may not be entirely objective when assessing their students’ work. Consequently, the role of the supervisor in project assessment is coming under increasing scrutiny from researchers (CitationPathirage et al., 2007; CitationMacDougall et al., 2008; CitationShay, 2005; CitationMcKinstry et al., 2004; CitationDennis, 2007). CitationMacDougall et al. (2008) and CitationMcKinstry et al. (2004) have reported that supervisor assessment is affected by contact with students. On the other hand, studies by CitationPathirage et al. (2007), CitationShay (2005) and CitationDennis (2007) suggest that there is no statistically significant evidence to support such an assertion. The jury is still out regarding the likelihood of project supervisors exhibiting bias when assessing the work of the students under their supervision.

This paper is a contribution to the debate on the involvement of supervisors in marking the project work of students under their supervision. The method adopted in this paper uses statistical analysis to compare the marks awarded by project supervisors and second markers in the third year individual student projects of the undergraduate engineering programme at the University of Exeter in the academic year 2008/09. A statistically significant difference between the marks awarded by supervisors and the marks awarded by second markers would postulate that supervisors are influenced by contact with the students under their supervision.

Following the presentation of the results of the statistical analysis, an attempt will be made to explain the paper’s findings and draw meaningful conclusions from them using perspectives drawn from research in communities of practice (CitationWenger, 1998). According to CitationLave and Wenger (1998) the term community of practice implies ‘participation in an activity system about which participants share understandings concerning what they are doing and what that means in their lives and for their communities.’ A community of practice perspective has been adopted because it is increasingly accepted amongst education researchers that the knowledge needed to evaluate and assess student work is primarily tacit and is best ‘understood in terms of the social interaction between the individual academic and the rest of the academic community’ (CitationJawitz, 2007a). This view is also reinforced by CitationGonzalez-Arnal and Burwood (2003) who suggest that academics learn to assess effectively by ‘participating in relevant social practices, observing, copying, imitating, until we begin to grasp the sense of the activities and are able to integrate different elements.’

Project implementation and assessment at the University of Exeter

At the beginning of the academic year, each student is assigned a supervisor whose duty it is to guide and advise the student throughout the year through weekly meetings, which are recorded on an attendance form in the student’s project logbook. The project is an entirely student-driven process, with the student having the responsibility for planning, managing and implementing it within the time guidelines set out at the beginning of the year. At the end of the academic year, each project is assessed by the student’s supervisor and another academic from the engineering department who is familiar with the project’s subject area. Both markers separately assess and mark the project report and poster presentation using standardised mark sheets. They then (again separately) arrive at an overall project mark, taking into consideration the marks awarded by the supervisor for the log book, the student’s general project management conduct and the preliminary report which is submitted by the student earlier in the year. Following their separate assessment, the two academics agree on a final project mark which subsumes all the marks awarded to the individually assessed parts of the project work. If they fail to agree a mark, the project is referred to a third academic from within the department.

The last stage of the project assessment process is the examination board. All of the academics who teach on the engineering programmes meet to discuss all of the engineering examination marks in consultation with external examiners. With regard to project marks, the examination board confirms the marks that have been agreed upon and makes a final decision on all of the projects in which agreement was not reached between the markers.

An overview of the data and statistical analysis methods used in the study

90 undergraduate student projects in the engineering department at the University of Exeter were submitted for assessment in the academic year 2008/09. 24 academics from within the department served as supervisors for one or more of the projects. 20 of these supervisors also served as second markers for one or more student projects they had not supervised.

The project marks for each student comprised the mark and grade awarded independently by the supervisor, the mark and grade awarded independently by the second marker and the mark and grade agreed by the supervisor and second marker. Marks ranged from 0 to 100% and grading for the projects followed the standard classification used by the University of Exeter as shown in .

The specific objective of the statistical analysiswas to find answers to the following questions:

  • Is there a correlation between marks awarded by supervisors and marks awarded by second markers?

  • What is the relationship, if any, between the final agreed marks and the marks awarded individually by the supervisors and second markers?

  • Do supervisors mark projects differently from second markers?

  • Do supervisors and second markers have different marking profiles?

  • In the cases where the marks awarded by supervisors and second markers fall into different classes, are supervisors more likely to award a higher class than the second markers?

  • Do differences in academic status/rank affect second marker and supervisor marking?

To obtain answers to the above questions, t-tests were used to compare means, Pearson correlation was used to establish agreement or disagreement between different marker categories and the chi-square test was used to establish whether or not supervisors and second markers exhibited different marking profiles.

Results

Is there a correlation between marks awarded by supervisors and marks awarded by second markers?

There is a high correlation between marks awarded by supervisors and marks awarded by second markers r(N=90)=0.84,p<0.01. Both of these marks are highly correlated to the agreed marks, with r(90)=0.97,p<0.01 for supervisor marks and r(90)=0.91,p<0.01 for second marker marks.

What is the relationship, if any, between the final agreed marks and the marks awarded individually by the supervisors and second markers?

The Williams T2 formula (Cramer, 1994, p. 227) was used to establish whether or not the difference between the correlation of supervisor marks to agreed marks (0.97) and the correlation of examiner marks to agreed marks (0.91) was statistically significant. For the 90 student projects (N=90), the degree of freedom (d.f.) of the Williams T2 formula is 87, i.e. d.f.=N-3. In this case the Williams T2 value is T2 (87)=5.76,p=0.01, indicating that the correlation between supervisor marks and agreed marks is higher than the correlation between second marker marks and agreed marks (T 2 (87)=5.76,p=0.01).

Do supervisors mark projects differently from second markers?

The means for the two sets of marks are very close to each other, with the mean for supervisor marks being 65.36% and the mean mark for the second markers being 64.98%. In addition, the variances are also close together with the variance for supervisor marks being 125.05 and the variance for examiner marks being 114.99. A student t-test carried out at a significance level of 5% yields a t-value of 0.2347, much less than the critical t-value of 1.9734, i.e. t(178)=0.2347,p=0.05, where N=90, the number of student projects and the degree of freedom is 2N-2=178. This suggests that there is no significant difference between supervisor marking (M=65.36,SD=11.18) and second marker marking (M=64.98,SD=10.72).

Do supervisors and second markers have different marking profiles?

For each student project, the supervisor and the second marker were each categorised into one of six categories depending on:

  • the relative differences between the supervisor mark and the second marker mark

  • the extent to which the marker in question is prepared to accommodate the views of the other marker as indicated by the distance of the agreed mark from the mean of the supervisor and second marker marks.

Table 1. Standard mark classification used at the University of Exeter

Pairs of marks that differed by three or more marks were categorised as having high differences whilst pairs of marks within three marks of each other were categorised as having low differences. Markers who agreed on a final mark closer to their initial mark than the mean of their mark and that of the second marker were categorised as having the ability to retain their decision. On the contrary, those markers who agreed a mark further from their initial mark than the mean of their mark and that of the second marker were categorised as being accommodative. The mark difference value of three was chosen as a basis for categorisation after observing that supervisor marks were within three marks of the second marker marks in almost 60% of the projects (53 out of 90). Each individual marker’s category was then determined by computing the marker’s average mark difference. For both the supervisors and second markers, a profile showing the distribution of the markers across the six mark difference categories was determined. compares the supervisor and second marker marking profiles.

The chi-square test was carried out to test the null hypothesis that there is no significant difference in supervisor and examiner marking profiles. For the null hypothesis to be rejected at the 5% two-tailed significance level, the chi-square value has to exceed 11.07 for five degrees of freedom. (NB. The degrees of freedom equal one less the number of marking profile categories multiplied by one less the number of marker roles. In this case the number of categories is six and the number of marker roles is two - supervisors and examiners). The chi-square test shows that there is no significant difference in the marking profiles of supervisors and second markers, X 2 (5,N=44)=4.06,p=0.05.

Table 2. Comparison of supervisor and second marker marking profiles

Table 3. Final year project differences between grade classes awarded by the supervisor and the second marker

Do supervisors classify projects differently from second markers?

As shown in , in 63.3% of the projects the supervisor mark and the second marker mark fell into the same class (). In 95.6% of the projects the supervisor and the second marker were within one class difference, with the proportion of supervisors with a class above the second marker almost identical to the proportion of supervisors with a class below the second marker (14 projects compared to 15 projects). In three of the projects the supervisor marks were two classes below the class awarded by the second marker. There was only one project in which the supervisor mark was three classes above that of the mark awarded by the second marker.

A contingency table for the classes that supervisors and second markers awarded in their initial marking was prepared (). Additional classes for marks above 80% and below 30% have been added. The chi-square test was carried out to test the null hypothesis that there is no significant difference in the manner in which supervisors and examiners classify student projects. For the null hypothesis to be rejected at the 5% two-tailed significance level, the chi-square value has to exceed 12.59 for six degrees of freedom (number of contingency table categories less one). The chi-square test showed that, for each project, the supervisor and the second marker tended to categorise it in the same class X 2 (6,N=21)=5.77,p=0.05.

Where supervisor and second markers individual marks fall into different classes are supervisors more likely to award a higher or lower class than the second markers?

A two-tailed t-test carried at a significance level of 5% yields a t-value of 0.26 which is numerically much less than the critical t-value of 2.0042 for 56 degrees of freedom (one less than the number of projects with a class difference of one), i.e. t(56)=0.26,p=0.05. This suggests that supervisors (M=−0.759,SD=15.69) are equally as likely to award a higher or a lower class as the second markers (M=−0.48,SD=15.76).

Do differences in academic status/ rank affect second marker and supervisor marking?

The academics within the engineering department range from lecturers who are still at the start of their academic careers (with less than five years academic experience) to full professors. Of the 24 academics involved in project supervision and assessment, nine were in the junior grade of lecturer, six were senior lecturers, five were associate professors and the remaining four were full professors. To assess the impact of seniority on project marking, the academics were divided into two groups: junior academics and senior academics. The junior academics group comprised all academics in the lecturer grade and the senior academics group comprised all other academics from the rank of senior lecturer to full professor.

Four possible combinations of supervisor and second marker were considered:

  • The supervisor and the second marker are both senior academics

  • The supervisor and the second marker are both junior academics

  • The supervisor is a senior academic and the second marker is a junior academic

  • The supervisor is a junior academic and the second marker is a senior academic.

Table 4. Contingency table for grades awarded by supervisors and second markers

In the case where supervisors are junior academics, is their marking affected by the rank of the second marker?

A two-tailed student t-test carried at a significance level of 5% yields a t-value of 0.42 which is numerically much less than the critical t-value of - 2.0294 for 36 degrees of freedom (number of projects less two), i.e. t(36)=0.42,p=0.05. This implies that when junior academics are in the role of project supervisor, their approach to project marking is not affected by whether the second marker is a senior academic (M=−0.53,SD=28.59) or a junior academic (M=0.5,SD=11.75).

In the case where second markers are junior academics, is their marking affected by the rank of the supervisor?

A two-tailed student t-test carried at a significance level of 5% yields a t-value of 0.51 which is numerically much less than the critical t-value of 2.0336 for 36 degrees of freedom (number of projects less two), i.e. t(36)=0.51,p=0.05. This implies that the difference between the two means is not statistically significant at the 5% significance level. When junior academics are in the role of second marker their approach to project marking is not affected by whether the project supervisor is a senior academic (M=−0.87,SD=33.62) or a junior academic (M=0.5,SD=11.75).

In the case where supervisors and second markers are of equal academic rank, is the difference between supervisor and second marker mark affected by markers’ rank?

A two-tailed t-test carried at a significance level of 5% yields a t-value of 0.7712 which is numerically much less than the critical t-value of 2.069 for 23 degrees of freedom (number of projects less two), i.e. t(23)=0.77,p=0.05. This suggests that when the supervisors and second markers have the same rank their marking is not affected by whether they are junior academics (M=−0.5,SD=11.75) or senior academics (M=1.73,SD=12.82).

Do the supervisor-to-second marker rank pairings have the same marking profile?

The supervisor-to-second marker rank pairings were categorised into three categories of marking profile as shown in . As before, pairs of marks that differed by three or more marks were categorised as having high differences whilst pairs of marks within three marks of each other were categorised as having low differences.

The chi-square test for the null hypothesis that there is no difference in intermarker behaviour when the pairings between supervisors and examiners are categorised by rank yields a chisquare value of 1.85 at six degrees of freedom. (NB. the degrees of freedom equal one less the number of marking profile categories multiplied by one less the number of supervisor-examiner pairings, i.e. X2 (6,N=12)=1.85,p=0.05). This suggests that differences in academic rank between a supervisor and a second marker do not influence intermarker behaviour.

Table 5. Contingency table for grades awarded by supervisors and second markers

Discussion

The main finding of this study is that academics tend to mark projects under their supervision in much the same way as their counterparts from within the same department who second mark those projects and the marks awarded by project supervisors are highly correlated to the marks awarded by the second markers. In addition, there is no statistically significant evidence to suggest that supervisors mark any differently from second markers. This is true even for the marks and classes awarded individually by the supervisors and second markers prior to reaching agreement. In addition, when the academics are ranked in accordance with the extent to which they are hard or soft markers, there is no statistically significant difference in the profiles of supervisors and second markers. Finally, this study also finds that both project supervisors and second markers are largely unaffected by any differences in their relative academic status.

These findings agree closely with the findings from some of the previous research on assessment. For example, in a qualitative study of double marking in a business school dissertation which utilised staff questionnaires and a structured group discussion, CitationHand and Clewes (2000) found that (in general) markers were in close agreement, with only a few instances in which major discrepancies occurred. In another study, CitationPathirage et al. (2007) carried out a statistical analysis to test the theory that the supervisor’s mark has a bias in favour of the student when compared to the second marker. Assessing project marks in the School of the Built Environment at the University of Salford over the three academic years 2002/03, 2003/04, and 2004/05, they discovered that the differences in the average mark awarded by the first marker and the second marker was not statistically significant for two of the academic years and only weakly significant in the other. Similar observations were also noted by CitationShay (2005) and CitationDennis (2007). These findings, together with the findings presented in this paper, suggest that supervisors and second markers tend to mark in the same way within a double marking scheme.

Why do supervisors and second markers tend to mark closely to each other? CitationShay (2004) suggests that the academics constitute a community of practice which, to some degree, shares a common understanding of what to expect in a quality project. From a community of practice perspective, the fact that supervisors and second markers generally award project marks that are close together would indicate that the academics have developed a shared concept of marking and classifying projects with which everyone in the department generally complies. As Johnston (2004) personally experienced when learning to mark essays at an American university in the Middle East for the first time, these concepts may not be explicitly documented and they may not conform to the official mark sheet. Consequently, a newcomer to an academic department has to learn through practice how to interpret and use the mark sheet in the same manner as the more established members of the department. Indeed, as CitationShay (2004) notes, one reason for the discrepancies between the marks allocated in a double marking scheme is that one (or both) of the markers may be relatively new to the department.

Another way of looking at this is that, with double marking in place, project assessment becomes a situated activity. Alexander and Wiley (1981) define a situated activity as ‘conduct in a symbolically defined space and time within which an actor presumes that events are being or might be monitored by another.’ In other words, a situated activity is an activity which an individual carries out in anticipation that others will assess it and comment on it. With regard to project assessment under the double marking scheme, academics know that others may judge them on the basis of their assessment. Consequently, as pointed out by Jawitz (2007b), student project assessment is a high stakes game not only in terms of the importance of the mark towards the student’s grade but also the standing of the academic amongst colleagues. For example, Johnston (2004) talks of the pressure on examiners to conform to some common way of marking so as to maintain credibility amongst colleagues. It is therefore plausible that double marking has sufficient inbuilt (albeit implicit) control and feedback mechanisms to ensure minimal intermarker variability between supervisors and second markers.

Reflection

Whilst the findings may be in general agreement with the current socio-cultural view of higher education assessment, it may be difficult to generalise them in terms of other institutions because of two key shortcomings. Firstly, this work focuses only on a single cohort in a single university. A longitudinal study involving multiple cohorts and multiple engineering departments in different universities would help to shed light on the organisational and socio-cultural features that facilitate effective double marking in supervisorassessed student projects. Secondly, this work has been primarily quantitative. Given that the assessment of student projects is a socially situated activity, the use of qualitative research methods (such as direct observations of the discussions between supervisors and second markers, in-depth interviews with selected markers and focus groups comprising both supervisors and second markers) may help to shed light on the double marking scheme. In this way, an evidence-based theoretical framework for implementing effective double marking in supervisor-assessed student projects can be developed and made available to any engineering department that needs to improve its double marking procedures.

References

  • AlexanderN. C. and MaryG.W. (1981) Situated activity and identity formation. In: RosenbergM. and TurnerR.H. (eds.) Social psychology: sociological perspectives. New York: Basic, 269-289.
  • DennisI. (2007) Halo effects in grading student projects. Journal of Applied Psychology, 92 (4), 1169-1176.
  • Gonzalez-ArnalS. and BurwoodS. (2003) Tacit knowledge and public accounts. Journal of Philosophy of Education, 37 (3), 377-391.
  • HandL. and ClewesD. (2000) Marking the difference: an investigation of the criteria used for assessing undergraduate dissertations in a business school. Assessment and Evaluation in Higher Education, 25 (1), 5-21.
  • JawitzJ. (2007a) Becoming an academic: a study of learning to judge student performance in three disciplines at a South African university. PhD thesis, University of Cape Town.
  • JawitzJ. (2007b) New academics negotiating communities of practice: learning to swim with the big fish. Teaching in Higher Education, 12 (2), 185-197.
  • JohnstonB. (2004) Summative assessment of portfolios: an examination of different approaches to agreement over outcomes. Studies in Higher Education, 29 (3), 395-412.
  • LaveJ. and WengerE. (1998) Situated learning: legitimate peripheral participation. Cambridge: University of Cambridge Press.
  • LevyJ. (2000) Engineering education in the United Kingdom: standards, quality assurance and accreditation. International Journal of Engineering Education, 16 (2), 136-145.
  • MacDougallM. RileyS. CameronH. and McKinstryB. (2008) Halos and horns in the assessment of undergraduate medical students: a constituency-based approach. Journal of Applied Quantitative Methods, 3 (2), 116-128.
  • McKinstryB. CameronH. EltonR. and RileyS. (2004) Leniency and halo effects in marking undergraduate short research projects. BMC Medical Education, 4 (28). Available from http://www.biomedcentral.com/1472-6920/4/28 [accessed 22 August, 2011].
  • PartingtonJ. (1994) Double-marking students' work. Assessment & Evaluation in Higher Education, 19 (1), 57-60.
  • PathirageC. HaighR. AmaratungaD. and BaldryD. (2007) Enhancing the quality and consistency of undergraduate dissertation assessment: a case study. Quality Assurance in Education: an International Perspective, 15 (3), 271-286.
  • ShayS. (2004) The assessment of complex performance: a socially situated interpretive act. Harvard Educational Review, 74 (3), 307-329.
  • ShayS. (2005) The assessment of complex tasks: a double reading. Studies in Higher Education, 30 (6), 663-679.
  • WengerE. (1998) Communities of practice: learning, meaning, and identity. Cambridge: Cambridge University Press.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.