5,745
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

A comparison of Olympic and Paralympic performances

Pages 446-458 | Received 10 Feb 2016, Accepted 20 Feb 2018, Published online: 21 Mar 2018

Abstract

In 2012, Oscar Pistorius created history as the first amputee sprinter to compete in the Olympics. Other athletes achieved amazing feats long before the Paralympics were introduced, including gymnast George Eyser who won six medals at the 1904 Olympics with a wooden leg, and others who competed in both Games. An exciting challenge of considerable interest is to compare performances of Olympic and Paralympic athletes, so contributing to improving integration of the two competitions. We generalise the recent dynamic shrinkage method for class handicapping and apply it to competition results from equestrian individual dressage at the London 2012 Summer Games and cross country skiing at the Sochi 2014 Winter Games. Our analysis generates promising results and surprising revelations. It also offers a fair method for comparing performances by athletes from other diverse groups, with potential benefits of extra incentive and reward systems for motivating unified sporting participation in general settings.

1. Introduction

The history of multi-sport competitions is rich and can be traced back thousands of years. Major international multi-sport competitions began with the modern Olympic Games, with summer events starting in 1896 and winter events starting in 1924. Olympic Games are open to all competitors, though there are now many other international multi-sport competitions aimed at specific communities. Perhaps surprisingly, the Deaflympic Games for deaf athletes began in the summer of 1924 and the winter of 1949, whereas the hugely popular Paralympic Games for athletes with physical disabilities began in the summer of 1960 and the winter of 1976. Another important competition is the Special Olympics World Games for athletes with intellectual disabilities, which began in the summer of 1968 and the winter of 1977. Other major international multi-sport competitions include the Commonwealth, Asian, Pan-American, All-Africa, Pacific, European, Youth Olympic, and Invictus Games.

There are some remarkable stories of disabled athletes who participated in open competition at Olympic Games. Among the most notable of these are gymnast George Eyser (USA) who won three gold medals in 1904 despite a wooden leg, deaf-mute boxer Carlo Orlandi (Italy) who won a gold in 1928, lower-leg amputee Oliver Halassy (Hungary) who won golds for water polo in 1932 and 1936, Károly Takács (Hungary) who won golds for shooting in 1948 and 1952 with a badly injured hand, hammer thrower Harold Connolly (USA) who had Erbs Palsy yet won a gold in 1956, deaf fencer Ildikó Újlaky-Rejtő (Hungary) who won two golds in 1964, deaf swimmer Jeffrey Float (USA) who won a gold in 1984 and blind archer Im Dong-Hyun (South Korea) who won golds in 2004 and 2008.

Speaking before the London 2012 Olympic and Paralympic Games, the Chair of the Organising Committee, Lord Coe, said: “We want to change public attitudes towards disability, celebrate the excellence of Paralympic sport and to enshrine from the very outset that the two Games are an integrated whole”. This objective was achieved with overwhelming success in London and subsequently at the Rio de Janeiro Games in 2016. Meanwhile, the operational research community has shown considerable interest in performance measurement for multi-sport competitions.

The predominant technique that has been used for analysing the success of nations at the Olympic Games is data envelopment analysis (Churilov & Flitman, Citation2006; Gomes & Lins, Citation2008; Li, Lei, Dai, & Liang, Citation2015; Li, Liang, Chen, & Morita, Citation2008; Lozano, Villa, Guerrero, & Cortés, Citation2002; *YLL2008, Citation2015; Zhang, Li, Meng, & Liu, Citation2009). This procedure is inherently nonparametric, which makes it robust against model misspecification at the cost of reduced power and clarity. Another recent and relevant method, which is more parametric in nature, is multiple attribute decision making (Ballι & Korukoǧlu, Citation2014; Li, Kou, Lin, Xu, & Liao, Citation2015; Rezaei, Citation2015). Both approaches offer ways for combining observations of several different attributes to compare individual units. The main differences between those formulations and the developments in this paper are that we focus on comparing a single measure across units that belong to natural or contrived clusters. In this respect, we add to the literature by tacklingthe specific problem of comparing disparate groups of competitors. The resulting methodology offers potential benefits for applications in many diverse situations.

In this paper, we consider how, and to what extent, we might compare the performances of Olympic and Paralympic athletes in order to offer enhanced integration of the two competitions. The next section reviews a suitable methodology for making such comparisons and then we generalise this theory to enable its application to a much wider variety of sports. After this abstract development, we present some empirical analyses based on data collected from sports competitions at the London 2012 Summer Games and the Sochi 2014 Winter Games. Then we extend these ideas to consider applications of the generalised shrinkage method to compare diverse categories. Finally, we conclude the paper by discussing these results, suggesting how Olympic and Paralympic Games might develop in future, and considering whether other population subgroups could or should be integrated in a similar manner.

2. Shrinkage method for class handicapping

The International Paralympic Committee invests considerable effort in allocating athletes to disability classes, in order to ensure that all athletes in any particular class have similar physical and mental capacities, so resulting in fair competition. However, athletes from different classes in individual Paralympic sports often compete for the same medals to ensure fair rewards for effort when there are many classes or few competitors.

For some events, simple handicaps can be applied to render all competitors equally able. Examples include the use of eyeshades in goalball and upper bounds for each team’s aggregate physical ability in wheelchair basketball. Most events involve scaling the performance measure, racing time perhaps, by a factor that relates to the athlete’s disability class. For example, if the average racing time in a specific class is 50% greater than that of the average racing time in the best performing class, then we scale the times of individual athletes in the specific class by a factor of 2/3 to enable fair comparisons.

Current practice is that each of these sports has a committee to assign base factors to classes and to adjust these factors following major sporting competitions. However, this approach has some disadvantages: the system lacks transparency and is complicated; base factors require frequent committee adjustments; historical factors ignore the effects of current racing conditions; changes in classifications and technology are difficult to incorporate; the system does not fully allow for different class sizes.

In order to combat these problems, Percy (Citation2011) proposed a shrinkage method for class handicapping, which uses data from only the current competition that is in progress. This method is based on ideas developed by Efron and Morris (Citation1975), Efron and Morris (Citation1977) that relate to James-Stein estimators. Copas (Citation1983) provides a comprehensive description of these estimation procedures and Wang, Huwang, and Yu (Citation2015) present a recent application of them in the context of quality control charts. A subsequent paper by Percy (Citation2013) presents a mathematical justification for this shrinkage method based on an objective Bayesian analysis of a suitable probability model.

The shrinkage method for class handicapping can be applied interactively, either immediately after events where athletes compete together and results are observed simultaneously, or dynamically during events where athletes compete sequentially and results are not observed simultaneously. There are two purposes of the present paper: firstly to generalise this approach for application to a considerably broader range of sports; secondly to investigate how we might use it to compare the performances of Olympic and Paralympic athletes. To facilitate this analysis, we start with a brief review of the shrinkage method.

Suppose that we observe racing times rij for i=1,,m and j=1,,ni corresponding to athlete j in class i. Define r~i· to be the sample median racing time of class i and r~·· to be the overall sample median racing time. The purpose of using medians rather than means is for robustness and to avoid outliers; see Huber and Ronchetti (Citation2009). Next define a weighted geometric mean of these measures(1) r~i·=r~i·nir~··1/(ni+1)(1)

to represent a shrinkage of r~i· towards r~·· by an amount that is inversely correlated with the number of athletes in class i. Geometric means are used rather than arithmetic means here because racing times are positive, so multiplicative models are more appropriate than additive models.

This shrinkage towards the overall average is necessary as the precision of sample class medians as estimators of corresponding population medians increases with sample size. Estimators based on small samples are unreliable and empirical evidence alone cannot distinguish between the abilities of individuals and their classes. By regressing class medians towards the global average in relation to their sample sizes, shrinkage reduces the mean squared errors when estimating the average racing times. Hence, it allows for varying class sizes and enables fair comparisons among groups with different numbers of competitors; see Everson (Citation2007).

The scaling factor for class i is then defined by the ratio(2) xi=minir~i·r~i·(2)

and finally the scoring time corresponding to athlete j in class i is given by(3) sij=xirij(3)

for i=1,,m and j=1,,ni. These scoring times are then compared across athletes and classes to determine the winners and rank order of competitors.

3. Generalised shrinkage method for class handicapping

The shrinkage method described above is useful for performance measures that are constrained to be positive and for which the aim is to achieve small values. Such measures are common and typically represent times that arise in sports such as running, swimming, cycling and skiing. It adapts easily to deal with the situation where there is a non-zero lower bound, by subtracting the bound from each athlete’s result prior to calculations. However, we now generalise the shrinkage method in two ways.

Firstly, we consider performance measures that are again constrained to be positive but for which the aim is to achieve large values. Such measures typically represent laps, distances, heights or weights that arise in sports such as sailing, discus, high jump and weightlifting. The extension to allow for maximisation simply involves redefining the best class. This requires us to replace the scaling factor in Equation (Equation2) with(4) xi=maxir~i·r~i·(4)

and then calculate the scoring times from Equation (Equation3) as before.

Secondly, we consider performance measures that more generally take values from any real interval, which is an appropriate assumption for all sports and includes values taken from a countable subset of a real interval. This extension includes unbounded measures, such as displacement in sports like tug-of-war, and bounded measures, such as points and ratings in sports like archery, gymnastics, diving and figure skating. Rather than develop separate algorithms for each of these situations, we propose a unified algorithm for class handicapping. We refer to this as the generalised shrinkage method, as it models all these cases and includes the existing methods as special cases.

Our proposal involves transforming the performance measure onto the set of real numbers and then applying an additive version of the theory presented above. This procedure is reminiscent of generalised linear models (Nelder & Wedderburn, Citation1972) and we shall see that it corresponds to the existing analysis for performance measures that are constrained to be positive. We need to consider just four possibilities: proper and bounded [ab]; left-bounded and right-unbounded [a,); left-unbounded and right-bounded (-,b]; unbounded at both ends (-,)=R. We define a transformation tij=f(rij) according to the support of the performance measure, which takes one of the forms(5) t=f(r)=lnr-ab-rifr[a,b]ln(r-a)ifr[a,)-ln(b-r)ifr(-,b]rifr(-,)(5)

for simplicity and consistency. These forms ensure that tijR and retain the direction of achievement, whereby small or large values represent the best results. Although finite bounds map onto infinity for these transformations, this does not present a problem in practice. Appendix 1 presents a theoretical justification for the choice of transformations in Equation (Equation5).

As we use medians rather than means and the above transformations are strictly increasing functions, it follows that the medians of the transformed performance measures equal the transformed medians of the performance measures. This implies that we can save on computational effort by calculating class and overall medians before transformation, rather than transforming all the data before calculating the medians. Hence, t~i·=f(r~i·) for i=1,,m and t~··=f(r~··) from Equation (Equation5).

Having transformed the class and overall medians onto the set of real numbers, we now use additive models, rather than multiplicative models, when evaluating the scoring results. In this case, we replace Equation (Equation1) with the weighted arithmetic mean(6) t~i·=nit~i·+t~··ni+1(6)

and then determine the arithmetic adjustment or difference for class i. This replaces Equation (Equation2) and is defined by(7) yi=minit~i·-t~i·(7)

if scoring results close to the lower bound of r are desirable, or(8) yi=maxit~i·-t~i·(8)

if scoring results close to the upper bound of r are desirable. In both cases, the transformed scoring result for athlete j in class i is given by(9) uij=yi+tij(9)

and this enables us to determine the corresponding untransformed scoring result for this athlete. This is given by(10) sij=f-1uij(10)

where(11) s=f-1(u)=a+beu1+eufors[a,b]a+eufors[a,)b-e-ufors(-,b]ufors(-,)(11)

from Equation (Equation5), which replaces Equation (Equation3). It is easy to prove that this generalised shrinkage method reduces to the existing shrinkage method for minimising racing times that satisfy rij[0,).

Before proceeding, we note that exceptional athletes can achieve perfect scores in some sports, particularly those with bounded ranges such as diving or figure skating. However, this does not pose a problem as the arithmetic adjustments within generalised shrinkage do not require transformations of individual scoring times, merely of class and overall medians. Although transformations of individual scoring times are included within Equations (Equation9) and (Equation10), the net effect of transforming, adjusting and back-transforming a perfect score is the same perfect score, so the analysis will rightly recommend the gold medal for such an athlete.

We also note that small class sizes can be included in the analysis, though this leads to the possibility of collusion. However, the use of medians, rather than means, provides security against this and pooling very small classes would also help to avoid collusion. In theory, it would even be possible to introduce a continuum disability scale, which effectively assigns a different class to each athlete. However, this scheme would be very difficult to implement in practice. The main problem is that the classification of each athlete would be very subjective and susceptible to bribery and corruption, so rendering this analysis unworkable.

4. Comparing Olympic and Paralympic athletes

Percy (Citation2011) applied the shrinkage method to compare the performances of athletes in different Paralympic classes. We now investigate whether the method is able to compare such performances with those of Olympic athletes, effectively regarding the latter as belonging to a different disability class. The purposes are not to remove open competition, but rather to enable fair comparisons of athletes with different abilities.

We define the adjective “fair” in this context as ranking competitors from the classes under consideration in order of their departures from corresponding class averages, such that exceptional relative performances are rated best. In particular, though somewhat tautologically, the rankings are determined by the values of the transformed scoring results uij of Equation (Equation9).

The transformation and shrinkage that are applied to the observed results are designed to map the data onto the set of real numbers, such that the differences between classes merely represent additive disparities and the variances of different classes are roughly equal. This is a common assumption in generalised linear modelling, specifically the analysis of variance, which is much related to the methodology considered here. The transformed scoring results are then compared directly to identify athletes that performed exceptionally well relative to other competitors in their particular classes.

Although Olympic and Paralympic Games are held at the same venues at about the same times, there is still little true integration as a result of which Paralympic sports remain marginalised. The shared competitions proposed in this paper are certainly not replacements for the separate games. Rather, they are intended to generate supplementary rewards that encourage further cooperation. These might take the form of an extra layer of medal awards that appeals to sponsors and competitors, so offering incentives for tournament organisers to integrate the sports. Sponsors, including broadcast media and sports manufacturers, benefit through extra publicity and marketing opportunities. Competitors, both Olympic and Paralympic, benefit by higher profiles and improved unification of the two games.

Some sports are offered in only one of these two types of games, in which case comparisons of this nature are inappropriate. Even so, it is still possible to unite Olympics and Paralympics further by merging events and ceremonies and by introducing more unified sports. Extracting results data from the websites www.olympic.org and www.paralympic.org, we now consider two sports where such comparisons are possible.

4.1. London 2012 Summer Games

The summer sport that we consider here is mixed-sex equestrian individual dressage, where the best performance corresponds to the largest percentage score achieved of the form rij[0,100]. Corresponding to this event, medals were awarded for open competition in the Olympic Games and for each of five functional classes in the Paralympic Games:

(I)

riders with impaired limb function, or poor balance and good upper limb function (subdivided into classes Ia and Ib);

(II)

riders with locomotion impairment;

(III)

blind riders with moderate locomotion impairment;

(IV)

riders with some visual impairment or impaired function in one or two limbs.

Figure contains a plot of grouped individual values that displays the observed and scoring percentages, the latter transformed using generalised shrinkage, for each Paralympic functional class and for the Olympic open class. As expected, the last of these classes performed best and so its observed and scoring results are equal. The rankings of individual athletes within each class are unchanged by transformation.

Figure 1. Observed and scoring percentages for London 2012.

Figure 1. Observed and scoring percentages for London 2012.

The tables in Appendix 2 present the numerical values of the observed and scoring percentages for each functional class, along with overall rankings based on the transformed results using generalised shrinkage. Our method would allocate the gold and bronze medals to Olympic athletes from Class O, Charlotte Dujardin (GBR) and Adelinde Cornelissen (NED) respectively, and the silver medal to a Paralympic athlete from Class Ia, Sophie Christiansen (GBR). Interestingly, Sophie Christiansen graduated with a master’s degree in mathematics from Royal Holloway, University of London, which hosted the Operational Research Society’s 2014 conference where the author first presented this research. The reasons for these combined medal allocations are clear from Figure and Appendix 2. These three winning athletes achieved considerably larger observed percentages than those of other competitors in their respective classes.

4.2. Sochi 2014 Winter Games

The winter sport that we consider here is men’s cross country skiing, where the best performance minimises a time of the form rij[0,). Corresponding to this event, medals were awarded for open competition over 15km in the Olympic Games and for each of three functional classes in the Paralympic Games:

(I)

20km classical technique visually impaired;

(II)

15km sitting;

(III)

20km classical technique standing.

The race distances vary among these classes for historical reasons, though they are reasonably similar. Consequently, the loglinear transformation of Equation (Equation5), which generalised shrinkage applies in this setting, ensures that the classes are comparable. Figure contains a plot of grouped individual values that displays the observed and scoring times (hours), the latter transformed using generalised shrinkage, for each Paralympic functional class and for the Olympic open class. As expected, the last of these classes performed best in terms of the observed values and so its observed and scoring results are equal. The rankings of individual athletes within each class is unchanged by transformation.

Figure 2. Observed and scoring times (hours) for Sochi 2014.

Figure 2. Observed and scoring times (hours) for Sochi 2014.

The tables in Appendix 3 present the numerical values of the observed and scoring times (hours, minutes and seconds) for each functional class, along with overall rankings based on the transformed results using generalised shrinkage. Our method would allocate all three medals to Paralympic athletes from Class III, Rushan Minnegulov (RUS), Ilkka Tuomisto (FIN) and Vladislav Lekomtcev (RUS) respectively.

The reasons for these combined medal allocations are clear from Figure and Appendix 3. These three winning athletes achieved considerably smaller observed racing times than those of other competitors in their class, whereas the best results in other classes were less distinct from those of other competitors in their respective classes. Note that although there are considerably more athletes in the open Olympic class than in any of the Paralympic classes for this event, the generalised shrinkage method allows for differing class sizes and treats competitors in large classes fairly.

4.3. General observations

Before proceeding, it is important to reflect on the feasibility of the assumptions behind the shrinkage method. In particular, the theoretical justifications given by Percy (Citation2013) relate to comparisons between Paralympic classes within a single sport. By extending the analysis to include Olympic classes, our investigations seem to present no problems and the analysis works well as intended. Here, Paralympic and Olympic classes merely represent different levels of physical or mental ability.

However, our analysis of the Sochi Winter games reveals an interesting variation, as the actual race distances differ among the Olympic and Paralympic classes. Following transformations as described earlier in this paper, the generalised shrinkage method implicitly assumes similar locations, dispersions and skewnesses for all classes, in order that all competitors have equal opportunities to excel from their class packs. Hence, we must question whether this assumption is still valid for this scenario.

Although the race distances vary for cross-country skiing, a glance at Figure is sufficient to show that the locations and dispersions of observed racing times are comparable across all Olympic and Paralympic classes. Moreover, the skewnesses of observed racing times in the different classes are all positive. Consequently, the assumptions underlying this method are reasonable for this analysis and suggest that comparisons of Olympic and Paralympic performances might generally be fair within any particular sport.

5. Comparing diverse categories

As the generalised shrinkage method appears to compare results fairly among differing classes of ability, it is natural to ask what other comparisons are feasible. Perhaps the most exciting prospect in this regard would be to compare the performances of male and female athletes, so that occasional competitions might enable them to compete fairly for an overall prize. However, other opportunities with significant benefits would arise by facilitating comparisons among youth, adult and senior competitors. And why not even compare different events after standardising the corresponding results?

In order to compare different events, we first need to standardise the results to allow for different distributions. As this algorithm already transforms the observed results onto the set of real numbers for intermediate calculations, we use the transformed results tij from Equation (Equation5) as the basis of this standardisation. Specifically, we propose that the tij should be mapped linearly for each event, via translation and scaling, so that the mean and standard deviation of all transformed results for each event are 0 and 1, respectively, and so that better performances correspond to positive scores rather than negative scores.

This use of mean and standard deviation is appropriate as the transformed results are real numbers, and the values 0 and 1 arbitrarily but conveniently correspond to the additive and multiplicative identities for the complete ordered field of real numbers. Denoting the mean and standard deviation of the transformed results by μ and σ respectively, the appropriate linear transformation is(12) zij=±tij-μσ(12)

where the plus sign is used if the aim is to maximise the original performance measure and the minus sign is used if the aim is to minimise the original performance measure. Pleasingly, this is also the linear equation that is used to standardise the normal probability distribution.

Figure displays the standardised scores resulting from both events considered earlier, mixed-sex equestrian individual dressage at London 2012 and men’s cross country skiing at Sochi 2014. The best rated performances are those with the largest values. As the standardisation formula in Equation (Equation12) involves a linear mapping, the performances across all classes within each event are ranked exactly as for the separate analyses presented in the previous section. However, it is very clear that performances across all classes are now comparable between these otherwise unrelated events. This offers a remarkable prospect for enhancing the appeal of competitive sports, across different categories and activities.

Figure 3. Standardised scoring results for London 2012 and Sochi 2014.

Figure 3. Standardised scoring results for London 2012 and Sochi 2014.

Actual comparisons between these two events lead to the conclusion that the gold, silver and bronze medallists would be the three respective equestrian winners identified previously, when generalised shrinkage was used to compare classes. This is slightly disappointing, as it leads us to question whether skiers are treated unfairly by this standardisation approach, particularly as there were more skiers than equestrians. Glancing back at Figures and , it is tempting to suggest that the skewness of results explains why equestrians perform better than skiers in this combined analysis. Specifically, the equestrian results in this study tend to cluster around an average with a few exceptionally good performances, whereas the skiing results in this study tend to cluster around an average with a few relatively bad performances.

However, these differences in skewness are less important than they seem, as our standardisation algorithm applies to the transformed results tij of Equation (Equation5) rather than the observed results rij. Moreover, six skiers finished in the top twenty places of the combined analysis. These include the three winning skiers in Class III, who finished in fourth, seventh and eighth positions. Regardless of skewness directed towards good or bad performances, it is entirely reasonable that the declared winners are those athletes who registered performances that were considerably better than those of the other elite athletes in their sport, which is what occurred here. Nevertheless, these observations suggest that caution should be exercised when comparing performances from different sports if the class skewness varies substantially.

6. Discussion

This article generalises the shrinkage method for class handicapping that was proposed by Percy (Citation2011, Citation2013), by extending it to allow for maximisation of performance measures and to allow for bounded and unbounded performance measures. It applies this new generalised shrinkage method in a novel way, to compare the performances of Paralympic athletes with those of Olympic athletes. It uses sets of data from two representative sports, mixed-sex equestrian individual dressage (London 2012) and men’s cross country skiing (Sochi 2014), to illustrate these comparisons. Our results demonstrate that such comparisons among athletes of mixed abilities are fair, according to the definition presented in this article, and enable new layers of incentives and rewards further to integrate Olympic and Paralympic Games.

We do not advocate the removal of any existing medal structure, but merely hope to enhance diversity and inclusivity in sport. This could supplement other methods that are currently used to achieve improved cooperation between Olympic and Paralympic competitions, including the merging of events and ceremonies, and the introduction of more unified sports. In a distinct section, we even consider the application of generalised shrinkage to compare the results arising from the two distinct events considered previously, with interesting and pleasing conclusions. However, the possibilities seem to be considerably more varied than this.

First, this method is generic and might be used to compare results arising from different competitions, perhaps to determine the best performing golfer or tennis player in any given year (Bozóki, Csató, & Temesi, Citation2016), the greatest hockey or football team of all time (Baker & McHale, Citation2015), or the best all-time athlete in any particular sport relative to peer competitors. For example, it might provide an alternative method for comparing the boxing achievements of former and current heavyweight champions Muhammad Ali and Manuel Charr, or determining who is the greatest football player of all time from among the likes of Pele, Maradonna, Ronaldo and Messi. An early application of shrinkage in a similar sporting context (American football) was presented by Everson (Citation2007).

Second, and perhaps most remarkably, generalised shrinkage could be applied to compare results arising from different sports, as demonstrated above. This could potentially offer a further level of reward for achievement in multi-sport competitions, such as determining an overall best achieving athlete or gymnast, primarily for entertainment purposes. At least, such inter-sport rewards would encourage cooperation and collaboration that might lead to shared knowledge and resources. This methodology could potentially also lead to fairer scoring systems for multi-event competitions such as triathlon, heptathlon, decathlon and modern pentathlon. In the heptathlon and decathlon, for example, the final event is a middle distance race that is widely considered to be the hardest in which to achieve a good points score. Perhaps generalised shrinkage could help address such issues. The use of operational research in contexts like this offers an opportunity to illustrate the potential of our field to a wide audience in an interesting and easily digestible way.

Third, there is tremendous potential for this method fairly to compare different categories of competitor in any particular sport. Examples for which this could offer substantial benefits include comparisons of professionals and amateurs, classes that employ different forms of technology, male and female athletes, and various age groups such as youth, adult and senior. Such combined results would enhance existing competitions by encouraging between-category activity. Indeed, many amateur sports involve distinct open and handicapped tournaments, and these comparisons could offer considerable benefits for the latter by reducing the amount of subjective handicapping required, so avoiding accidental and intentional bias.

Fourth, it is possible to broaden the scope of this algorithm substantially by applying it to non-sporting competitions. For example, consider students’ marks for several modules on a degree programme (Dalziel, Citation1998). The assessments might be of different standards and need not involve exactly the same students. Generalised shrinkage could be used, as an alternative to existing approaches, to standardise the marks for accurate ranking. Also in the context of higher education, this approach could be used by governing authorities to determine whether universities discriminate in their admission of undergraduate students or whether employers discriminate in their recruitment of graduates. Such possibilities clearly offer many opportunities for future applications of operational research.

Acknowledgements

The author is grateful to Bruce Warner and Hugh Daniel, for helpful discussions about the role of class handicapping in Paralympic sports.

Notes

No potential conflict of interest was reported by the authors.

References

  • Baker, R. D., & McHale, I. G. (2015). Time varying ratings in association football: The all-time greatest team is .... Journal of the Royal Statistical Society Series A, 178, 481–492.
  • Ballı, S., & Koruko\v{g}lu, S. (2014). Development of a fuzzy decision support framework for complex multi-attribute decision problems: a case study for the selection of skilful basketball players. Expert Systems, 31, 56–69.
  • Bozóki, S., Csató, L., & Temesi, J. (2016). An application of incomplete pairwise comparison matrices for ranking top tennis players. European Journal of Operational Research, 248, 211-218.
  • Churilov, L., & Flitman, A. (2006). Towards fair ranking of Olympics achievements: the case of Sydney 2000. Computers and Operations Research, 33, 2057–2082.
  • Copas, J. B. (1983). Regression, prediction and shrinkage (with discussion). Journal of the Royal Statistical Society B, 45, 311–354.
  • Dalziel, J. (1998). Using marks to assess student performance, some problems and alternatives. Assessment & Evaluation in Higher Education, 23, 351–366.
  • Efron, B., & Morris, C. (1975). Data analysis using Stein’s estimator and its generalizations. Journal of the American Statistical Association, 70, 311–319.
  • Efron, B., & Morris, C. (1977). Stein’s paradox in statistics. Scientific American, 236, 119–127.
  • Everson, P. (2007). A statistician reads the sports pages. Chance, 20, 49–56.
  • Gomes, E. G., & Lins, M. E. (2008). Modelling undesirable outputs with zero sum gains data envelopment analysis models. Journal of the Operational Research Society, 59, 616–623.
  • Huber, P. J., & Ronchetti, E. M. (2009). Robust Statistics (2nd ed.). New York, NY: Wiley.
  • Li, G., Kou, G., Lin, C., Xu, L., & Liao, Y. (2015). Multi-attribute decision making with generalized fuzzy numbers. Journal of the Operational Research Society, 66, 1793–1803.
  • Li, Y., Lei, X., Dai, Q., & Liang, L. (2015). Performance evaluation of participating nations at the 2012 London Summer Olympics by a two-stage data envelopment analysis. European Journal of Operational Research, 243, 964–973.
  • Li, Y., Liang, L., Chen, Y., & Morita, H. (2008). Models for measuring and benchmarking Olympics achievements. Omega, 36, 933–940.
  • Lozano, S., Villa, G., Guerrero, F., & Cortés, P. (2002). Measuring the performance of nations at the Summer Olympics using data envelopment analysis. Journal of the Operational Research Society, 53, 501–511.
  • Nelder, J. A., & Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society Series A, 135, 370–384.
  • Percy, D. F. (2011). Interactive shrinkage methods for class handicapping. IMA Journal of Management Mathematics, 22, 139–156.
  • Percy, D. F. (2013). Generic handicapping for paralympic sports. IMA Journal of Management Mathematics, 24, 349–361.
  • Rezaei, J. (2015). Best-worst multi-criteria decision-making method. Omega, 53, 49–57.
  • Wang, H., Huwang, L., & Yu, J. H. (2015). Multivariate control charts based on the James-Stein estimator. European Journal of Operational Research, 246, 119–127.
  • Yang, M., Li, Y. J., & Liang, L. (2015). A generalized equilibrium efficient frontier data envelopment analysis approach for evaluating DMUs with fixed-sum outputs. European Journal of Operational Research, 246, 209–217.
  • Zhang, D., Li, X., Meng, W., & Liu, W. (2009). Measuring the performance of nations at the Olympic Games using DEA models with different preferences. Journal of the Operational Research Society, 60, 983–990.

Appendix 1

Justification of transformations in Equation (Equation5)

Our challenge is to determine a convenient, coherent mapping of all possible results ranges onto the set of real numbers, in order to apply an additive version of the shrinkage method for class handicapping.

Theorem:

The transformations in Equation (Equation5) uniquely comprise the simplest bijective function rf(r) that maps [a,b]R and is consistent as a-, b or both.

Proof Nelder and Wedderburn (Citation1972) show that the canonical link function for mapping [0,1]R in generalised linear modelling is the logit link defined by rln{r/(1-r)}. Accordingly, this is the natural bijective function to use as a default basis for constructing appropriate maps that apply to other ranges.

For r[a,b] a linear transformation of r extends this logit link to give(A1) rlnr-ab-r.(A1)

However, ln{(r-a)/(b-r)}± as a- or b, so we scale the argument in Relation (EquationA1) by a constant ratio of linear terms involving a and b to ensure that the function does not diverge as either limit becomes infinite. This gives(A2) rln(b-c)(r-a)(c-a)(b-r)(A2)

in terms of an unspecified constant c(a,b). As these limits separately become infinite, the function of r defined by Relation (EquationA2) satisfies(A3) lima-ln(b-c)(r-a)(c-a)(b-r)=lnb-cb-r(A3)

and(A4) limbln(b-c)(r-a)(c-a)(b-r)=lnr-ac-a(A4)

both of which remain finite.

Unfortunately, these expressions show that r0 for all r as both a- and b, which is inconsistent with the required generality to encompass the unbounded results range rR. To avoid this problem, consider the Mercator series expansion ln(1+x)=x-x2/2+x3/3- for -1<x1. Fixing r and allowing b to vary in the mapping of Equation (EquationA3) gives(A5) lnb-cb-r=-ln1-r-cb-c=r-cb-c+Ob-2(A5)

for c<(b+r)/2 so that |(r-c)/(b-c)|<1. Fixing r and allowing a to vary in the mapping of Equation (EquationA4) gives(A6) lnr-ac-a=ln1-c-rc-a=r-cc-a+Oa-2(A6)

for c>(a+r)/2 so that |(c-r)/(c-a)|<1. Hence, we can avoid this asymptotic mapping to zero for all r as both a- and b by scaling the function in Equation (EquationA2) by a factor (b-c)(c-a), which is the product of the principal denominators in Equations (EquationA5) and (EquationA6). In order to ensure that our mapping remains strictly increasing and finite as either limit becomes infinite, we divide this product by the simplest factor that guarantees this property, which is (b-a).

Collating all these results leads to the general functional forms(A7) f(r)=(b-c)(c-a)(b-a)ln(b-c)(r-a)(c-a)(b-r)ifr[a,b](c-a)lnr-ac-aifr[a,)(b-c)lnb-cb-rifr(-,b]r-cifr(-,)(A7)

where in each case the constant c lies inside the corresponding range of r values. Although we imposed further constraints on c in order to analyse variable bounds, these can now be relaxed as the results ranges are fixed for all sporting competitions. These functions are consistent with one another and offer a simple bijective mapping from any real interval onto the set of real numbers R. Moreover, any strictly increasing linear transformations of these functions yield identical results in practice, as users see only the back-transformed scoring results s as determined by Equation (Equation11).

Finally, we simplify the formulae in Equation (EquationA7) for practical applications by specifying suitable values for the constant c in these four situations. These values may differ as every implementation of this method involves only one results range with fixed bounds. The simplest expressions arise by setting c to be (a+b)/2, a+1, b-1 and 0, respectively. We then scale the first term by a constant factor of 4/(b-a) for convenience as justified in the previous paragraph, which then uniquely generates the functional forms in Equation (Equation5).

Appendix 2

Mixed-sex equestrian individual dressage (London, 2012)

Appendix 3

Men’s cross country skiing (Sochi, 2014)