453
Views
1
CrossRef citations to date
0
Altmetric
Research Articles

Winning one-day international cricket matches: a cross-team perspective

ORCID Icon &
Pages 39-58 | Received 29 Sep 2021, Accepted 05 Feb 2022, Published online: 20 Feb 2022

ABSTRACT

The study analyses the predictors of a win for four international cricket teams in the one-day international cricket format. A binary logistic regression is used to determine the relationship between the independent variables, i.e., fours and sixes scored, bowling economy, extras conceded, fielding dismissals, the number of debutants from each side, umpire’s nationality, pitch condition, and season of play vis-à-vis odds of a win. The study found that the number of fielding dismissals and bowler economy significantly influence the odds of winning for all four teams. Further, the nationality of the umpire did not affect any team, while other variables influenced the fortunes of different teams differently. Proposed models in the paper can be used by team management and coaches in devising match strategy and player selection for higher win outcomes based on a combination of historical trend data for specific variables and actual data for the others.

1. Introduction

Cricket is a popular team sport today played across more than a hundred countries globally. It has its origins in sixteenth-century England (Kaluarachchi & Aparna, Citation2010). At present, there are three playing formats accepted officially by the International Cricket Council (ICC), namely, Five Day Test matches, 50-over One Day Internationals (ODI), and the most recently adopted Twenty-Twenty cricket (T20). Sports analytics in general and in an evolving game like cricket in particular, there has been a lot of interest amongst researchers and practitioners in recent times on predictive models using data mining techniques and machine learning models (Jain et al., Citation2021; Passi & Pandey, Citation2018; Pathak & Wadhwa, Citation2016). While research using data analytics has progressed significantly in other sports such as soccer (Duan & Chakravarty, Citation2021), basketball (Maymin, Citation2021), and tennis (Angelini et al., Citation2022), in general, research on cricket is scarce (Jain et al., Citation2021). In this study, we visit the different dimensions of ODI matches and their interplay in influencing the win-loss outcome modelled at a country level (not assuming one size fits all). We have used a binary logistic regression model to predict match outcomes on a large set of independent variables as the scope of this paper and wish to use machine learning models to predict match outcomes as an area of future research.

The contribution of the paper is threefold. First, it analyses the team-level data for four different national sides with the most extended history of playing One Day International (ODI) cricket to model the winning strategy. We have focussed on a specific team-based strategy rather than considering a generic one size fits all approach as in most studies documented in extant literature (Ahmed et al., Citation2013; Cannonier et al., Citation2015; Jain et al., Citation2021; Premkumar et al., Citation2020). Second, the predictive models have been tested and validated to devise match strategies for the respective teams, which can be used not just by coaches and team administration, but also sponsors of teams and media companies broadcasting/ streaming the coverage considering the kind of viewership based on predicted match results. Third, to the best of our knowledge, several dimensions are considered together for the first time: number of debutants representing each side, season of play, number of bowlers with specific economy rate, extra runs conceded by each side, and pitch condition. In addition, other dimensions documented in the extant literature, such as the nationality of umpires, fielding dismissals, number of boundaries, to name a few, have also been analysed to arrive at the strategy recommendations. While some of the variables may seem to be intermediate to predict match outcomes, the historical data on the variables may be used as an input. For example, for independent variables like the number of fielding dismissals/ number of bowlers with particular economy rate/ number of sixes and fours scored, etc., the historical means with the specific opponents may be used. This adds to the robustness of the paper’s proposed predictive model, which is a significant gap in extant literature. It is also interesting to note that scoring more sixes and fours directly leads to victory. The results do not reflect the same and vary for different teams (elaborated in the Discussion section). We discuss the interaction effects between the various dimensions to propose a team-specific win strategy.

The paper unfolds with a brief review of relevant literature, followed by the empirical framework and the data set used for analysis. The article explains the different variables and their operationalisation in the data section. The results section, model validation follow it, and a discussion of strategic implications for the four teams studied in the paper. We have concluded the paper with managerial implications, limitations, and future research directions.

2. Review of literature

In the chapter titled “Research Directions in Cricket” published in the book “Handbook of Statistical Methods and Analyses in Sports”, Swartz (Citation2016) has identified significant research trends in cricket analytics. Broadly there are six leading research themes – home team advantage, the influence of umpiring decisions, the effect of batting first, a partnership among batters, change of rules by cricketing bodies, and determining the composition of a team. Other major studies conducted in cricket analytics include player performance, team strength, optimal line-ups, and tactics. Norman and Clarke (Citation2010), focusing on batting, have optimised batting order across all game formats using dynamic programming. Analytical studies on cricket have also been conducted from the perspective of sponsorship (Currie, Citation2000; Donlan, Citation2014; Goldman & Johns, Citation2009), motivation to follow matches (Bennett et al., Citation2007; Kashif et al., Citation2019), and public relations (Hopwood, Citation2005a, Citation2005b). For more details on the different dimensions of cricket and strategy covered in the extant literature, Appendix A may be referred to.

Our review of the literature indicated that some critical determinants of match outcomes have remained unexplored. More importantly, while generic strategic recommendations were covered (Cannonier et al., Citation2015), different teams have different playing styles, characteristics, and beliefs. Hence, it is fair to hypothesise that other explanatory variables will influence match outcomes differently. For example, the role of debutants (where the urge to perform is paramount to ensure a place in the side) could affect favourable results for a team, while this may not be true for others. While home side advantage has been studied in the past, the season of play (i.e., winter, summer, autumn, or spring) has been included for analysis in the study. For a team with a strong battery of seam bowlers (i.e., Pakistan, Australia), the season may have a crucial role to play. Other critical determinants considered for the first time are the count of bowlers with a specific economy rate, extra runs conceded by each side, and pitch condition measured as a trinomial categorical variable. We find that studies have dealt extra runs as a part of runs conceded per over, which is at an aggregated level and does not break it down to a tactical level to manage extras, as it has both bowling and fielding implications. Finally, this paper considered three kinds of pitch conditions – home pitches, pitches similar to home, and pitches not similar to home. Overall, we can summarise that the choice of variables for this research has been based on past inclusions, exclusions, and granularity (i.e., extra runs). The paper addresses this gap in extant literature by adopting a team-specific approach to predict win/loss outcomes rather than a generic approach.

3. Empirical framework

Batting, bowling, and fielding have influenced match outcome within the production function approach (Cannonier et al., Citation2015). The methods adopted in the studies are multiple regression methods to model the relationship between match outcome and independent variables (Akhtar & Scarf, Citation2012; Asif & McHale, Citation2016; Bailey & Clarke, Citation2006; Brooks et al., Citation2002; Khan et al., Citation2019). This study added an input categorised as “others” representing dimensions such as the nationality of umpires, number of debutants, pitch condition, and season of play. Hence, the match result is expressed as a function in EquationEquation 1.

(1) Match Outcome =fBatting, Bowling, Fielding, Others(1)

The match outcome is modelled as a logit model. The dependent variable, y*, is the match outcome (i.e., win/ loss) expressed as a binary ordinal variable and measured as a latent variable (Brooks et al., Citation2002; Dawson et al., Citation2009). Matches with outcomes other than win or loss (example – tie or abandoned) are not included in the analysis as the focus of the study is on analysing a strategy to win. The structure of the latent variable model is expressed in EquationEquation 2.

(2) yj=xjβ+ej(2)

xj represents explanatory variables representing batting, bowling, fielding, and other dimensions. At the same time, β is the matrix of parameters to be estimated. The random error term is represented by ej. The measured dependent variable is whether the team has won (=1) or lost (=0), where:

y* is the unobserved latent variable and was measured on whether the team won (=1) or lost (=0), where:

y= 1 ify > 0, andy= 0 ify  0

Hence the logit model can be denoted as

(3) Pyj= 1|xj =Pyj > 0|xj=exjβ1+exjβ=Λ(xjβ)(3)

Where Λ (.) is the cumulative distribution function for a standard logistic distribution. The model is used separately for a cross-country comparison to analyse win/ loss data for four national teams: Australia, England, India, and Pakistan. Explanatory variables were considered for every opposition that played against the countries modelled above for the stated time period under consideration.

3.1. Data

Data was collected for ODI matches played by the men’ “s national teams of Australia, England, India, and Pakistan from the website – https://www.espncricinfo.com/(see, ). The data ranges from when these countries played their first official ODI matches to the recently concluded ICC World Cup in July 2019. A total of 852 matches were considered for Australia (from January 5th” 1971 till July 11th’ 2019), 654 matches for England (from January 5th’ 1971 till July 11th’ 2019), 926 matches for India (from July 13th’ 1974 to July 9th’ 2019), and 849 matches for Pakistan (from February 11th’ 1973 till July 5th’ 2019). Refer to for the breakdown of matches played against different opponents by these nations. It may be noted that all games with a no-win/ loss result have been excluded from the analysis as the focus is to analyse the game-winning strategy. Illustratively, the 2019 ICC World Cup final match between England and New Zealand, which ended in a tie and was decided on the number of boundaries scored, has also been ignored in the analysis.

Figure 1. ODI matches played year wise.

Figure 1. ODI matches played year wise.

Figure 2. ODI matches played by countryFootnote1.

Figure 2. ODI matches played by countryFootnote1.

From the first ODI played on January 5th’ 1971, till July 11th’ 2019, a total of 8384 official ODI games were held globally, and this analysis includes 40.84% (3424 matches) of it. The reasons for the selection of Australia, England, India, and Pakistan for the research was their duration of play of ODI matches, the share of ODIs played by them, contribution to the sport commercially and viewership, and the geographical representation of the three major cricket playing continents in the world. This also factors the different playing styles and hence the tactics and strategy adopted for winning.

3.2. Variables

In all, there are two batting variables representing batting prowess through the number of fours (FOUR) and sixes (SIX) hit by the respective teams. While the number of boundaries scored by the teams studied in the paper (Australia, England, India, and Pakistan, henceforth known as focal teams) was expected to influence the outcome of wins positively, the relationship was inverse when boundaries scored by opponents were higher. For bowling, two variables were considered. First, the frequency count of the number of bowlers (BOWECON) figuring below a specific economy rate (measured in terms of runs per over, RPO); and second, extra runs (EXTRA) conceded. As the game rules, playing conditions, and strategies evolved significantly between 1971 and 2019, BOWECON was operationalised considering three different time eras. The average runs per over have been steadily increasing since the early days of cricket. The average RPO pre-1986 for all official ODI matches played by all the nations was 4. The same statistic was 4.5 for 1986–1999 and 5 RPO for games post-1999. As our estimation involved only four teams, based on expert inputs, the norms were fixed as 3.5, 4.5, and 5.5 RPO for the pre-1986, 1986–1999, and post-1999 periods, respectively.

For every match involving focal teams, the frequency count of the number of bowlers meeting the criteria was recorded for both the focal team and its opponent. For extra runs conceded, match by match data was scanned, and extras were summed up (i.e., byes, leg byes, wides, no balls, and penalty runs). Frequency count of dismissals (FIELDING) involving fielding skills (i.e., run out, stumping, catch) was done for every match for both the focal team and its opponent to arrive at the single variable in the fielding category. The maximum number of variables figure under “others”, including season of play (SEASON), pitch condition (PITCH), nationality of umpires (UMPIRES), and the number of debutants (DEBUTANT). See, for a description of the variables and their expected relationship with the dependant variable – match outcome of a win (WIN). It may be noted that the expected signs signify the direction of influence of the explanatory variable on WIN for the focal team under consideration. Further, each metric variable – FOUR, SIX, BOWECON, EXTRA, FIELDING, and DEBUTANT- has two variables, one representing the focal team and their opponent.

Table 1. Definition of variables.

4. Results

From a batting perspective, we observed that the mean value of the number of fours and sixes scored by the Indian side was the highest (see, ). The opponents have the lowest average of four’s against the Pakistani bowling attack, followed by Australians, making them the most formidable side to score boundaries against. Matches involving India also have the highest number of sixes scored per match (5.5 sixes scored per match), making the games more exciting from the viewers’ perspective.

Table 2. Descriptive statistics.

For the variable BOWECON, the mean value is highest for the Australian and Pakistan teams, making them the most difficult to score runs off. Regarding conceding extras, India and Pakistan, the sub-continent teams, lack discipline, with England being the most restrictive and the only team conceding fewer extras than its opponents. Australia, England, and India performed better in fielding dismissals with a higher average per match, whereas Pakistan trails its opponents. For Pakistan, the gap is about 0.5 dismissals on average, indicating that fielding is its weakness, while Australia has an advantage at an average of 0.8. Hence, this has implications for team selection, especially if FIELDING is one of the critical variables influencing the outcome. Further, it gains significance when playing season and pitch conditions favour seam bowlers, leading to more opportunities for dismissals.

In “others”, DEBUTANT is the only metric variable, and the rest (i.e., SEASON, PITCH, and UMPIRE) are categorical. It was seen that England and Australia play more than 70% of their matches in home pitches or similar to home pitches. In comparison, Pakistan has the maximum share of games (40.8%), played in similar to home pitches (i.e., in Sharjah, Sri Lanka, Bangladesh, India, and Zimbabwe). Similar to home pitches for Australia and England, include grounds in South Africa, New Zealand, West Indies (in addition to England or Australia as the case may be). At the same time, home pitches for England also include Wales, Scotland, and Ireland. From the umpiring perspective, subcontinent teams have more than 40% of their matches officiated by neutral umpires, with the distribution being least for Australia at 18.9%. As far as the playing season is concerned, winter is the least popular among all four countries, due to seam bowling conditions, fading light in the evenings, and hence shorter playing hours for day matches, to name a few. India (38%) and Pakistan (35.2%) have their most significant share of games played during Autumn, whereas Summer is the preferred season for Australia (53.7%) and England (40.8%). This has a bearing on game strategy, as weather conditions are less likely to assist bowlers during summer and autumn. Hence, batting conditions are likely to be better, and matches could result in high average scores, more boundaries, and more entertainment for the crowd.

Before proceeding with the estimation of the logit model, a test of multicollinearity was carried out, checking the Variance Inflation Factor (VIF). VIF values greater than ten reveal high multicollinearity among predictor variables (Diamantopoulos & Winklhofer, Citation2001). There was no such concern in our case as VIF values ranged between 1.05 to 8.03. We present the Correlation analysis among the independent variables to check for perfect correlation. As seen in , the maximum correlation between the independent variables is 0.503 (For Australia), 0.482 (For England), 0.554 (For India), and 0.459 (For Pakistan), indicating the absence of perfect correlation among the independent variables.

Table 3. Correlation matrix for team Australia.

Table 4. Correlation matrix for team England.

Table 5. Correlation matrix for team India.

Table 6. Correlation matrix for team Pakistan.

The logit model was solved using hierarchical regression, and the results for the different teams are captured in below. Model 1 has only the estimator variables for the individual teams, while Models 2, 3, and 4 include different interaction effects. They are Model 2 – BOWECON*FIELDING, Model 3 – BOWECON*FIELDING, EXTRA*FIELDING; Model 4 – BOWECON*FIELDING, EXTRA*FIELDING, and BOWECON*EXTRA. The model factoring an interaction of all three variables and others did not have a single interaction significant at 5% and hence was not included in the results for discussion. The Hosmer-Lemeshow test results for all four models for each side indicate satisfactory goodness-of-fit with a significance value greater than 0.05. The Nagelkerke R2 values for all models are over 0.7 indicating a more than 70% predictability of outcomes. The results are presented in below and discussed subsequently.

Table 7. Logit results for Australia, England, India, and Pakistan.

It can be observed that in model 1(not including any interaction effects), the variables BOWECON (Focal Team and Opponents), FIELDING (Focal Team and Opponents), and FOUR (Focal Team and Opponents) are significant at 1% for all four teams. The variable EXTRA (Focal Team) is significant for Australia at 1%, and EXTRA (Opponents) is significant for Pakistan at a 5% significance level. The variables SIX (Focal Team and Opponents) are significant for England, India, and Pakistan at 5% and Australia, India, and Pakistan at 1%, respectively. The variable DEBUTANT (Focal Team) was found to be significant for Pakistan at 1%, while DEBUTANT (Opponents) was found significant for England at the same level. The variable UMPIRE was found to be insignificant for all the teams. The variable PITCH was significant for only England at 5% and SEASON for only Pakistan at the same level.

The results from model 2 (interaction of BOWECON and FIELDING) show that the variables BOWECON (Opponents), FIELDING (Focal Team and Opponents), FOUR (Focal Team and Opponents) are significant at 5% for all the four teams. The variable BOWECON (Focal Team) was significant for England, India, and Pakistan at 1%. EXTRA (Focal Team) was significant for Australia and England at 1%, while the EXTRA (Opponents) variable was also significant for Australia and England at 5%. The variable SIX (Focal Team) was significant for England, India, and Pakistan at 5%, while the variable SIX (Opponents) was significant for Australia, India, and Pakistan at 1%. The variable DEBUTANT (Focal Team) was found to be significant for Pakistan at 1%, while the variable DEBUTANT (Opponents) was found significant for only England at 1%. The variable UMPIRE was found to be insignificant for all the teams (similar to model 1). The variable PITCH was found to be significant for England at 5%. SEASON was significant for Pakistan at 5%, and the interaction effect of BOWECON*FIELDING (Focal Team) was significant only for Australia and Pakistan at 5%.

The results from model 3 (interaction of BOWECON and FIELDING, EXTRA and FIELDING) show that the variables FOUR (Focal Team and Opponents) are significant at 1% for all four countries. BOWECON (Focal Team and Opponents) was significant at 5% for England, India, and Pakistan. EXTRA (Focal Team) was insignificant for all the teams, while the variable EXTRA (Opponents) is significant only for Pakistan at 5%. The variable FIELDING (Focal Team) was significant for all the teams at 5%, while FIELDING (Opponents) was significant for England, India, and Pakistan at 1%. SIX (Focal Team) was significant for England, India, and Pakistan at 5%, while SIX (Opponents) was significant for Australia, India, and Pakistan at 1%. DEBUTANT (Focal Team) was significant for Pakistan at 1% while DEBUTANT (Opponents) was significant for England at 5% level. UMPIRE as a variable was significant only for India at 5%. PITCH was significant only for England at 5% and SEASON only for Pakistan at the same level. The interaction effect, BOWECON*FIELDING (Focal Team), was significant for Pakistan at 1%, while the interaction effect, BOWECON*FIELDING (Opponents), was significant for Pakistan at 5%. The interaction effect, EXTRA*FIELDING (Focal Team), was significant for Australia at 5%, and for “Opponents” was significant for only England at 5%.

The results from model 4 (interaction of BOWECON and FIELDING, EXTRA and FIELDING, BOWECON and EXTRA) show that the variables FIELDING (Focal Team), FOUR (Focal Team), and FOUR (Opponents) were significant for all the teams at 5%. The variables BOWECON (Focal Team), BOWECON (Opponents), and FIELDING (Opponents) were significant for England, India, and Pakistan at 5%. EXTRA (Focal Team) was insignificant for all the teams, while EXTRA (Opponents) was significant for Australia and England at 5%. SIX (Focal Team) was significant for England, India, and Pakistan at 5%, while SIX (Opponents) was significant for Australia, India, and Pakistan at 1%. DEBUTANT (Focal Team) was significant for only Pakistan at 1% while DEBUTANT (Opponents) was significant only for England at the same level. UMPIRE was significant for India at 5%, while PITCH was significant only for England at 5%. SEASON and interaction effect of BOWECON*FIELDING (Focal Team and Opponents) were significant only for Pakistan at 5%. The interaction effect of EXTRA*FIELDING (Focal Team) was significant for only Australia at 5%. In contrast, EXTRA*FIELDING (Opponents) was significant only for England at the same level. The interaction effects, BOWECON*EXTRAS (Focal Team) and BOWECON*EXTRAS (Opponents), were insignificant for all four teams.

To summarise the analysis results, we observe that different teams have different variables (including the interaction effects) significantly influencing the win/loss outcome. Therefore, it is critical to analyse and model the dependent variable for each playing nation separately to predict match results. Coaching staff, cricket administrators, and all other stakeholders who depend on win/loss prediction of match outcomes involving different teams need to separately strategize for each team instead of considering common factors (as discussed in this study). Extant literature on cricket research (refer to Appendix A) includes discussions on specific variables in a generic fashion for different formats of cricket such as ODI or T20 or Test matches or Indian Premier League (IPL). They include Cannonier et al. (Citation2015) on T20 vs. Indian Premier League (IPL) vs. ODI, Ahmed et al. (Citation2013) for player selection attributes in IPL, Premkumar et al. (Citation2020) for player evaluation in ODIs, and Bhaskar (Citation2009) and Dawson et al. (Citation2009) on toss on influencing the outcome of the match. Our study addresses this gap by focusing on specific predictor variables influencing outcomes for four leading cricketing countries.

5. Discussion

This study aimed to determine a team-wise strategy in terms of different explanatory variables and their interactions influencing the log odds of winning. We discuss the winning strategy and implications for the coaching staff and cricket administrators below. For some of the variables considered in the models, historical mean scores may be used to predict match outcomes such as FIELDING, BOWECON, EXTRA, FOUR, and SIX. The other variables will be known in advance, including SEASON, PITCH, and UMPIRE. For the variable DEBUTANT, depending on the nature of the impact of the variable for the specific team, it may be considered in the selection of probable to form the playing eleven. Thus, the independent variables are pre-determined with input values to predict match outcomes. We can see that different variableimpact the match outcome for different teams under consideration. Hence, the winning strategy is not a one size fits all approach as reported in studies in the past.

5.1. Team strategy implications

5.1.1. Australia

The core dimensions of batting, bowling and fielding are significant in influencing the winning outcome for the Australian side. Pitch condition, the season of play, nationality of umpires, and the number of debutants play an insignificant role. In Model 2, it is seen that a higher count of bowlers bowling with a specific economy rate and good fielding display increases the odds of winning by 8%. In comparison, it declines by 7% when the Australian team concedes extra runs, everything else remaining the same. The probability of winning for the side is maximum (at 60%) when fielding dismissals are high, and batters score more fours (at 55%). In Models 3 and 4, the interaction of extra runs conceded by Australian bowlers and fielding dismissals reveals a decline of 2% in the log odds of winning (other things being the same). This indicates the importance of bowling discipline in influencing match outcomes relative to fielding dismissals.

For the Australian team, selectors and coaching staff should pay greater attention to the core dimensions of the game – bowling discipline and fielding prowess. Focus on players with reasonable fitness levels and fielding skills, in addition to bowlers with a disciplined attack and conceding minimum extra runs, should be the key to increasing win probability in the ODI format.

5.1.2. England

More debutants playing on the opposition side increase the log odds of winning for England by more than 50%, as observed in all the models, with other things being unchanged. From a batting perspective, fours scored by the opposition has a higher bearing on match outcome (probability of a win for England at 44% – a gap of 6%) as against the fours and sixes scored by England batsmen (win probability of about 54% – a gap of 4%) with other things remaining unchanged. Once again, it is seen in both Models 3 and 4 that BOWECON and FIELDING have the single most significant impact on the log odds of winning (at 71% and 70%, respectively). In addition to DEBUTANT, it is seen that when England plays in pitches similar to English conditions (i.e., Australia, New Zealand, South Africa, and West Indies), win probability is only 31% compared to home pitches where home conditions favour the team.

To summarise, economical bowling and fielding are vital for England. Batting of the opponents (in terms of fours scored) has a more significant negative influence on the outcome, vis-à-vis the England team scoring boundaries. As to extraneous variables, debutants playing for the opposition and pitches also influence the outcome (DEBUTANT positively impacting a win). Thus, the focus remains on bowling and fielding for the coaching staff and selection committee. Batting plays second fiddle to bowling and fielding.

5.1.3. India

Results for India indicate the power of batting, bowling, and fielding to influence the log odds of a win. At the same time, none of the interactions are significant in any of the models. In Models 3 and 4, it is observed that the log odds of winning for India increase by 21% (55% win probability) and 25% (56% win probability) respectively when the team scores more fours and sixes, other variables remaining unchanged. It has been a clear strategy for the India batsmen to go the aerial route while batting with the highest mean value of sixes amongst all four teams. On the bowling front, Model 4 results indicate that India’s probability of winning is 75% when more bowlers bowl economically and is at 60% when fielding performance through dismissals is high, everything else remaining unchanged. However, winning probability drops to 33% and 36%, respectively, when opposition bowlers are economical and create more fielding dismissals. It is also estimated that the win probability for India is only 32% when the country plays in pitches outside the subcontinent as against in India, namely, Australia, England, New Zealand, South Africa, West Indies, etc.

Hence it can be summarised that economical bowling and on-field dismissals through catches, stumping, and run-outs are vital for a high win probability. This can be seen in the strategy adopted by Indian team management post the 2007 ICC World Cup on player fitness, including minimum sprinting speed criteria for selection, in addition to fielding and bowling coaches.

5.1.4. Pakistan

For Pakistan, in addition to batting prowess through boundaries, winning probability is highest when bowlers are economical (at 73%), and fielding dismissals are high (at 69%), everything else remains the same (see Model 4). However, it is interesting to note that the negative influence of opposition performance is higher for BOWECON, wherein the loss probability for Pakistan is 76%. Scoring boundaries favours a win for Pakistan more than opposition scoring runs through fours and sixes, and it remains a conscious strategy to have more hard hitters in the team. Introducing more debutants turns out to be a dampener, with the win probability at 36%. Compared to Summer, Spring, and Autumn, which are more favourable for wins (at 68%), from a season’s perspective. There are two significant interactions at 5% in Model 4, revealing that the influence is counter-intuitive, though marginal. When more Pakistani bowlers are economical, and fielding dismissals are also high, the win probability is 48%. It increases to 52% when the same interaction effect is considered for the opposition, with everything else remaining the same.

To summarise, Pakistan tends to lose more matches when they introduce debutants, while batters hitting boundaries favour the odds of winning. Economical bowling and fielding dismissals are important, though opposition teams with high BOWECON have a more significant negative influence on match outcome. Spring and Autumn seasons favour the team compared to Summer. The overall summary of team strategy implications is presented in below.

Table 8. Summary of team strategy implications.

6. Model validation

The models were validated with an in-sample and out-of-sample estimation exercise independently (refer to ). It was found that the prediction accuracy of match outcome was more than 75% in most models (except in two models for Australia) in the in-sample validation test. A split-case in-sample test was developed with 20% randomly selected matches played by the four sides to arrive at the logit regression coefficients of the estimation variables for each model. Once the equations were formulated with the newly derived logit regression coefficients, the values of independent variables for the remaining sample were used to predict the match outcome.

Table 9. Model validation results.

For out-of-sample validation, the model was tested against all matches played by the sides after the conclusion of ICC World Cup 2019. The sample included all the games played by the four countries between July 14th’ 2019 and December 2nd’ 2020. During this period, 39 matches were played by the four teams, with Australia playing 13, England 8, India 14, and Pakistan 4 matches. The prediction accuracy was 75% and above for most models except 12.5% of the cases at about 70% and the remaining 12.5% around 60%. Model validity was ascertained based on the results of both tests and may be applied to devise the game strategy for different teams.

7. Conclusion

Since the first ODI was played between Australia and England on January 5th’ 1971, so far (as of May 22nd’ 2020), 8510 official ODI matches have been played across the globe. The game has evolved significantly over the last five decades with many changes in laws, playing styles, equipment and accessories used, and crowd preferences, in addition to match strategy deployed by the teams. The role of aggressive batting and bowling in winning ODI and T20 matches (Cannonier et al., Citation2015), home pitch advantage (Davis et al., Citation2015; De Silva & Swartz, Citation1997), toss and batting order (Dawson et al., Citation2009), attendance in test cricket (Sacheti et al., Citation2016), the role of umpires (Sacheti et al., Citation2015), and opening partnerships (Valero & Swartz, Citation2012) were some of the noteworthy studies which covered different aspects of the game influencing outcomes across formats. However, it is critical to acknowledge each national team’s strengths and weaknesses and their approach to a match, which has been a research gap. This study evaluates the same for the four major international teams and recommends the strategy to increase the odds of winning. While journalists and cricket experts discuss and strategize a lot on batting prowess and the role of pinch hitters and hard hitters in ODI cricket, our findings reveal that it is not a common determinant across teams. It has a more significant role to play for the subcontinent teams of India and Pakistan when they score runs through boundaries, but not as much for England and Australia. The two most vital variables influencing win outcome positively for all four teams in the role of fielding dismissals and economical bowling with a higher number of bowlers meeting the economy criteria. It has been a pattern for most teams to appoint specialist fielding and bowling coaches in recent years, and several teams have witnessed the impact of the same as well. Hence, these four teams must continue to invest in the bowling and fielding aspects and thus align team selection strategy on those lines.

In contrast, other teams may consider applying the validated models to devise their respective strategies. Contrary to the past findings, this study does not see the role of the umpire’s nationality influencing outcomes. Similarly, the season of play and home advantage has a differential influence across teams on the match outcome. Debutants are expected to either be match-winner to make a solid claim to the side or fail due to pressure. Our findings show differing results for different teams. Hence, a generalised view of using debutants in pressure matches for a win does not influence the outcome for Australia, England, or India.

8. Managerial implications

This study establishes that different aspects of the game have a differing influence on the winning outcome for different national sides in ODI cricket. The proposed and validated models serve as a solid indicator to devise a match strategy, which could be deployed by the four teams modelled in the paper. Essentially, it entails using historical data applied to the proposed models for the different countries to predict the match results and devise the strategy around batting, bowling fielding, and other criteria. Historical mean scores may be used for variables such as FIELDING, BOWECON, EXTRA, FOUR, and SIX. In contrast, variables like SEASON, PITCH, and UMPIRE shall already be known per the schedule shared by the respective boards and ICC. For those models wherein DEBUTANT is a significant variable, the decision is based on the win/loss outcome impact. Other cricketing nations may develop respective predictive models after testing and validating them as proposed in the study to devise their match strategy.

9. Limitations and future research directions

The study has several limitations. The scope of the study includes four major cricketing nations. Hence, there is a scope to advance the domain to other cricketing countries such as West Indies, Sri Lanka, New Zealand, Zimbabwe, and upcoming performers, namely, Afghanistan, Ireland, Netherlands, etc. Further, the study focuses on data for men’ “s cricket representing national sides, and hence similar comparative study for women” ‘s national side, club, and university level cricket teams remains an interesting and future area of work. Second, the models developed in the study exclude variables such as number of specialist batters, number of specialist bowlers, number of all-rounders, number of spin versus fast bowlers in the team composition. This could be an exciting area of research for the future. Third, the study has been based on historical data of ODI matches. It may also be worthwhile to test and validate the models for T20 cricket and alternately build predictive models for the shorter format. These are future research directions on the domain that will significantly improve the body of literature in the future.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The author(s) reported there is no funding associated with the work featured in this article.

Notes

1. Australia – 899 matches from 5th January’ 1971 to 11th July’ 2019, England – 706 matches from 5th January’ 1971 to 11th July’ 2019, India – 926 matches from 13th July’ 1974 to 9th July’ 2019, and Pakistan – 893 matches from 11th February’ 1973 to 5 July’ 2019.

References

  • Ahmed, F., Deb, K., & Jindal, A. (2013). Multi-objective optimization and decision making approaches to cricket team selection. Applied Soft Computing, 13(1), 402–414. https://doi.org/10.1016/j.asoc.2012.07.031
  • Akhtar, S., & Scarf, P. (2012). Forecasting test cricket match outcomes in play. International Journal of Forecasting, 28(3), 632–643. https://doi.org/10.1016/j.ijforecast.2011.08.005
  • Allsopp, P. E., & Clarke, S. R. (2004). Rating teams and analysing outcomes in one‐day and test cricket. Journal of the Royal Statistical Society: Series A (Statistics in Society), 167(4), 657–667. https://doi.org/10.1111/j.1467-985X.2004.00505.x
  • Angelini, G., Candila, V., & De Angelis, L. (2022). Weighted Elo rating for tennis match predictions. European Journal of Operational Research, 297(1), 120–132. https://doi.org/10.1016/j.ejor.2021.04.011 .
  • Asif, M., & McHale, I. G. (2016). In-play forecasting of win probability in One-Day International cricket: A dynamic logistic regression model. International Journal of Forecasting, 32(1), 34–43. https://doi.org/10.1016/j.ijforecast.2015.02.005
  • Bailey, M., & Clarke, S. R. (2006). Predicting the match outcome in one-day international cricket matches, while the game is in progress. Journal of Sports Science & Medicine, 5(4), 480–487. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3861745/.
  • Bennett, R., Ali‐Choudhury, R., & Mousley, W. (2007). Television viewers” motivations to follow the 2005 Ashes Test series: Implications for the rebranding of English cricket. Journal of Product Brand Management, 16(1), 23–37. https://doi.org/10.1108/10610420710731133
  • Bhaskar, V. (2009). Rational adversaries? Evidence from randomised trials in one day cricket. The Economic Journal, 119(534), 1–23. https://doi.org/10.1111/j.1468-0297.2008.02203.x
  • Borooah, V. K. (2016). Upstairs and downstairs: The imperfections of cricket” s decision review system. Journal of Sports Economics, 17(1), 64–85. https://doi.org/10.1177/1527002513511973
  • Brooks, R. D., Faff, R. W., & Sokulsky, D. (2002). An ordered response model of test cricket performance. Applied Economics, 34(18), 2353–2365. https://doi.org/10.1080/00036840210148085
  • Cannonier, C., Panda, B., & Sarangi, S. (2015). 20-over versus 50-over cricket: Is there a difference? Journal of Sports Economics, 16(7), 760–783. https://doi.org/10.1177/1527002513505284
  • Clarke, S. R. (1998). Test statistics. In J. Bennett (Ed.), Statistics in sport, London: Arnold. (pp. 83–103). Arnold Applications of Statistics Series.
  • Croucher, J. S., (2000). “Player ratings in one-day cricket.” In Proceedings of the fifth Australian conference on mathematics and computers in sport, Sydney, NSW: Sydney University of Technology: 95–106.
  • Currie, N. (2000). Maximising sport sponsorship investments: A perspective on new and existing opportunities. International Journal of Sports Marketing and Sponsorship, 2(2), 70–77. https://doi.org/10.1108/IJSMS-02-02-2000-B007
  • Damodaran, U. (2006). Stochastic dominance and analysis of ODI batting performance: The Indian cricket team, 1989-2005. Journal of Sports Science & Medicine, 5(4), 503. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3861748/.
  • Davis, J., Perera, H., Silva, R. M., & Swartz, T. B. (2016). Tactics for twenty20 cricket. South African Statistical Journal, 50(2), 261–271. https://journals.co.za/doi/abs/10.10520/EJC194761.
  • Davis, J., Perera, H., & Swartz, T. B. (2015). A simulator for twenty20 cricket. Australian & New Zealand Journal of Statistics, 57(1), 55–71. https://doi.org/10.1111/anzs.12109
  • Dawson, P., Morley, B., Paton, D., & Thomas, D. (2009). To bat or not to bat: An examination of match outcomes in day-night limited overs cricket. Journal of the Operational Research Society, 60(12), 1786–1793. https://doi.org/10.1057/jors.2008.135
  • de Silva, B. M., & Swartz, T. B. (1997). Winning the coin toss and the home team advantage in one-day international cricket matches. New Zealand Statistician, 32(2), 16–22.
  • Diamantopoulos, A., & Winklhofer, H. M. (2001). Index construction with formative indicators: An alternative to scale development. Journal of Marketing Research, 38(2), 269–277. https://doi.org/10.1509/jmkr.38.2.269.18845
  • Donlan, L. (2014). An empirical assessment of factors affecting the brand-building effectiveness of sponsorship. Sport, Business and Management: An International Journal, 4(1), 6–25. https://doi.org/10.1108/SBM-09-2011-0075
  • Duan, C. J., & Chakravarty, A. (2021). Team contingent or sport native? A bayesian analysis of home field advantage in professional soccer. Journal of Business Analytics, 4(1), 67–75. https://doi.org/10.1080/2573234X.2020.1854625
  • Feros, S. A., Young, W. B., & O’Brien, B. J. (2018). Quantifying cricket fast-bowling skill. International Journal of Sports Physiology and Performance, 13(7), 830–838. https://doi.org/10.1123/ijspp.2017-0169
  • Goldman, M., & Johns, K. (2009). Sportainment: Changing the pace of limited‐overs cricket in South Africa. Management Decision, 47(1), 124–136. https://doi.org/10.1108/00251740910929740
  • Hopwood, M. K. (2005a). Applying the public relations function to the business of sport. International Journal of Sports Marketing and Sponsorship, 6(3), 30–44. https://doi.org/10.1108/IJSMS-06-03-2005-B006
  • Hopwood, M. K. (2005b). Public relations practice in English county cricket. Corporate Communications: An International Journal, 10(3), 201–212. https://doi.org/10.1108/13563280510614465
  • Jain, P. K., Quamer, W., & Pamula, R. (2021). Sports result prediction using data mining techniques in comparison with base line model. Opsearch, 58(1), 54–70. https://doi.org/10.1007/s12597-020-00470-9
  • Kaluarachchi, A., & Aparna, S. V., (2010). “Cric AI: A classification based tool to predict the outcome in ODI cricket.” In 2010 Fifth International Conference on Information and Automation for Sustainability 17-19 Dec. 2010. (IEEE): Colombo, Sri Lanka, 250–255. doi:10.1109/ICIAFS.2010.5715668.
  • Karnik, A. (2010). Valuing cricketers using hedonic price models. Journal of Sports Economics, 11(4), 456–469. https://doi.org/10.1177/1527002509350442
  • Kashif, M., Fernando, P. M. P., & Wijenayake, S. I. (2019). Blinded by the sand of its burrowing? Examining fans’ intentions to follow one-day cricket on TV with a moderating effect of social influence. International Journal of Sports Marketing and Sponsorship, 20(1), 81–108. https://doi.org/10.1108/IJSMS-08-2017-0094
  • Khan, J. R., Biswas, R. K., & Kabir, E. (2019). A quantitative approach to influential factors in One Day International cricket: Analysis based on Bangladesh. Journal of Sports Analytics, 5(1), 57–63. https://doi.org/10.3233/JSA-170260
  • Koulis, T., Muthukumarana, S., & Briercliffe, C. D. (2014). A Bayesian stochastic model for batting performance evaluation in one-day cricket. Journal of Quantitative Analysis in Sports, 10(1), 1–13. https://doi.org/10.1515/jqas-2013-0057
  • Lewis, A. J. (2005). Towards fairer measures of player performance in one-day cricket. Journal of the Operational Research Society, 56(7), 804–815. https://doi.org/10.1057/palgrave.jors.2601876
  • Lewis, A. J. (2008). Extending the range of player-performance measures in one-day cricket. Journal of the Operational Research Society, 59(6), 729–742. https://doi.org/10.1057/palgrave.jors.2602379
  • MacDonald, D., Cronin, J., & Macadam, P. (2018). Key movements and skills of wicket-keepers in one-day international cricket. International Journal of Sports Science & Coaching, 13(6), 1156–1162. https://doi.org/10.1177/1747954118786849
  • MacDonald, D., Cronin, J., Mills, J., McGuigan, M., & Stretch, R. (2013). A review of cricket fielding requirements. South African Journal of Sports Medicine, 25(3), 87–92. https://doi.org/10.7196/SAJSM.473
  • Manage, A. B., Mallawaarachchi, K., & Wijekularathna, K. (2010). Receiver operating characteristic (ROC) curves for measuring the quality of decisions in cricket. Journal of Quantitative Analysis in Sports, 6(2), 8. https://doi.org/10.2202/1559-0410.1246
  • Maymin, P. (2021). Using scouting reports text to predict NCAA→ NBA performance. Journal of Business Analytics, 4(1), 40–54. https://doi.org/10.1080/2573234X.2021.1873077
  • McGinn, E. (2013). The effect of batting during the evening in cricket. Journal of Quantitative Analysis in Sports, 9(2), 141–150. https://doi.org/10.1515/jqas-2012-0048
  • Norman, J. M., & Clarke, S. R. (2010). Optimal batting orders in cricket. Journal of the Operational Research Society, 61(6), 980–986. https://doi.org/10.1057/jors.2009.54
  • Passi, K., & Pandey, N. (2018). Increased prediction accuracy in the game of cricket using machine learning. arXiv preprint arXiv:1804.04226 (International Journal of Data Mining & Knowledge Management Process), 8(2). https://arxiv.org/abs/1804.04226.
  • Pathak, N., & Wadhwa, H. (2016). Applications of modern classification techniques to predict the outcome of ODI cricket. Procedia Computer Science, 87, 55–60. https://www.sciencedirect.com/science/article/pii/S1877050916304653
  • Perera, H., Gill, P. S., & Swartz, T. B. (2014). Declaration guidelines in test cricket. Journal of Quantitative Analysis in Sports, 10(1), 15–26. https://doi.org/10.1515/jqas-2013-0118
  • Prakash, C. D., Patvardhan, C., & Singh, S. (2016). A new machine learning based deep performance index for ranking IPL T20 Cricketers. International Journal of Computer Applications, 137(10), 42–49. https://doi.org/10.5120/ijca2016908903
  • Premkumar, P., Chakrabarty, J. B., & Chowdhury, S. (2020). Key performance indicators for factor score based ranking in One Day International cricket. IIMB Management Review, 32(1), 85–95. https://doi.org/10.1016/j.iimb.2019.07.008
  • Preston, I., & Thomas, J. (2000). Batting strategy in limited overs cricket. Journal of the Royal Statistical Society: Series D (The Statistician), 49(1), 95–106. https://doi.org/10.1111/1467-9884.00223
  • Sacheti, A., Gregory-Smith, I., & Paton, D. (2016). Managerial decision making under uncertainty: The case of twenty20 cricket. Journal of Sports Economics, 17(1), 44–63. https://doi.org/10.1177/1527002513520011
  • Sacheti, A., Gregory‐Smith, I., & Paton, D. (2015). Home bias in officiating: Evidence from I nternational cricket. Journal of the Royal Statistical Society: Series A (Statistics in Society), 178(3), 741–755. https://doi.org/10.1111/rssa.12086
  • Saikia, H., Bhattacharjee, D., & Bhattacharjee, A. (2013). Performance based market valuation of cricketers in IPL. Sport, Business and Management: An International Journal, 3(2), 127–146. https://doi.org/10.1108/20426781311325069
  • Saikia, H., Bhattacharjee, D., & Lemmer, H. H. (2012). A double weighted tool to measure the fielding performance in cricket. International Journal of Sports Science & Coaching, 7(4), 699–713. https://doi.org/10.1260/1747-9541.7.4.699
  • Santos-Fernandez, E., Wu, P., & Mengersen, K. L. (2019). Bayesian statistics meets sports: A comprehensive review. Journal of Quantitative Analysis in Sports, 15(4), 289–312. https://doi.org/10.1515/jqas-2018-0106
  • Scarf, P., Shi, X., & Akhtar, S. (2011). On the distribution of runs scored and batting strategy in test cricket. Journal of the Royal Statistical Society: Series A (Statistics in Society), 174(2), 471–497. https://doi.org/10.1111/j.1467-985X.2010.00672.x
  • Shivakumar, R. (2018). What technology says about decision-making: Evidence from cricket” s Decision Review System (DRS). Journal of Sports Economics, 19(3), 315–331. https://doi.org/10.1177/1527002516657218
  • Singh, G., Bhatia, N., & Singh, S. (2011). Fuzzy logic based cricket player performance evaluator. IJCA Special Issue on Artificial Intelligence Techniques-Novel Approaches and Practical Applications, 1(3), 11–16. https://d1wqtxts1xzle7.cloudfront.net/30853509/SPE206T-with-cover-page-v2.pdf?Expires=1645064529&Signature=YVPk7FSmBiZGyDhnENdYxVj~cTjZjkf0Ak3T-2gE9sqTOy1A4Wh7McnH7YgCVh7yLtt9V-Thq5wATvNQhfXWj17bABQiT-OOZ09xWVXQhKdhSdNHka2DHGN-N-wI1ZLup1XpyzrmgWfUVnDFDTytIiqpZL8VS4yLq9HOGJ2WAZQdO91SCkOYHsOviKgJmJbJbwfTLfPM08s4DyZtF5PJ2Na08tdMmFDnh8A3Oz-6DKIxcKAz--Timcjkb4Ziz6m947UtRPWu7Hvh3iA4XsP9BjbDyKwDpsRMSJ-ynUvXrMH2x8TyQ9DGLnLRqFhn5aDbFAkBddT2NCwl1LEWmIlKAQ__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA.
  • Swartz, T. B Albert, J., Glickman, Mark E., Swartz, Tim B., & Koning, Ruud H. (2016). Research directions in cricket. Handbook of Statistical Methods and Analyses in Sports. Chapman and Hall/CRC Handbooks of Modern Statistical Methods: Boca Raton, FL, 272. https://www.routledge.com/Handbook-of-Statistical-Methods-and-Analyses-in-Sports/Albert-Glickman-Swartz-Koning/p/book/9780367331016.
  • Valero, J., & Swartz, T. B. (2012). An investigation of synergy between batsmen in opening partnerships. Sri Lankan Journal of Applied Statistics, 13, 87–98. https://doi.org/10.4038/sljastats.v13i0.5125

Appendix A.

Review of literature summary

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.