9,122
Views
15
CrossRef citations to date
0
Altmetric
Sports Performance

The tactics of successful attacks in professional association football: large-scale spatiotemporal analysis of dynamic subgroups using position tracking data

ORCID Icon, ORCID Icon, , ORCID Icon &
Pages 523-532 | Accepted 06 Oct 2020, Published online: 27 Oct 2020

ABSTRACT

Association football teams can be considered complex dynamical systems of individuals grouped in subgroups (defenders, midfielders and attackers), coordinating their behaviour to achieve a shared goal. As research often focusses on collective behaviour, or on static subgroups, the current study aims to analyse spatiotemporal behaviour of dynamic subgroups in relation to successful attacks. We collected position tracking data of 118 Dutch Eredivisie matches, containing 12424 attacks. Attacks were classified as successful (N = 1237) or non-successful (N = 11187) based on the potential of creating a scoring opportunity. Using unsupervised machine learning, we automatically identified dynamic formations based on position tracking data, and identified dynamic subgroups for every timeframe in a match. We then compared the subgroup centroids to assess the intra- and inter-team spatiotemporal synchronisation during successful and non-successful attacks, using circular statistics. Our results indicated subgroup-level variables provided more information, and were more sensitive to disruption, in comparison to team-level variables. When comparing successful and non-successful attacks, we found decreases (p < .01) in longitudinal inter- and intra-team synchrony of interactions involving the defenders of the attacking team during successful attacks. This study provides the first large-scale dynamic subgroup analysis and reveals additional insights to team-level analyses.

Introduction

An association football team is considered a complex (i.e. many degrees of freedom) dynamical system, of which all moving parts (players) coordinate their spatial positioning over time in order to achieve a common goal (Balague et al., Citation2013; Davids et al., Citation2005) (i.e. winning the game). In doing so, the team interacts with the opponent and constantly adapts to the changing constraints of the game (i.e., score-line, attacking vs. defending) (Balague et al., Citation2013; Davids et al., Citation2005; J. F. Gréhaigne et al., Citation1997). A team itself consists of 11 individuals, grouped together in subgroups (i.e. attackers, midfielders and defenders) who together constitute the system as a whole (Balague et al., Citation2013; J. F. Gréhaigne et al., Citation1997). The observed spatiotemporal behaviour resulting from the inter-team interaction and subgroups within both teams is considered tactical behaviour (J.-F. Gréhaigne et al., Citation1999; Rein & Memmert, Citation2016).

Association football is a low scoring sport in which scoring opportunities are sparse. When in possession of the ball, teams aim to create scoring opportunities by moving the ball into a scoring position (Goes et al., Citation2019). To achieve this, teams try to space the field by increasing their covered area, create depth through movement without the ball in the longitudinal direction, try to achieve numerical superiority in key areas of the field like the final 3rd, to ultimately disrupt the opposing team’s organization by creating space (Fernandez & Bornn, Citation2018) between the lines and moving opposing players out of position (Clemente & Fernando Manuel Lourenço, Citation2014; Costa et al., Citation2009, F. R. Goes et al., Citation2019). Previous work using tracking data from a full season of >300 Dutch Eredivisie matches has shown that, although only a minority of the attacks result in scoring opportunities, winning teams create significantly more opportunities compared to losing teams (Goes et al., Citation2019; F. Goes et al., Citation2020). Therefore, understanding the tactical behaviour that characterizes attacks resulting in scoring opportunities is highly relevant, and can be considered key for improving performance.

Previous research has shown that association football is an in-phase sport, in which teams tend to move up and down the field as well as side to side in synchrony and in the same direction (Bartlett et al., Citation2012; Coutinho et al., Citation2017; Duarte et al., Citation2012; Frencken et al., Citation2012, Citation2011; Gonçalves et al., Citation2014; Memmert et al., Citation2017; Siegle & Lames, Citation2013). Analysis of variables like the mean team position (team centroid) has demonstrated that inter-team interaction is strongly synchronized, especially in the longitudinal direction (Bartlett et al., Citation2012; Coutinho et al., Citation2017; Frencken et al., Citation2012; Citation2011; Memmert et al., Citation2017; Rein et al., Citation2017). It is assumed that in order to create scoring opportunities, teams try to disrupt this inter-team synchrony (Bartlett et al., Citation2012; Frencken et al., Citation2012; Memmert et al., Citation2017; Rein et al., Citation2017), and winning teams have proven to cause more defensive disruption through passing than losing teams over a full season of Dutch Eredivisie matches (F. Goes et al., Citation2020; Kempe & Goes, Citation2019). However, research with team-level variables like the team centroid failed to find evidence of the inter-team synchrony disruption in the moments before key events like goals or shots (Bartlett et al., Citation2012; Frencken et al., Citation2012; Memmert et al., Citation2017). One might argue that team-level variables are not specific enough to study the tactical behaviour that characterizes successful and non-successful attacks, and the use of subgroup-level variables is recommended (Bartlett et al., Citation2012; Memmert et al., Citation2017).

In a systematic review on the topic of tactical behaviour analysis in football using position tracking data, Low et al. (Citation2019) reviewed 77 studies on tactical behaviour, and specifically discussed the non-linear dynamics of behaviour that are also the topic of the current work. They found that in general, tactical behaviour of a team is considered to be characterized by the concept of self-similarity: behaviour of parts of the team (subgroups) follows behaviour of the global system (the team), as all players on the team share a common goal (Low et al., Citation2019). This synchronisation between team-members or subgroups, but also between teams, seems to be stronger in elite teams compared to lower tier teams, as well as in defenders compared to attacking players (Low et al., Citation2019). In line with this argumentation, previous research has shown that in 4 English Premier League matches, winning teams display higher levels of intra-team movement synchronization in comparison to losing teams, as all possible pairings of outfield teammates (dyads) showed more frequent (near) in-phase behaviour (Memmert et al., Citation2017). Therefore, as attacking teams try to disrupt the subgroup synchrony within the opposing team, they face the following task: disrupting the subgroup synchrony of the opponent while maintaining synchronous subgroup behaviour themselves. Only a few studies have actually looked into the spatiotemporal behaviour of subgroups before (Gonçalves et al., Citation2014; Memmert et al., Citation2017; Siegle & Lames, Citation2013). The results show that generally, subgroups on one team (intra-team) tend to move in synchrony with each other. Furthermore, on the inter-team level it was reported that directly opposing subgroups (i.e. defenders of one team in relation to attackers of the other team), most frequently move in in-phase synchrony too (Gonçalves et al., Citation2014; Memmert et al., Citation2017; Siegle & Lames, Citation2013), in line with the teams as whole (Frencken et al., Citation2012). However, it was also concluded that in comparison to the whole team, subgroups show more variability in their spatiotemporal behaviour, and are more sensitive to disruptions (Memmert et al., Citation2017; Siegle & Lames, Citation2013).

All current studies on subgroup behaviour are characterized by two important limitations. First of all, these studies typically use one static traditional formation descriptor (i.e. 4-3-3), and assign player-roles based on this formation that remain fixed for the entire match (Gonçalves et al., Citation2014; Memmert et al., Citation2017; Siegle & Lames, Citation2013). However, as possession continuously changes from one team to another, and tactical objectives depend on possession status (Clemente & Fernando Manuel Lourenço, Citation2014; Shaw & Glickman, Citation2019), one could argue that one would require at least two formation descriptors to capture the behaviour during the different phases of the game (attacking vs defending). Furthermore, as association football is a highly dynamic game, players frequently swap positions and roles as a result of interaction (Memmert et al., Citation2017). Accordingly, player roles should be deemed dynamic rather than static as well. For example, in possession of the ball, many wingbacks go over the midfielder, and in some parts of the game function as a winger. Therefore, contrary to existing methods, one could argue capturing the complex dynamics of association football requires a formation descriptor that is specific to a certain subphase of the game, and allows for dynamic assignment of player roles.

The second prominent limitation of the current studies on subgroup behaviour is their dependence on notational analysis. All studies construct their formation descriptor based on manually labelled formations and player roles. There are two major downsides to notational analysis: one it is (very) time-consuming, and two it provides a possible source of observer bias, as reported, for example, in the work of (Chawla et al., Citation2017). The authors showed that there is relatively poor agreement between two expert observers when it comes to, for example, rating passes on a 3-point scale. As the intention of sports science work is often to translate research findings to the field, one has to take into account that the time-consuming nature of notational analysis, whereas not necessarily a problem for research, can be detrimental in a practical situation where resources are limited and a game is played every 3–7 days. Therefore, utilizing an automated subgroup identification method would both be more robust as well as much more scalable in practice.

To date, only a limited number of studies that automatically detect formations, and therefore enable subgroup identification, exist. (Bialkowski et al., Citation2014; Citation2014), proposed a method in which a formation descriptor was identified automatically from the heat-map of the team over the entire match, resulting in 11 unique roles that were distributed dynamically over all players during the entire match. In this approach, players were allowed to swap roles, as roles were dynamically distributed over all players utilizing a Hungarian algorithm that uses the log probability of a position belonging to a specific role on a given timeframe (Bialkowski et al., Citation2014). As the emphasis in the works by Bialkowski et al. (Citation2014) was on differentiating between various styles of play, they only constructed one formation descriptor per game, and used that to categorize styles of play and compare home and away performance. In another study, Shaw and Glickman (Citation2019) proposed a method that classifies the formation per game state, to be able to detect tactical changes during the game (Shaw & Glickman, Citation2019). They utilized an agglomerative hierarchical clustering technique to identify 20 different formation types, and used these formations to study offensive and defensive strategies and tactical changes over a match. Both studies are great examples of interesting new methods that quantify formations automatically using position tracking data. In contrast with the current work, however, they were focussed more on team style and strategy, and did not apply their methods to study the interactions between players or subgroups on a micro-level.

With the current study, we aim to analyse the spatiotemporal behaviour of dynamic subgroups, to determine the tactical behaviour that characterizes successful attacks. To achieve this, we aim to answer two research questions: First, can we automatically and adequately identify subgroups based solely on player position tracking data? Second, does the dynamic coupling of subgroups hold more information than dynamic coupling of teams? In order to investigate our second question, we will analyse the spatiotemporal behaviour on a team and subgroup level for successful and non-successful attacking sequences of both teams. Furthermore, we will analyse the coupling of subgroups on an inter- and intra-team level to investigate their importance for success. This approach should allow further insight in what might be the spatiotemporal characteristics of successful attacking and defending.

Our hypothesis in relation to the first question is that we can adequately cluster subgroups based only on position tracking data and that subgroups hold more information and are more sensitive to change as a result of inter-team interaction than team-level variables. Our hypothesis for the second question is that as teams try to create scoring opportunities by creating space, we expect successful attacks are characterized by maintenance of intra-team subgroup synchrony in the attacking team. Furthermore, we expect a decreased intra-team subgroup synchrony in the defending team, and a decreased inter-team synchrony between opposing subgroups. As our study will be one of the first studies to investigate successful attacks vs. non-successful attacks in a dataset of professional competitive data, we expect that answering our research questions could result in an increased understanding of the tactical characteristics of successful attacks. As competitive circumstances in an 11v11 game are hard to replicate in an experimental setup, especially on a large scale, real-world observational studies are especially valuable in the context of professional sports. Despite the limited ability of drawing causal inferences from such studies, we expect that our work can be a valuable addition to the body of experimental research on this topic.

Methods

Data

We utilized an observational design in which we collected a convenience sample of position tracking data of 118 Dutch Eredivisie matches between 26 teams during 4 seasons through our research partners in the field of professional sports. Data had been generated through a semi-automatic optical tracking system (SportsVU; STATS LLC, Chicago) that captures the X and Y coordinates of all players and the ball in metres at 10 Hz. Before analysis, the raw position tracking files were first pre-processed with ImoClient software (Inmotiotec Object Tracking B.V., The Netherlands). Pre-processing consisted of filtering with a weighted Gaussian algorithm (85% sensitivity), which is the recommended filter provided by the software manufacturer for our specific data source, and automatic detection of ball possession and ball events based on the position tracking data. All data was mapped to the same standard field size (105 m x 68 m) where the X-axis runs longitudinally from goal to goal (−52.5 m to +52.5 m), and the Y-axis runs horizontally along the midline (−32 m to +32 m), excluding out-of-bounds regions. All further processing and analysis were conducted using custom routines programmed in Python 3.6.

Next, all attacks in the dataset were identified using the automatically generated event data. An attack started at the moment a team first gained control over the ball, and ended whenever the opponent gained control over the ball or when there was a stoppage of play. We then selected all attacks with a minimal duration of 5 seconds, that started in the first or second 3rd of the field. We set these criteria because we were interested in deliberate attacks only. Exploratory analysis and visual observation of 18 matches in our dataset revealed that possessions starting inside the final 3rd are very often the result of standard-situations (free-kicks, corners, etc.), and can therefore not be considered elaborate attacks. Furthermore, we also found that possessions lasting shorter than 5 seconds typically contain only one failed pass or dribble, and cannot be considered elaborate attacks either. This cut-off value is largely in line with previous research that states that “sustained threats” (elaborate attacks) last at least 6 seconds (Fernandez-Navarro et al., Citation2018). We further validated the 5-second cut-off value by assessing the impact of different cut-off values on our dataset. Furthermore, we conducted a sensitivity analysis by assessing the impact of different cut-off values on the results regarding inter-team synchronisation.

Dynamic subgroup identification

To identify subgroups, we first defined two formation descriptors for every team, for being either in possession of the ball or not: One attacking formation descriptor (FA) based on timeframes in possession, and one defending formation descriptor (FD) based on timeframes not in possession. To construct a formation descriptor that follows the structure of traditional formation descriptors commonly employed by association football coaches (i.e. 4-4-2, 4-3-3), we computed the mean X positions of all outfield players (excluding the goalkeeper) during time in possession [X1A, X2A, … X10A], and time not in possession [X1D, X2D, … X10D], using position tracking data of the first half. As tactical changes can occur during the game (especially in the 2nd half) as a result of game events, substitutions and strategic decisions by the coach, we deliberately choose to use data for the first half only. As we were aiming to replicate the structure of a traditional pre-game formation descriptor, to be able to identify subgroups based on data instead of the starting formation on paper, we choose to omit the 2nd half data as the high variance in the 2nd half would decrease the performance of our clustering method. Mean player positions represented the average (attacking or defending) longitudinal organisation of all players on a team, similar to a traditional formation descriptor. We subsequently used a KMeans unsupervised clustering algorithm, as the KMeans algorithm has proven to be a robust, easy to interpret, and scalable model, that works well with a small number of clusters (David & Vassilvitskii, Citation2007). To follow the structure of traditional formation descriptors we instructed the algorithm to identify 3 clusters, and subsequently detected the number of players in every subgroup, and used this to construct FA and FD. Using the relevant formation descriptor in combination with possession status, we then dynamically identified defenders, midfielders and attackers on both teams for every timeframe, resulting in a dynamic role distribution and a constant (possession dependent) formation ().

Table 1. Illustrative example of dynamic role distribution of players based on clustering

Feature construction

To assess the spatiotemporal behaviour of subgroups, we identified subgroups for every timeframe in a match, and first computed the centroid (average position) for the given subgroup on timeframe t based on the X and Y positions of all subgroup members of a given subgroup on t. Furthermore, to validate the assumption that subgroup variables provide additional insight to team-level variables, we computed the team centroids and inter-team distance in longitudinal (CX) and lateral (CY) direction.

Success of attacks

Every attack was classified as successful or unsuccessful to allow comparison. A successful attack was defined as an attack that resulted in control over the ball in an area that would allow the potential creation of a scoring opportunity. To operationalize this, we computed a zone (Z) value for every ball reception during an attack. Every reception is awarded a score between 0 and 1 based on the location of the reception in relation to the goal, using a grid similar to that in Link’s work on dangerousity (Link et al., Citation2016), from which we deduce points for the defensive pressure on the ball receiver. Any score above 0 indicates the ball possessing player is within a 30 m range of the goal, without (sufficient) defensive pressure, and thereby has the potential to create an opportunity. This concept has been validated in work by (Goes et al., Citation2019; F. Goes et al., Citation2020) using large sets of Eredivisie data, and has been found to have a strong relationship with success. If peak zone during an attack was >0, an attack was classified as successful as it allows the direct creation of a scoring opportunity, otherwise it was classified as non-successful.

Statistical analysis

To formulate an answer on our first research question, we evaluated the accuracy of our clustering approach by computing the silhouette score of the attacking and defending formation descriptor of every match. The silhouette score ranges from −1 to 1, in which negative scores indicate poor clustering, scores around 0 indicate a large overlap in clusters, and positive scores indicate good clustering.

To answer our second research question, we compared the spatiotemporal characteristics of successful and non-successful attacks. To assess the coupling of subgroup behaviour, we computed the longitudinal and lateral synchrony between subgroups on intra- and inter-team levels using the relative phase (⁰) of two subgroup centroid time-series, computed using a Hilbert transform (Palut & Zanone, Citation2005). On an inter-team level we took the subgroup centroids of opposing subgroups (i.e., attackers team A – defenders team B), while on an intra-team level, we did this based on the subgroup centroids of neighbouring subgroups (i.e. defenders – midfielders). To validate our assumption about subgroup versus team-level variables, we studied the inter-team synchrony during successful and non-successful attacks using team-centroids as well.

Given the circular nature and wrapping property (i.e., 370⁰ = = 10⁰) of the relative phase data, we utilized circular statistics to compare subgroup synchrony between successful and non-successful attacks. For every inter- and intra-team relative phase variable, we first computed the mean direction θ, the mean resultant vector length R, the circular variance Vm and the circular standard deviation SD. We then statistically compared successful and non-successful attacks for every variable using a Watson-Williams test (Watson & Williams, Citation1956), with the significance level ⍺ adjusted following a Bonferroni-correction with m = 18. Subsequent effect sizes were computed using Cohen’s d. To further assess the differences between subgroup synchrony in successful and non-successful attacks, we visually assessed the distribution of relative phase values using rose plots (Cremers & Klugkist, Citation2018) and statistically compared distributions between successful and non-successful attacks using Kuiper’s test adapted to circular data (Paltani, Citation2004). This test assesses if there are any differences in mean direction θ or mean resultant length R, and computes a False Positive Probability (FPP) of falsely rejection the null hypothesis that both datasets could have been drawn from the same distribution. A relative phase close to 0⁰ represents in-phase synchronous behaviour, while a relative phase close to 180⁰ represents anti-phase synchronous behaviour, and other values represent a-synchronous behaviour (Gonçalves et al., Citation2014).

Results

We included 12.424 attacks in our analysis, of which 1.237 were classified as successful, and 11.187 as non-successful. This means that less than 10% of all observed attacks were considered successful.

Sensitivity analysis cut-off values

We found that changing the cut-off value within the range of 4–6 seconds would only result in marginal changes in the size of the dataset, with a 9.0% increase in dataset size when changing the cut-off to 4 seconds and a 7.9% decrease when changing it to 6 seconds (). We also found that this would lead to neglectable changes in the properties of the dataset, as the average number of passes would hardly change (± 0.14 passes) when changing the cut-off value between 4 and 6 seconds, as would the average starting location of the attacks (± 1.0 m) and the average total duration (± 1.5 seconds). Our analysis also further confirmed our hypotheses regarding elaborate attacks, as 75% of possessions with a duration < 3.5 seconds contained less than 1 successful pass. In addition, our sensitivity analysis revealed that changing our cut-off value in the range of 4.0–6 seconds only changes the mean θ of the inter-team synchrony on the x- and y-axis with <0.1 degrees, and the standard deviation with <0.3 degrees. Even at a cut-off value as low as 2 seconds the mean θ of the inter-team synchrony between team-centroids on the longitudinal (x) axis would be 7.00⁰ ± 34.10⁰ in successful attacks and 4.16⁰ ± 40.91⁰ in non-successful attacks, which is comparable to the 7.06⁰ ± 34.13⁰ in successful attacks and 4.28⁰ ± 40.66⁰ in non-successful attacks found with the current cut-off value of 5 seconds ().

Figure 1. Assessment of how changing our possession duration cut-off value would impact the size and properties of our dataset

Figure 1. Assessment of how changing our possession duration cut-off value would impact the size and properties of our dataset

Table 2. Inter-team synchrony between directly opposing subgroups of both teams

Dynamic subgroup identification

To answer our first research question, we first looked at the accuracy of our clustering approach. Our dynamic subgroup clustering resulted in an average silhouette score of 0.63 ± 0.07 for the attacking formation descriptor, and an average silhouette score of 0.63 ± 0.07 for the defending formation descriptor.

Subgroup synchronisation

To answer our second research question, we first looked at inter-team synchrony on a team-level and between opposing subgroups. Then, we present the intra-team subgroup synchrony in both teams. For each step, synchrony of the teams is related to success of the attacking sequence.

Inter-team subgroup synchronisation

On an inter-team level, all variables except for the lateral synchrony between attackers of the attacking team and the defenders of the defending team, and the longitudinal synchrony between midfielders of both teams had significantly different mean directions θ (). Most differences were deemed small to trivial, but we found a medium effect size (= −0.41, 95% CI: [−0.42, −0.41]) on the difference in longitudinal synchronization between defenders on the attacking team and attackers on the defending team ().

The distribution of (a-)synchronous inter-team behaviour () shows significant differences in mean direction θ or mean resultant length R on all variables. According to Kuiper’s test, there is a 0.0% change that samples were drawn from the same distribution for all inter-team variables. Based on visual inspection, the most pronounced effects were found in the longitudinal synchrony between defenders on the attacking team, and attackers on the defending team (: Long. DEF-ATT). Both successful and non-successful attacks were characterized by comparably more frequent occurrences of anti-phase behaviour compared to the interactions between other subgroups, and successful attacks seemed to be characterized by more frequent a-synchronous behaviour. What also stands out is the longitudinal synchrony between attackers on the attacking team and defenders on the defending team (: Long. ATT-DEF). Successful attacks seemed to be characterized by an increased occurrence of in-phase synchronous behaviour. Finally, we also found that team-level variables (: Long. Cx-Cx & Lat. Cy-Cy) are characterized by a more directed distribution and a lower spread in comparison to subgroup-level variables.

Figure 2. Rose plots of relative phase distributions for inter-team variables. Data is grouped in 22.5⁰ bins, in which de radius of the bin represents the relative occurrence. Grey bins with no edges represent non-successful attacks, while white bins with black edges represent successful attacks. Black dotted lines with circular markers represent the mean direction θ and mean resultant length R of the non-successful distributions, while black solid lines with diamond markers represent those of the successful distributions

Figure 2. Rose plots of relative phase distributions for inter-team variables. Data is grouped in 22.5⁰ bins, in which de radius of the bin represents the relative occurrence. Grey bins with no edges represent non-successful attacks, while white bins with black edges represent successful attacks. Black dotted lines with circular markers represent the mean direction θ and mean resultant length R of the non-successful distributions, while black solid lines with diamond markers represent those of the successful distributions

Intra-team subgroup synchronisation

On the intra-team level, we found significant differences in subgroup synchrony for all variables on the attacking team (). All effect sizes were deemed trivial to small, except for the longitudinal synchrony between the defenders and midfielders, which had a medium effect size (= 0.25, 95% CI: [0.25, 0.25]). For the defending team, only the lateral synchrony between defenders and midfielders, and the longitudinal synchrony between midfielders and attackers was significantly different between successful and non-successful attacks, but both effect sizes were considered to be small ().

Table 3. Intra-team subgroup synchrony

The distribution of (a-)synchronous intra-team behaviour () shows significant differences in mean direction θ or mean resultant length R on all variables. According to Kuiper’s test, there is a 0.0% change that samples were drawn from the same distribution for all intra-team variables. Based on visual inspection, the most pronounced effects were found in the longitudinal direction, especially for the defenders and midfielders on the attacking team, who display more frequent a-synchronous behaviour during successful attacks (: Long. DEF-MID – Offence), as well as the midfielders and attackers on the attacking team, who displayed more frequent synchronous in-phase behaviour during successful attacks (: Long. MID-ATT – Offence). On the defending team, we found similar effects for both the defenders and midfielders (: Long. DEF-MID – Defence), and midfielders and attackers (: Long. MID-ATT – Defence).

Figure 3. Rose plots of relative phase distributions for intra-team variables. Data is grouped in 22.5⁰ bins, in which de radius of the bin represents the relative occurrence. Grey bins with no edges represent non-successful attacks, while white bins with black edges represent successful attacks. Black dotted lines with circular markers represent the mean direction θ and mean resultant length R of the non-successful distributions, while black solid lines with diamond markers represent those of the successful distributions

Figure 3. Rose plots of relative phase distributions for intra-team variables. Data is grouped in 22.5⁰ bins, in which de radius of the bin represents the relative occurrence. Grey bins with no edges represent non-successful attacks, while white bins with black edges represent successful attacks. Black dotted lines with circular markers represent the mean direction θ and mean resultant length R of the non-successful distributions, while black solid lines with diamond markers represent those of the successful distributions

Discussion

The current study aimed to analyse the spatiotemporal behaviour of dynamic subgroups in relation to successful attacks. To achieve this, we aimed to build an algorithm to automatically and adequately identify subgroups in association football teams based on position tracking data, and to use these subgroups to explore if their spatiotemporal behaviour provides more information compared to team-level analysis, in relation to successful attacks. Our results indicate that we could adequately identify subgroups, and that analysis of these subgroups provides more information in comparison to team-level variables. The main findings with regards to these subgroups were that the defenders on the attacking team and attackers on the defending team showed a decreased inter-team synchronisation. Furthermore, the defenders and midfielders on the attacking team also showed a decreased intra-team synchronisation.

Using a new approach for automated dynamic subgroup identification, we have shown that we can adequately cluster dynamic subgroups using only position tracking data. By means of the presented silhouette scores, we showed that our clustering approach provides a valid subgroup identifier, with little overlap between clusters. This is a very promising result as this is the first study that uses a dynamic, context dependent, method for automated subgroup identification in association football. Previous contributions to the field of subgroup behaviour in association football either used an approach in which subgroups were identified based on manual labelling (Gonçalves et al., Citation2014; Memmert et al., Citation2017; Siegle & Lames, Citation2013), or an approach that dynamically distributed roles but used a static formation irrespective of context (Bialkowski et al., Citation2014; Citation2016). Our approach provides three advantages over existing methods. First, as our approach automatically identifies the formation and the subgroups, it is both more scalable as well as more reliable than approaches that require manual labelling. Given the time-consuming nature of notational analysis, its limited inter-rater reliability (Chawla et al., Citation2017), and the increased use of data in professional association football (Rein & Memmert, Citation2016), scalability should be a key consideration in every future analysis, especially when that analysis is intended to serve practical purposes. Second, as association football is a highly dynamic game, and players frequently swap positions as a result of interaction (Bialkowski et al., Citation2014), our dynamic approach provides a more realistic perspective on tactical behaviour than approaches that employ static player roles or formations. Finally, as tactical behaviour is context-dependent (Balague et al., Citation2013), and team strategy changes with possession status (Clemente & Fernando Manuel Lourenço, Citation2014), using a formation descriptor that changes with possession status can be seen as a more valid approach compared to static formation descriptors. As a result of these advantages, our dynamic subgroup analysis seems capable of advancing our understanding of team tactics, specifically inter- and intra-team subgroup interactions.

While our dynamic subgroup identification has proven to be adequate as well as to be able to provide interesting insights in tactical behaviour, several concerns should also be mentioned. In the current work, we aimed to construct a formation descriptor that follows the structure of a traditional 3-line formation descriptor (i.e., 4-3-3) as is typically used in practice. Furthermore, we deliberately choose to only differentiate between attacking and defending phases of play, as our main goal was to do an innovative subgroup analysis focussing on successful attacking. However, one has to acknowledge that formations can change throughout the game, especially as coaches react on match events like goals, and substitutions occur. For that reason, we choose to only use data from the first half for our formation descriptor. This stopping criterion is somewhat arbitrary, and different choices could have been made here, like using data up to the first substitution. However, other stopping criteria would be more random, as events like substitutions can happen at any time during the game, for a number of reasons. Regardless, identifying the most valid stopping criteria, or even re-constructing the formation descriptor multiple times a game would be an interesting topic for future research. One interesting work to look at in that regard is the study by (Shaw & Glickman, Citation2019) who constructed a formation descriptor similar to ours, but analysed formations over multiple time-windows during the match. Another interesting point of discussion is the fact that players can be grouped in many different ways other than their longitudinal positional role (i.e., defender, midfielder or attacker). One could argue, for example, that subgroups can also be formed based on dynamic interactions with and without the ball, by clustering running trajectories and passing interactions of different players throughout various phases of the game. Finally, we deliberately choose to utilize a KMeans algorithm, because of its simplicity, scalability, and goodness of fit towards our problem. However, as both the work of (Shaw & Glickman, Citation2019), as well as the work of (Bialkowski et al., Citation2014) illustrate, there are different solutions to the same problem. Comparing the different methods could help to advance the current state of the art in this aspect. In conclusion, we propose that optimizing the dynamic subgroup identification approach can be performed in different ways, but would require a separate study that is solely focussed on this aspect.

To display the relevance of the subgroup identifier we studied the hypothesis that the analysis of subgroup-level variables would be more informative in comparison to team-level variables. As we assumed based on previous work, behaviour on the team as well as subgroup level was generally characterized by in-phase behaviour (Frencken et al., Citation2012; Citation2011). However, in line with our hypothesis, subgroup-level variables did provide more in-depth information on this aspect in comparison to team-level variables. On the inter-team level, we found a pronounced anti-phase synchronous behaviour pattern between defenders on the attacking team and attackers on the defending team. This would not have been uncovered based on team-level analysis. Furthermore, inspection of the rose plots as well as comparison of the circular variance (Vm) revealed that although both subgroup and team level variables typically follow the characteristics of unimodal directed distributions, subgroup variables have a much larger spread. Therefore, we confirm the assumption that subgroups are more sensitive to disruptions (Memmert et al., Citation2017; Siegle & Lames, Citation2013), and we argue that subgroup-level variables can provide more detailed information compared to team-level variables.

Our results revealed that in general, inter-team subgroup interactions were characterized by in-phase synchronous behaviour. These findings are in line with previous work on smaller sample sizes, in which analyses of subgroups in one World Cup match (Siegle & Lames, Citation2013), one Champions League match (Memmert et al., Citation2017), and one exhibition youth game (Gonçalves et al., Citation2014) have demonstrated that inter-team subgroup synchrony is characterized by synchronous in-phase behaviour. Additionally, we hypothesized that successful attacks would be characterized by a decreased inter-team synchrony, as the attacking team tries to create space for an attack. Our results could only partially confirm this hypothesis. We found that during successful attacks on a team-level, a-synchronous behaviour especially in the longitudinal direction was marginally increased. On a subgroup-level, our main finding was that longitudinal and lateral synchrony between defenders on the attacking team and attackers on the defending team was decreased in successful attacks. In addition, we found a (marginally) increased synchrony between attackers on the attacking team and defenders on the defending team. Perturbations of the equilibrium in inter-team synchrony in association football are known to occur infrequently and only last for a short time (Frencken et al., Citation2012), and top tier teams are able to adjust their behaviour quickly and will only a small delay, while already showing more regular and synchronized behaviour patterns to begin with (Low et al., Citation2019). Therefore, the significant inter-team findings, especially the decreased inter-team synchrony between the defenders on the attacking team and attackers on the defending team can be considered to be essential findings. As our results also yielded a decreased intra-team synchrony between defenders and midfielders on the attacking team during successful attacks, one could argue that the attackers on the defending team have to choose between falling back to aid the defence, or keeping pressure on the build-up, which seems to disrupt the inter-team synchrony.

We assumed that on an intra-team level, teams behave like complex systems, and subgroup behaviour is characterized by self-similarity (Davids et al., Citation2005). Our results confirmed the self-similarity assumption, as subgroups on both the attacking team as well as the defending team spent the majority of the time moving in longitudinal and – to a lesser extent – lateral in-phase synchrony. These results are in line with previous work that found a strong synchrony between subgroup members and their subgroup centroid (Gonçalves et al., Citation2014), and a strong synchrony between combinations of team-mates (dyads) (Folgado et al., Citation2018). Based on the assumption that self-similarity is related to performance, we hypothesized that successful attacks would be characterized by an increased intra-team synchrony on the attacking team, and a decreased intra-team synchrony on the defending team. Our results indicated that this hypothesis should be rejected. Our main finding on the intra-team level conflicted with our hypothesis, but matches our inter-team findings discussed above, as we found a decreased longitudinal synchrony between the defenders and midfielders on the attacking team during successful attacks. The changes in longitudinal synchrony could be explained by the strong coupling of inter-team interactions (Frencken et al., Citation2012; Citation2011), and the strategy typical for the build-up of an attack in association football. It seems as if the midfielders and attackers of the attacking team form an offensive unit that moves in longitudinal synchrony, while the defenders move asynchronous to space the field. As behaviour of teams is directly coupled (Bartlett et al., Citation2012; Frencken et al., Citation2011), the changes in intra-team synchrony in the attacking team are mirrored by the defending team. In conclusion, it seems as if self-similarity of intra-team behaviour is not directly linked to success, and that this could be explained by the coupling of inter-team behaviour.

Despite most effect sizes were relatively small, the practical implications of those effects could be considerable. Within association football, the focal point of tactical analysis of attacking play are often the attackers, and we tend to give credit for a successful attack to those players directly involved in key passes and assists (Kempe et al., Citation2020), as well as assign blame to the defenders for not preventing those attacks from happening. However, in addition to the key role that is obviously played by attackers during any attack, our results underline the importance of the defenders on the attacking team, as they seem to be ones responsible for creating space. On the other hand, our results also indicate that suffering successful attacks should – at least partially – be attributed to the attackers on the defending team, as this is typically where the organization seems to break down. Therefore, our results could trigger practitioners to redirect their attention to different parts of the attack, and could potentially impact tactical exercises.

In conclusion, this has been the first study to automatically identify dynamic subgroups, and assessed interactions on both inter- and intra-team levels. By utilizing this approach in the analysis of a large-scale dataset of professional matches, we provided new insights in the dynamical coordination and interaction of subgroups in relation to successful attacking. However, several questions that could be addressed by future research remain. First of all, tactical behaviour seems to be impacted by playing style (Kempe et al., Citation2014), and therefore determinants of success could be team or style specific. Although we already studied a heterogeneous sample of teams with various playing style, our study is limited by the fact that all teams played in the Dutch competition. Therefore, it might be interesting to study a dataset that contains matches played in different countries and competitions to see how a nation’s association football philosophy or competition level affects behaviour and determinants of success. In addition, as were focussed more on subgroup interactions and less on style, we assumed formations remain constant over the course of a match. However, as illustrated by Shaw and Glickman (Citation2019), tactical changes result in changing formations over the course of a match (Shaw & Glickman, Citation2019). Accounting for these changing formations would add an extra dimension to our analysis, which would be very interesting for future work. Furthermore, in the current study, we aggregated temporal findings into means per attack and looked at relative occurrences of (a-) synchronous behaviour. As this limits the practical interpretability, it could be interesting for future work to conduct time-series analysis of an attack, thereby enabling the exact identification and study of the moments during an attack and its’ determinants. Finally, in the current study, we limited our analysis to attacks longer than 5 seconds starting outside of the final 3rd. Albeit somewhat arbitrary, these thresholds do occur more often in similar work (Shaw & Glickman, Citation2019). Furthermore, we assessed the validity of this decision by conducting a sensitivity analysis and found that marginal changes to the cut-off value would not impact our dataset nor our results in a significant manner. However, future work could further improve these criteria by systematically validating our approach with expert analysts, and setting additional criteria to further improve our inclusion of elaborate attacks only.

Conclusion

We have shown that we can automatically identify dynamic subgroups based only on position data, and that these subgroups hold more information, and are more sensitive to change, in comparison with team-level variables. During successful attacks, we found decreases in the synchrony of intra-team and inter-team subgroup interactions, that seem related to the creation of space during a successful attack. Practical implications of our findings imply that successful attacks are strongly dependent on the defenders creating space for the attackers by moving in an a-synchronous anti-phase fashion, thereby challenging the attackers on the defending team.

Acknowledgments

This work was supported by a grant of the Netherlands Organization for Scientific Research (project title: “The Secret of Playing Football: Brazil vs. The Netherlands”).

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by grants of the Netherlands Organization for Scientific Research (629.004.012-SIA).

References