Forced Turnover: Evaluating Pressing Effectiveness in Soccer

Authors

David Almona, Centre College

Natalie Rayce, Carnegie Mellon University

Published

July 25, 2025


Introduction

Soccer is a highly tactical sport where defensive tactics not only prevent the opposing team from scoring but can create immediate attacking opportunities. Pressing, a defensive tactic where opposing players apply coordinated pressure on the offensive ball carrier to force turnovers, has been at the core of elite teams like Liverpool under Jürgen Klopp and Manchester City under Pep Guardiola (commonly known as “gegenpressing”). They have shown that effective pressing can turn defense into goal-scoring opportunities within seconds.

This study centers around the question: Can the effectiveness of a press in soccer be predicted using factors such as spatial context, pressing dynamics, game context, and situational factors? We have defined our measure of effectiveness as forcing a turnover within 5 seconds of pressing initiation.

Data

Match information, dynamic events, and XY tracking data are provided by SkillCorner. They use artificial intelligence and deep learning to detect moving objects in broadcast videos and extract data. Match information includes the data and time of the games, home and away team names, pitch dimensions, referee information, and other game-level details. Dynamic events include on-ball and off-ball activities such as passes, shots, tackles, recoveries, off-ball runs, goals, and many others. Each event also has a timestamp, location coordinates, and player identification. XY tracking data includes the identities, locations, and movements of all 22 players and the ball throughout the full 90 minutes at a rate of 10 frames per second (10 Hz). Since this data is extracted from broadcast video, SkillCorner uses its technology to extrapolate the coordinates of players outside of the camera’s field of view.

The dataset contains 520 matches played in the 2023 season of Major League Soccer (MLS), the professional soccer league in North America. The dynamic events for 18 of the 520 matches were not provided by SkillCorner because they did not pass their quality check and were, therefore, unusable for this analysis.

Methods

Data processing

Using the XY tracking data, we calculated the frame-by-frame distance, velocity, acceleration, and direction of both the players and the ball. These velocities were smoothed using a rolling function with a window size of 3 frames, and the accelerations with a window size of 5 frames. Afterwards, we removed the physically impossible values, such as velocity values larger than 11.9 m/s and acceleration values larger than 10 m/s2. We also standardized the data such that the home team always attacks from left to right (Anzer et al. (2025).

Detecting pressing sequences with XY tracking data

While StatsBomb-360 data and possibly other soccer tracking providers tag a pressure or pressing event, SkillCorner does not. To identify pressing events, we initially defined the “pressure zone” as any area within 6 meters of the ball carrier. However, according to Andrienko et al. (2017), this approach is too simplistic and does not account for the directions players are facing or moving towards. So, we chose to adopt the new approach they had proposed, where the “pressure zone” is elliptical (or oval) rather than circular. The distance limits are determined by the following formula:

L = D_{back} + (D_{front} - D_{back})(z^3 + 0.3z) / 1.3 where:

L = the maximum distance limit for effective pressure at angle \theta (the radius of the oval-shaped pressure zone at any given angle)

D_{back} = the maximum distance limit when the presser is positioned behind the ball carrier

D_{front} = the maximum distance limit when the presser is positioned in front of the ball carrier

z = (1 - cos \theta) / 2

\theta = the angle between the vector from the ball carrier to the center of the attacking goal (which we determined as the threat direction) and the vector from the ball carrier to the presser

Andrienko et al. (2017) determined the distance thresholds D_{back} and D_{front} to be 3m and 9m, respectively, based on consultation with football (soccer) experts. He later performed an experiment to verify these parameters.

Figure 1: The pressure zone. Left panel shows our initial circular pressure zone approach. Right panel shows the oval pressure zone method from Andrienko et al. (2017), which accounts for directional threat. The shaded blue area is where defending players would be considered as applying pressure to the ball carrier.

Now that we have determined the pressure zone, the only other criterion we specified was that the approach velocity of the defender to the ball carrier must be greater than 1 m/s, as proposed by Merckx et al. This approach velocity threshold has been set in place to filter out “static” defending/pressing, as the defender must actively engage or move towards the ball carrier even if within the pressure zone. To reiterate, a defending player was classified as “pressing” if they were simultaneously within the oval pressure zone and approaching the ball carrier above the velocity threshold.

Grouping pressing sequences

Individual pressing actions were grouped into pressing sequences based on how close they happened in time. We defined a pressing sequence as a continuous period where at least one defender from the same team maintained pressing behavior, allowing for brief interruptions of up to 1.5 seconds (15 frames). In other words, if a pressing defender leaves the pressure zone or is no longer actively pressing, but another defender exhibits pressing behavior within 1.5 seconds, the sequence remains active. However, if the next press begins more than 1.5 seconds after the previous press, a new pressing sequence begins.

For each identified sequence, we extracted the sequence duration (in frames and seconds), the number of defending players involved, and the average approach velocity of pressing defenders at the sequence start. 252,646 pressing sequences were identified.

Figure 2: Heatmap showing the pressing patterns across the MLS 2023 season. The highest concentration of pressing occurs in the middle third of the field on both sides, where teams look to win the ball back in midfield areas to create quick attacking opportunities. Note: the home team is made to always attack left-to-right, the away team goes right-to-left.
Measuring an effective press

Since the goal of pressing is to regain ball possession or force a turnover by the attacking team, Lee et al. (2025) highlighted that the impact of pressing should extend beyond immediate ball possession. This is mainly because pressing can force the attacking team into tight positions, which may increase the likelihood of an eventual turnover in the next few seconds or actions. They tested different success criteria for pressing, such as regaining possession after 7 seconds or after 4 actions, among others, but their analysis focused on regaining possession within 5 seconds of the pressing initiation. As a result, we decided to use the same method and assess the effectiveness of a pressing sequence based on whether the pressing team forced a turnover within 5 seconds of the start of the pressing sequence.

Feature engineering

After data cleaning and processing, we had 31 features that were used for training our model. These included:

  • Spatial Context: Ball carrier position, distance to boundaries, field third, etc.
  • Pressing Dynamics: Number of defenders, approach velocity, passing options, etc.
  • Game Context: Score, game state (winning/losing/drawing), time remaining, etc.
  • Situational Factors: How the ball carrier gained possession (pass reception, interception, etc.), incoming pass characteristics (distance, height, range), etc.

These features were extracted from already tagged events provided by SkillCorner and from processing the XY tracking data. Further details on the features are provided in the appendix.

Analysis

To reiterate, 252,464 pressing sequences were identified across the 502 MLS matches. After explicitly handling missing values within some of the features, we built and compared the performances of two models to predict forced turnovers within 5 seconds of pressing initiation: a logistic regression as a baseline and an XGBoost model. We used 10-fold cross-validation with match-based splits to prevent data leakage, making sure that no observations from the same match appeared in both training and test sets. To reduce computational load, hyperparameter tuning was done on a 10% stratified sample of the data to find the best XGBoost parameters.

Model Performance

The XGBoost model marginally outperformed the logistic regression model across all evaluation metrics, as seen in Table 1 below.

Table 1
Model Calibration

Figure 3 below shows the calibration plot of our XGBoost model, plotting predicted turnover probabilities against actual turnover rates. In a perfectly calibrated model, the blue line, which is our model, should align exactly with the red dotted diagonal line (perfect calibration).

Our model shows reasonably good calibration overall, with some deviation from perfect. This tells us that our XGBoost model is better at predicting when turnovers are unlikely to occur than when they are likely to occur.

Figure 3: Calibration plot

Results

Team Pressing Performance

Significant variation existed in pressing effectiveness across MLS teams during the 2023 season (Figure 4). Based on our model, the New York Red Bulls were the most effective pressing team, forcing approximately 7 more turnovers per game than our model predicted.

Figure 4: Positive values (blue) indicate teams forcing more turnovers than predicted, while negative values (red) show underperformance. New York Red Bulls led MLS in pressing effectiveness, while Nashville SC struggled most relative to expectations.
Pressing Volume vs. Pressing Effectiveness

Figure 5 below shows pressing volume against pressing effectiveness by teams with four quadrants: high-volume/high-effectiveness (top right), low-volume/high-effectiveness (top left), high-volume/low-effectiveness (bottom right), and low-volume/low-effectiveness (bottom left). The New York Red Bulls represented the ideal combination, attempting the most presses per game while maintaining the highest effectiveness.

Figure 5: Teams in the upper-right quadrant combine high pressing frequency with high effectiveness
When pressing vs. when being pressed

Figure 6 below shows the relationship between pressing effectiveness and press resistance across all MLS teams during the 2023 season, measured as the difference between actual and expected turnovers per game (xP_diff). When teams are pressing, positive xP_diff values indicate forcing more turnovers than expected. When teams are being pressed, negative xP_diff values indicate better press resistance (fewer turnovers than expected). For visualization purposes, the y-axis values were inverted, transforming negative xP_diff values into positive “turnovers avoided,” so that both axes follow the natural idea that higher values represent better performance.

Teams in the upper-right quadrant are good at both: they force more turnovers than expected when pressing while avoiding more turnovers than expected when in possession. Based on our model, Seattle Sounders, Los Angeles FC, and Sporting Kansas City are the select few that do well at both. St. Louis City Soccer Club and New York Red Bulls show strong pressing abilities but are vulnerable when being pressed (lower-right). Portland Timbers seem to struggle the most at both.

Figure 6: Pressing effectiveness versus press resistance for MLS teams during the 2023 season.

Discussion

Feature Importance

Our analysis shows that pressing effectiveness in the MLS is predictable to a meaningful degree, with our XGBoost model achieving an AUC of 0.771. Variable importance analysis showed that start_type contributed approximately 70% of total model importance. This variable describes how the ball carrier got in possession of the ball, which could be an interception, reception, recovery, among others. When we looked at the actual turnover probabilities, pressing the ball carrier who got the ball from an interception led to a high turnover rate, approximately 74% of the time (Table 2). This aligns with tactical principles employed at the highest levels of professional football. As Domenec Torrent, Pep Guardiola’s former assistant coach at Manchester City, explained: “When we lose the ball it’s very important for Pep to press high in five seconds. If you don’t win it back within five seconds then make a foul and go back” (Torrent, cited in Manchester Evening News). This immediate pressing approach, particularly effective when opponents have just gained possession through interceptions, reflects the vulnerability window that our data quantifies, demonstrating why start_type is an important predictor of successful pressure outcomes.

Table 2: Teams in the upper-right quadrant combine high pressing frequency with high effectiveness
Limitations
  • There was a 23% class imbalance in the response variable, forced turnover. This potentially leads to biased models that perform poorly on the minority class (in our case, turnover = “yes”) as the models are structured for overall accuracy.

  • Our analysis uses only MLS data, limiting generalizability to other leagues with different tactical styles, player quality, or physical demands.

  • Tracking data inaccuracies may affect player position and movement precision.

  • Our model does not account for individual player skill levels, such as pace and pressing ability.

  • No pitch control modeling limits understanding of spatial dominance during pressing.

Future
  • Apply class weights to handle class imbalance.

  • Extend analysis to multiple leagues.

  • Add pressing intensity calculation.

  • Add pitch control metrics to account for spatial dominance.

Acknowledgement

Special thanks to Daniel Wicker (Charlotte FC), Dr. Ron Yurko, Quang Nguyen, the CMSACamp TAs, and Carnegie Mellon University.

Citations

Andrienko, G., Andrienko, N., Budziak, G., Dykes, J., Fuchs, G., von Landesberger, T., & Weber, H. (2017). Visual analysis of pressure in football. Data Mining and Knowledge Discovery, 31(6), 1793–1839. https://doi.org/10.1007/s10618-017-0513-2

Anzer, G., Arnsmeyer, K., Bauer, P., Bekkers, J., Brefeld, U., Davis, J., Evans, N., Kempe, M., Robertson, S. J., Smith, J. W., & Van Haaren, J. (2025). Common Data Format (CDF): A Standardized Format for Match-Data in Football (Soccer). http://arxiv.org/abs/2505.15820

Bauer, P., & Anzer, G. (2021). Data-driven detection of counterpressing in professional football: A supervised machine learning task based on synchronized positional and event data with expert-based feature extraction. Data Mining and Knowledge Discovery, 35. https://doi.org/10.1007/s10618-021-00763-7

Lee, M., Jo, G., Hong, M., Bauer, P., & Ko, S.-K. (2025). exPress: Contextual Valuation of Individual Players Within Pressing Situations in Soccer.

Merckx, S., Robberechts, P., Euvrard, Y., & Davis, J. (n.d.). Measuring the Effectiveness of Pressing in Soccer. Robberechts, P. (n.d.). Valuing the Art of Pressing.

Contact Information

David Almona, Centre College, almonadavid@gmail.com

Natalie Rayce, Carnegie Mellon University, nrayce@andrew.cmu.edu

Code Availability

Code available on GitHub