Forced Turnover:
Evaluating Pressing Effectiveness in Soccer

Natalie Rayce

Carnegie Mellon University

David Almona

Centre College

Daniel Wicker

Charlotte FC, External Advisor

Key Terms

Pressing: a defensive tactic where players apply coordinated pressure on the opponent with the ball to force mistakes, win back possession, and quickly transition to attack

Forced Turnover: when a player loses possession due to opponent pressure, resulting in the opposing team gaining control. This includes misplaced passes, interceptions, successful tackles, or losing control under pressure - all direct results of effective defensive pressure

Turning Defense to Attack

Manchester City wins back possession seconds after losing it through pressing.

Data

  • Source: SkillCorner

  • Dataset:

    • 520 matches in the MLS 2023 season
    • 18 unusable matches with no event data
  • Three data types:

    • Match information: game details (teams, pitch, referee)
    • Event data: player actions (passes, shots, tackles) with timestamps and coordinates
    • Tracking data: real-time positions of all players and ball at 10 Hz

Steps


1. Identify when a press occurs using player tracking data


2. Define the criteria for an effective press


3. Modeling & Results

Initial Pressure Zone

  • Within 6 meters of the ball carrier

Problem: Doesn’t account for direction and oversimplifies pressing

The Pressure Zone

Adopted from Andrienko et al. (2017).1

Oval Pressure Zone Formula

\[ L = D_{back} + (D_{front} - D_{back})(z^3 + 0.3z) / 1.3 \] where:

  • \(L\) = the maximum distance limit for effective pressure at angle \(\theta\)

  • \(D_{back}\) = the max. distance limit when the presser is positioned behind the ball carrier

  • \(D_{front}\) = the max. distance limit when the presser is positioned in front of the ball carrier

  • \(z\) = \((1 - cos \theta) / 2\)

  • \(\theta\) = the angle between the vector from the ball carrier to the center of the attacking goal (our threat direction) and the vector from the ball carrier to the presser

Andrienko et al. (2017) determined the distance thresholds \(D_{back}\) and \(D_{front}\) to be 3m and 9m, respectively, based on consultation with football (soccer) experts. He later performed an experiment to verify these parameters.

The pressure zone angle

Pressing Criteria

A defending player was classified as “pressing” if they were simultaneously

  • within the oval pressure zone, AND
  • approaching the ball carrier above a velocity threshold of 1 m/s.

Pressing actions were grouped into sequences if at least one defender continued pressing within 1.5 seconds.

252,464 pressing sequences were identified across the 502 MLS matches.

But What Makes An Effective Press?

  • Since the goal of pressing is to regain ball possession from the attacking team, the impact of pressing should extend beyond immediate ball re-possession.
  • Pressing can force the attacking team into tight positions, which may increase the likelihood of an eventual turnover in the next few seconds or actions.

A press is effective when there’s a forced turnover within 5s of pressing initiation.


Features: 31 features were extracted and used for training our model:

  • Spatial Context: Ball carrier position, distance to boundaries, field third, etc.

  • Pressing Dynamics: Number of defenders, approach velocity, passing options, etc.

  • Game Context: Score, game state (winning/losing/drawing), time remaining, etc.

  • Situational Factors: How the ball carrier gained possession (pass reception, interception, etc.), incoming pass characteristics (distance, height, range), etc.

Modeling

We compared five models all evaluated using 10-fold cross-validation with match-based splits to prevent data leakage.

Hyperparameter tuning was done on a 50% stratified sample of the data to find the best XGBoost parameters.

*Calibration plots are available in the appendix.

Is XGBoost really better than Logistic?

The performance difference between XGBoost and logistic regression appears negligible.

\[ |z| = \frac{|\bar{x}_{xgboost} - \bar{x}_{logit}|}{\sqrt{SE^2_{xgboost} + SE^2_{logit}}} \]

\[ = \frac{|0.434 - 0.445|}{\sqrt{0.00263^2 + 0.00255^2}} \]

\[ = \frac{0.011}{0.00366} \]

\[ = 3.02 \text{ standard errors apart} \]

New York Red Bulls pressed best

Comparing our model to PPDA

What’s Passes Per Defensive Action (PPDA)?: the number of opposition passes allowed outside of the pressing team’s own defensive third, divided by the number of defensive actions by the pressing team outside of their own defensive third. (Source: Opta Analyst)

A lower figure indicates a higher level of pressing, while a higher figure indicates a lower level of pressing.

In short:

  • PPDA measures pressing aggressiveness

  • Our model measures pressing effectiveness

So why even compare these?

Lower PPDA Correlates with More Turnovers Forced

PPDA values from Opta Analyst

Variable Importance

  • The feature start_type contributed approximately 70% of total model importance.

  • This describes how the ball carrier got in possession of the ball.

  • In the SkillCorner data, there are 14 categorical levels, but we grouped them into 5:

Reception ← pass_reception, goal_kick_reception, throw_in_reception, corner_reception, free_kick_reception

Interception ← pass_interception, goal_kick_interception, throw_in_interception, corner_interception, free_kick_interception

Recovery ← recovery

Keep Possession ← keep_possession

Unknown ← unknown, missing values

74% of Presses After Interceptions Led to Turnovers

Looking at actual turnovers, pressing the ball carrier when they got the ball from an interception led to a turnover approximately 74% of the time.

Limitations

  • Class imbalance in the dataset (23% turnovers vs. 77% no turnovers) led to models with high accuracy but lower recall for the minority class.

  • MLS-only data limits generalizability to leagues with different physical demands and player quality.

  • Grouping pressing actions into sequences means this approach cannot evaluate individual player pressing effectiveness

  • Missing values were flagged or labeled as ‘unknown’ rather than estimated, which may limit the model’s ability to capture underlying patterns.

Future Work

  • Add pressing intensity calculations and pitch control models.

  • Sensitivity analysis on the 5-second window for pressing effectiveness by testing alternative time thresholds (e.g., 3 seconds, 4 seconds, 6 seconds).

  • More research on the shape and boundaries of the pressure zone.

Acknowledgement: SkillCorner, Daniel Wicker (Charlotte FC), Dr. Ron Yurko, Quang Nguyen, the CMSACamp TAs, and Carnegie Mellon University - Statistics & Data Science

Contact Information:

Thank You

Appendix

A.1: Feature Descriptions

Some features used in the model
Feature Description Type
ball_carrier_x x-coordinate of ball carrier at press start Numeric
ball_carrier_y y-coordinate of ball carrier at press start Numeric
n_pressing_defenders Number of unique defenders who were actively pressing Numeric
max_passing_options Number of available passing options for ball carrier Numeric
avg_approach_velocity Average speed of pressing defenders (m/s) Numeric
poss_third_start Pitch third where press begins Categorical
game_state Current match status (winning/drawing/losing) Categorical
start_type How player gained possession Categorical
incoming_high_pass Pass received above 1.8m height Boolean
incoming_pass_distance_received Distance of received pass (m) Numeric
incoming_pass_range_received Range category of received pass Categorical
organised_defense Defense organized at pass moment Boolean
dist_to_nearest_sideline Distance to nearest sideline (m) Numeric
dist_to_nearest_endline Distance to nearest endline (m) Numeric
dist_to_attacking_endline Distance to attacking endline (m) Numeric
dist_to_defensive_endline Distance to defensive endline (m) Numeric
dist_to_attacking_goal Distance to attacking goal center (m) Numeric
minutes_remaining_half Minutes left in current half Numeric
minutes_remaining_game Minutes left in match Numeric
ball_carrier_direction Ball carrier direction (degrees) Numeric
ball_carrier_speed Ball carrier speed (m/s) Numeric
penalty_area Press starts in penalty area Boolean
n_defenders_within_10m Defenders within 10m radius Numeric
n_defenders_within_15m Defenders within 15m radius Numeric
n_defenders_within_20m Defenders within 20m radius Numeric
n_defenders_within_25m Defenders within 25m radius Numeric

A.2 Calibration Plots

A.3 Pressing vs. Being Pressed