Performance of supervised learning world cup match prediction model

Review of Match Day 1 World Cup Machine Learning Predictions

In my previous post, I shared an overview of my Match Day 1 World Cup Machine Learning model, its match predictions and the betting strategies based on these outputs.  In this post, we review the performance of the predictions and recommended betting strategies.

How did the Match Day 1 World Cup Machine Learning Model perform?

To re-cap, we use international football match results since 1995, Elo Ratings and Elo Points gained/lost by each team and developed the following data features (data inputs to be processed by the model algorithm to predict the outcomes) :

  • Wins of both teams in previous 7 non-friendly matches
  • Losses of both teams in previous 7 non-friendly matches
  • Head 2 Head wins against in last 3 games against opposing team
  • Head 2 Head losses against teams in last 3 games against opposing team
  • In last 7 non-friendly matches, # of matches won against top 20 Elo-ranked teams
  • In last 7 non-friendly matches, # of matches won by opposing team against top 20 Elo-ranked teams
  • In last 7 non-friendly matches, Elos gained by team and opposing team

With the above data stats/features, we take the calculate the totals (sum), median and variance.

These features are then passed through various classifcation models that allow for the prediction of multiple classes (events) . In this instance, the events are: Wins / Draws / Losses

Summary of Model Performance

Match Day 1 Word Cup Machine Learning Model Performance

  • 50% prediction accuracy
    The model predicted 8 out of 16 games correctly (50%)
  • We test 3 betting strategies (using odds we captured from Ladbrokes):
    • Bet $1 on every predicted outcome
      Total staked $16 results in profit of $4.07 / $0.25 for every $1 staked
    • Bet $1 on games where derived expected gain is greater than $1
      Total staked $6 results in profit of $4.70 / $0.78 for every $1 staked
    • Bet $1 on games where derived expected gain is > 0
      Total staked $10 results in profit of $4.31 / $0.43 for every $1 staked

Observations from machine learning World Cup prediction model performance results

  • Although the sample size is small, the model can still help support bet decision making
  • Using a model to derive outcome probabilities and expected gains to inform decisions to about gains and risks. Scenario B sees bets according to expected gain and achieved higher gains per $1 staked
  • The occurence of Mexico winning helped tipped all scenarios to postive return. If Mexico lost, all scenarios would have been loss making. However, Scenario B would still result in lower losses. Using the model outputs can help manage size of losses. Due to the small sample size, profits can vary greatly. Further, predictions have to be reviewed to understand true performance.
  • Our review only looks at bets of equal size. It does not take into consideration, limited budget and re-investment of gains. One can further optimize risk and return by allocation of bets according to budget,  probability of event occuring and expected gains and re-investment of returns.

Applying observations to other events

  • Machine learning models can assist businesses in optimising their investment decisions to maximise gains or minimise risk

    Machine learning can be used to optimise decisions and manage risk. The key challenge is identifying events you are trying to predict, quantifying gains/losses of these events and ensuring that interactions used to predict events are captured and accurately reflected in your data

     

  • Running a model over data and validating the model is the simple part, one needs to :
    • Identify and isolate the key events that one is trying to predict (What are your Win/Lose/Draw events that are of interest to you? Is it customer churn? Customer Upgrade?)
    • Ensure that these key events can be captured and identified via the data (How do you identify your Win/Lose/Draw events and is the data captured? For example, account closure date to identify churned customers or account upgrade date)
    • Identify the potential gains when an event occurs. (What is the return per $ invested if a customer upgrades or loss if a customer churns)
    • Hypothesise interactions that may predict an event and ensure these interactions are captured and are an accurate reflection

About The Author