In this paper the relationship between the outcome of a football match (win, lose or draw) and a set of variables describing the game actions is investigated across time, by analyzing data from 4 consecutive yearly championships. The aim of the study is to discover the factors leading to win the match. More precisely, the goal is to select, from hundreds of covariates, those that most strongly affect the probability of winning a match, to recognize regularities across time by identifying the variables whose importance in confirmed in different analyses, and finally to construct a small number of composite indicators to be interpreted as drivers of match outcome. These tasks are carried out using the Random Forest machine learning algorithm, in order to select the most important variables, and Principal Component Analysis, in order to summarize them into a small number of drivers. Variable selection is performed using the novel approach developed by Sandri and Zuccolotto.

Discovering the Drivers of Football Match Outcomes with Data Mining

CARPITA, Maurizio;SANDRI, Marco;SIMONETTO, Anna;ZUCCOLOTTO, Paola
2015-01-01

Abstract

In this paper the relationship between the outcome of a football match (win, lose or draw) and a set of variables describing the game actions is investigated across time, by analyzing data from 4 consecutive yearly championships. The aim of the study is to discover the factors leading to win the match. More precisely, the goal is to select, from hundreds of covariates, those that most strongly affect the probability of winning a match, to recognize regularities across time by identifying the variables whose importance in confirmed in different analyses, and finally to construct a small number of composite indicators to be interpreted as drivers of match outcome. These tasks are carried out using the Random Forest machine learning algorithm, in order to select the most important variables, and Principal Component Analysis, in order to summarize them into a small number of drivers. Variable selection is performed using the novel approach developed by Sandri and Zuccolotto.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/335109
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 28
  • ???jsp.display-item.citation.isi??? 25
social impact