Using machine learning to predict the score of the soccer players of Campeonato Brasileiro de Futebol and decide the best squad with linear programming
This is a project to try to build the best team in Cartola FC. To do that, I did first a regression of the score of the soccer players, second I predicted the winner teams and losers of each round using classification with the program that I made in the repository Machine-learning-soccer-classification to prevend the linear model to choose players from teams that are playing against each other which is bad most of the time and for last I used linear programming to decid the best squad maximizing the score predictions multiplied by some weights like, if the team is playing home or outside, if the teams are playing a classic and the position of the team in the Brasileirão table.
I used the data from the repository caRtola: https://github.com/henriquepgomide/caRtola
Python in Jupyter Notebook
I decided to try different machine learning approaches to compare the results, so i used XGBoost, Extra Trees, Random Forest and MLP.
The meaning of each feature and its weight can be found in the site: https://www.cartolafcbrasil.com.br/scouts
train = 70% of data and test = 30%
The graph show us that playing home is very important for the Campeonato Brasileiro, because the teams win more often home than outside and that's why i added a weight of playing home in the linear programming part.
Using data from round 14 of 2021, the classification predictions are:
Using 108 cartoletas, the best squad chosen was:
Total price = 107.84
Pontuation prediction = 104.13