Football Match prediction using machine learning algorithms in jupyter notebook

Jupyter Notebook 100.00%

python jupyter-notebook machine-learning svm-classifier logistic-regression naive-bayes exploratory-data-analysis

predicting-football-match-outcome-using-machine-learning's Introduction

Predicting Football Match Outcome using Machine Learning

I have used dataset from two sites for this project 1.https://www.kaggle.com/hugomathien/soccer
2.http://football-data.co.uk/data.php

The dataset from kaggle website was in sqlite format but I was not able to upload the file in sqlite so i have uploaded the csv files for all the tables.

This dataset has tables of Country, League, Match, Player, Player Attributes, Team ,Team Attributes and sequences. It has information of more than 25000 matches, 10000 players, 11 European Countries with their lead championship from 2008 to 2016, Players and Teams attributes sourced from EA Sports' FIFA video game series, betting odds from up to 10 providers

I have performed Exploratory Data Analysis and used this dataset for it.

Later I have downloaded data from the football-data.co.uk website which had even more relevant information which i have used to perform prediction.

I have performed Logistic Regression, Naive Bayes and Support Vector Machine algorithms on the dataset with SVM giving the highest accuracy of 61.29%

predicting-football-match-outcome-using-machine-learning's People

Contributors

Stargazers

Watchers

predicting-football-match-outcome-using-machine-learning's Issues

MultinomialNB Cannot accept negative Alpha

Hi,

I was trying to run your code on my machine with Phyton 3.7 and at step Multinomial Naive Bayes¶ i get the following error

ValueError Traceback (most recent call last)
in
4 for i in range(-1000,1000,50):
5 clf1 = MultinomialNB(alpha=i)
----> 6 clf1.fit(X_train,y_train)
7 clf1.fit(X_train_2,y_train)
8 scores = cross_val_score(clf1, X_train, y_train, cv=10)

c:\users\johnm\appdata\local\programs\python\python37\lib\site-packages\sklearn\naive_bayes.py in fit(self, X, y, sample_weight)
609 dtype=np.float64)
610 self._count(X, Y)
--> 611 alpha = self._check_alpha()
612 self._update_feature_log_prob(alpha)
613 self._update_class_log_prior(class_prior=class_prior)

c:\users\johnm\appdata\local\programs\python\python37\lib\site-packages\sklearn\naive_bayes.py in check_alpha(self)
471 if np.min(self.alpha) < 0:
472 raise ValueError('Smoothing parameter alpha = %.1e. '
--> 473 'alpha should be > 0.' % np.min(self.alpha))
474 if isinstance(self.alpha, np.ndarray):
475 if not self.alpha.shape[0] == self.feature_count.shape[1]:

ValueError: Smoothing parameter alpha = -1.0e+03. alpha should be > 0.

It would Appear that the MultinomialNB function cannot accept a negative alpha value, how did you manage to run the code with a negative Alpha

Error in Code

hii broo..i cannot get proper table.team column. the column shows all true values instead of each team name what should i do?? Plzz help me. i attached screenshot below.

Error

In [38]:
#Extract necessary features from the data file
feature_table = df.iloc[:,:23]
print(table)

#Full Time Result(FTR), Home Shots on Target(HST), Away Shots on Target(AST), Home Corners(HC), Away Corners(AC)
feature_table = feature_table[['HomeTeam','AwayTeam','FTR','HST','AST','HC','AC']]
print(feature_table)
#Home Attacking Strength(HAS), Home Defensive Strength(HDS), Away Attacking Strength(AAS), Away Defensive Strength(ADS)
f_HAS = []
f_HDS = []
f_AAS = []
f_ADS = []
for index,row in feature_table.iterrows():
f_HAS.append(table[table['Team'] == row['HomeTeam']]['HAS'].values[0])
f_HDS.append(table[table['Team'] == row['HomeTeam']]['HDS'].values[0])
f_AAS.append(table[table['Team'] == row['AwayTeam']]['AAS'].values[0])
f_ADS.append(table[table['Team'] == row['AwayTeam']]['ADS'].values[0])

feature_table['HAS'] = f_HAS
feature_table['HDS'] = f_HDS
feature_table['AAS'] = f_AAS
feature_table['ADS'] = f_ADS
feature_table

APPAER THIS ERROR AND I M NOT PROCEDED

IndexError Traceback (most recent call last)
in
10 f_ADS = []
11 for index,row in feature_table.iterrows():
---> 12 f_HAS.append(table_16[table_16['Team'] == row['HomeTeam']]['HAS'].values[0])
13 f_HDS.append(table_16[table_16['Team'] == row['HomeTeam']]['HDS'].values[0])
14 f_AAS.append(table_16[table_16['Team'] == row['AwayTeam']]['AAS'].values[0])

Random Forest Classifier gives 100% accuracy

I applied the random forest algorithm on merged_dataset.csv. Out of 6080 rows, I used 80% rows for training and the remaining 20% for testing. I found that the trained model predicted target with 100% accuracy. I take attribute FTR as a target.

CODE :

`from sklearn.preprocessing import LabelEncoder
import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
from sklearn.ensemble import RandomForestClassifier

dataframe = pd.read_csv('./dataset/Merged_dataset.csv')
print(dataframe.head())

df = dataframe.apply(LabelEncoder().fit_transform)
print(df.head())

target = np.array(df['FTR'])
features= df.drop(['id','FTR','FTAG','FTHG'], axis = 1)
features = np.array(features)

#Split the data into training and testing sets
train_features, test_features, train_labels, test_labels = train_test_split(features, target, test_size = 0.20, random_state = 42)

model = RandomForestClassifier()
model.fit(train_features, train_labels)

predicted_labels = model.predict(test_features)

print("actual Test labels")
print(test_labels)
print("")
print("predicted test labels")
print(predicted_labels)

#calculate accuracy
count = 0
totalCount = len(predicted_labels)
for i in range(len(test_labels)):
if(predicted_labels[i] == test_labels[i]):
count = count+1

print("Accuracy : "+str((count/totalCount)*100)+" %")
`

OUTPUT :

isn't it too unreal to have 100% percent accuracy? If I applied Logistic Regression then model's accuracy is 68%

will you correct what is wrong in my code? or what is the concept that I am missing while training my model?

prathameshtari / predicting-football-match-outcome-using-machine-learning Goto Github PK

predicting-football-match-outcome-using-machine-learning's Introduction

Predicting Football Match Outcome using Machine Learning

predicting-football-match-outcome-using-machine-learning's People

Contributors

Stargazers

Watchers

Forkers

predicting-football-match-outcome-using-machine-learning's Issues

MultinomialNB Cannot accept negative Alpha

Error in Code

Error

Random Forest Classifier gives 100% accuracy

Who can explain prediction codes and normalization?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent