Prediction of a candidate winning the election based on his/her features using Logistic Regression Classification.
With over 600 Million voters voting for 8500+ candidates across 543 constituencies, the general elections in the world's largest democracy are a potential goldmine of data.
While there are existing separate analysis about the votes each candidate received and the personal information of each candidate, there was no comprehensive analysis that included both these information.
A quick overview of the contents of the following exploratory data analysis project.
- Dataset Source
- Resources used
- Quick descriptions of the questions explored
- [NB Viewer Links](# NB_Viewer)
- The dataset used in this analysis project can be downloaded from here
- Also the data collection was done by Kaggle user Prakrut Chahaun.
- The code is written in python using the jupyter notebook.
- Pandas library has been used for data manipulation and data cleaning.
- Used the matplotlib and seaborn library for data visualization.
- The sci-kit learn library is used for our model prediction.
- Which constituencies had the highest number of candidates?
- Which were the top parties wrt to the number of candidates?
- Which candidates had the highest percentage of votes?
- How was the age of candidates distributed?
- Which candidates had the highest number of criminal records?