Giter Club home page Giter Club logo

samujjwaal / modelling-us-election-data Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 4.43 MB

Data Modelling on 2018 US midterm Election Data and US Demographic Data. Creating regression, classification and clustering models.

Jupyter Notebook 100.00%
election-data classification-models clustering-models election-modelling data-science python pandas numpy scikit-learn sklearn statistics svm naive-bayes-classification linear-regression lasso-regression knearest-neighbor-classification kmeans-clustering dbscan-clustering choropleth-map matplotlib

modelling-us-election-data's Introduction

Data Modelling on US Election Data

This project was done as a course assignment for CS418: Introduction to Data Science course at the University of Illinois at Chicago during the Fall 2019 term along with teammates Yushenli1996 and nathanhe789.


The dataset was partly provided to us by the Professor. There were 2 CSV files: one contained a merged data file of demographic data and election data of counties of certain US states from the 2016 US Senate Elections(generated in this project), and another data file containing only the demographic data of some US counties.

The merged data file was meant to be used for training machine learning classification/predictive models to predict winning political party for a particular county, while the demographic data file was to be used as the testing set for the models.

The merged data file was partitioned into training and validation sets using Holdout method. 75% of data was allocated for training the models and rest 25% for validation of the models.

Additionally, the numeric attributes in the training and validation sets were standardized to have a mean of 0 and variance of 1.


The main purpose of the assignment was to perform Data Modelling on the merged demographic-election data. The data modelling tasks performed on the dataset are:

  • Build Linear Regression Model

    • Using all attributes
    • By selecting different attributes to find the best set of attributes
    • Using LASSO regression
  • Build Classification Models and select 2 best performing models

    • Using all attributes
    • By selecting different attributes to find the best set of attributes
  • Build Clustering Models and select 2 best performing models

    • Using all attributes
    • By selecting different attributes to find the best set of attributes
  • Predict the Democratic and Republican party votes of each county using the best performing regression model using the testing set of demographic data

  • Predict winning political party in each county using the best performing classification model using the testing set of demographic data

  • Create choropleth map to visualize the majority political party of each county as predicted by the best performing classification model


    Using political party attribute in the dataset

    img

    Using political party predicted by SVM

    img


Check out the Jupyter Notebook or the project report to see the data science flow implemented.

modelling-us-election-data's People

Contributors

imgbotapp avatar samujjwaal avatar yushenli1996 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.