In this project, I explore a dataset composed of employee data in order to predict employee attritions and discover the main factors behind employee attrition. Using the provided Google Colab link will take you directly to the notebook where all my models were trained. The original data comes from https://www.kaggle.com/patelprashant/employee-attrition. Under the data
folder are three files, one is attrition.csv
, the renamed file downloaded from kaggle, and two others attrition_train.csv
and attrition_test.csv
. These datasets were created with split_data.py
so I wouldn't be creating many different train/ test splits when opening my notebook each time.
attrition_model.ipynb
: Notebook containing model training and feature engineeringdata
: Folder for datasplit_data.py
: Helper script to construct training and test datasets.
You can watch a Youtube video of me explaining my findings here https://youtu.be/hA3KqkXcQhg.
Thank you!