Analytic Report project
Given the dataset (https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset), create an analytic report to answer three below questions. The report should include the introduction, describing the data, visualizations (scatter plot, bar graph, histogram, boxplot, subplot, pie chart, heatmap, etc., each graph should have an explained analysis followed), filtering data to different categories, analysis (include t-test or Mann-whitney for Hypothesis testing) and conclusions.
-
What are key factors that are playing into current attrition rates?
-
What are key factors that are playing into current satisfaction rates?
-
When are employees leaving?
Step by step to make a good report
- Check size
- Check missing value
- Missing value treatments
- Descriptive statistic
- Explore DA
- Hypothesis testing
- Feature selection
- Data cleaning
- Train/Test split
- Build model
- Test model
- Check stability model
Next, calculate PCA by hand to find eigenvalue-eigenvectors pairs and the principal components Y1,Y2,Y3.