An insurance company wants to improve its cash flow forecasting by better predicting an annual life insurance premium using demographics and basic customer health risk metrics at the time of application.
Build a machine learning model that can predict the premium for life insurance based on a customer's basic information.
Variable | Description |
---|---|
Age | Age of applicant (primary beneficiary) |
Sex | Gender of applicant (female, male) |
BMI | An applicant’s weight in kilograms divided by the square of height in meters |
Children | Number of children or dependents |
Smoker | Smoking habits |
Region | Northeast, Southeast, Southwest, Northwest |
Charges | Life insurance premium based on the data given (target variable) |
- Performed exploratory data analysis
- Used feature selection
- Built an accurate model
- Designed frontend of webpage with Streamlit
- Deployed model using Heroku
Random Forest Regression model with an accuracy of 85.4%
File | Description |
---|---|
Procfile | State what commands are to be run on start up |
requirements.txt | Python package dependencies needed on Heroku |
scaler_value.pkl | Saved data preparation objects |
setup.sh | Set up the platform project directory |
Life_Insurance_Model.pkl | Serialized model |
streamlitpredictinsurancecharges.py | Frontend code |
A simple random forest regression model can be used to create a machine learning model that can then be deployed for business purposes.