Welcome to the Insurance Data Analysis repository! ๐ This repository contains a Python notebook (Q3.ipynb
) that performs insightful analysis on insurance data using logistic regression. The dataset consists of the following columns:
- age: age of the primary beneficiary.
- sex: insurance contractor gender, female or male.
- BMI: Body mass index, providing an understanding of the body weight relative to height.
- children: Number of children covered by health insurance or number of dependents.
- smoker: Smoking status.
- region: the beneficiary's residential area in the US (northeast, southeast, southwest, northwest).
- charges: Individual medical costs billed by health insurance.
The notebook follows a step-by-step process to analyze the insurance data and build a logistic regression model. Here's an overview of what you'll find inside:
- Data Loading: The insurance data is uploaded and loaded into the notebook.
- Data Scaling: The data is scaled to normalize the values between 0 and 1.
- Data Visualization: Interactive plots are generated to visualize the distribution of various features based on the outcome (diabetic or non-diabetic).
- Train-Test Split: The dataset is divided into training and testing sets with an 80:20 ratio.
- Logistic Regression Modeling: A powerful logistic regression model is applied to the training data.
- Model Evaluation: The model's accuracy and root mean squared error (RMSE) are calculated using the testing data.
- Confusion Matrix: A detailed confusion matrix is generated to evaluate the performance of the model.
Feel free to explore the notebook and leverage the provided code to gain insights into the relationship between the input features and the outcome variable (diabetic or non-diabetic). You can also use the logistic regression model to make predictions and evaluate its performance.
Enjoy your data exploration journey and feel free to customize the notebook to suit your specific needs! If you have any questions or feedback, please don't hesitate to reach out. Happy analyzing! ๐๐