Giter Club home page Giter Club logo

pclaridy / lifestyle-health-analysis Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.96 MB

This project investigates the relationship between various lifestyle factors and self-assessed general health status using the comprehensive 2022 Behavioral Risk Factor Surveillance System (BRFSS) dataset. Key aspects of the project include comprehensive dataset analysis, data preprocessing, the application of predictive models.

License: MIT License

Python 100.00%

lifestyle-health-analysis's Introduction

Lifestyle Factors and General Health: A Predictive Analysis

Description

This project investigates the relationship between various lifestyle factors and self-assessed general health status using the comprehensive 2022 Behavioral Risk Factor Surveillance System (BRFSS) dataset. Key aspects of the project include comprehensive dataset analysis, rigorous data preprocessing, the application of predictive models, elaborate visualizations, and strategic health insights.

Data Source

The analysis is based on the 2022 Behavioral Risk Factor Surveillance System (BRFSS) dataset, provided by the Centers for Disease Control and Prevention (CDC). This dataset includes responses from a wide cross-section of the population across the United States and its territories.

Table of Contents

Installation

To set up the project environment and run the analysis:

git clone https://github.com/YOUR_GITHUB_USERNAME/general-health-lifestyle-factors
cd general-health-lifestyle-factors
pip install -r requirements.txt

Repository Structure

  • README.md: Provides an overview and instructions for the project.
  • Data Cleaning.py: Contains the data cleaning and preprocessing steps.
  • Baseline Models.py: Script for developing baseline predictive models.
  • RF and Boosting.py: Implements the initial Random Forest and Gradient Boosting models.
  • RF and Boosting2.py: Second iteration of tuning for Random Forest and Gradient Boosting models.
  • RF and Boosting3.py: Final iteration for advanced tuning and evaluation of the models.
  • Paper.pdf: The comprehensive research paper detailing background, methodology, analysis, and conclusions.

Running the Code

  1. Clone this repository to your local machine.
  2. Install the required Python packages using: pip install -r requirements.txt.
  3. Execute the Python scripts in the following sequence to replicate the analysis:
    • python "Data Cleaning.py": Cleans and prepares the dataset for analysis.
    • python "Baseline Models.py": Develops and evaluates baseline predictive models.
    • python "RF and Boosting.py": Applies and evaluates the initial Random Forest and Gradient Boosting models.
    • For further tuning and evaluation of the models, run python "RF and Boosting2.py" and python "RF and Boosting3.py" as needed.

Methodology

The project employs comprehensive statistical analysis and data mining techniques such as Regression Analysis, Decision Trees, Random Forest, and Gradient Boosting to predict general health outcomes based on lifestyle factors. Detailed methodologies including data cleaning, preprocessing, and model development are extensively documented within the paper and corresponding Python scripts.

Findings

The analysis demonstrates significant correlations between lifestyle factors and self-assessed general health, emphasizing the potential of Random Forest and Gradient Boosting models in public health strategies. The tuned Gradient Boosting model, with an accuracy of 85.79%, effectively predicts general health status, underscoring the impact of lifestyle factors.

Conclusion

The findings highlight the critical role of lifestyle choices on general health and showcase the capability of machine learning models to forecast health outcomes. Insights from this study provide a solid foundation for designing health interventions aimed at enhancing general well-being.

Contributing

Contributions, suggestions, and feedback are welcome. Please feel free to fork the repository, make changes, and submit pull requests.

License

This project is released under the MIT License. Please refer to the LICENSE file for more details.

Acknowledgments

  • Georgia Institute of Technology for providing the educational platform.
  • Centers for Disease Control and Prevention (CDC) for making the BRFSS dataset publicly available.

lifestyle-health-analysis's People

Contributors

pclaridy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.