Investigating the association between Fitbit wearable data and self-reported measures of life satisfaction

This repository contains the code base for our MED-264 group's Final Project.

Data

The data is part of All of Us Registered Tier Dataset (version 7). The notebooks were developed on Python 3.7 and All of Us Jupyter Notebook environment.

Files

data_collection.ipynb - This notebook extracts the data from All of Us dataset using GoogleBigQuery query and saves it to the persistent disk on the created workspace.
data_preprocessing.ipynb - In this notebook, the saved dataframes are read and upon observing missingness, the feature list is filtered.
data_cleaning.ipynb - In this notebook, the missing data for each feature is imputed with the patient level mean.
data_splitting.ipynb - In this notebook, the dataset is split into train and test after feature engineering. The split ensures that there is no leakage of patient level data on train and test sets.
model_building.ipynb - Traditional machine learning models such as Logistic Regression, Decision Tree Classifier, Random Forest Classifier, and XGBoost Classifier are chosen to perform both multi-class and binary class classification tasks. The results of these are available in the notebooks.
data_correlation_and_statistics.ipynb - General statistics about the population and correlation among features is captured in this notebook.
python_ordinal_regression.ipynb - Ordinal Regression regression is carried out to observe the odd ratios and 95% confidence intervals. Furthermore, the statistical significance (p-values) is reported in this notebook.
assets/ - Contains all the illustrations derived from our study.

Explainability of Random Forest Classifier

Random Forest Feature Importance (Binary Classification)

Random Forest Feature Importance (Multi-class Classification)

Acknowledgements

We would like thank Dr. Tsung-Ting Kuo (instructor) for arranging lectures with various other lecturers for our sessions. We would like to also thank the TAs of this course, Grace Yufei Yu and Aaron Boussina.

vishaln15 / med-264 Goto Github PK

med-264's Introduction

Investigating the association between Fitbit wearable data and self-reported measures of life satisfaction

Data

Files

Explainability of Random Forest Classifier

Random Forest Feature Importance (Binary Classification)

Random Forest Feature Importance (Multi-class Classification)

Acknowledgements

med-264's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent