Problem Statement: The proposed capstone/thesis project seeks to address the pressing issue of understanding the demographic characteristics and factors associated with COVID-19-related deaths in Mexico. The goal is to explore who is most vulnerable to the virus in terms of gender, age, location, and health conditions. The project also aims to apply machine learning analysis to predict death cases based on these variables. The intended audience for this research includes healthcare professionals, policymakers, epidemiologists, and data scientists, as the findings will contribute to a deeper understanding of COVID-19 mortality and guide public health measures.
An extensive literature review will be conducted to situate this project in the context of existing research on COVID-19's impact. The pandemic, declared in March 2020, has triggered a global health crisis with profound social, economic, and political repercussions. By examining prior work, this project will build upon and complement the collective knowledge about the virus's effects. Researchers have identified comorbidities, disparities in healthcare systems, and social determinants as key factors influencing COVID-19 outcomes. This project will contribute by focusing on the Mexican context and providing specific insights into the demographic aspects of COVID-19-related deaths.
Resources Available:
- Access to the Mexican Open Government data for data collection and analysis.
- Literature resources and databases for research.
- Machine learning libraries (e.g., scikit-learn) and data analysis tools.
- Access to computing resources and software for data processing and model development.
Resources Needed:
- Computational resources for data processing and machine learning model training.
- Access to cloud computing platforms for scalability.
- Potentially, support for software development for data analysis and machine learning model implementation.
A structured work plan for completing the project is as follows:
-
Data Collection and Preprocessing (Responsible: Researcher, Deadline: Months 1-2)
- Collect the dataset from the Health Department of Mexico.
- Preprocess and clean the data, addressing any missing or inconsistent values.
-
Exploratory Data Analysis (Responsible: Researcher, Deadline: Months 3-4)
- Perform exploratory analysis to understand the data's distribution and characteristics.
- Identify trends and patterns in demographic and health-related variables.
-
Machine Learning Model Development (Responsible: Researcher, Deadline: Months 5-8)
- Select relevant features and labels for classification.
- Split the dataset into training and testing sets.
- Apply supervised machine learning algorithms for predictive analysis.
- Evaluate model performance using metrics such as accuracy, precision, recall, and F1-score.
-
Interpretation and Insights (Responsible: Researcher, Deadline: Months 9-10)
- Analyze results to gain insights into the relationships between variables.
- Interpret findings and provide context to understand the demographic vulnerability to COVID-19.
-
Improvement Measures (Responsible: Researcher, Deadline: Months 11-12)
- Address class imbalance through techniques like oversampling or undersampling.
- Explore feature engineering or selection to enhance model performance.
- Test new models and fine-tune hyperparameters.
- Adjust the probability threshold for better classification accuracy.
This project is committed to ethical data usage and privacy protection. The dataset used for analysis will be anonymized to ensure the confidentiality of individuals' identities. Only aggregated data will be shared in the project, and personally identifiable information will not be disclosed. The research focuses on public health analysis, and no individual-level data will be used.
The data management plan involves:
- Storing project data securely and maintaining backups.
- Documenting data sources and cleaning procedures.
- Ensuring code reproducibility for transparency.
- After project completion, submitting research findings and data to the institution's library for preservation and future access by researchers.
This capstone/thesis project aims to provide critical insights into COVID-19 mortality in Mexico and will adhere to ethical standards and best practices for data management. It will contribute to the broader understanding of the pandemic's impact on vulnerable populations and guide public health efforts.