Use this REAMDE.md
file to describe your final project (as detailed on Canvas).
We are interested in the domain of public/global health. We chose this domain because one of our members is a public health major, and we also feel like the data that we interact with can be displayed in an impactful and meaningful way. We also feel like there are a lot of different directions that we can go with this domain, and it is extremely relevant to our lives after the COVID-19 pandemic that has had such an huge impact on the whole world.
One data driven project that we found was a dataset tracking community transmission of Covid-19 cases in the United States at a county level.
We found this data driven project about general world health statistics that offers a visual summary for a variety of global health topics including topics such as disease, intervention, and life expectancy.
Link: https://www.who.int/data/gho/whs-2020-visual-summary
The third data driven project we found was a dataset about cancer statistics split up by U.S.. Some of the statistics included in this project include death counts, rates, survival and prevalence estimates.
Link: https://www.cdc.gov/cancer/uscs/dataviz/index.htm
- What regions of the world have been most affected by Covid-19?
- We could answer this question by grouping countries by the region they are from and comparing their number of cases that have been reported.
- What is the quality of treatment and coverage of treatment for people living with HIV/AIDS in countries globally?
- We could look at the proportion of that country's total amount of people with HIV/AIDS with the amount of people who are getting treated.
- Which country's have the highest child mortality rate?
- We can find the max value of child mortality from the dataset and find the country that it belongs to.
We found data about global COVID-19 cases from this website: https://covid19.who.int/WHO-COVID-19-global-data.csv
WHO collected the numbers used for this dataset from the numbers of confirmed COVID cases and deaths from official statistics from health websites of countries. The data is about new case and death counts, and they are updated incrementally as more data comes in from different countries.
There are 116,604 observations (rows) in the Covid dataset. There are 8 features (columns) in the Covid dataset.
We can answer the question about what region of the world you would be more likely to have caught COVID from this dataset.
We found data about child mortality rates split up by countries and through different age groups here: https://data.unicef.org/resources/dataset/child-adolescent-and-youth-mortality-rates/
We chose to look at the ages of 5-14 for the mortality rate and use the data that split up based on countries instead of regions.
The data was collected and compiled by UNICEF, and it was generated from various international surveys. The data is about the rate of child mortality among 5-14 year olds split up by countries.
There are 586 observations (rows) in the child mortality dataset. There are 33 features (columns) in the child mortality dataset.
We can answer the question about which countries have the highest child mortality rates using this dataset.
We found a dataset about the number of people living with HIV/AIDS globally from this website: https://www.kaggle.com/imdevskp/hiv-aids-dataset
The data was collected and generated by WHO and UNESCO from their public records on the number of people living with HIV/AIDS around the world. Devakumar cleaned that data from WHO and UNESCO and put the data into a CSV file that draws a focus on treatment statistics for those living with HIV/AIDS by country.
There are 170 observations (rows) in the HIV/AIDS dataset There are 10 features (columns) in the HIV/AIDS dataset
We can answer the question about how effective the treatment coverage is for HIV/AIDS using this dataset since it has data on anti retro-viral therapy coverage.