All the visualizations have been put under the Visualizations folder. There is a video presentation that is linked here and a Powerpoint presentation that can be accessed online here. If you want to read more about the process of the project, I have detailed it in my portfolio website here.
This summer, I intern at Harassment and Reporting Platform, a non-profit organization with a goal to increase awareness on assault and harassment. We aim to gather crow-sourced contextual data, analyze, and create a cohesive narrative to bridge the gap of technical research and public understanding. While at the data team, I can explore and propose a data science project to research a topic related to harassment and assault of personal interest.
The main aim of the project is to gain insights to social media representation associated with the Asian American hate crimes incidents.
For our purposes, we choose the New York Times journal because it is reputable and it has a clear API documentation.
-
Request article data from New York Times using the NYT API
-
Parse the data into the format that we want, save into a csv file.
- The Asians American NYT Dataset.csv file is all the headlines for the tag while the updated csv file is only within the pandemic timeframe.
-
Download the US statistics on Covid cases from the New York Time repository
-
Explore the NYT dataset
-
Merge two dataset
-
Create wordclouds for all the headlines and the headlines in the pandemic.
-
Create heatmap visualizations
-
Create cases percentage versus each subject counts visualizations
-
Create subject counts visualization
If you have further questions, please feel free to contact me through Github or visit my personal website for more social media accounts. Thank you very much!