Initially, the purpose of this project was the prediction of the status of homicide cases after some number of days; after investigating the data, however, it turned into a series of projects investigating the various data sources used and seeing how far they can be taken in various ways.
This series is based on data collected by the The Washington Post on over 50,000 homicides from 2007 to 2017, as well as data from Murder Accountability Project, which claims to be the most complete repository of United States hommicide data, dowloaded September 19, 2023.
In Chapter 1, we gather the data and clean and preprocess it for further investigation. Specifically, we compare how the information represented in each dataset differs, and evaluate the difficulty in combining them.