Giter Club home page Giter Club logo

twitter_coronavirus's Introduction

Map Reduce Class Project

This was a project for my Data Structures class at Claremont McKenna College. This project involved analysis of millions of tweets using MapReduce techniques. Utilizing both Shell scripts and Python code I parsed through all geotagged tweets in 2020, and created figures about the country and language for tweets containing certain hashtags related to the Coronavirus of 2020.

Procedure

Four Python programs were utilized in this data analysis. map.py looks through all files passed into it, which for the project being all geotagged tweets in 2020, searchs for hashtags, then calculates the origin country and language of the tweets with the hashtags. reduce.py takes multiple files, the output files of map.py, and combines them into one file. visualize.py then charts the amount of tweets with a given hashtag by country or language, using the file produced by reduce.py. These three programs generated the four following figures:

Figure 1: # of Tweets in 2020 with the hashtag #코로나바이러스 by country of origin

Figure 2: # of Tweets in 2020 with the hashtag #코로나바이러스 by language of origin

Figure 3: # of Tweets in 2020 with the hashtag #coronavirus by country of origin

Figure 4: # of Tweets in 2020 with the hashtag #coronavirus by language of origin

The fourth program alternative_reduce.py takes all the files produced by map.py, and creates the following line chart using similar code to reduce.py and visualize.py.

twitter_coronavirus's People

Contributors

irajmoradi avatar joeybodoia avatar mikeizbicki avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.