Giter Club home page Giter Club logo

covid19-2's Introduction

CoronaVis: A Real-time COVID-19 Tweets Analyzer

The goal of ConronaVis is to use tweets as the information shared by the people to visualize topic modeling, study subjectivity and to model the human emotions during the COVID-19 pandemic. The main objective is to explore the psychology and behavior of the societies at large which can assist in managing the economic and social crisis during the ongoing pandemic as well as the after-effects of it. In this paper, we describe the CoronaVis Twitter dataset (focused on United States) that we have been collecting from early March 2020. We would like to share this data with the hope that it will enable the community to find out more useful insights and create different applications and models to fight with COVID-19 pandemic and the future pandemics as well. The paper is available at arXiv

Data Description

We are continuously collecting the data since March 5, 2020 and will keep fetching the tweets using Twitter Streaming API. We have collected around 700GB of raw data until April 24, 2020 and saved this data as JSON files. However, we dynamically process this data in real-time for the CoronaVis application. We processed several features from the tweets such as (Tweet ID, Tweet Text, User Location if available, User Type), etc. We will keep collecting the data and update this data repository once in every week.

This repository contains two different folders. One is the "data" folder containing the tweeter data and another is a "src" folder which contains some basic code presenting the way to read the data and some basic data analytics. The "data" folder contains the data in CSV file format for each day from 5th March 2020 to till date and named by the particular date with the format YEAR-MONTH-DATE. If you have any suggestions or concern, please send an email at [email protected] or [email protected].

Data attributes

tweet_id

Unique ID of a tweet.

created_at

Creation time of a tweet.

loc

State level user location.

text

Processed tweet text. All the text are in small letters, non-English characters and few stop words are removed.

user_id

Pseudo user id. The exact user name is transformed to a anonymous id to preserve the privacy of the user.

verified

Denotes whether the tweet post is verified or not (1 or 0). 0 --> Not Verified. 1 --> Verified.

Use Policy

This dataset is released in compliance with Twitter’s Developer Terms & Conditions. The data repository will be continuously updated every week. The data repository, containing codes, and CoronaVis, 2020 W2C lab, Missouri University of Science and Technology, all rights reserved, can be used for educational, academic, and government research purposes with proper citation (Please cite this paper). Any commercial use of any materials is strictly prohibited. Taking and sharing a screenshot is allowed with appropriate citation.

@misc{kabir2020coronavis, title={CoronaVis: A Real-time COVID-19 Tweets Analyzer}, author={Md. Yasin Kabir and Sanjay Madria}, year={2020}, eprint={2004.13932}, archivePrefix={arXiv}, primaryClass={cs.SI} }

covid19-2's People

Contributors

mykabir avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.