Giter Club home page Giter Club logo

cs108finalproject's Introduction

Usage

  1. Get Twitter API keys.
    • Create a file called APIKeys.json, and store your API keys in there. You can use APIkeyexample.txt as a reference.
    • Note that this .json will not be pushed to git, unless you change the .gitignore.
  2. Generate tweets for a user or set of users
    • Navigate to the src directory
    • Run python main.py --names <NAME1> <NAME2> ... where each of the NAMEi can be replaced with a twitter handle.
    • The code will pull tweets and save them to the data directory
    • This will also print generated tweets to the console
  3. Determine sentence similarity
    • Navigate to the src directory
    • Run python model_test.py <tweet_file> <K>, where <tweet_file> is the relative path to a file in the data folder (for example, ../data/Harvard.csv), and K designates how big your K-mer will be. K must be at least 2.

Important files

  • main.py: Contains code to generate sentences given a list of Twitter handles at the command line.
  • model_generator.py: Contains functions to generate the Markov model for a user. This includes getting tweets from a file, extracting K-mers, forming the model, and determining next words given the current K-1 words.
  • model_test.py: Contains functions generate sentences from a model, and test their similarity to the original tweets. Note that when run as driver program, this file will default to determining sentence accuracy.
  • twitter_extractor.py: Contains functions to connect to Twitter API and extract tweets for user or users.
  • comparison.py: Contains functions to compare words/sentences for quantitative analysis.

Dependencies

numpy, scipy, Tweepy, NLTK

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.