Giter Club home page Giter Club logo

bittooth's Introduction

Bittooth

A senior project about Bitcoin value prediction.

Description

This project tries to create a model which predicts the value of bitcoin from both statistically analysis and Twitter sentiment on a given period of time. Many different models are designed and tested to see which gives the best results.

Below here is our project structure. One node corresponds to one notebook.

Bittooth Data-flow

Installation

There are two method of installations: with or without virtual environment. You only need to install with virtual environment if you are a developer of the project.

Our project uses DVC to source control our data to Google Drive remote. Please refer to here for more information on pulling data.

Install without virtual environment

  1. Install Twint.

    pip install --upgrade git+https://github.com/kevctae/twint.git
  2. Install dependencies from requirements.txt.

    pip install -r ./setup/requirements.txt
  3. Run Jupyter Notebook.

    jupyter notebook

Install with virtual environment (Development)

  1. Install pyenv.
  2. Install pyenv-virtualenv
  3. Run setup script based on the OS:
    • (macOS) Allow executable on setup-mac.sh and run the script.

      cd ./setup
      chmod +x ./setup-mac.sh
      ./setup-mac.sh

Pulling data with DVC

In order to pull data from Google Drive, you will first need permission to access the Drive from kevctae. Once you have permision, you may pull data using command (make sure to install DVC from here):

dvc pull

It will ask you to get verification code on the first pull. Follow the provided URL and login to your Google account

Acknowledgement

Authors

bittooth's People

Contributors

kevctae avatar julian-kota-kikuchi avatar chayisara avatar nuttasetsk avatar

Stargazers

CaQtiml avatar  avatar

Watchers

 avatar

bittooth's Issues

Feedback

Main feedback: great work!! i especially liked slide 10 on the pre versus post preprocessing contrast, the illustrations of the sentiment score as well as updating the sentiment lexicon, and the over-time plots. This is definitely in good shape already so no pressure to do a ton more, but a few comments on small things

  • Money features: as discussed, would try to check robustness (if time) to extracting monetary amounts based on either regex like dollar symbols or if there's a currency extraction since i wonder if high correlation on slide 16 is artifact fo the filtering to range specified on slide 15
  • slide 18: based on this graph, it seems like there's a high frequency of tweets that (1) get 0 engagement and (2) that get weighted to zero so effectively dropped from the prediction model - i think this makes sense but just being clear then the analytic sample is tweets with a non-zero amount of viewer engagement
  • slide 19: if there's an equation representation of the bitcoin differential value that'd be helpful (as someone unfamiliar w/ this topic is it difference between day t open and close price, close price on day t versus close price on day t+1 etc)
  • slide 27: maybe illustrating which months are in training, validation, and test (or days/weeks depending on the unit of time used to split)
  • I know we rushed through the last set of slides but related to the above about the dependent variable, it seems like you could: (1) use the return as the DV/Y even in the ML model, since it seems good to try to predict day to day changes rather than price on a fixed day given high correlation; (2) i think you'd want to use the sentiment on day t (so not the change in sentiment but whether sentiment on March 8 2021 predicts change in price between closing price March 8th 2021 and closing price March 9th 2021)
  • discussion of timestamp of tweet related to time zones/opening and closing of markets - writing out the above comment made me realize that say the markets close at 4 pm EST--- you'd want to the day $t$ tweets to be posted before 4 pm EST and then everything after market close goes into the next day--- so when merging the tweets to the price data, making sure the tweets are attached to pre market close hours

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.