Giter Club home page Giter Club logo

kryptoflow's Introduction

Kryptoflow

Join the chat at https://gitter.im/kryptoflow-repo/Lobby Build Status

Algorithmic crypto trading framework with Kafka and TensorFlow (Keras + TensorFlow Serving)

Documentation live at: https://carlomazzaferro.github.io/kryptoflow/index.html

Description

Implemented so far:

  1. Data ingestion pipeline, which includes websocket streams from twitter, reddit, gdax which are saved to three separate Kafka topics.
  2. Data transforms to model gdax (BTC-USD only for now) into time-series of custom timespan to which RNNs/LSTMs can be applied for time-series prediction
  3. Tensorflow training and model persistance
  4. Tensorflow Serving APIs exposed through a Flask endpoint. A few details are worth mentioning here. The Tensorflow Serving is actually run within a docker container, and that docker container is polled by a Flask endpoint through gRPC.

Installation and Infrastructure

Infrastructure: kafka, tensorflow, et. al.

Spin up kafka and related scrapers (zookeeper, kafka-ui, etc.)

docker-compose up

Data collection, without Docker (MacOS, Ubuntu)

First, create a new virtual environment. This is highly recommended for managing dependencies.

Then, modify the resources/resources.json file, and add your twitter and reddit API keys. This will enable you to programmatically access the twitter and Reddit APIs.

Dependencies

  1. python3.6 (recommend installing with homebrew)
  2. librdkafka: brew install librdkafka
  3. Python requirements: pip install -r requirements.txt
  4. NLTK requirements: python3.6 -c "import nltk; nltk.download('vader_lexicon'); nltk.download('punkt')"
  5. Local install: pip install -e .
  6. Kafka IP: export KAFKA_SERVER_IP='kafka1'

Finally, you may need to edit your /etc/hosts/ to be able to connect to kafka. Do so by running:

sudo nano /etc/hosts

and adding the following line to it:

127.0.0.1 localhost kafka1

Run scrapers

Then, run mkdir -p ~/tmp/logs/ && supervisord -c resources/supervisord.conf

This will run the three scripts below and will take care of restarting them if/when they fail. This should happen eventually due to API downtimes, malformatted data that is not handled, etc. Usually however the scripts are pretty resiliant and have run for weeks straight.

Alternatively, run:

python3.6 kryptoflow/scrapers/reddit.py

python3.6 kryptoflow/scrapers/gdax_ws.py

python3.6 kryptoflow/scrapers/twitter.py

To verify that your data is being ingested appropriatley, head to http://localhost:8000 for a visual ui of the Kafka topics being ingested.

Webapp

The scope of the project is creating a framework. As such, I found it tremendously useful (and a great learning experience) to add a visualization layer that would make it easy to visualize the collected data, as well as the performance of the models. A bare-bones (note: still in development) ember.js can be accessed by running:

python -m kryptoflow.serving.app.py -vv

Then, head to http://localhost:5000 and click on 'Charts'. To retrieve historical data, click on 'From Server'.

This should show up:

API

The backend was written in Flask. It provides restful access to historical data as well as access to the prediction service. The latter should work only if the steps outlined in the Analysis section have been followed. The api was built with Flask-RESTPLUS, which offers automated documentation with swagger.

Head to http://0.0.0.0:5000/tf_api/#!/historical_data to check it out.

Preview:

Analysis

Check out Keras training notebook for training and model storing instructions

Deploy with Tensorflow Serving

Build the docker container with bazel, tf serving, and all the required dependencies to serve the model. This may take some time (~40 minutes)

bash kryptoflow/serving/build_server.sh

Then, copy the stored models to the container:

bash kryptoflow/serving/serve.sh 1

The 1 indicates the number of the model. Check saved_models/ directory for the available stored models. These are automatically saved when the class kryptoflow.ml.export.ModelExporter is instantiated. See the notebook for more info on how to stored the models themselves.

License

GNU General Public License v3.0

kryptoflow's People

Contributors

carlomazzaferro avatar giranntu avatar gitter-badger avatar

Watchers

DreamChaser avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.