Giter Club home page Giter Club logo

beatcoin's Introduction

B[ea]tcoin

A real-time trading tool for crypto-traders

Description

In the last 5 years, cryptocurrencies have become a new form of financial assets designed to work as a medium of exchange. Because of the popularity of these new currencies, as well as the features and advantages they provide, many different cryptocurrency exchanges or markets have established with traders being widespread all around the world. The objective of B[ea]coin is to be a centralized hub of real-time data from different cryptocurrency exchanges, such that traders are provided with information for best pricing and trades in order to beat the markets.

Markets and data sources

B[ea]tcoin currently works with four of the biggest cryptocurrency exchanges in the world:

Market
Coinbase (or GDAX) coinbase
Bitfinex bitfinex
CEX-IO cexio
OkCoin okcoin

Each of these markets provide a real-time API for users to have access to up-to-date order book and latest trading information through a websocket connection. B[ea]tcoin connects to all the APIs and collects the data from each market. Similarly, B[ea]tcoin only provides information for only Bitcoin and Ethereum to US Dollars trades (BTC-USD and ETH-USD).

Data Pipeline

The data pipeline implemented on the back-end of B[ea]tcoin is shown in the following diagram. Different APIs from the markets are ingested by several Kafka producers and categorized into different queues according to its nature ("buy order" or "sell order" depending on "BTC-USD" or "ETH-USD"). Next, Spark Streaming consumes all queues from Kafka and processes each message in a distributed manner to speed up execution time, and updates a cache-based Redis database that keeps the open book for each market as well as an aggregated open book. The front-end is served by Flask where queries from users (i.e., traders) about specific markets or a combination of them may be selected. pipeline

Set up instructions

This code runs on Amazon AWS EC2 servers.

Installation

The code is executed in 4 different EC2 clusters:

  1. ingestion-cluster that connects to all markets' APIs and sends mesages to the Kafka producers. Needs: Zookeeper, Kafka, python3
  2. processing-cluster that collects all messages with Spark Streaming and processes them to maintain an up-to-date order book from each market. Needs: Hadoop, Spark, Spark Streaming, python3
  3. db-stream-cluster keeps the open books Redis database which is constantly changing depending on the amount of activity at each market. Needs: Redis, python3
  4. web-server works as the server for the front-end website. Needs: python3 (with Flask, Tornado)

Deployment

To start B[ea]tcoin, there are several steps that need to be executed, however they have been easily packed into three shell scripts for the user to deploy it.

  1. On the root ~ directory of processing-cluster, execute the following shell script: $ ./src/startJob.sh. This will start submit a Spark job into the scheduler to start consuming data from the Kafka broker. Note: IP directions for the different clusters should be added into files db-stream-ip.txt, ingestion-ip.txt and processing-ip.txt located in ~/src/ directory.
  2. On the root ~ directory of ingestion-cluster, execute the following shell script: $ ./src/startMarkets.sh. This will start the API for every market and for every product (BTC and ETH) and send every message to the specific Kafka topic.
  3. On the root ~ directory of web-server, execute the following line to start the website: $ sudo -E python3 tornadoapp.py. The website will be available at www.beatcoin.services. Note: IP direction for the Redis database should be added into file db-stream-ip.txt located in ~/src/app/ directory.

Testing

The complete back-end was tested to analyze how many messages it could process, and on average, 5,000 messages (total and balanced for every topic) are successfully processed every second with no scheduling delays. This benchmark was run with a cluster consisting of 1 master and 3 worker nodes on m4.large EC2 instances.

Demo

The presentation is available here. The video for demo is available here.


Gustavo Tapia. Insight Data Engineering - NYC Fall 2017

beatcoin's People

Contributors

gustapia06 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.