Giter Club home page Giter Club logo

talaria's Introduction

Talaria

TalariaDB is a distributed, highly available, and low latency time-series database that stores real-time data. It's built on top of Badger DB.
Blog: https://engineering.grab.com/big-data-real-time-presto-talariadb

About TalariaDB

In Grab, millions and millions of transactions and connections take place every day on our platform, which requires data-driven decision making. And these decisions need to be made based on real-time data. For example, an experiment might inadvertently cause a significant increase of waiting time for riders.

To overcome the challenge of retrieving information from large amounts of data, we designed and built TalariaDB. It addresses our need to query at least 2-3 terabytes of data per hour with predictable low query latency and low cost. Most importantly, it plays very nicely with the different tools’ ecosystems and lets us query data using SQL.

Architecture

alt text

The diagram above shows how TalariaDB ingests and serves data.

  • The upstream ETL pipeline prepares the ORC files and store them in the S3 bucket as input.
  • AWS SQS is created as the event notification service of the S3 bucket and notifies TalariaDB of any new object uploaded on S3.
  • Presto is connected to TalariaDB through Thrift since it implements PrestoThriftService https://prestodb.github.io/docs/current/connector/thrift.html.
  • AWS Route53 is used as DNS web service of TalariaDB, whose domain is registered on Presto cluster. The IP addresses of the DNS record is registered by TalariaDB using Gossip protocol in bootstrap phase.

Currently this project is currently highly coupled with AWS services like SQS, S3 and Route53. We will make these components (storage, DNS) pluggable and make TalariaDB useful for more generic case.

Quick Start

Preconditions

  • setup AWS profile and make sure your machine is accessible to AWS and has enough permission to read from S3, SQS and manipulate Route53 records.

Steps

  1. Set env vars
export X_TALARIA_CONF=(path-to-this-repo)/config-ci.json
  1. Edit config-ci.json with your own configurations (including AWS Route53 and SQS configs)
  2. Start application
go run (path-to-this-repo)/main.go

About Us

TalariaDB is maintained by:

License

TalariaDB is licensed under the --- (LICENSE.md)

talaria's People

Contributors

a9kitkumarsinha avatar cr-phang avatar wangbeyond avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.