Giter Club home page Giter Club logo

rs-ingest's Introduction

Crates.io

rs-ingest

Ingestion library written in rust for Futurenet

This package provides primitives for building custom ingestion engines on Futurenet. It's inspired from stellars go/ingest package.

Often, developers either need ingestion features that are outside of Horizon's scope, or need higher availability for the data. For example, a protocol's frontend might need to ingest events into their own database with a very specific filtering or at a large amount, or they might also need to replay history to see if they missed some events.

This crate is being designed with this need in mind, and works on futurenet!

Note: You can also use this crate for pubnet, see the example.

Note: This crate is still a work in progress. The current capabilities of the crate are limited.

Note: Currently only POSIX systems are supported.

Features

Currently, you can both replay history and run online. Running online does not currently support running starting to replay history from a given ledger.

Note that the current implementation is experimental and does not cover all the functionalities that an ingestion crate should, including but not limited to failsafe mechanisms, archiver interaction, custom toml configuration, readers, and overall a more optimized codebase.

Running offline

Running offline means being able to replay history through a catchup. Replaying history will enable you to process everything that has happened on the network within the specified bounded range.

rs-ingest allows you to run offline in two modes:

  • single-thread
  • multi-thread

Single-thread mode

Running single-thread mode is the most straightforward way of using rs-ingest. This mode will await for the core subprocess to finish catching up and will then allow to retrieve the ledger(s) metadata.

Running single-thread mode will store every ledger meta.

Multi-thread mode

Running multi-thread mode is also pretty simple, but returns a Receiver<MetaResult> object that receives new ledger meta (already decoded) as soon as it is emitted.

When you run multi-thread mode you will be in charge of how to store the metadata or object derived from it.

When running multi-thread mode you also need to call the closing mechanism manually.

Running online

Running online means being able to sync with Futurenet and close ledgers, thus receive ledger close meta. This mode is more suitable for building using real-time data (for example event streaming).

Multi-threaded mode

Running online can only be done in multi-thread mode. You'll receive a Receiver<MetaResult> object which receives ledger close meta as stellar-core closes ledgers.

When running multi-thread mode you also need to call the closing mechanism manually.

Closing mechanism

rs-ingest has a closing mechanism that deletes temporary buckets created during execution and clears the objects. Closing is important before re-initiating an action on rs-ingest. When running single-thread mode, the closing mechanism is triggered within rs-ingest modules, however when running multi-thread it's the implementor that must decide when to trigger the closing mechanism.

Try it out

The crate is a WIP, but you can already start playing around the features it currently offers. For example, check out the examples.

The crate is available on crates.io:

ingest = "0.0.3"

stellar-core setup

Before using the crate, you need the stellar-core executable. To install the currently futurenet-compatible core:

git clone https://github.com/stellar/stellar-core

cd stellar-core

git checkout v20.0.0rc1

git submodule init

git submodule update

./autogen.sh

CXX=clang++-12 ./configure --enable-next-protocol-version-unsafe-for-production

make

make install [this one might need root access on some machines]

Note. Depending on the machine, you might need a different cmake env var than CXX=clang++-12.

Learn

Check out LEARN.md

rs-ingest's People

Contributors

heytdep avatar omahs avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

omahs

rs-ingest's Issues

Option to have bounded channels for multi-thread mode

Currently, the runner creates unbounded channels with std::sync::mpsc::channel(). This means every send op goes through in the bufreader since the channel has an infinite buffer and might cause overloads if the streamed data from code overloads the decoder. While this might be what most users look for, adding the option for bounded channels might help in some situations to prevent excessive memory usage.

Workaround or document catchup result streamed meta

When running catchup in multi-thread mode users might incur an error within the MetaResult wrapper since after catching up it will try to decode the catchup result. The most straightforward way to approach this is to document this and notify users not to do unsafe access to options of MetaResult since it will always result in a panic when running catchup in multi-thread mode.

track and document pubnet compatibility

rs-ingest currently seems to work well also on pubnet and a new flag in IngestionConfig has been added to specify wether to run on mainnet or futurenet.

That said, I haven't had the chance yet to see how it goes when running online or going through big chunks of history. We should also document pubnet compatibility and add examples, possibly similar to the already-existing ones on stellar/go/ingest.

Add closing mechanism for both single-thread and multi-thread.

When running single-threaded the closing mechanism (which is not yet implemented) could be within the runner core module. However, when running multi-threaded the closing mechanism can't be triggered by rs-ingest modules since it would halt execution.

The best way forward seems to be to provide a handle users should call to initiate closing.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.