Giter Club home page Giter Club logo

scrolls's Introduction

Scrolls Logo Scrolls Logo

Read-optimized cache of Cardano on-chain entities

GitHub GitHub Workflow Status

Intro

Scrolls is a tool for building and maintaining read-optimized collections of Cardano's on-chain entities. It crawls the history of the chain and aggregates all data to reflect the current state of affairs. Once the whole history has been processed, Scrolls watches the tip of the chain to keep the collections up-to-date.

Examples of collections are: "utxo by address", "chain parameters by epoch", "pool metadata by pool id", "tx cbor by hash", etc.

In other words, Scrolls is just a map-reduce algorithm that aggregates the history of the chain into use-case-specific, key-value dictionaries.

โš ๏ธ this tool is under heavy development. Library API, configuration schema and storage structure may vary drastically. Several important features are still missing. Use at your own peril.

Storage

Storage backends are "pluggable", any key-value storage mechanism is a potential candidate. Our backend of preference is Redis (and TBH, the only one implemented so far). It provides a very high "read" throughput, it can be shared across the network by multiple clients and can be used in cluster-mode for horizontal scaling.

We also understand that a memory db like Redis may be prohibitive for some use-cases where storage optimization is more important than read-latency. The goal is to provide other backend options within the realm of NoSQL databases better suited for the later scenarios.

About CRDTs

The persistence data model does heavy use of CRDTs (Conflict-free replicated data types) and idempotent calls, which provide benefits for write concurrency and rollback procedures.

For example, CRDTs allow us to re-build the indexes by spawning several history readers that crawl on-chain data concurrently from different start positions. This provides a sensible benefit on collection-building time. We call this approach "swarm mode".

TODO: explain future plan to leverage CRDTs for rollback checkpoints.

Accessing the Data

Scrolls doesn't provide any custom client for accessing the data, it relies on the fact that the canonical clients of the selected backends are ubiquitous, battle-tested and relatively easy to use. By knowing the structure of the stored keys/values, a developer should be able to query the data directly from Redis.

TODO: reference specs for each key/value

Filtering

Not all data is important in every scenario. Every collection in Scrolls is disabled by default and needs to be opted-in via configuration for it to be processed and stored. Some collections are more storage-hungry than others (eg: "Block CBOR by Hash"), plan ahead to avoid resource exhaustion.

Within the scope of a particular collection, further filtering can be specified depending on the nature of the data being aggregated. For example, the "UTXOs by Address" collection can be filtered to only process UTXO from a set of predetermined addresses.

TODO: Document filtering options per collection

How it Works

Scrolls is a pipeline that takes block data as input and outputs DB update commands. The stages involved in the pipeline are the following:

  • Source Stages: are in charge of pulling data block data. It might be directly from a Cardano node (local or remote) or from some other source. The requirement is to have the raw CBOR as part of the payload.
  • Reducer Stages: are in charge of applying the map-reduce algorithm. They turn block data into CRDTs commands that can be later merged with existing data. The map-reduce logic will depend on the type of collection being built. Each reducer stage handles a single collection. Reducers can be enabled / disabled via configuration.
  • Storage Stages: receive the generic CRDT commands and turns them into DB-specific instructions that are then executed by the corresponding engine.

diagram

Feature Status

  • Collections
    • UTXOs by Address
    • Address by Tx Output
    • Tx CBOR by Hash
    • Tx Count by Address
    • Chain Point by Tx Hash
    • Balance by Address
    • Pool Id by Stake Address
    • Pool Metadata by Pool Id
    • Chain Parameters by Epoch
    • UTXOs by Asset
    • Block Hash by Tx Hash
    • Block Hashes by Epoch
    • Block Header by Block Hash
    • Tx Hashes by Block Hash
    • Ada Handle by Address
    • Address by Ada Handle
    • Block CBOR by Hash
    • Metadata by Tx Hash
    • Feature requests open
  • Data Sources
    • Node-to-Node ChainSync + Blockfetch
    • Node-to-Client ChainSync
    • Oura Kafka Topic
    • Raw-CBOR Block files
  • Storage Backend
    • Redis
    • MongoDB
    • Cassandra
    • AWS DynamoDB
    • GCP BigQuery
    • Firestore
    • Azure CosmoDB
    • Feature requests open
  • Filtering Options
    • By Input / Output Address
    • By Withdrawal Address
    • By Collateral Address
    • By Block Slot Bounds
    • By Metadata Label
    • By Mint Policy / Asset
    • By Pool

Testdrive

In the testdrive folder you'll find a minimal example that uses docker-compose to spin up a local Redis instance and a Scrolls daemon. You'll need Docker and docker-compose installed in your local machine. Run the following commands to start it:

cd testdrive
docker-compose up

You should see the logs of both Redis and Scrolls crawling the chain from a remote relay node. If you're familiar with Redis CLI, you can run the following commands to see the data being cached:

redis:6379> KEYS *
1) "c1.addr1qx0w02a2ez32tzh2wveu80nyml9hd50yp0udly07u5crl6x57nfgdzya4axrl8mfx450sxpyzskkl95sx5l7hcfw59psvu6ysx"
2) "c1.addr1qx68j552ywp6engr2s9xt7aawgpmr526krzt4mmzc8qe7p8qwjaawywglaawe74mwu726w49e8e0l9mexcwjk4kqm2tq5lmpd8"
3) "c1.addr1q90z7ujdyyct0jhcncrpv5ypzwytd3p7t0wv93anthmzvadjcq6ss65vaupzmy59dxj43lchlra0l482rh0qrw474snsgnq3df"
4) "c1.addr1w8vg4e5xdpad2jt0z775rt0alwmku3my2dmw8ezd884zvtssrq6fg"
5) "c1.addr1q9tj3tdhaxqyph568h7efh6h0f078m2pxyd0xgzq47htwe3vep55nfane06hggrc2gvnpdj4gcf26kzhkd3fs874hzhszja3lh"
6) "c1.addr1w8tqqyccvj7402zns2tea78d42etw520fzvf22zmyasjdtsv3e5rz"
redis:6379> SMEMBERS c1.addr1w8tqqyccvj7402zns2tea78d42etw520fzvf22zmyasjdtsv3e5rz
1) "2548228522837ea580bc55a3e6a09479deca499b5e7f3c08602a1f3191a178e7:20"
2) "04086c503512833c7a0c11fc85f7d0f0422db9d14b31275b3d4327c40c6fd73b:25"
redis:6379>

Once you're done with the testdive, you can clean your environment by running:

docker-compose down

Installing

We currently provide the following ways to install Scrolls:

  • Using one of the pre-compiled binaries shared via Github Releases
  • Using the Docker image shared via Github Packages
  • By compiling from source code using the instructions provided in this README.

Configuration

This is an example configuration file:

# get data from a relay node
[source]
type = "N2N"
address = "relays-new.cardano-mainnet.iohk.io:3001"

# You can optionally enable enrichment (local db with transactions), this is needed for some reducers
[enrich]
type = "Sled"
db_path = "/opt/scrolls/sled_db"

# enable the "UTXO by Address" collection
[[reducers]]
type = "UtxoByAddress"
# you can optionally prefix the keys in the collection
key_prefix = "c1"
# you can optionally only process UTXO from a set of predetermined addresses
filter = ["addr1qy8jecz3nal788f8t2zy6vj2l9ply3trpnkn2xuvv5rgu4m7y853av2nt8wc33agu3kuakvg0kaee0tfqhgelh2eeyyqgxmxw3"]

# enable the "Point by Tx" collection
[[reducers]]
type = "PointByTx"
key_prefix = "c2"

# store the collections in a local Redis
[storage]
type = "Redis"
connection_params = "redis://127.0.0.1:6379"

# start reading from an arbitrary point in the chain
[intersect]
type = "Point"
value = [57867490, "c491c5006192de2c55a95fb3544f60b96bd1665accaf2dfa2ab12fc7191f016b"]

# let Scrolls know that we're working with mainnet
[chain]
type = "Mainnet"

Compiling from Source

To compile from source, you'll need to have the Rust toolchain available in your development box. Execute the following command to clone and build the project:

git clone https://github.com/txpipe/scrolls.git
cd scrolls
cargo build

FAQ

Don't we have tools for this already?

Yes, we do. We have excellent tools such as: Kupo, dcSpark's Carp or Db-Sync. Even the Cardano node itself might work as a source for some of the collections. Every tool is architected with a set of well-understood trade-offs. We believe Scrolls makes sense as an addition to the list because assumes a particular set of trade-offs:

  • network storage over local storage: Scrolls makes sense if you have multiple distributed clients working in a private network that want to connect to the same data instance.
  • read latency over data normalization: Scrolls works well when you need to answer simple questions, like a lookup table. It won't work if you need to create joins or complex relational queries.
  • data cache over data source: Scrolls aims at being a "cache" of data, not the actual source of data. It has an emphasis on easy and fast reconstruction of the collections. It promotes workflows where the data is wiped and rebuilt from scratch whenever the use-case requires (such as adding / removing filters).
  • Rust over Haskell: this is not a statement about the languages, both are great languages, each one with its own set of trade-offs. Since most of the Cardano ecosystem is written in Haskell, we opt for Rust as a way to broaden the reach to a different community of Rust developers (such as the authors of this tool). Scrolls is extensible, it can be used as a library in Rust projects to create custom cache collections.
  • bring your own db: storage mechanism in Scrolls are pluggable, our goal is to provide a tool that plays nice with existing infrastructure. The trade-off is that you end up having more moving parts.

How does this tool compare to Oura?

There's some overlap between Oura and Scrolls. Both tools read on-chain data and output some data results. The main difference is that Oura is meant to react to events, to watch the chain and actuate upon certain patterns. In contrast, Scrolls is meant to provide a snapshot of the current state of the chain by aggregating the whole history.

They were built to work well together. For example, let's say that you're building an app that uses Oura to process transaction data, you could then integrate Scrolls as a way to lookup the source address of the transaction's input.

How do I read the data using Python?

Assuming you're using Redis as a storage backend (only one available ATM), we recommend using redis-py package to talk directly to the Redis instance. This is a very simple code snippet to query a the UTXOs by address.

>>> import redis
>>> r = redis.Redis(host='localhost', port=6379, db=0)
>>> r.smembers("c1.addr1w8tqqyccvj7402zns2tea78d42etw520fzvf22zmyasjdtsv3e5rz")
{b'2548228522837ea580bc55a3e6a09479deca499b5e7f3c08602a1f3191a178e7:20', b'04086c503512833c7a0c11fc85f7d0f0422db9d14b31275b3d4327c40c6fd73b:25'}

The Redis operation being used is smembers which return the list of members of a set stored under a particular key. In this case, we query by the value c1.addr1w8tqqyccvj7402zns2tea78d42etw520fzvf22zmyasjdtsv3e5rz, where c1 is the key prefix specified in the config for our particular collection and addr1w8tqqyccvj7402zns2tea78d42etw520fzvf22zmyasjdtsv3e5rz is the address we're interested in querying. The response from redis is the list of UTXOs (in the format {tx-hash}:{output-index}) that are associated with that particular address.

How do I read the data using NodeJS?

    import redis from "redis";
    let r = redis.createClient("redis://127.0.0.1:6379"); // Initialize a redis client
    r.on("ready", () => { //When redis client is ready, run stuff
        r.sMembers("c1.addr1qy8jecz3nal788f8t2zy6vj2l9ply3trpnkn2xuvv5rgu4m7y853av2nt8wc33agu3kuakvg0kaee0tfqhgelh2eeyyqgxmxw3")
        .then(console.log);
    })

The Redis operation being used is sMembers which return the list of members of a set stored under a particular key. In this case, we query by the value c1.addr1w8tqqyccvj7402zns2tea78d42etw520fzvf22zmyasjdtsv3e5rz, where c1 is the key prefix specified in the config for our particular collection and addr1qy8jecz3nal788f8t2zy6vj2l9ply3trpnkn2xuvv5rgu4m7y853av2nt8wc33agu3kuakvg0kaee0tfqhgelh2eeyyqgxmxw3 is the address we're interested in querying. The response from redis is the list of UTXOs (in the format {tx-hash}:{output-index}) that are associated with that particular address.

What is "swarm mode"?

Swarm mode is a way to speed up the process of rebuilding collection from scratch by splitting the tasks into concurrent instances of the Scrolls daemon by partitioning the history of the chain into smaller fragments.

swarm mode diagram

scrolls's People

Contributors

adrabenche avatar bejf avatar ggaabe avatar jamesbsmyth avatar jmhrpr avatar joacohoyos avatar martinschere avatar matiwinnetou avatar micahkendall avatar mkeen avatar omahs avatar pacman99 avatar paulobressan avatar rvcas avatar scarmuega avatar sebastiengllmt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

scrolls's Issues

Scrolls server crash on preview network

Hi
I have problem with scrolls (version 0.4.3) on preview network. Scrolls is running from docker image and after processing block: 1702a60228ef0fd782d8428e8fe44297960f3355eea0b1eb1ce6e5818b2ea877 it is stopping. Below logs:

[2023-01-25T13:26:15Z INFO  scrolls::sources::n2n::transport] handshake output: Accepted(10, VersionData { network_magic: 2, initiator_and_responder_diffusion_mode: false })
[2023-01-25T13:26:15Z INFO  scrolls::sources::utils] found existing cursor in storage plugin: Specific(7995187, "1702a60228ef0fd782d8428e8fe44297960f3355eea0b1eb1ce6e5818b2ea877")
[2023-01-25T13:26:15Z INFO  scrolls::sources::n2n::chainsync] chain-sync intersection is (7995187, 1702a60228ef0fd782d8428e8fe44297960f3355eea0b1eb1ce6e5818b2ea877)
[2023-01-25T13:26:15Z INFO  scrolls::sources::n2n::chainsync] requesting next block
[2023-01-25T13:26:15Z INFO  scrolls::sources::n2n::chainsync] requesting next block
[2023-01-25T13:26:15Z WARN  scrolls::reducers::worker] rollback requested for (7995187, 1702a60228ef0fd782d8428e8fe44297960f3355eea0b1eb1ce6e5818b2ea877)
[2023-01-25T13:26:15Z INFO  scrolls::sources::n2n::chainsync] requesting next block
[2023-01-25T13:26:15Z ERROR gasket::runtime] STAGE: reducers, WORK, PANIC: cbor error: Invalid CBOR structure: unexpected type indefinite bytes at position 1831: expected bytes (definite length)
[2023-01-25T13:26:15Z INFO  scrolls::sources::n2n::chainsync] requesting next block
[2023-01-25T13:26:15Z INFO  scrolls::sources::n2n::chainsync] requesting next block
...
[2023-01-25T13:26:15Z INFO  scrolls::sources::n2n::chainsync] chain-sync reached the tip of the chain
[2023-01-25T13:26:15Z INFO  scrolls::sources::n2n::chainsync] awaiting next block (blocking)
[2023-01-25T13:26:16Z INFO  scrolls::daemon] Scrolls is stopping...
[2023-01-25T13:26:16Z WARN  scrolls::daemon] dismissing stage: n2n with state Alive(Working)
[2023-01-25T13:26:16Z WARN  scrolls::daemon] dismissing stage: enrich-skip with state Alive(Working)
[2023-01-25T13:26:16Z WARN  scrolls::daemon] dismissing stage: reducers with state Alive(StandBy)
[2023-01-25T13:26:16Z WARN  scrolls::daemon] dismissing stage: redis with state Alive(Working)

UTXOs by Asset / Policy ID

Thanks again for the amazing tool and documentation !

I'm looking to build a dashboard to track wallet distribution of specific cadano native assets (ex: LQ) across multiple epochs.

Is this use-case feasible through scrolls? Glancing at the documentation, I realize "UTXOs by Asset" seems a close match (under implementation - perhaps?). Please suggest when you get a chance !

bug: potential for inaccurate data due to missing some blocks when using N2N

There is currently an issue which means some blocks (random) are not processed when using N2N, meaning the accuracy of data cannot be guaranteed. When attempting to roll forward to a new block header while following the tip, the server may sometimes respond to the BlockFetch message with a MsgNoBlocks instead of the requested block data, but Scrolls does not handle this case and behaves as if it processed the block. This means that any actions that would have occurred due to the missed blocks will not be reflected in the data in Scrolls.

Setup mdBook documentation placeholder

Oura project uses mdBook to hold its documentation. We need to implement the same thing for Scrolls.

The goal of this issue is to setup mdBook in this repository as a placeholder for documentation that still needs to be written:

  • copy the github action workflow from Oura
  • initialize the requied mdBook folder / files
  • copy the basic info from the README into the introduction of the mdBook

Writing the actual documentation is out of scope

bug: `TxCountByAddress` reducer incrementing multiple times per tx

TxCountByAddress seems to be a naive copy of BalanceByAddress and increments the txcounter per txin and txout in the transaction which matches the address, when it should probably just increment the counter once per tx where at least one txin or txout matches the address. Also, the default prefix reads balance_by_address.

I'll fix.

Help with example of working config for "Address by Ada Handle" reducer

Could anyone help me testing this config in preprod environment? Or maybe provide an example of a working one?

[source]
type = "N2N"
address = "preprod-node.world.dev.cardano.org:30000"

[[reducers]]
type = "AddressByAdaHandle"
key_prefix = "AddressByAdaHandle"
policy_id_hex = "f0ff48bbb7bbe9d59a40f1ce90e9e9d0ff5002ec48f232b49ca0fb9a"

[storage]
type = "Redis"
connection_params = "redis://127.0.0.1:6379/2"

[chain]
type = "PreProd"

[intersect]
type = "Point"
value = [8261225,"103ff3cfec6e388db803fc10dbdecb3026b4a51382008d01ff3f9121be6fa6e4"]

After minting this handle $myhandle:
image

I was expecting to see it under the key AddressByAdaHandle.$myhandle, in my redis logical db 2, but nothing shows up:

[2] > keys *
1) "_cursor"

I am using the binary from asset_holders_by_asset_id branch which already has the reducer implemented. FYI, I've built it with cargo build --release --locked --all-features && strip ./target/release/scrolls || true and I am calling like this: /usr/local/bin/scrolls daemon --config /path/to/config.toml. I also do del _cursor on redis before each test.


oura command to help debugging:

oura watch --since 8261225,103ff3cfec6e388db803fc10dbdecb3026b4a51382008d01ff3f9121be6fa6e4 preprod-node.world.dev.cardano.org:30000 --bearer tcp --magic preprod`

Thanks for looking! ๐Ÿ’

[FR] reducers::transaction_count_by_contract_address

Here is an example for a contract jpg.store:

https://cardanoscan.io/address/addr1w999n67e86jn6xal07pzxtrmqynspgx0fwmcmpua4wc6yzsxpljz3

It would be nice to have a reducer where we can get the Transaction Count across all epochs for a given contract address, e.g.

addr1w999n67e86jn6xal07pzxtrmqynspgx0fwmcmpua4wc6yzsxpljz3 -> 3185960

Minor: when cardano-node not available reconnection worker takes 100% CPU

So I noticed when there is an outage of cardano-node, scrolls tries to reconnect (this is great) but the small issue is that this reconnection watchdog/ daemon or worker in scrolls process is consuming way more CPU of the machine than it should.

image

mati@kajka:~$ sudo journalctl  -f -u scrolls
Jun 25 13:36:24 kajka scrolls[234390]: [2023-06-25T11:36:24Z WARN  gasket::retries] retryable operation error: "network error: Connection refused (os error 111)"
Jun 25 13:36:28 kajka scrolls[234390]: [2023-06-25T11:36:28Z WARN  gasket::retries] retryable operation error: "network error: Connection refused (os error 111)"
Jun 25 13:36:36 kajka scrolls[234390]: [2023-06-25T11:36:36Z WARN  gasket::retries] retryable operation error: "network error: Connection refused (os error 111)"
Jun 25 13:36:52 kajka scrolls[234390]: [2023-06-25T11:36:52Z WARN  gasket::retries] retryable operation error: "network error: Connection refused (os error 111)"
Jun 25 13:37:24 kajka scrolls[234390]: [2023-06-25T11:37:24Z WARN  gasket::retries] retryable operation error: "network error: Connection refused (os error 111)"
Jun 25 13:38:24 kajka scrolls[234390]: [2023-06-25T11:38:24Z WARN  gasket::retries] retryable operation error: "network error: Connection refused (os error 111)"
Jun 25 13:39:24 kajka scrolls[234390]: [2023-06-25T11:39:24Z WARN  gasket::retries] retryable operation error: "network error: Connection refused (os error 111)"
Jun 25 13:40:24 kajka scrolls[234390]: [2023-06-25T11:40:24Z WARN  gasket::retries] retryable operation error: "network error: Connection refused (os error 111)"
Jun 25 13:41:24 kajka scrolls[234390]: [2023-06-25T11:41:24Z WARN  gasket::retries] retryable operation error: "network error: Connection refused (os error 111)"
Jun 25 13:42:24 kajka scrolls[234390]: [2023-06-25T11:42:24Z WARN  gasket::retries] retryable operation error: "network error: Connection refused (os error 111)"

ERROR gasket::runtime] STAGE: reducers, WORK, PANIC: variable-length uint error: variable-length uint overflow

STAGE: reducers, WORK, PANIC: variable-length uint error: variable-length uint overflow
Scrolls is stopping...

This happens while running in docker
any idea what is happening?

`[2022-09-02T05:01:16Z INFO scrolls::storage::redis] new cursor saved to redis 65011840,d4a00c3ba23a44705fb69a22d0feeec7b47481cb9f92510bda6909faaf37772e

[2022-09-02T05:01:17Z INFO scrolls::storage::redis] new cursor saved to redis 65011944,f999af602ab467b8d8bd71662138aeec1578659f37c51dd555728b04d4093d07

[2022-09-02T05:01:17Z ERROR gasket::runtime] STAGE: reducers, WORK, PANIC: variable-length uint error: variable-length uint overflow

[2022-09-02T05:01:17Z INFO scrolls::daemon] Scrolls is stopping...

[2022-09-02T05:01:17Z WARN scrolls::daemon] dismissing stage: n2n-headers with state Alive(Working)

[2022-09-02T05:01:17Z WARN scrolls::daemon] dismissing stage: n2n-blocks with state Alive(Working)

[2022-09-02T05:01:17Z WARN scrolls::daemon] dismissing stage: enrich-skip with state Alive(Working)

[2022-09-02T05:01:17Z WARN scrolls::daemon] dismissing stage: reducers with state Alive(StandBy)

[2022-09-02T05:01:17Z WARN scrolls::daemon] dismissing stage: redis with state Alive(Working)

[2022-09-02T05:01:17Z ERROR gasket::runtime] STAGE: redis, WORK, RECV ERR: error receiving work unit through input port

[2022-09-02T05:01:17Z ERROR gasket::runtime] STAGE: enrich-skip, WORK, SEND ERR: error sending work unit through output port

[2022-09-02T05:01:17Z WARN gasket::runtime] STAGE: n2n-blocks, WORK, RESTART: downstream error while processing business logic error sending work unit through output port

[2022-09-02T05:01:17Z WARN gasket::runtime] STAGE: n2n-headers, WORK, RESTART: downstream error while processing business logic error sending work unit through output port

[2022-09-07T17:46:42Z INFO scrolls::daemon] scrolls is running...

[2022-09-07T17:46:42Z INFO scrolls::sources::n2n::transport] handshake output: Accepted(7, VersionData { network_magic: 764824073, initiator_and_responder_diffusion_mode: false })

[2022-09-07T17:46:42Z INFO scrolls::sources::n2n::transport] handshake output: Accepted(7, VersionData { network_magic: 764824073, initiator_and_responder_diffusion_mode: false })

[2022-09-07T17:46:42Z INFO scrolls::sources::utils] found existing cursor in storage plugin: Specific(65011944, "f999af602ab467b8d8bd71662138aeec1578659f37c51dd555728b04d4093d07")

[2022-09-07T17:46:42Z WARN scrolls::reducers::worker] rollback requested for (65011944, f999af602ab467b8d8bd71662138aeec1578659f37c51dd555728b04d4093d07)

[2022-09-07T17:46:42Z ERROR gasket::runtime] STAGE: reducers, WORK, PANIC: variable-length uint error: variable-length uint overflow

[2022-09-07T17:46:43Z INFO scrolls::daemon] Scrolls is stopping...

[2022-09-07T17:46:43Z WARN scrolls::daemon] dismissing stage: n2n-headers with state Alive(Working)

[2022-09-07T17:46:43Z WARN scrolls::daemon] dismissing stage: n2n-blocks with state Alive(Working)

[2022-09-07T17:46:43Z WARN scrolls::daemon] dismissing stage: enrich-skip with state Alive(Working)

[2022-09-07T17:46:43Z WARN scrolls::daemon] dismissing stage: reducers with state Alive(StandBy)

[2022-09-07T17:46:43Z WARN scrolls::daemon] dismissing stage: redis with state Alive(Working)

[2022-09-07T17:46:43Z ERROR gasket::runtime] STAGE: redis, WORK, RECV ERR: error receiving work unit through input port

[2022-09-07T17:46:43Z ERROR gasket::runtime] STAGE: enrich-skip, WORK, SEND ERR: error sending work unit through output port

[2022-09-07T17:46:43Z WARN gasket::runtime] STAGE: n2n-blocks, WORK, RESTART: downstream error while processing business logic error sending work unit through output port

[2022-09-07T17:46:43Z WARN gasket::runtime] STAGE: n2n-headers, WORK, RESTART: downstream error while processing business logic error sending work unit through output port`

Question about utxo_by_address filtering

Hello,

As per the documentation, there is a todo "Document filtering options per collection".

I had a look at the code and from what I understood, it doesn't seem like address filtering is already implemented for that Reducer.

The relevant steps I identified where it could have been implemented are when:

  1. The block is reduced into the different TXs (reduce_byron/alonzo_tx).
  2. Each TX is transformed into CRDT (send_set_add) and sent for the storage step.

I saw nothing like an address filtering in that part of the code. The questions are then:

=> Is it implemented and I'm not looking at it the right way OR it is not implemented yet and you already have an idea of how it could be?

feat: allow splitting scrolls into multiple instances

In order to run scrolls in production, it is necessary to be able to run multiple sled instances on the same sled database.

Dependencies:

  1. Allow to configure redis _cursor name via config

  2. Separate sled write away from the scrolls. We should have a command line tool that keeps scrolls ingestion process separate from scrolls instance. Scrolls instance should use sled db and only in case reducers are configured that need it

[FR] Reducers::tvl_by_contract_address_and_epoch

Here is an example of TVL for a contract jpg.store:

https://cardanoscan.io/address/addr1w999n67e86jn6xal07pzxtrmqynspgx0fwmcmpua4wc6yzsxpljz3

It would be nice to have a reducer where we can get the data about TVL (total value locked) across a specific epoch, e.g.

addr1w999n67e86jn6xal07pzxtrmqynspgx0fwmcmpua4wc6yzsxpljz3.339 -> 445562 * 1000000 (lovelaces value)

scrolls CLI ignores --config flag

I am trying to run scrolls without docker compose. But the CLI ignores the config file passed through --config flag and throws error.

Steps to reproduce:

  • Create a daemon.toml file with required entries
  • Run the following command
./scrolls daemon --config daemon.toml

Error:

ERROR: ConfigError(
"missing field source",
)

[FR] Reducers::tvl_by_contract_address

Here is an example of TVL for a contract jpg.store:

https://cardanoscan.io/address/addr1w999n67e86jn6xal07pzxtrmqynspgx0fwmcmpua4wc6yzsxpljz3

It would be nice to have a reducer where we can get the data about TVL (total value locked) across all epochs.

addr1w999n67e86jn6xal07pzxtrmqynspgx0fwmcmpua4wc6yzsxpljz3 -> 445562 * 1000000 (lovelaces value)

Error when connecting to preview cardano node

Hi.
When Im trying to sync with local cardano node for preview network I get following error:

scrolls_1                 | [2022-12-01T14:20:27Z INFO  scrolls::sources::n2n::transport] handshake output: Accepted(10, VersionData { network_magic: 2, initiator_and_responder_diffusion_mode: false })
scrolls_1                 | [2022-12-01T14:20:27Z INFO  scrolls::sources::n2n::transport] handshake output: Accepted(10, VersionData { network_magic: 2, initiator_and_responder_diffusion_mode: false })
scrolls_1                 | [2022-12-01T14:20:27Z INFO  scrolls::sources::utils] found existing cursor in storage plugin: Specific(1363359, "33f1f63f10e82cf72e137b92e6d307be85e8a3ddd5b9131220734b82cd2974ae")
scrolls_1                 | [2022-12-01T14:20:27Z WARN  scrolls::reducers::worker] rollback requested for (1363359, 33f1f63f10e82cf72e137b92e6d307be85e8a3ddd5b9131220734b82cd2974ae)
scrolls_1                 | [2022-12-01T14:20:27Z ERROR gasket::runtime] STAGE: reducers, WORK, PANIC: cbor error: Invalid CBOR structure: decode error: bad cbor data type for plutus data
scrolls_1                 | [2022-12-01T14:20:28Z INFO  scrolls::daemon] Scrolls is stopping...

It seems to be realed to https://discordapp.com/channels/826816523368005654/892793958097375282/1041704392778317875 as we had problems with other components as well

It appears as hashing PlutusScript does not lead to the same script hash.

  • there is a high chance that either hashing function has an issue for PlutusScript hasher:
https://github.com/txpipe/pallas/blob/main/pallas-primitives/src/alonzo/crypto.rs

or

there is some error in PlutusScript where exact bytes are different than Haskell implementation:

https://cardanoscan.io/address/addr1w999n67e86jn6xal07pzxtrmqynspgx0fwmcmpua4wc6yzsxpljz3

Contract Address: addr1w999n67e86jn6xal07pzxtrmqynspgx0fwmcmpua4wc6yzsxpljz3
Script Hash: 714a59ebd93ea53d1bbf7f82232c7b012700a0cf4bb78d879dabb1a20a

Error "missing utxo" using example .toml

Hi,

running scrolls with the example .toml file (cutdown slightly) on mainnet for Filter type = "UtxoByAddress" gives error:

Note: this is from compiing from scratch with redis on standard localhost and port (6379). The docker image works as expected.

ERROR gasket::runtime] STAGE: reducers, WORK, PANIC: missing utxo: ba1950d0cf9e6a551215701f85a209a9a47fb64032efa14e21fcfbf37c125a9a#0
[2022-10-16T20:45:11Z INFO  scrolls::daemon] scrolls is running...
[2022-10-16T20:45:11Z INFO  scrolls::sources::n2n::transport] handshake output: Accepted(9, VersionData { network_magic: 764824073, initiator_and_responder_diffusion_mode: true })
[2022-10-16T20:45:11Z INFO  scrolls::sources::utils] no cursor found in storage plugin
[2022-10-16T20:45:11Z INFO  scrolls::sources::n2n::transport] handshake output: Accepted(9, VersionData { network_magic: 764824073, initiator_and_responder_diffusion_mode: true })
[2022-10-16T20:45:12Z WARN  scrolls::reducers::worker] rollback requested for (57867490, c491c5006192de2c55a95fb3544f60b96bd1665accaf2dfa2ab12fc7191f016b)
[2022-10-16T20:45:13Z ERROR gasket::runtime] STAGE: reducers, WORK, PANIC: missing utxo: ba1950d0cf9e6a551215701f85a209a9a47fb64032efa14e21fcfbf37c125a9a#0
[2022-10-16T20:45:14Z INFO  scrolls::daemon] Scrolls is stopping...
[2022-10-16T20:45:14Z WARN  scrolls::daemon] dismissing stage: n2n-headers with state Alive(Working)
[2022-10-16T20:45:14Z WARN  scrolls::daemon] dismissing stage: n2n-blocks with state Alive(Working)
[2022-10-16T20:45:14Z WARN  scrolls::daemon] dismissing stage: enrich-sled with state Alive(Working)
[2022-10-16T20:45:14Z WARN  scrolls::daemon] dismissing stage: reducers with state Alive(StandBy)
[2022-10-16T20:45:14Z WARN  scrolls::daemon] dismissing stage: redis with state Alive(Working)

.toml file:

[source]
type = "N2N"
address = "relays-new.cardano-mainnet.iohk.io:3001"

[enrich]
type = "Sled"
db_path = "/blah02/sled_db"

# enable the "UTXO by Address" collection
[[reducers]]
type = "UtxoByAddress"
# you can optionally prefix the keys in the collection
key_prefix = "c1"
# you can optionally only process UTXO from a set of predetermined addresses
filter = ["addr1qy8jecz3nal788f8t2zy6vj2l9ply3trpnkn2xuvv5rgu4m7y853av2nt8wc33agu3kuakvg0kaee0tfqhgelh2eeyyqg"]

[storage]
type = "Redis"
connection_params = "redis://127.0.0.1:6379"

[intersect]
type = "Point"
value = [57867490, "c491c5006192de2c55a95fb3544f60b96bd1665accaf2dfa2ab12fc7191f016b"]

[chain]
type = "Mainnet"

I have also tried a similar .toml file on preprod testnet and get the same error. Im either doing something stupidly wrong or missing something in the setup? could anyone advise?

Scrolls does not resume when relay goes offline and back online

What I expect is that when relay goes offline and then back offline scrolls continues operation. Instead I see error messages and processing stops:

Aug 06 11:25:58 kajka systemd-journald[259]: Suppressed 748117 messages from scrolls.service
Aug 06 11:26:28 kajka systemd-journald[259]: Suppressed 753118 messages from scrolls.service
Aug 06 11:26:58 kajka systemd-journald[259]: Suppressed 742835 messages from scrolls.service
Aug 06 11:27:28 kajka systemd-journald[259]: Suppressed 746538 messages from scrolls.service
Aug 06 11:27:58 kajka systemd-journald[259]: Suppressed 747329 messages from scrolls.service
Aug 06 11:28:28 kajka systemd-journald[259]: Suppressed 749589 messages from scrolls.service
Aug 06 11:28:58 kajka systemd-journald[259]: Suppressed 749292 messages from scrolls.service
Aug 06 11:29:28 kajka systemd-journald[259]: Suppressed 736017 messages from scrolls.service

Aug 06 11:53:59 kajka scrolls[38140]: [2022-08-06T09:53:59Z ERROR gasket::runtime] error that requires stage to restart: channel error communicating with multiplexer: channel is not connected, failed to send payload
Aug 06 11:53:59 kajka scrolls[38140]: [2022-08-06T09:53:59Z ERROR gasket::runtime] error that requires stage to restart: channel error communicating with multiplexer: channel is not connected, failed to send payload
Aug 06 11:53:59 kajka scrolls[38140]: [2022-08-06T09:53:59Z ERROR gasket::runtime] error that requires stage to restart: channel error communicating with multiplexer: channel is not connected, failed to send payload
Aug 06 11:53:59 kajka scrolls[38140]: [2022-08-06T09:53:59Z ERROR gasket::runtime] error that requires stage to restart: channel error communicating with multiplexer: channel is not connected, failed to send payload
Aug 06 11:53:59 kajka scrolls[38140]: [2022-08-06T09:53:59Z ERROR gasket::runtime] error that requires stage to restart: channel error communicating with multiplexer: channel is not connected, failed to send payload
Aug 06 11:53:59 kajka scrolls[38140]: [2022-08-06T09:53:59Z ERROR gasket::runtime] error that requires stage to restart: channel error communicating with multiplexer: channel is not connected, failed to send payload
Aug 06 11:53:59 kajka scrolls[38140]: [2022-08-06T09:53:59Z ERROR gasket::runtime] error that requires stage to restart: channel error communicating with multiplexer: channel is not connected, failed to send payload
Aug 06 11:53:59 kajka scrolls[38140]: [2022-08-06T09:53:59Z ERROR gasket::runtime] error that requires stage to restart: channel error communicating with multiplexer: channel is not connected, failed to send payload
Aug 06 11:53:59 kajka scrolls[38140]: [2022-08-06T09:53:59Z ERROR gasket::runtime] error that requires stage to restart: channel error communicating with multiplexer: channel is not connected, failed to send payload

scrolls is now stuck on cursor:

127.0.0.1:16379> get _cursor
"52640848,aa9c017c2e5830a710f907aa4ee4bb071e19208cec5941074192b8993202339c"
127.0.0.1:16379> get _cursor
"52640848,aa9c017c2e5830a710f907aa4ee4bb071e19208cec5941074192b8993202339c"
127.0.0.1:16379> get _cursor
"52640848,aa9c017c2e5830a710f907aa4ee4bb071e19208cec5941074192b8993202339c"

and does not proceed

Panic on reducer: Is AlonzoCompatible a MultiAsset era?

The following reducers doesn't work.

use serde::Deserialize;

use pallas::ledger::traverse::{Feature, MultiEraBlock};

use crate::model;

#[derive(Deserialize)]
pub struct Config {
pub key_prefix: Option,
}

pub struct Reducer {
config: Config
}

impl Reducer {

    pub fn reduce_block(
        &mut self,
        block: &MultiEraBlock,
        output: &mut super::OutputPort,
    ) -> Result<(), gasket::error::Error> {
        if block.era().has_feature(Feature::MultiAssets) {
            for tx in block.txs() {
                let mint = tx.mint();

                let mints = mint.as_alonzo().unwrap();

                for (policy, assets) in mints.iter() {
                    let policy_id = hex::encode(policy.as_slice());

                    let number_of_minted_or_destroyed = assets.len();

                    let key = match &self.config.key_prefix {
                        Some(prefix) => format!("{}.{}", prefix, policy_id),
                        None => format!("{}.{}", "transaction_count_by_native_token_policy".to_string(), policy_id),
                    };

                    let crdt = model::CRDTCommand::PNCounter(key, number_of_minted_or_destroyed as i64);
                    output.send(gasket::messaging::Message::from(crdt))?;
                };
            }
        }

        Ok(())
    }
}

impl Config {
    pub fn plugin(self) -> super::Reducer {
        let reducer = Reducer {
            config: self
        };

        super::Reducer::TransactionsCountByNativeTokenPolicyId(reducer)
    }
}

it dies at block:

127.0.0.1:6379> get _cursor
"23068793,69c44ac1dda2ec74646e4223bc804d9126f719b1c245dadc2ad65e8de1b276d7"

expected behavior is that it does not crash and proceeds further

Feat: Address by Ada Handle

I saw Oura had a feature being worked on in April to show Ada handle address mappings. Is this still on the priority list to potentially be added to scrolls soon?

Sled integration is not working correctly (fails silently)

I tried to ingest data using enricher from the Origin a few times and each time after ca 4,7 GB - 5 GB sled breaks, ingestion continues but sled is simply dead and no new transaction data is being written.

Expectations:

  • Increase log level, seems sled exceptions maybe swallowed
  • if sled ingestion breaks for whatever reason processing should be halted, otherwise it gives a false assumption that everything is fine and data is stored while in fact it isn't

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.