Giter Club home page Giter Club logo

audit-protocol's Introduction

Table of Contents

Overview

Audit protocol Overview

Audit Protocol is the component of a fully functional, distributed system that works alongside Pooler and Epoch Generator together they are all responsible for

  • Generating aggregated snapshots of UniswapV2 specific data event logs over smart contract state transitions
  • Communicating with other Snapshotter peers to reach consensus over smart contract, running over PowerLoom Protocol Testnet
  • Ultimately provide access to rich aggregates that can power a Uniswap V2 dashboard with the following data points:
    • Total Value Locked (TVL)
    • Trade Volume, Liquidity reserves, Fees earned
    • Transactions containing Swap, Mint, and Burn events

Role Of Audit Protocol:

Audit Protocol provides following important functionalities:

  • Submit snapshots generated by Pooler to the smart contract. Snapshots submitted by other snapshotter peers reach to consensus over the smart contract.
  • Keeping cache of finalized snapshot data on local disk and IPFS (local/3rd party service) in most optimal way to reduce storage load.
    • Support for cronjob-based pruning of old snapshot data (using configured maximum duration) from local disk and IPFS.
    • Support for archival of snapshots to Web3Storage and ultimately backed to Filecoin (WIP)
  • Generate snapshotter report for snapshot submissions and provide dashboard to view the report (WIP).

Project Architecture

Audit protocol works on event based architecture using RabbitMQ. Events in this case are the messages sent to the message queue. Audit Protocol listens to two types of messages described below:

  1. Commit Snapshot Message: This message is sent by Pooler to Audit Protocol when a snapshot is generated for a project for an epochID over smart contract state transitions.
  2. Snapshot Finalized Message: This message is sent by Pooler to Audit Protocol when a snapshot is finalized by smart contract after reaching consensus over the snapshots submitted by snapshotter peers for a project.

Payload commit service higher level flow

RabbitMQ message wise flow as mentioned in Project Architecture section:

  1. Snapshot Commit Message:
    • Store snapshots on decentralized storage protocols like IPFS and/or Web3 Storage.
    • Submit snapshots to the smart contract for consensus.
  2. Snapshot Finalized Message:
    • Receives finalized snapshot.
    • Compare committed snapshot with finalized snapshot.
    • Generate snapshotter report for snapshot submissions accordingly.
    • Keep a local cache of finalized snapshots.

Pruning service higher level flow

  • Create cronjob entry which runs periodically according to configured duration (default 0 0 * * * = every day).
  • Prune snapshots older than configured max duration (default 7 days) from local disk and IPFS.

For a deeper dive into Audit Protocol services, refer to Introduction doc.

Requirements

Setup Instructions

There are 2 ways to run Audit Protocol, using docker or running processes directly in your local environment. For simplicity, it is recommended to run using docker.

Configure settings

  • Copy settings.example.json present in root directory of the project to settings.json file, and populate it with the required values.
cp settings.example.json settings.json
  • Configure the snapshotter account address (EVM compatible 0x...) in settings.instance_id.
    • instance_id: This is the unique public key for your node to participate in consensus. It is currently registered on approval of an application (refer deploy repo for more details on applying).
  • Configure the pooler-namespace
    • pooler-namespace: It is the unique key used to identify your project namespace around which all consensus activity takes place.
  • Configure anchor_chain_rpc_url in settings.anchor_chain_rpc_url
  • If you are using 3rd party IPFS provider, add a valid URL in settings.ipfs.url and settings.ipfs.reader_url
    • Fill authentication keys for the provider in settings.ipfs.writer_auth_config and settings.ipfs.reader_auth_config
  • Configure Powerloom smart contract address in settings.signer.domain.verifyingContract
  • Configure snapshotter account address and private key in settings.signer.accountAddress and settings.signer.privateKey respectively
  • Optional Steps
    • Configure settings.reporting.slack_webhook_url if you want to receive alerts on the Slack channel. In order to set up a slack workflow, refer to Slack workflow setup.
    • Configure access token for Web3 Storage in settings.web3_storage.api_token, if you want to use Web3 Storage to store snapshots.
  • Note that by default settings.healthcheck service uses 9000 port, but if you already have some services running at that port, you can change to any other port that is free in your system.

Steps for docker setup

Steps to run directly

Note that these steps will only run Audit Protocol. If you want to run the complete system, follow the steps mentioned in Docker Setup

  • Install required dependencies as mentioned in requirements section
  • Generate services binaries by running the following command
./build.sh
  • Start all processes by running the following command
pm2 start pm2.config.js

Monitoring and Debugging

  • To monitor the status of running processes, you simply need to run pm2 status.
  • To see all logs you can run pm2 logs
  • To see logs for a specific process you can run pm2 logs <Process Identifier>
  • To see only error logs you can run pm2 logs --err

Note: if you are running docker setup, you can login into the container and run the above commands.

Alerting

Audit Protocol uses Slack for alerting. You can configure Slack alerts by referring to this doc

Snapshotter reports

Audit protocol stores snapshotter reports in Redis and snapshots in the local cache directory specified in settings.local_cache_path.

  • To get Snapshotter report for a particular project, you can run the following command
# login into redis-cli

# replace `<project_id>` with actual project id
HGETALL projectID:<project_id>:snapshotterStatusReport

# get successful snapshot submissions count for a project
GET projectID:<project_id>:totalSuccessfulSnapshotCount

# get missed snapshot submissions count for a project
GET projectID:<project_id>stg:totalMissedSnapshotCount

# get incorrect snapshot submissions count for a project
GET projectID:<project_id>:totalIncorrectSnapshotCount

We are working on a dashboard to view Snapshotter reports. It will be available soon.

Protocol Overview

Architecture

For the details of each component, refer to component details.

Architecture Details

Details about the working of various components of Audit Protocol are present in Introduction if you're interested to know more about Audit Protocol.

audit-protocol's People

Contributors

chaitanyaprem avatar anomit avatar raghavendragaleppa avatar atiqgauri avatar swagftw avatar xadahiya avatar swarooph avatar omahs avatar irfan-ansari-au28 avatar

Stargazers

Tuan Duc Tran avatar  avatar Freelancer avatar Mohanish Patel avatar Pedro Maia avatar Vignesh kumar Dharmalingam avatar  avatar  avatar  avatar

Watchers

 avatar  avatar Sulejman Sarajlija avatar KANIKA MISHRA avatar  avatar  avatar

audit-protocol's Issues

Update pair addresses for all Uniswap variations

The current list of pair addresses used in UniswapV2, Sushiswap, and Quickswap are outdated and include inactive or low-activity pairs.
To enhance the user experience and data accuracy, we need to curate and create a new list of top pairs for all three platforms.

Related to PowerLoom/pooler#8

Resilient indexing and aggregation services

Is your feature request related to a problem?
The present architecture of the indexer and aggregator services that generate important datapoints for the Uniswap v2 use case does not deal well with broken chains and skipped snapshots that can arise because of

  • network errors
  • cache errors
  • RPC timeouts or rate limits
    ...among many other factors that can go wrong.

Its attempt to find a common end epoch across 100+ projects to just get started on building aggregates along with a lot of convoluted logic on trying to sync up other higher order aggregates with the same causes the entire process to come to a halt in case of above issues creating gaps or duplicate entries in snapshot chains. The end result is that even though snapshots continue to proceed fine, there are no usable data points generated or updated.

Describe the solution you'd like

Move the indexing and aggregation ahead as long as majority of the snapshotted projects' snapshot chains progress fine. A couple of projects with broken chains should not cause an entire cessation in data point generation and updation.

Describe alternatives you've considered

Not applicable

Additional context

This scope is only for the phase where these nodes participate in an offchain consensus system and maintain a local state of the DAG chain and other indexes and aggregates on top of it. In the next phase of release, the protocol state of DAG chains, indexes, aggregates will be available on its own protocol chain and breakages in snapshot chains will not be a matter of concern.

Apart from the logical issue itself of halted indexing and aggregation, there are violations of accepted Pythonic design patterns, coding styles and standards which need to be addressed.

State Builder (Audit protocol dismantled) design and discussion

Top level issue to track State Builder Development

Image

State Builder agent has the following responsibilities

  • Build Snapshot DAG chains
  • Build Aggregate DAG chains
  • Segment and Store data on IPFS

Current State and Todo

Initially to be run by powerloom but later will be completely decentralized

  • Snapshot DAG Chain finalizer (sequencer component) (todo)
  • Aggregate DAG Chain finalizer (sequencer component) (todo)
  • Design Architecture (in progress)
  • Dag chain won't move forward if data is missing. If data is missing it will interact with the contract to release a FixEvent and wait for missing data to be populated and then catch up with everything else
  • Pruning and Segmentation (Can be GO based, @swagftw can update audit protocol services)
  • Dag Chain Verifier (Update and redesign to fit new decentralized architecture) (todo)

Web3 storage token is not getting autopopulated using build.sh

Describe the bug

Web3 storage token is not getting populated properly in generated settings.json while running build.sh from deploy causing dag-status-reporter to crash.

To Reproduce

Affected versions: current dockerify build

Steps to reproduce the behavior:

  1. Run ./build.sh with dockerify tag in deploy
  2. Check generated settings.json file

Expected behavior
Web3 storage token should get populated properly from .env file in deploy

Token Aggregator go service unexpected behaviour

Describe the bug

  • Service seems to be getting stuck due to unknown reason
  • Code clean up is required

Expected behavior

  • Continuous token aggregation

Proposed Solution

  • Code review is necessary along with code cleanup

Caveats
Not sure yet

Additional context
NA

Record status of snapshot submission to relayer

Is your feature request related to a problem?
To complete the feature request on snapshotter implementations of Powerloom Protocol to expose an internal API for snapshot processing status per epoch PowerLoom/pooler#40, it is necessary that the payload commit service captures this information in a transient cache entry which can be parsed to generate a snapshot's status report on its state transitions.

Describe the solution you'd like
When the payload commit service of audit protocol submits a snapshot against a project ID to the relayer service, capture its success or failure from the response returned.

The proposed datamodel is as follows

type SnapshotterStateUpdate struct {
	Status    string                 `json:"status"`
	Error     string                 `json:"error"`
	Extra     map[string]interface{} `json:"extra"`
	Timestamp int64                  `json:"timestamp"`
}

This is to be recorded against the state transition of REALYER_SEND as detailed in PowerLoom/pooler#40

Describe alternatives you've considered
NA

Additional context
NA

Indexing and aggregation failures on missing project metadata and null payloads in DAG blocks

Describe the bug

  1. If a project ID is registered for building indexes on, and there have been no snapshots received on it nor DAG chains built, it can cause the index generation (eg. 24 hour, 7 day timeseries) to completely fail for all projects registered for indexing.

The individual tasks launched as below may return None because of the decorator @acquire_bounded_semaphore applied to the function get_epoch_end_per_project(),

for project_id in registered_project_ids:
tasks.append(
get_epoch_end_per_project(
project_id=project_id,
semaphore=semaphore,
writer_redis_conn=writer_redis_conn,
ipfs_read_client=ipfs_read_client)
)

which will cause the following to fail

max_epoch_end = max(filter(lambda x: not(isinstance(x, Exception) or x[1] == 0), epoch_ends_map), key=lambda x: x[1])

  1. Aggregation for Uniswap specific data point, trade volume across a segment of the DAG chain of snapshots can fail because of a few DAG blocks that do not have a snapshot payload encapsulated in them.
6|ap-proto-indexer  | ERROR    pair_data_aggregation_service 2023-03-23 14:41:03,624 624 pair_data_aggregation_service-process_pairs_trade_volume_and_reserves: Error in process_pairs_trade_volume_and_reserves: 'NoneType' object is not subscriptable
6|ap-proto-indexer  | Traceback (most recent call last):
6|ap-proto-indexer  |   File "/src/pair_data_aggregation_service.py", line 498, in process_pairs_trade_volume_and_reserves
6|ap-proto-indexer  |     pair_trade_volume_7d = calculate_pair_trade_volume(dag_chain_7d)
6|ap-proto-indexer  |   File "/src/pair_data_aggregation_service.py", line 288, in calculate_pair_trade_volume
6|ap-proto-indexer  |     pair_trade_volume.total_volume = sum(map(lambda x: x['data']['payload']['totalTrade'], dag_chain))
6|ap-proto-indexer  |   File "/src/pair_data_aggregation_service.py", line 288, in <lambda>
6|ap-proto-indexer  |     pair_trade_volume.total_volume = sum(map(lambda x: x['data']['payload']['totalTrade'], dag_chain))
6|ap-proto-indexer  | TypeError: 'NoneType' object is not subscriptable

To Reproduce

As described in the previous section, the bug arises when the appropriate conditions are (un)satisfied.

Expected behavior

Errors concerning the state of individual project IDs should not halt aggregation generation on a collection of such project IDs.

Proposed Solution

  1. Handle specific exception from wrapped function in @acquire_bounded_semaphore to ensure a blanket None is not returned in case of all exceptions
  2. Ignore DAG blocks with null payloads while calculating aggregated trade volume.

Caveats
No caveats. These changes allow unstable or testing systems to continue generating aggregate data points and not have the need to reset local state of DAG chains and indexes built on them.

Additional context
Independent of OS and Python version, this will show up as long as the conditions described above are encountered.

Remove IPFS and web3 storage upload functionalities

Is your feature request related to a problem?
Presently, payload commit service consumes large payloads over RabbitMQ from snapshotters to ultimately upload the contents to IPFS and/or web3 storage. This is a huge overhead on system resources and parallel work is underway on the snapshotter repos to integrate these functionalities natively in there.

Describe the solution you'd like
Do not consume entire snapshot contents as payload over RabbitMQ and instead only act as a dispatcher for the snapshot submission transaction to relayer.

Describe alternatives you've considered
NA

Additional context
NA

Load consensus state from genesis for new snapshotters joining an offchain consensus network

Is your feature request related to a problem?

The present, off chain version of audit protocol maintains localized DAG chains of finalized snapshots.

The payload commit service specifically sets a state marker, project_first_epoch_end_height that will affect the epochs to be queried from the DAG finalizer service when it performs a self healing of its DAG chain of snapshots. This will give rise to indeterminate behavior in peers that have not joined a consensus network "from the beginning" because of the following code in DAG finalizer.

epochs_to_fetch = {
k: (k-1) * project_epoch_size + project_first_epoch_end_height
for k in range(
finalized_block_height_project+1, earliest_pending_dag_height_next_to_finalized
)
}

When a peer is 'late' to the network, its project_first_epoch_end_height will depend upon the first snapshot submission received by the local payload commit service. This will obviously vary across peers joining in at different times, which will cause a completely different epoch to be queried on the consensus service against a missing block height entry in the DAG chain.

Describe the solution you'd like

  • Move the project_first_epoch_end_height state marker to off chain consensus service
    AND/OR
  • Before a late peer begins submitting snapshot, preload a snapshot state as described following(which includes project_first_epoch_end_height) and let it catch up on the entirety of the chain.

Offchain Node State Load drawio

Describe alternatives you've considered

Not applicable

Additional context

Not applicable

Pruning-Archival service refactoring and unexpected behaviour fix

This is issue is for improving and eventually fixing Pruning-Archival service written in go.

This thread can be used for continuous discussion and progress of the changes required for this service to work properly.
Flow diagram for the existing service can be found here

  • need to understand flaws and gaps in current flow and compare with expected behavior and improve
  • update the service with updated behaviour

Current Behaviour:

  • service polls redis cache after configured time and runs pruning and archival on every project
  • segments can be created anytime and after uncertain duration hence polling can be bad
  • unclear code flow and hard to read
  • lacks proper retry mechanism
  • does not seem to prune & archive DAG project chains consistently.

Expected Behaviour:

  • Should prune DAG chains and archive and maintain valid state.
  • Remove need of polling
  • Should be treated as worker on tasks appended to message queue
  • Way to test locally
  • Readable and modular code

Do you want to work on this issue?

Yes

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.