Giter Club home page Giter Club logo

cosmos-upgrades's Introduction

cosmos-upgrades

cosmos-upgrades is a powerful tool developed by Defiant Labs to search for scheduled Cosmos upgrades. This tool aims to streamline the process of tracking and managing upgrades in the Cosmos ecosystem.

๐ŸŒŒ Introduction

The Cosmos ecosystem is vast and ever-evolving. With frequent upgrades and enhancements, it becomes crucial for stakeholders to keep track of scheduled upgrades. cosmos-upgrades bridges this gap by providing a centralized solution to fetch and monitor these upgrades.

๐Ÿ›  Problem Statement

Keeping track of scheduled upgrades in a decentralized ecosystem can be challenging. Missing an upgrade can lead to potential downtimes, security vulnerabilities, and missed opportunities. cosmos-upgrades addresses this challenge by offering a reliable and up-to-date source of information for all scheduled Cosmos upgrades.

๐Ÿ“š Chain-Registry Deep Dive

The chain-registry is more than just a repository of chain details; it's the backbone that powers the cosmos-upgrades tool. Each chain specified in the request is mapped to its corresponding JSON file within the chain-registry. This mapping allows the tool to look up vital information, such as endpoints, for each chain.

For instance, when you specify "akash" in your request, the tool refers to the akash/chain.json file in the chain-registry to fetch the necessary details.

Why is the Chain-Registry Essential?

  1. Accuracy & Reliability: By centralizing the details of all chains in the chain-registry, we ensure that the data fetched by cosmos-upgrades is always accurate and up-to-date.
  2. Extensibility: The design of the chain-registry allows for easy additions of new chains or updates to existing ones.
  3. Community Collaboration: The chain-registry is open-source, fostering a collaborative environment. If a user notices a missing chain or outdated information, they can contribute by submitting a PR with the correct details.

What if a Network is Missing?

If a particular network or chain is not present in the chain-registry, the cosmos-upgrades tool won't be able to provide information about it. In such cases, we strongly encourage users to:

  • Reach out to the protocol leads to inform them about the omission.
  • Take a proactive approach by submitting a PR to the chain-registry with the correct information.

By doing so, not only do you enhance the tool's capabilities, but you also contribute to the broader Cosmos community.

๐Ÿš€ Making Requests

To fetch the scheduled upgrades, you can use the following curl command for both mainnets and testnets:

Mainnets

curl -s -X GET \
  -H "Content-Type: application/json" \
  https://cosmos-upgrades.apis.defiantlabs.net/mainnets

Testnets

curl -s -X GET \
  -H "Content-Type: application/json" \
  https://cosmos-upgrades.apis.defiantlabs.net/testnets

Note: The response will contain details of the scheduled upgrades for the specified networks.

๐Ÿงช Automated Script (upgrades.sh)

upgrades.sh is a convenient script provided to fetch scheduled upgrades for both mainnets and testnets. It offers customization options and simplifies the process of tracking upgrades.

Usage

  1. Make sure you have jq installed on your system. You can install it using your system's package manager.

  2. Open a terminal and navigate to the directory containing upgrades.sh.

  3. Run the script to fetch upgrades for both mainnets and testnets:

./upgrades.sh

The script will provide you with a list of scheduled upgrades for the specified networks.

Customizing Networks

You can customize the list of networks by modifying the networks associative array in the script. The networks array is divided into mainnets and testnets, and you can add or remove network names as needed.

declare -A networks=(
  [mainnets]="secretnetwork osmosis neutron nolus crescent akash cosmoshub sentinel stargaze omniflixhub cosmoshub terra kujira stride injective juno agoric evmos noble omny quasar dvpn onomy"
  [testnets]="agorictestnet quasartestnet stridetestnet onomytestnet axelartestnet nibirutestnet nobletestnet dydxtestnet osmosistestnet cosmoshubtestnet"
)

CHAIN_WATCH Environment Variable

The CHAIN_WATCH environment variable allows you to specify a particular chain(s) to use, instead of all. If set, the app will only poll the chain-regsitry for the specified chain(s). Otherwise, it will poll all chains in the registry. You can still filter this output with other tooling like upgrades.sh

For example, to only poll "cosmoshub" rpc/rest endpoints, you can set CHAIN_WATCH as follows:

export CHAIN_WATCH="cosmoshub"

cosmos-upgrades's People

Contributors

danbryan avatar pharr117 avatar clemensgg avatar

Stargazers

Chris Fraser avatar devops-evmos avatar Javed Khan avatar  avatar

cosmos-upgrades's Issues

Implement production ready setup with reader/writer worker split

The application is currently reliant on the Flask development mode setup since data is gathered and stored in single-process available Python global maps.

Flask is meant to be run behind some server type (sometimesa WSGI, gunicorn being one example). These servers spin up multiple instances of the Python program to help with load balancing and app runs.

The application data system is not prepared for this. Each gunicorn process would have its own global data store, meaning the app would be bashing RPC servers on a per-worker basis.

The application needs to be moved to a more production-ready state. One way would be to implement application modes for reader/writer process splitting such as:

  1. Reader mode -> a Flask application that reads from an external data store and responds to requests
  2. Write mode -> a Python application that writes to an external data store
  3. Data store layer -> A data storage location that both reader and writer interface with

A production deployment would then look like:

  1. A gunicorn server that spins up multiple Reader mode versions of the application
  2. A writer process that writes to the data store
  3. A shared data storage layer

REST Health checks are failing in multiple ways, either fix or eliminate health checks

We fail early in the update info workflow by hitting RPC and REST servers to check for health.

We use the following function:

cosmos-upgrades/app.py

Lines 133 to 147 in 619fed1

def is_endpoint_healthy(endpoint):
try:
response = requests.get(f"{endpoint}/health", timeout=1, verify=False)
# some chains dont implement the /health endpoint. Should we just skip /health and go directly to the below?
if response.status_code == 501:
response = requests.get(
f"{endpoint}/cosmos/gov/v1beta1/proposals?proposal_status=2",
timeout=1,
verify=False,
)
return response.status_code == 200
except:
return False

The problem is:

  1. Not all REST servers provide a /health endpoint
  2. Not all Chains use the gov module, making the /cosmos/gov/v1beta1/proposals?proposal_status=2 endpoint a bad check (see Noble and NobleTestnet for examples)

We either need to:

  1. Fix the health checks to use a more reliable endpoint
  2. Eliminate health checks entirely and just go straight into the worfklow

Some considerations:

  1. Picking a reliable endpoint will require an endpoint available on ALL chain servers. Module specific endpoints may not be useful here
  2. The health checks are useful in getting a list of healthy servers from the Chain Registry. Sometimes the servers in the Chain Registry API map are not working and unavailable and pre-checking for health lets us pick out proper endpoints to use

Handle multiple upgrade proposals

Currently, the code finds the first upgrade proposal and uses that when reaching out to the /cosmos/gov/v1beta1/proposals?proposal_status=2 endpoint:

cosmos-upgrades/app.py

Lines 272 to 278 in 1819705

for proposal in data.get("proposals", []):
content = proposal.get("content", {})
if (
content.get("@type")
== "/cosmos.upgrade.v1beta1.SoftwareUpgradeProposal"
):
# Extract version from the plan name

We should handle multiple upgrade proposals in case a chain has multiple active proposals being voted on.

We should consider whether the object data type is the best response value and maybe instead return an array of upgrade proposals in this case.

rework semantic version match

Look in the plan info first, then look at the description field for the semantic version. We should be able to report the version as 12.0.0 as seen in the plan info picture below.

{
  "type": "mainnet",
  "network": "cosmoshub",
  "rpc_server": "https://cosmos-rpc.onivalidator.com",
  "latest_block_height": 16848362,
  "upgrade_found": true,
  "upgrade_name": "v12",
  "source": "active_upgrade_proposals",
  "upgrade_block_height": 16985500,
  "estimated_upgrade_time": "2023-09-13T11:39:27.176309",
  "version": "12"
}

image

Semver matching function is matching on pre-release tags when it should skip those

The find_best_semver_for_versions function seems to be matching on pre-release tags.

User reports say that the persistence network was reporting the following info before their upgrade today:

{
  "type": "mainnet",
  "network": "persistence",
  "rpc_server": "https://rpc-persistence.architectnodes.com/",
  "rest_server": "https://api-persistence.cosmos-spaces.cloud/",
  "latest_block_height": 13870121,
  "upgrade_found": true,
  "upgrade_name": "v10",
  "source": "current_upgrade_plan",
  "upgrade_block_height": 13870350,
  "estimated_upgrade_time": "2023-11-02T14:34:22.177132",
  "upgrade_plan": {
    "height": 13870350,
    "binaries": [],
    "name": "v10",
    "upgraded_client_state": null
  },
  "version": "v10.1.0",
  "error": null
}

Notice v10.1.0. This was found in a pre-release tag:

image

We need to fix the find_best_semver_for_versions so it skips prerelease tags.

Handle active upgrade proposals while a current plan is already in place

Cosmos handles upgrades by scheduling a plan. There can only ever be 1 plan in place, see here for why:

https://github.com/cosmos/cosmos-sdk/blob/5eaa7b8d3c50eefc5dd95210c38c84c1e829509f/x/upgrade/keeper/keeper.go#L185-L189

However, there may be an active proposal being voted on while a current plan is in place. The code currently prefers the active proposal over the current plan, returning data only on that:

cosmos-upgrades/app.py

Lines 563 to 579 in 1819705

if (
active_upgrade_version
and (active_upgrade_height is not None)
and active_upgrade_height > latest_block_height
):
upgrade_block_height = active_upgrade_height
upgrade_version = active_upgrade_version
upgrade_name = active_upgrade_name
source = "active_upgrade_proposals"
rest_server_used = current_endpoint
break
if (
current_upgrade_version
and (current_upgrade_height is not None)
and (current_plan_dump is not None)
and current_upgrade_height > latest_block_height

We should handle both these cases and return data on both instead of just one.

Add gh_commit to output

See this for the hash logic
https://script.google.com/home/projects/1IU3doZX1iY20JJ_DbwRNOwuufWIrLCqgFauwTsfp3d9jyxPYTViTXBte/edit

add a new field to the output

gh_commit with a value of the hash

{
  "type": "mainnet",
  "network": "akash",
  "rpc_server": "https://akash-rpc.lavenderfive.com:443",
  "rest_server": "https://api-akash-ia.cosmosia.notional.ventures",
  "latest_block_height": 12981806,
  "upgrade_found": true,
  "upgrade_name": "v0.26.0",
  "source": "current_upgrade_plan",
  "upgrade_block_height": 12992204,
  "estimated_upgrade_time": "2023-09-27T16:02:01.530395",
  "upgrade_plan": {
    "height": 12992204,
    "binaries": [],
    "name": "v0.26.0",
    "upgraded_client_state": null
  },
  "version": "v0.26.0",
  "gh_commit" : "aabbccdd"
  "error": null
}

Suggestion: More fields

Since you guys are filtering the data I think you should provide more data outside, Polkachu did a tremendous work with those.

What is missing that I think is a must have:

  • gov link to the upgrade
  • block link to the explorer block (like mintscan etc)

Nice to have:

  • Crawl the body of the gov and extract the repo link if any or attempt to find the tag within github
  • Short guide to upgrade the node
  • Cosmo visor folder

Datetime format string error during update loop

One of the update loop iterations is failing on a bad date string:

cosmos-upgrades Completed fetch data for network rsprovidertestnet                                                                                                       โ”‚
โ”‚ cosmos-upgrades Completed fetch data for network persistencetestnet2                                                                                                     โ”‚
โ”‚ cosmos-upgrades Completed fetch data for network sixtestnet                                                                                                              โ”‚
โ”‚ cosmos-upgrades Found 1 rest endpoints and 1 rpc endpoints for permtestnet                                                                                               โ”‚
โ”‚ cosmos-upgrades 35.191.10.132 - - [19/Oct/2023 17:44:11] "GET /healthz HTTP/1.1" 200 -                                                                                   โ”‚
โ”‚ cosmos-upgrades Completed fetch data for network permtestnet                                                                                                             โ”‚
โ”‚ cosmos-upgrades Traceback (most recent call last):                                                                                                                       โ”‚
โ”‚ cosmos-upgrades   File "/app/app.py", line 703, in update_data                                                                                                           โ”‚
โ”‚ cosmos-upgrades     testnet_data = list(                                                                                                                                 โ”‚
โ”‚ cosmos-upgrades   File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 609, in result_iterator                                                              โ”‚
โ”‚ cosmos-upgrades     yield fs.pop().result()                                                                                                                              โ”‚
โ”‚ cosmos-upgrades   File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 439, in result                                                                       โ”‚
โ”‚ cosmos-upgrades     return self.__get_result()                                                                                                                           โ”‚
โ”‚ cosmos-upgrades   File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result                                                                 โ”‚
โ”‚ cosmos-upgrades     raise self._exception                                                                                                                                โ”‚
โ”‚ cosmos-upgrades   File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 58, in run                                                                          โ”‚
โ”‚ cosmos-upgrades     result = self.fn(*self.args, **self.kwargs)                                                                                                          โ”‚
โ”‚ cosmos-upgrades   File "/app/app.py", line 707, in <lambda>                                                                                                              โ”‚
โ”‚ cosmos-upgrades     lambda network, path: fetch_data_for_network(                                                                                                        โ”‚
โ”‚ cosmos-upgrades   File "/app/app.py", line 623, in fetch_data_for_network                                                                                                โ”‚
โ”‚ cosmos-upgrades     current_block_datetime = parse_isoformat_string(current_block_time)                                                                                  โ”‚
โ”‚ cosmos-upgrades   File "/app/app.py", line 207, in parse_isoformat_string                                                                                                โ”‚
โ”‚ cosmos-upgrades     return datetime.fromisoformat(date_string)                                                                                                           โ”‚
โ”‚ cosmos-upgrades Error in update_data loop after 95.593695 seconds: Invalid isoformat string: '2023-10-19T17:43:49.9951+00:00'                                            โ”‚
โ”‚ cosmos-upgrades Error encountered. Sleeping for 1 minute before retrying...                                                                                              โ”‚
โ”‚ cosmos-upgrades ValueError: Invalid isoformat string: '2023-10-19T17:43:49.9951+00:00'                                                                                   โ”‚
โ”‚ cosmos-upgrades 35.191.10.130 - - [19/Oct/2023 17:44:12] "GET /healthz HTTP/1.1" 200 -

We need to catch the error and/or parse the string differently depending on the issue.

Add support for block-height to UTC time

Based on the speed of the network and the unbonding time we can determine how many blocks per second happen.

Here is another script i have written in the past which calculates this . This functionality should be integrated and create a new field in the json dump with the UTC time of the upgrade based on the prediction function

"""cosmos script to calculate chain info"""

import requests
import typer
from dateutil import parser

# How many blocks to go back to calculate seconds per block.
# The higher the number, the more accurate the calculation
DELTA_BLOCKS = 35000
PRUNING_PADDING = 1000


def get_latest_height_time(peer):
    """get_latest_height_time returns the latest block height and time"""
    try:
        res = requests.get(f"{peer}/status", timeout=10)
        res.raise_for_status()
        data = res.json()
    except requests.exceptions.RequestException as e:
        raise SystemExit(f"Error getting latest block height and time: {e}\n{e.response.content}")

    try:
        latest_height = int(data["result"]["sync_info"]["latest_block_height"])
        latest_time_seconds = int(parser.isoparse(data["result"]["sync_info"]["latest_block_time"]).timestamp())
    except (ValueError, KeyError) as e:
        raise SystemExit(f"Error getting latest block height and time: {e}")

    return latest_height, latest_time_seconds


def get_delta_time(peer, delta_blocks, latest_height):
    """get_delta_time returns the time of the block at the given delta blocks"""
    delta_height = latest_height - delta_blocks
    try:
        res = requests.get(f"{peer}/block?height={delta_height}", timeout=10)
        res.raise_for_status()
        data = res.json()
    except requests.exceptions.RequestException as e:
        raise SystemExit(f"Error getting delta time: {e}\n{e.response.content}")

    try:
        delta_time_seconds = int(parser.isoparse(data["result"]["block"]["header"]["time"]).timestamp())
    except (ValueError, KeyError) as e:
        raise SystemExit(f"Error calculating delta_time_seconds: {e}")

    return delta_time_seconds


def main(rpc_peer: str, unbonding_days: int):
    latest_height, latest_time_seconds = get_latest_height_time(rpc_peer)
    delta_time_seconds = get_delta_time(rpc_peer, DELTA_BLOCKS, latest_height)

    # Calculate seconds per block
    seconds_per_block = round((latest_time_seconds - delta_time_seconds) / DELTA_BLOCKS, 2)
    print(f"Seconds per block: {seconds_per_block}")

    # Calculate unbonding period in blocks
    unbonding_period_blocks = round((unbonding_days * 86400) / seconds_per_block)
    print(f"Unbonding period blocks: {unbonding_period_blocks}")

    # Calculate minimum blocks to retain
    blocks_to_keep = int(round(((unbonding_period_blocks + PRUNING_PADDING) / DELTA_BLOCKS) + 0.5) * DELTA_BLOCKS)
    print(f"Set min-retain-blocks = {blocks_to_keep} in app.toml")


if __name__ == "__main__":
    typer.run(main)

Quasar reporting

Check on Quasar upgrades, getting reports the cosmos-sdk version is being returned, instead of the chain binary.

bug: some chains don't check for upgrades.

https://twitter.com/rhinostake/status/1698394405760377072

For some reason, umee did not find a rest_server to use to check for upgrades despite the logs showing

Found 16 rest endpoints and 16 rpc endpoints for umee
Completed fetch data for network umee
{
  "type": "mainnet",
  "network": "umee",
  "rest_server": "",
  "rpc_server": "https://rpc-umee.cosmos-spaces.cloud",
  "latest_block_height": 8179463,
  "upgrade_found": false,
  "upgrade_name": "",
  "source": "",
  "upgrade_block_height": null,
  "estimated_upgrade_time": null,
  "version": ""
}

I think the problem is bigger than umee. We need to have checks to ensure each chain has at least one functional RPC/REST server from the CR.

Add example of how to filter the requests from mainnets

Add example of how to filter the requests from mainnets

# Define an array of networks you care about
networks=("network1" "network2" "network3" ... "network15")

# Construct the jq filter dynamically
jq_filter='.[] | select(.network | IN('
for network in "${networks[@]}"; do
    jq_filter+='"'$network'",'
done
jq_filter=${jq_filter%,}'))'  # Remove the trailing comma and close the parenthesis

# Use the constructed filter with curl and jq
curl -s -X GET \
  -H "Content-Type: application/json" \
  http://localhost:5000/mainnets | jq "$jq_filter"

persist chain registry data

Rework defiant infra to give a persistent disk that can be used to hold chain-registry data.
Update the python code logic so If the data is less then X hours old, it should not need to be re-fetched.

Use plan name and search on repo git tags to find potential versions

We currently do a naive regex search on the string dump of the entire plan when trying to find version numbers. We should attempt to do the following instead:

  1. Get the plan name, these are usually of the form "v"
  2. Search the git tags in the associated chain repo
  3. Find all matching tags that have the plan name version number in them
  4. Return the "best" match (or the closest match that can be made)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.