Giter Club home page Giter Club logo

Comments (3)

Longarithm avatar Longarithm commented on July 17, 2024 2

Current progress:

  • After couple days of learning, could launch mirroring for the only existing setup on height 74020501. My branch with disabled background fetching works (Grafana)
  • With enabled background fetching nodes crash with error, I will investigate it
thread 'actix-rt|system:0|arbiter:2' panicked at 'Cannot update flat head to GTrgwbrks8pjEwd6vdWvKUBkZZPVEFDyRM9cJuCGt4Qj: StorageInternalError("delta does not exist for block GTrgwbrks8pjEwd6vdWvKUBkZZPVEFDyRM9cJuCGt4Qj")', chain/chain/src/chain.rs:2294:25
stack backtrace:
   0: rust_begin_unwind
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panicking.rs:67:14
   2: near_chain::chain::Chain::update_flat_storage_for_block
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
  • Asked @marcelo-gonzalez to create setup for height 94194482. Discussing possibility to merge most recent mirror code to master.

from nearcore.

jakmeier avatar jakmeier commented on July 17, 2024 1

As for the best way to produce load that maximizes storage pressure, I have been able to consistently observe slight undercharging in the following way:

  • Use large Sweat batches (should be in master soon: #9385)
  • Reduce the shard cache size in config.json to 1MB (to simulate when the state doesn't fit in the cache)

With that, using glcoud peristent SSDs, I saw slight undercharging. But not more than compute costs account for.
You can try to make the test more extreme by adding more accounts to the Sweat contract but that will take more time.
You can also try to run one Sweat contract on every shard, let me know if you need help how to arrange that.

Grafana board that shows constant undercharging in shard 3 for almost a week: https://nearinc.grafana.net/d/1dZGhpJ4k/blockchain-utilization?orgId=1&refresh=30m&var-chain_id=testnet-experimental&var-role=All&var-node_type=All&var-node_id=jakmeier-benchmarking-tps-validator0&var-shard_id=3&from=now-7d&to=now

from nearcore.

Longarithm avatar Longarithm commented on July 17, 2024

I was giving updates in the Zulip thread during the last week. Current summary:

Too small State latency

Finally, an improvement

However, using the same debug metrics we can clearly see that idea improves time spent on Main Thread State reads: without fetching, with fetching. But there is no confirmation that it resolves the incident, yeah.

Next steps ideas

I'll carefully say that background fetching works. Still, we need to show that it resolves the bottleneck we had during high load. What we can do:

  • I would try to understand where is the actual confusing bottleneck in the mirrored traffic. But that's not the priority.
  • Try to increase State latency for mirroring. One discrepancy we have with real setup is that we start from DBCol::State storing only latest State. In reality, State stores data for 5 epochs which should take ~2x more space
  • Focus on measuring traffic from 94194482 which corresponds to the latest incident.
  • Increase State latency for locust setup. @jakmeier's setup shows that it is pretty much possible (grafana)

from nearcore.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.