anza-xyz / agave Goto Github PK

View Code? Open in Web Editor NEW

This project forked from solana-labs/solana

231.0 11.0 98.0 371.3 MB

Web-Scale Blockchain for fast, secure, scalable, decentralized apps and marketplaces.

Home Page: https://www.anza.xyz/

License: Apache License 2.0

Shell 2.05% JavaScript 0.04% C++ 0.08% Python 0.09% C 0.60% Rust 97.07% Makefile 0.04% Dockerfile 0.02% TeX 0.01%

agave's Introduction

Building

1. Install rustc, cargo and rustfmt.

$ curl https://sh.rustup.rs -sSf | sh
$ source $HOME/.cargo/env
$ rustup component add rustfmt

When building the master branch, please make sure you are using the latest stable rust version by running:

$ rustup update

When building a specific release branch, you should check the rust version in ci/rust-version.sh and if necessary, install that version by running:

$ rustup install VERSION

Note that if this is not the latest rust version on your machine, cargo commands may require an override in order to use the correct version.

On Linux systems you may need to install libssl-dev, pkg-config, zlib1g-dev, protobuf etc.

On Ubuntu:

$ sudo apt-get update
$ sudo apt-get install libssl-dev libudev-dev pkg-config zlib1g-dev llvm clang cmake make libprotobuf-dev protobuf-compiler

On Fedora:

$ sudo dnf install openssl-devel systemd-devel pkg-config zlib-devel llvm clang cmake make protobuf-devel protobuf-compiler perl-core

2. Download the source code.

$ git clone https://github.com/anza-xyz/agave.git
$ cd agave

3. Build.

$ ./cargo build

Testing

Run the test suite:

$ ./cargo test

Starting a local testnet

Start your own testnet locally, instructions are in the online docs.

Accessing the remote development cluster

devnet - stable public cluster for development accessible via devnet.solana.com. Runs 24/7. Learn more about the public clusters

Benchmarking

First, install the nightly build of rustc. cargo bench requires the use of the unstable features only available in the nightly build.

$ rustup install nightly

Run the benchmarks:

$ cargo +nightly bench

Release Process

The release process for this project is described here.

Code coverage

To generate code coverage statistics:

$ scripts/coverage.sh
$ open target/cov/lcov-local/index.html

Why coverage? While most see coverage as a code quality metric, we see it primarily as a developer productivity metric. When a developer makes a change to the codebase, presumably it's a solution to some problem. Our unit-test suite is how we encode the set of problems the codebase solves. Running the test suite should indicate that your change didn't infringe on anyone else's solutions. Adding a test protects your solution from future changes. Say you don't understand why a line of code exists, try deleting it and running the unit-tests. The nearest test failure should tell you what problem was solved by that code. If no test fails, go ahead and submit a Pull Request that asks, "what problem is solved by this code?" On the other hand, if a test does fail and you can think of a better way to solve the same problem, a Pull Request with your solution would most certainly be welcome! Likewise, if rewriting a test can better communicate what code it's protecting, please send us that patch!

agave's People

Contributors

Stargazers

Watchers

Forkers

pgarg66 apfitzge joeaba ksolana superteamdao dmakarov firedpeanut lucasste ilya-bobyr samkim-crypto frixoe tahsintunan anwayde wgb5445 kevinheavey nvjle 0x0ece zilayo nagaprasadvr oneforalone bartenbach xoac italoacasas xjcaa henry-e simpdigit solana-developers rashann247nowayout soundsonacid hwhelchel michaelschem masdefi hrushi20 808putnam rhovian delegate-sh anishnath blockworks-foundation xuanwo vadorovsky chall-t sfuhfds lightprotocol debridge-finance bwarelabs joke123321 hoangtuanictvn hardhatchad chorusone rakurai-io waitfreemaxi steveluscher arkdrq sam0x17 simlecode foxxcn picosol fsgegs paladin-bladesmith farakhali1 jasl fkouteib heysion vmarkushin yhzhangjump vctorized ferric-sol nitro-svm abk-labs neonlabsorg knotts14 olivernchalk qdrs videobitva soonl2 quannadev 0xmmbd arthur999999 solforge-labs sergeytimoshin kerimz yona-labs wl4g-blockchain osadovy dan4ik605743 firedancer-io cpubot blockydevs spherelaboratories sphere-laboratories i1i1 farinm dain-protocol octopus-network fikunmi-ap ohaddahan rexstjohn-anza lou-kamades

agave's Issues

`TransactionCost::SimpleVote` doesn't report correct number of sigs for two-sig votes

Problem

TransactionCost has two cases in the enum; SimpleVote and Transaction. A SimpleVote is defined by these things:

agave/cost-model/src/transaction_cost.rs

Lines 8 to 10 in 11aa06d

 /// SimpleVote has a simpler and pre-determined format: it has 1 or 2 signatures, 

 /// 2 write locks, a vote instruction and less than 32k (page size) accounts to load. 

 /// It's cost therefore can be static #33269.

Note that a SimpleVote can contain 1 or 2 signatures, but TransactionCost::num_transaction_signatures() reports a fixed value of 1 for simple votes:

agave/cost-model/src/transaction_cost.rs

Lines 95 to 100 in 11aa06d

 pub fn num_transaction_signatures(&self) -> u64 { 

 match self { 

 Self::SimpleVote { .. } => 1, 

 Self::Transaction(usage_cost) => usage_cost.num_transaction_signatures, 

 } 

 }

This doesn't affect actual fees (the fee-payer will still be charged for 2 sigs) or CU's in the way things are currently written, but for the sake of accurate counting, it is incorrect.

Proposed Solution

Update SimpleVote to track the number of signatures:

agave/cost-model/src/transaction_cost.rs

Line 15 in 11aa06d

SimpleVote { writable_accounts: Vec<Pubkey> },

and populate that field here:

agave/cost-model/src/cost_model.rs

Lines 40 to 42 in 11aa06d

 TransactionCost::SimpleVote { 

 writable_accounts: Self::get_writable_accounts(transaction), 

 }

Propagate ELF and verification errors to SVM

Problem

In the SVM project, one can bypass the usual program deployment steps developers carry to deploy a program on chain. As a consequence, if someone tries to execute a bad program in the SVM without prior verification, the error message is simply "failed verification" without further description.

Proposed Solution

Propagate to the SVM the pertinent errors.

Slot can be used for the AppendVec ID

Problem

When creating a new append vec, it needs an ID. That ID comes from an atomic u32 that is incremented for each append vec. Before the accounts write cache, it was possible to have multiple append vecs for a single slot. So the ID was used as a unique identifier of all append vecs. Now, there can only be one append vec per slot. This means the slot itself is a unique identifier for all append vec.

The atomic variable used to get the new append vec IDs can be removed. New append vec can use the slot as their ID.

Proposed Solution

Remove the atomic variable for the next append vec ID.
When creating new append vecs, use the slot as the ID.
When extracting snapshot archives, the remapper needs to handle old snapshot archives that had a different append vec ID. In this case, the remapper should rename the storage files and append vec IDs to be its slot.

Solana Frontend connection issue for making the transactions

We working for a dex based solution on solana network, it properly works on localnet including all the trasanctions. But we make changes to devnet remaining all same but encountering the following issues. Please let me know, whether devnet has any issues or any other solutions needed to resolve this case.

cli: Relax stake split destination check

This is a copy of solana-labs#32735

Problem

In the CLI, process_stake_split checks that the destination stake account doesn't exist, and errors if it does. In most cases, this saves from getting an on-chain error from trying to split into some other account.

https://github.com/solana-labs/solana/blob/849525735f784f8e2e0fca19ba5699aea6b1724e/cli/src/stake.rs#L1884-L1893

However, it's possible to pre-seed some lamports into a system account before splitting into it, to keep the full amount delegated, so this check is needlessly strict.

Proposed Solution

Relax the check by allowing the destination split account to be owned by the system program, with some lamports, and no data. It can error in every other case still.

Anchor Deploy is very slow on mainnet

solana --version
solana-cli 1.18.3 (src:6f13e1c2; feat:3352961542, client:SolanaLabs)

anchor --version
anchor-cli 0.29.0

rustc --version
rustc 1.76.0 (07dca489a 2024-02-04)

I tried to deploy the anchor program on mainnet
anchor deploy

But the speed is very slow (On devnet, the speed was fast)
Deploy fee was very high and distribution was done at 1% for 6 hours.

Could you please let me know if the solana network is very busy and speed is very low now?
Then , how to speed up the deployment on mainnet?

quic batched send metrics are unreliable on error

Problem

rpc sts only increments its send failure once on error, regardless of batch size in the case of batched sends

agave/send-transaction-service/src/send_transaction_service.rs

Lines 748 to 760 in b27c80a

 let result = if wire_transactions.len() == 1 { 

 Self::send_transaction(tpu_address, wire_transactions[0], connection_cache) 

 } else { 

 Self::send_transactions_with_metrics(tpu_address, wire_transactions, connection_cache) 

 }; 

 if let Err(err) = result { 

 warn!( 

 "Failed to send transaction transaction to {}: {:?}", 

 tpu_address, err 

 ); 

 stats.send_failure_count.fetch_add(1, Ordering::Relaxed); 

 }

it can't really do any better because the quic tpu client batched send implementation bails on the first error, discarding statuses of any prior and pending sends

agave/quic-client/src/nonblocking/quic_client.rs

Lines 506 to 512 in cc3afa5

 for f in futures { 

 f.await 

 .into_iter() 

 .try_for_each(|res| res) 

 .map_err(Into::<ClientErrorKind>::into)?; 

 } 

 Ok(())

Proposed Solution

investigate whether waiting to collect all results of a batched send has significant performance penalty.
if so, split rpc sts metrics to report failed single and batched sends separately.
if not, rework the quic client's batched send such that in can return the error count.

Investigation items for SVM

Items to possibly investigate in the SVM context

Presently, the SVM processes a transaction and returns its results without updating the database with the account modifications. It may be a little counterintuitive to not have this update in the SVM, as external users would need to update their database themselves. Investigate if it is possible to move account commits to the SVM.
External users may want to user programs without a loader of with a custom loader. Investigate if is possible to relax the existing constraints to expose newer loaders without compromising security.
Maintenance of the Solana account model.

Scheduler: Bank Detection Metrics

Problem

Useful to have metrics for bank creation, detection, and time remaining

Proposed Solution

Add metrics similar to those leader_slot_loop_timings from legacy banking stage

Sysvar cache manipulation functions cab be moved to SVM

Problem

The set of functions depicted below manipulate the sysvar cache, which is a member of the transaction processor (i.e. the SVM), but they are implemented in bank.

agave/runtime/src/bank/sysvar_cache.rs

Lines 6 to 28 in f1a82cb

 impl Bank { 

 pub(crate) fn fill_missing_sysvar_cache_entries(&self) { 

 let mut sysvar_cache = self.transaction_processor.sysvar_cache.write().unwrap(); 

 sysvar_cache.fill_missing_entries(|pubkey, callback| { 

 if let Some(account) = self.get_account_with_fixed_root(pubkey) { 

 callback(account.data()); 

 } 

 }); 

 } 

 pub(crate) fn reset_sysvar_cache(&self) { 

 let mut sysvar_cache = self.transaction_processor.sysvar_cache.write().unwrap(); 

 sysvar_cache.reset(); 

 } 

 pub fn get_sysvar_cache_for_tests(&self) -> SysvarCache { 

 self.transaction_processor 

 .sysvar_cache 

 .read() 

 .unwrap() 

 .clone() 

 } 

 }

In addition, external users of the SVM need to set up the sysvar cache to use sysvars.

Proposed Solution

Move the functions to SVM.

sdk: Implement borsh traits via a proxy crate

This is a copy of solana-labs#32668

Problem

solana-labs#32511 adds the borsh 0.9 traits to Pubkey, but because of how the crate name is resolved in the macro, the implementations for the older traits have to be done by hand, which is error-prone and tedious.

Proposed Solution

Implement the traits via a proxy crate, just like https://github.com/ilya-bobyr/borsh-multi-version/

SVM - requirement of load_and_execute_sanitized_transaction

Problem

load_and_execute_sanitized_transactions has a undocumented requirement that sanitized_txs has no conflicting transactions.
If conflicting transactions are passed in, it will not crash, but may give incorrect results
- The validation and leader code currently check before calling, so this is not happening in our production environment
- However, this function requirement is not checked nor documented in this new publicly exposed interface
- If other groups begin using this interface, it is a non-obvious requirement

Proposed Solution

Short term:
- Add function documentation for this requirement
Longer term:
- Make load_and_execute_sanitized_transactions allow for conflicting transactions to be passed in
  - This relates to pending solana-foundation/solana-improvement-documents#83

cli: Verify ELF with actual feature set on network before program deployment

This is a copy of solana-labs#32462

Problem

As noticed in solana-labs#32448, there's currently an issue deploying programs to the network that use new syscalls. The quick solution there was to change the check to FeatureSet::all_enabled() to unblock users. This is OK because the loader will do a final check before deployment.

However, it's still possible to write an ELF buffer that contains an unactivated syscall, only for it to fail at the final stage deployment. That's not great UX, since the CLI can tell the user if their program contains unresolved symbols before ever attempting a deployment.

Proposed Solution

Rather than using FeatureSet::all_enabled(), fetch the actual feature set from the network and validate the ELF against that. We can also add a flag to skip these checks. This can be useful for a DAO or multisig queuing up a program upgrade once a syscall is activated.

primordial_accounts_file with more than 200M SOL results in bad genesis

Problem

If the cumulative balance in a primordial_accounts_file is more than about 165,000,000 SOL, then the balance of the one thanks account computed here is 0. The computation of stakes_lamports in create_and_add_stakes then underflows, resulting in an account having a balance of nearly 2^64 lamports. This causes sporadic panics when running the chain because capitalization overflows.

Proposed Solution

I don't understand the purpose of the one thanks "community" account. Is it still needed? Is there a reason its balance needs to be computed that way?

quic streamer metrics are easy to misinterpret

Problem

quic streamer metrics have several footguns that make them easy to misinterpret without heavy memorization of quirks or referencing the code

Proposed Solution

holistic recapitulation of quic streamer metrics. this should happen under a new datapoint name to avoid further confusion due to aliasing during upgrades

Update the epoch stake in use in wen_restart

In the current wen_restart implementation, we use epoch_stakes from the root bank, which is wrong if some validators have root bank in epoch X while others have root bank in epoch X+1, remember to fix it later.

Error: failed to get balance of account TypeError: fetch failed

Problem

Hello, guys, I'm trying following the tutorial https://www.soldev.app/course/intro-to-reading-data.
But such code doesn't work

import { Connection, PublicKey, clusterApiUrl } from "@solana/web3.js";

const connection = new Connection(clusterApiUrl("devnet"));
const address = new PublicKey('CenYq6bDRB7p73EjsPEpiYN7uveyPUTdXkDkgUduboaN');
const balance = await connection.getBalance(address);

console.log(`The balance of the account at ${address} is ${balance} lamports`); 
console.log(`✅ Finished!`)

It will result:

yarco@heaven scripts$ HTTPS_PROXY=http://127.0.0.1:7890 npx esrun key_001.ts 
bigint: Failed to load bindings, pure JS will be used (try npm run rebuild?)
(node:55856) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
/Users/yarco/Documents/Workspace/Blockchain/solana/scripts/node_modules/.pnpm/@[email protected]/node_modules/@solana/web3.js/src/connection.ts:3245
        throw new Error(
              ^


Error: failed to get balance of account CenYq6bDRB7p73EjsPEpiYN7uveyPUTdXkDkgUduboaN: TypeError: fetch failed
    at e (/Users/yarco/Documents/Workspace/Blockchain/solana/scripts/node_modules/.pnpm/@[email protected]/node_modules/@solana/web3.js/src/connection.ts:3245:15)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at Connection.getBalance (/Users/yarco/Documents/Workspace/Blockchain/solana/scripts/node_modules/.pnpm/@[email protected]/node_modules/@solana/web3.js/src/connection.ts:3242:12)
    at async file:///Users/yarco/Documents/Workspace/Blockchain/solana/scripts/[eval1]:13:15

Node.js v21.7.1

There's also such question in solana.stackexchange.com , but got no answer.

I'm pretty sure it is not networking problem, because the axios answer axios to another question in stackoverflow.com does work.

Check my running result:

So, maybe something wrong in web3.js? bigint: Failed to load bindings, pure JS will be used (try npm run rebuild?) ?
(I've also tried downgrade the node to v18, still the same error)

Proposed Solution

For now only the axios version: https://stackoverflow.com/a/77715282/1960612
does work to me.

Mac m1 pro
Node: v21.7.1 (also tried v18)
pnpm: 8.15.5

@solana-developers/helpers 2.1.0
@solana/web3.js 1.91.1
axios 1.6.8
bigint-buffer 1.1.5
dotenv 16.4.5
esrun 3.2.26
typescript 5.4.2

Considering upgrade outdated crate curve25519-dalek 3.2.1 -> 4.x

Problem

Since curve25519-dalek 3.2.1 constrains its dependence zeroize to version >=1, < 1.4, this would conflict while using crate tiny-bip39 1.0.0 and solana-sdk together.

Proposed Solution

I've tried to change curve25519-dalek version to the latest one 4.1.2, and import it with path specification, it works properly with address generation. However, i saw there's some crates outdated, too. Are there some known problems that can not upgrade the dependencies? If not, please considering upgrade the dependencies version to the latest one, especially curve2559-dalek.

SVM entry point requires builtins registered in two different places

Problem

In order to use the SVM, we must register the builtins as it is now done in the add_builtin function in bank.rs (link), and then pass them to load_and_execute_sanitized_transactions as a vector of public keys.

Proposed Solution

Ideally, we would only register the builtins on the SVM once, obviating the vector of public keys passed to load_and_execute_sanitized_transactions.

Allow configuration of all thread-pool sizes via CLI

Problem

As mentioned in #35, agave-validator (previously solana-validator) spawns lots of thread pools / tokio runtimes. For the majority of these, the size of the pool/runtime is specified by a constant in code, or configured to scale with the number of cores on the machine.

Some of sizes were carefully chosen and may still be an appropriate amount. Other values might have been unintentionally set to scale with number of cores (this is the default behavior when the number of threads is not specifically set). Or, some values might have become "incorrect" as the workload of the cluster changes / refactors occur and shift where heavier operations occur.

In any case, being able to control these sizes could be a powerful knob to turn.

Proposed Solution

Add CLI flags + plumbing for thread pool size parameters
The addition of the CLI flags should NOT alter any thread pool sizes
- This can be tackled in subsequent work tracked by #35
The flags should probably be hidden, maybe forever. Or, if we make them public, they'll need a warning along the lines of "you should only be using this if you know what you're doing"
Finally, having the ability to configure all sizes should make the rayon-threadlimit crate obsolete, altho we may need to follow a proper deprecation sequence for that crate

Bug when changing from platform tools v1.37 to v1.39

Problem

When building Openbook-V2 with the tools version v1.37, the following steps work correctly, but fail in platform tools version v1.39.

Steps to reproduce:

git clone https://github.com/openbook-dex/openbook-v2
cargo build-sbf --features enable-gpl
solana program deploy ./target/deply/openbook_v2.so
yarn run ts-node test.ts

import { Connection, Keypair, LAMPORTS_PER_SOL, PublicKey, Transaction } from '@solana/web3.js';
import * as openbook from '@openbook-dex/openbook-v2';
import { Program, BN, AnchorProvider, Wallet } from '@coral-xyz/anchor';
import * as splToken from "@solana/spl-token";

import * as fs from 'fs';

async function createMint(connection : Connection,  authority: Keypair, nb_decimals = 6) : Promise<PublicKey> {
    const kp = Keypair.generate();
    return await splToken.createMint(connection, 
        authority, 
        authority.publicKey, 
        authority.publicKey, 
        nb_decimals,
        kp)
}

export async function main() {

    const secretKey = JSON.parse(fs.readFileSync("/home/galactus/.config/solana/id.json", "utf-8"));
    const keypair = Keypair.fromSecretKey(new Uint8Array(secretKey));
    const authority = keypair;
    const payer = authority;
    const connection = new Connection("https://api.testnet.rpcpool.com/dfeb84a5-7fe8-4783-baf9-60cca0babbc7", "processed");


    let airdrop_sig = await connection.requestAirdrop(authority.publicKey, 2 * LAMPORTS_PER_SOL);
    await connection.confirmTransaction(airdrop_sig);

    let baseMint = await createMint(connection, authority, 6);
    let quoteMint = await createMint(connection, authority, 6);

    const quoteLotSize = new BN(1000000);
    const baseLotSize = new BN(1000000000);
    const makerFee = new BN(0);
    const takerFee = new BN(0);
    const timeExpiry = new BN(0);

    const wallet = new Wallet(authority);

    const programId = new PublicKey("AiqQtnazKRRUkn9enZ9SLUy35FS5aT38QrBET3qCiPqF");
    const provider = new AnchorProvider(connection, wallet, {});
    let client = new openbook.OpenBookV2Client( provider, programId);

    // Add your test here.
    const [[bidIx, askIx, eventHeapIx, ix], [market, bidsKeypair, askKeypair, eventHeapKeypair]] = await client.createMarketIx(
      authority.publicKey,
      "Market Name",
      quoteMint,
      baseMint,
      quoteLotSize,
      baseLotSize,
      makerFee,
      takerFee,
      timeExpiry,
      null, // oracleA
      null, // oracleB
      null, // openOrdersAdmin
      null, // consumeEventsAdmin
      null, // closeMarketAdmin
    );
    console.log("sending tx");

    let tx = new Transaction();
    tx.add(bidIx);
    tx.add(askIx);
    tx.add(eventHeapIx);
    tx.add(ix);
    tx.recentBlockhash = (await connection.getLatestBlockhash()).blockhash;
    // Send transaction
    let sig = await connection.sendTransaction(tx, [authority, market, bidsKeypair, askKeypair, eventHeapKeypair], {
        skipPreflight: false
    });
    console.log('Your transaction signature', sig);
    await connection.confirmTransaction(sig);
    
    console.log("Market initialized successfully");
    console.log("Market account:", market.publicKey.toBase58());
    console.log("Bids account:", bidsKeypair.publicKey.toBase58());
    console.log("Asks account:", askKeypair.publicKey.toBase58());
    console.log("Event heap account:", eventHeapKeypair.publicKey.toBase58());
    // console.log("Market authority:", market.authority.toBase58());
    console.log("Quote mint:", quoteMint.toBase58());
    console.log("Base mint:", baseMint.toBase58());
    console.log("Quote lot size:", quoteLotSize.toString());
    console.log("Base lot size:", baseLotSize.toString());
}

main().then(x => {
    console.log('finished sucessfully')
}).catch(e => {
    console.log('caught an error : ' + e)
})

Error in v1.39:

Program 9QJrVWzEaZBjao31iqBNaGqmXUNim7tmdb9kgczqGQXD failed: Access violation in unknown section at address 0x0 of size 8'

Proposed Solution

Investigating the problem

Increase the rebroadcast queue size

Problem

agave/send-transaction-service/src/send_transaction_service.rs

Line 31 in 794cb2f

 const MAX_TRANSACTION_RETRY_POOL_SIZE: usize = 10_000; // This seems like a lot but maybe it needs to be bigger one day 

We're seeing a lot of dropped transactions on RPC nodes, with the documented suggested approach to solving this here working as a work around, but it only amplifies the problem;

while (blockheight < lastValidBlockHeight) {
  connection.sendRawTransaction(rawTransaction, {
    skipPreflight: true,
  });
  await sleep(500);
  blockheight = await connection.getBlockHeight();
}

So if my reasoning is correct, we get dropped transactions unless we keep repeatedly spamming the rpc nodes with transactions, so that we can make it within the 10k re-broadcast queue limit at some point. This approach is a likely cause of that limit not being sufficient.

Proposed Solution

Something that might alleviate this pressure is checking if these transactions are identical and only counting unique transactions for this queue limit?

agave/send-transaction-service/src/send_transaction_service.rs

Line 31 in 794cb2f

 const MAX_TRANSACTION_RETRY_POOL_SIZE: usize = 10_000; // This seems like a lot but maybe it needs to be bigger one day 

spl-token transfer failure due to lack of priority fee

Problem

Due to how busy the Solana network is the included program spl-token to transfer tokens fails with the error "unable to confirm transaction. This can happen in situations such as transaction expiration and insufficient fee-payer funds" most of the time. (The fee-payer account has 25+ SOL in it so it's not that.)

As I understand it this is due to the spl-token program not adding a priority fee.

Proposed Solution

Update the spl-token program to include a priority fee.

cli: solana validator-info get does not display information for all validators

I checked on several validators and this command did not output anything.

For example:

solana validators -ut | grep 7arMwy1BnRTC995TVdgZYU8n9V2gwxTpWmVNDdeJihdA
  7arMwy1BnRTC995TVdgZYU8n9V2gwxTpWmVNDdeJihdA  66597J13dfdXhKu76VCEiKt6aDE4YP1Seccj2JGkXHYo   10%  258716111 (  0)  258716074 ( -2)   0.00%   269417   1.18.6     39025.815318832 SOL (0.02%)

solana validator-info get -ut | grep 7arMwy1BnRTC995TVdgZYU8n9V2gwxTpWmVNDdeJihdA

On the site I see information on these validators. For example:
https://www.validators.app/validators/7arMwy1BnRTC995TVdgZYU8n9V2gwxTpWmVNDdeJihdA?locale=en&network=testnet

Name: 	Moovi
Details: 	Moovi | Solana Validator
Identity: 	7arMwy1BnRTC995TVdgZYU8n9V2gwxTpWmVNDdeJihdA
Software: 	1.18.6
Keybase: 	moovi
Website: 	https://moovi.work
Vote Account: 	66597J13dfdXhKu76VCEiKt6aDE4YP1Seccj2JGkXHYo

I can assume that information is not displayed for those validators who published information a long time ago.

Checked on versions:
solana-cli 1.17.26 (src:4c418143; feat:3580551090, client:SolanaLabs)
solana-cli 1.18.5 (src:928d8ac2; feat:3352961542, client:SolanaLabs)

Feature Gate: simplify_alt_bn128_syscall_error_codes

SIMD

129

Description

See #294.

Feature ID

JDn5q3GBeqzvUa7z67BbmVHVdE3EbUAjvFep3weR3jxX

Activation Method

Single Core Contributor

Deployment Considerations

No response

Minimum Beta Version

No response

Minimum Stable Version

No response

Testnet Activation Epoch

657

Devnet Activation Epoch

714

Mainnet-Beta Activation Epoch

635

`SanitizedTransaction` may be too burdensome for SVM

Problem

When submitting a transaction to SVM, one must create a SanitizedTransaction, but all its fields are private. Although it has a try_new function, it depends on SanitizedVersionedTransaction, whose try_new construction, in turn, depends on VersionedTransaction.

Such a process is so onerous that we use SanitizedTransaction::from_transaction_for_tests in our tests.

All these levels of indirection ensure the safety for the Solana use-case, but they create a burden for external users who need to navigate all these intricacies just to create a SanitizedTransaction.

Solution

Under discussion.

Solana v1.17 uses the same Solana platform-tools v1.37 and Rust v1.68.0 as Solana v1.16

Solana platform-tools v1.37 got released on April 1, 2023. It uses Rust v1.68.0, released on March 9, 2023.
Solana v1.16.0 got released on May 31, 2023 and uses Solana platform-tools v1.37 with Rust v1.68.0.
Solana platform-tools v1.38 got released on Aug 20, 2023 and it still uses Rust v1.68.0.
Solana v1.17.0 got released on Oct 2, 2023 and still uses Solana platform-tools v1.37 with Rust v1.68.0.
Solana platform-tools v1.39 got released on Oct 21, 2023 and uses Rust v1.72.0, released on Aug 24, 2023.
Solana v1.18.0 got released on Jan 27, 2024 and uses Solana platform-tools v1.39 with Rust v1.72.0.

I summarized the information above in the following table:

Solana version	Solana release date	Solana program Rust version	Rust release date	Difference between Solana version and Rust version
v1.16.0	May 31, 2023	v1.68.0	March 9, 2023	2 months
v1.17.0	Oct 2, 2023	v1.68.0	March 9, 2023	7 months
v.18.0	Jan 27, 2024	v1.72.0	Aug 24, 2023	5 months

Is the 7 month delay between the initial release of Rust v1.68.0 and Solana v1.17.0 intentional?

I am asking because it is almost an year since the release of Rust v1.68.0 and the current Solana v1.17.24 release uses a 1-year old version of Rust v1.68.0 for building Solana programs.

I believe this hinders Solana program developer experience, since some packages require Rust versions newer than v1.68.0.

I got sent to this repository from solana-labs#35428.

ledger-tool verify is hung up for a while waiting for AccountBackgroundService

Problem

Running agave-ledger-tool verify with a recent commit near the tip of master sees a non-trivial slowdown after tx-processing has finished. Namely, it appears to be waiting on ABS:

[2024-03-20T20:04:09.010886812Z INFO  solana_ledger::blockstore_processor] ledger processed in 2 ms, ...
...
// Exit flag set which should propagate to AHV and ABS
[2024-03-20T20:04:09.407809319Z INFO  solana_core::accounts_hash_verifier] AccountsHashVerifier has stopped
...
[2024-03-20T20:11:42.695623647Z INFO  solana_runtime::accounts_background_service] AccountsBackgroundService has stopped

Note the timestamp difference; AHV returns almost immediately whereas ABS stops > 7 minutes later. Here are all the logs inbetween:

[2024-03-20T20:04:08.865586216Z INFO  solana_metrics::metrics] datapoint: accounts_db_active hash_scan=1i
[2024-03-20T20:04:09.407809319Z INFO  solana_core::accounts_hash_verifier] AccountsHashVerifier has stopped
[2024-03-20T20:05:03.132225735Z INFO  solana_metrics::metrics] datapoint: accounts_db_active hash_scan=0i
[2024-03-20T20:05:03.156931563Z INFO  solana_metrics::metrics] datapoint: accounts_db_active hash_dedup=1i
[2024-03-20T20:05:38.739498389Z INFO  solana_metrics::metrics] datapoint: accounts_db_active hash_dedup=0i
[2024-03-20T20:05:38.752142142Z INFO  solana_metrics::metrics] datapoint: accounts_db_active hash_merkle_tree=1i
[2024-03-20T20:05:40.891854054Z INFO  solana_metrics::metrics] datapoint: accounts_db_active hash_merkle_tree=0i
[2024-03-20T20:05:44.359742882Z INFO  solana_accounts_db::accounts_db] calculate_accounts_hash_from_storages: slot: 254911099, Full(AccountsHash(EiKKH9o7NGgi7jw9MSAxidLtidhUx61ZLMWk88KeoMUT)), capitalization: 572149832212208582
[2024-03-20T20:05:45.781712115Z INFO  solana_metrics::metrics] datapoint: cache_hash_data_stats cache_file_size=68760899328i cache_file_count=175i total_entries=955012452i loaded_from_cache=174i saved_to_cache=174i entries_loaded_from_cache=477506226i save_us=46980731i write_to_mmap_us=46620189i create_save_us=312311i load_us=0i read_us=3231i unused_cache_files=175i hits=0i misses=174i
[2024-03-20T20:05:45.782159006Z INFO  solana_metrics::metrics] datapoint: calculate_accounts_hash_from_storages total_us=96919324i mark_time_us=0i cache_hash_data_us=1510i accounts_scan_us=54266994i eliminate_zeros_us=35582547i hash_us=2139694i sort_us=19543420i hash_total=443043076i storage_sort_us=29499i collect_snapshots_us=73647i num_snapshot_storage=407166i scan_chunks=174i num_slots=407166i num_dirty_slots=0i storage_size_min=311296i storage_size_quartile_1=417792i storage_size_quartile_2=462848i storage_size_quartile_3=569344i storage_size_max=343113728i storage_size_avg=573478i roots_older_than_epoch=124i oldest_root=254478968i longest_ancient_scan_us=0i sum_ancient_scans_us=0i count_ancient_scans=0i append_vec_sizes_older_than_epoch=63826288i accounts_in_roots_older_than_epoch=142832i pubkey_bin_search_us=56003607i
[2024-03-20T20:05:45.782165879Z INFO  solana_metrics::metrics] datapoint: accounts_db_active hash=0i
[2024-03-20T20:05:45.786250783Z INFO  solana_runtime::bank] Initial background accounts hash verification has stopped
[2024-03-20T20:11:41.313515479Z INFO  solana_accounts_db::accounts_db] remove_dead_slots_metadata: 122 dead slots
[2024-03-20T20:11:41.315763717Z INFO  solana_metrics::metrics] datapoint: clean_purge_slots_stats safety_checks_elapsed=1i remove_cache_elapsed=0i remove_storage_entries_elapsed=431i drop_storage_entries_elapsed=1i num_cached_slots_removed=0i num_stored_slots_removed=122i total_removed_storage_entries=122i total_removed_cached_bytes=0i total_removed_stored_bytes=62839152i scan_storages_elapsed=0i purge_accounts_index_elapsed=0i handle_reclaims_elapsed=0i
[2024-03-20T20:11:41.315805737Z INFO  solana_metrics::metrics] datapoint: accounts_index_roots_len roots_len=407044i uncleaned_roots_len=1i roots_range_width=432018i unrooted_cleaned_count=0i rooted_cleaned_count=122i clean_unref_from_storage_us=80350i clean_dead_slot_us=56i append_vecs_open=407166i
[2024-03-20T20:11:41.315811128Z INFO  solana_metrics::metrics] datapoint: clean_accounts total_us=452307833i collect_delta_keys_us=43913i oldest_dirty_slot=254911099i pubkeys_removed_from_accounts_index=2331i dirty_ancient_stores=0i dirty_store_processing_us=49750888i accounts_scan=22879414i clean_old_rooted=342002459i store_counts=10647970i purge_filter=2555782i calc_deps=7211465i reclaims=182801i delta_insert_us=4282495i delta_key_count=17541843i dirty_pubkeys_count=0i sort_us=9363089i useful_keys=17541843i total_keys_count=17541843i scan_found_not_zero=9058004i scan_not_found_on_fork=0i scan_missing=0i uncleaned_roots_len=407073i clean_old_root_us=1943283i clean_old_root_reclaim_us=339994628i reset_uncleaned_roots_us=25242i remove_dead_accounts_remove_us=339772510i remove_dead_accounts_shrink_us=223505i clean_stored_dead_slots_us=101696i roots_added=407166i roots_removed=122i active_scans=0i max_distance_to_min_scan_slot=0i ancient_account_cleans=0i next_store_id=407166i
[2024-03-20T20:11:41.379356012Z INFO  solana_metrics::metrics] datapoint: accounts_db_active clean=0i
[2024-03-20T20:11:42.595493256Z INFO  solana_metrics::metrics] datapoint: accounts_db_active shrink=1i
[2024-03-20T20:11:42.595545706Z INFO  solana_metrics::metrics] datapoint: accounts_db_active shrink=0i
[2024-03-20T20:11:42.595553822Z INFO  solana_metrics::metrics] datapoint: accounts_background_service duration_since_previous_submit_us=453587771i num_iterations=1i cumulative_runtime_us=453587771i mean_runtime_us=453587771i min_runtime_us=453587771i max_runtime_us=453587771i
[2024-03-20T20:11:42.695623647Z INFO  solana_runtime::accounts_background_service] AccountsBackgroundService has stopped

A few items jump out:

accounts_db_active hash_scan=0i comes in after about 60 seconds, so this seemingly doesn't bail out on exit flag
Initial background accounts hash verification has stopped comes in after about 90 seconds
There is a large gap after above item, but before next log line remove_dead_slots_metadata ... like 6 minutes

Proposed Solution

Thought 1:
My possibly naive understanding is that ledger-tool should be able to opt out of these items; maybe we want some of these steps to occur for case of create-snapshot. Or, maybe opt-in if there is debug value.

Thought 2:
A step further, I think it would be preferred to plumb the exit flag down in these routines so that if someone does set the exit flag, they are aborted in a timely manner. Doing this may remove the need for any special handling to skip stuff in the ledger-tool case. If we go this route, we probably need a sanity check on whether there are any instances in which we should let these operations complete for ledger-tool. Again, create-snapshot command comes to mind as a candidate

Frozen ABI Macro requires `log` in dependency list

Problem

When annotating a struct with the #[derive(AbiExample)] macro and compiling a crate with cargo build-sbf, the compiler will complain that log was not found in the dependency list.

#[derive(Debug, PartialEq, Eq, Clone, AbiExample)]
|                                      ^^^^^^^^^^ could not find `log` in the list of imported crate

Proposed Solution

Modify the macro to point to the log dependency used by solana_frozen_abi.

[Bug] Scheduler Slot Metrics - time metrics being reported under counts

Problem

scheduler per slot timing metrics are reported under the "counts" datapoint

Proposed Solution

fix datapoint name

Possible invalid result in alt_bn128 syscalls

Problem

Looking at the code, I found the following a bit suspicious. When a function returns an error, the syscall returns success. Is this intended or a copy&paste error?

Examples:
https://github.com/anza-xyz/agave/blob/v1.18.6/programs/bpf_loader/src/syscalls/mod.rs#L1558-L1567
https://github.com/anza-xyz/agave/blob/v1.18.6/programs/bpf_loader/src/syscalls/mod.rs#L1806-L1808

Proposed Solution

Replace Ok(..) with Err(..)

Scheduler - Remaining Transaction Metrics

Problem

We do not currently collect and report information about the transactions that the leader did not include in the block.

Proposed Solution

Scan through and collect metrics after leader finishes 4-slot allocation and will not be leader again very soon:
Collect metrics to report:
- number of (valid) transactions in buffer
- number of (valid) non-conflicting transactions in buffer
- total CUs (requested) left in buffer
- histogram stats for CUs and priority
- most contentious account by count and by CUs (requested)

Tasks

Legacy BankingStage
CentralScheduler BankingStage

solana-test-validator in windows error

Hi team,

I am new for solana. I have a trouble after setup finally above:

thread 'main' panicked at validator\src\bin\solana-test-validator.rs:118:86:
called Result::unwrap() on an Err value: Os { code: 1314, kind: Uncategorized, message: "A required privilege is not held by the client." }
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

Thanks for support.

Separate solana-program release pipeline

Problem

Solana development improvements go much faster than the validator improvements. solana-program should not have to wait on a validator release to get a new version, and the validator release should not be incremented just because solana-program needs a release.

Proposed Solution

Separate the release pipeline between solana-program and the rest of the monorepo.

Capitalization overflow on test cluster (v1.18.2)

Problem

In setting up a new test cluster, @joeaba encountered a capitalization overflow while taking an accounts hash prior to creating a snapshot. The full backtrace / some relevant discussion here on Discord:

...
thread="solAccountsLo10" one=1i message="panicked at accounts-db/src/accounts_hash.rs:849:26:
    summing capitalization cannot overflow" location="accounts-db/src/accounts_hash.rs:849:26" version="1.18.2 (src:13656e30; feat:3352961542, client:SolanaLabs)"

The cluster was created with the following solana-genesis command:

solana-genesis --hashes-per-tick 12500 --bootstrap-validator genesis-keypair.json vote-pubkey.json stake-pubkey.json \
--ledger /home/sol/ledger --cluster-type development --faucet-pubkey faucet.json --faucet-lamports 100000000000000000 \
--bootstrap-validator-lamports 10000000000000 --bootstrap-validator-stake-lamports 250000000000000 \
--primordial-accounts-file accounts.json

The creation of the above genesis file yielded the following capitalization:

Capitalization: 600000000 SOL in 604 accounts

However, here are the lamports that we explicitly added:

18 446 744 073 709 551 615 <== U64::MAX
   100 000 000 000 000 000 <== --faucet-lamports
       250 000 000 000 000 <== --bootstrap-validator-stake-lamports
        10 000 000 000 000 <== --bootstrap-validator-lamports

Note that 500M of the 600M in capitalization stems from an item discovered in solana-labs#35266. Technically, we don't need/want those accounts in a development cluster, but even with 600M capitalization, I don't think we should have overflowed.

Proposed Solution

Recreate the problem and figure out why the overflow happened. Technically, minting too many lamports could cause an overflow. But, even with those extra accounts that got us up to 600M capitalization, we should still have been operating at about 1/30th of u64::MAX:

600M SOL * (1 billion lamports / 1 SOL) / u64::MAX ~= 1 / 30

Joe mentioned that the bootstrap validator was reproducing this. Steps to reproduce:

Use solana-keygen to create a bunch of keys to pass to above solana-genesis invocation
Create the following accounts.json file and pass it via --primordial-accounts-file:

{
    "tvc2ZFRKhdDgGg3aDsmKGtUWoQdn8gDnhH48KmpEdoj": {
        "balance": 25000000000000000,
        "owner": "11111111111111111111111111111111",
        "data": "",
        "executable": false
    },
    "tvc3NTVL8WdHuUcXEoNcCheQh8S52wjcFEcCXN6AdBj": {
        "balance": 25000000000000000,
        "owner": "11111111111111111111111111111111",
        "data": "",
        "executable": false
    },
    "tvc4VQteaHCQA8hPEyZjfz31JNPR8ny8baNyQYSkt4C": {
        "balance": 25000000000000000,
        "owner": "11111111111111111111111111111111",
        "data": "",
        "executable": false
    }
}

This issue showed on v1.18.2, but if trying to reproduce with master, back out the commit that was introduced with solana-labs#35266
Here were the bootstrap validator args; nothing special here but pasted for completeness:

Starting validator with: ArgsOs {
        inner: [
            "solana-validator",
            "--dynamic-port-range",
            "8000-8020",
            "--gossip-port",
            "8001",
            "--identity",
            "/home/sol/identity/genesis-keypair.json",
            "--ledger",
            "/home/sol/ledger",
            "--limit-ledger-size",
            "--log",
            "/home/sol/logs/solana-validator.log",
            "--rpc-port",
            "8899",
            "--expected-genesis-hash",
            "BEQ6Kc7BLfSUTh9nfMBW7fn5L8jfA22nCXuBna4NY3ue",
            "--wal-recovery-mode",
            "skip_any_corrupted_record",
            "--full-rpc-api",
            "--no-wait-for-vote-to-start-leader",
            "--enable-rpc-transaction-history",
            "--snapshot-interval-slots",
            "5000",
            "--vote-account",
            "/home/sol/identity/vote-pubkey.json",
            "--rpc-faucet-address",
            "127.0.0.1:9900",
            "--trusted-validator",
            "tvc2ZFRKhdDgGg3aDsmKGtUWoQdn8gDnhH48KmpEdoj",
            "--halt-on-trusted-validators-accounts-hash-mismatch",
            "--no-untrusted-rpc",
            "--expected-shred-version",
            "3541",
            "--entrypoint",
            "104.154.224.20:8001",
            "--no-genesis-fetch",
            "--no-snapshot-fetch",
            "--accounts",
            "/mnt/accounts",
            "--enable-cpi-and-log-storage",
        ],
    }

Creating account data to match the loader header is a manual process

Problem

The BPF loader V3 and V4 require a specific header on the account data field in order for the program do be loaded and executed correctly. Creating the header manually is a error-prone process.

Proposed Solution

Create functions that construct the account data field with the correct header information.

`solana-test-validator` does not work properly on Windows

Problem

Currently the Solana CLI will install properly on Windows, but the solana-test-validator will run into a few errors. Generally the recommendation has been for users to install WSL to run solana-test-validator, however this is frictional on the developer experience and leads to a lot of developers dropping off.

Current error flow on Windows:
Running solana-test-validator runs into the following:

Error: Failed to start validator. Failed to create ledger at test-ledger: blockstore error

Inspecting the logs:

tar stderr: tar: Can't launch external program: bzip2

After figuring out bzip2 installation on windows, solana-test-validator hangs on initialization process.

Logs:

[2023-04-03T16:28:38.837257000Z INFO  solana_test_validator] Starting validator with: ArgsOs {
        inner: [
            "solana-test-validator",
        ],
    }
[2023-04-03T16:28:38.837284100Z WARN  solana_perf] CUDA is disabled
[2023-04-03T16:28:38.837302300Z INFO  solana_perf] AVX detected
[2023-04-03T16:28:38.837312500Z INFO  solana_perf] AVX2 detected
[2023-04-03T16:28:38.840948300Z INFO  solana_faucet::faucet] Faucet started. Listening on: 0.0.0.0:9900
[2023-04-03T16:28:38.841038000Z INFO  solana_faucet::faucet] Faucet account address: 45U6mutB7zMjjX8NAXfrUYwNpTdYJm51aWnmjnPsBRC3
[2023-04-03T16:28:38.859996300Z INFO  solana_ledger::blockstore] Opening database at "test-ledger\\rocksdb"
[2023-04-03T16:28:39.066944300Z INFO  solana_ledger::blockstore] "test-ledger\\rocksdb" open took 206ms
[2023-04-03T16:28:39.069010700Z INFO  solana_metrics::metrics] metrics disabled: SOLANA_METRICS_CONFIG: environment variable not found
[2023-04-03T16:28:39.069693100Z INFO  solana_metrics::metrics] datapoint: shred_insert_is_full total_time_ms=0i slot=0i last_index=3i num_repaired=0i num_recovered=0i

Another issue related to running solana-test-validator on Windows: solana-labs#34793

Proposed Solution

Make solana-test-validator work on Windows without additional installation of a bash shell or WSL.

Cost Tracker - stats reporting bug

Problem

cost-tracker stats are reported when the bank is frozen
there may still be in-flight transactions which have not have costs removed/updated from tracker

Proposed Solution

Find a way to delay reporting until in-flight costs have been updated or removed

Solana wallet connection issue when initialize the transactions via Frontend

Hi Team,
Hereby sharing the set of codes and console of the error log, please clarify it will more helpful. I working all of the scenarios and modules, it is perfectly working in solana localnet with the Frontend, while we change the network to devnet, the following errors occurs when try to connect the phantom wallet with the functions.

My question : Is there any issue on devnet with phantom wallet provider, because solana localnet is working properly.
Please let us know to proceed further.

Feature Request - Add Verifiable Builds to the Solana CLI

Problem

Being able to create verifiable builds and prove your smart contract code deployed is the same code as displayed on Github or an explorer is incredibly important. Today developers deploying programs have to go install a separate CLI to create a verifiable build and upload it.

Proposed Solution

Integrate Verifiable Builds into the Solana CLI as part of solana program deploy. Verifiable build should add information to security.txt so that people can manually verify and not have to rely on uploading to an indexer.

Audit each individual thread pool size

Problem

The solana-validator process creates a lot of threads, including a number of threadpools. A staked node running v1.17 was observed to have more than 800 threads. The exact number may vary a little depending on some configuration as well as the number of physical cores the machine has (several pools scale with the number of cores). This is a lot of threads, and all of these threads add non-trivial overhead to the system.

Proposed Solution

Examine each thread pool more closely, and reduce thread pool sizes where appropriate. Even if a particular workflow actually utilizes a very high level of parallelism, it may be the case that we should still put a hard upper bound in place instead of letting the thread pool size scale with the number of cores.

Qos Metrics - reporting interval

Problem

qos metrics are reported when the slot changes, but only if transactions are executed
this delays the reporting of the last slot in 4 slot allocation until the next time become leader

Proposed Solution

move reporting to main banking loop(s):
- Fix qos_metrics reporting: thread-local-multi-iterator
- Fix qos_metrics reporting: central-scheduler

Banking Stage Worker - Metrics

Problem

Useful to know how much time is spent waiting/getting banks

Proposed Solution

Add metric to track time spent waiting for new banks

Note to self for impl: probably need to be careful on the last bank of 4 slot allocation, since we will wait for a new bank (for 50ms iirc) that never comes; probably want to have separate times for successful bank get and failed bank get

Add priority fees and retry count to `solana program deploy`

Problem

With higher usage on the cluster, transactions are requiring the need of a priority fee to be set in order to give them a greater chance of landing. This makes deploying programs, which require a larger amount of transactions than normal usage, to often fail before the program is finished deploying.

Proposed Solution

Add two new features:

Ability to set priority fee when using solana program deploy
Ability to set custom retry count

Docs: Many crates are poorly documented and very limited examples

Problem

A lot of current Solana crates are missing documentation, and have very limited examples.

This provides a barrier to becoming acquainted with the codebase and can lead to uncertainty when importing items from these crates as dependencies.

Since it appears that Agave is replacing the Solana repo, raising this here so that it can be tracked.

Note - not related to the Docusaurus docs in the docs folder, but rather the actual rust docs.

Proposed Solution

Ideally all crates should have doc comments for all public structs, methods, enums etc - leading to 100% doc coverage.

Given the large codebase, this can be broken down into several PRs, each targeting a specific crate.

Once a crate's public items have been fully documented, the rule #![deny(missing_docs)] could be applied to ensure that future additions will always include documentation.

Current doc coverage

Here's a few examples from the most downloaded solana crates per crates.io.

solana_sdk_macro - 18% documented
solana_frozen_abi - 0% documented
solana_program - 52% documented
solana_logger - 20% documented
solana_sdk - 64% documented
solana_metrics - 26% documented
solana_config_program - 39% documented
solana_vote_program - 27% documented
solana-transaction-status - 5% documented
solana_account_decoder - 2% documented
solana_clap_utils - 12% documented
solana_remote_wallet - 23% documented
solana_client - 30% documented

A few crates do already have 100% doc coverage, and we should strive to make sure this is the case for all Agave crates.

Solana RPC node disk read very high

solana-validator version: 1.17.24
server monitoring data

                   R           W
nvme3n1            1.50G   391M

nvme3n1             2.60G      101M

It's interesting that disk reads are getting higher and higher！

2024-03-12 11:49:10 (ongoing) - CPU_IOWAIT (Min:1.5 Mean:4.0 Max:7.4): solana-validator, kswapd3, kswapd0
                                       2024-03-12 11:48:31 (0:00:36) - CRITICAL on CPU_IOWAIT (Min:1.7 Mean:1.8 Max:2.3): solana-validator, kswapd0, kswapd3
                                       2024-03-12 11:42:13 (0:06:15) - CRITICAL on CPU_IOWAIT (Min:1.5 Mean:1.9 Max:2.7): solana-validator, kswapd0, kswapd3
                                       2024-03-12 11:39:45 (0:02:25) - CRITICAL on CPU_IOWAIT (Min:1.6 Mean:1.9 Max:2.8): solana-validator, kswapd1, kswapd3
                                       2024-03-12 11:39:15 (0:00:27) - CRITICAL on CPU_IOWAIT (1.5): solana-validator, kswapd3, kswapd2
                                       2024-03-12 11:38:33 (0:00:39) - CRITICAL on CPU_IOWAIT (Min:1.5 Mean:1.8 Max:2.2): solana-validator, kswapd3, jbd2/nvme3n1-8
                                       2024-03-12 11:37:14 (0:01:16) - CRITICAL on CPU_IOWAIT (Min:1.5 Mean:1.7 Max:2.0): solana-validator, kswapd3, glances
                                       2024-03-12 11:36:50 (0:00:09) - CRITICAL on CPU_IOWAIT (1.7): solana-validator, jbd2/nvme3n1-8, kswapd3
                                       2024-03-12 11:36:05 (0:00:42) - CRITICAL on CPU_IOWAIT (Min:1.5 Mean:1.8 Max:3.4): solana-validator, kswapd2, kswapd0
2024-03-12 13:02:48 UTC                2024-03-12 11:35:22 (0:00:09) - CRITICAL on CPU_IOWAIT (1.6): solana-validator, jbd2/nvme3n1-8, kswapd2

ExecStart：

ExecStart=/home/ubuntu/solana-release/bin/solana-validator
--identity /home/ubuntu/key/validator-keypair.json
--known-validator 7Np41oeYqPefeNQEHSv1UDhYrehxin3NStELsSKCT4K2
--known-validator GdnSyH3YtwcxFvQrVVJMm1JhTS4QVX7MFsX56uJLUfiZ
--known-validator DE1bawNcRJB9rVm3buyMVfr8mBEoyyu73NBovf2oXJsJ
--known-validator CakcnaRDHka2gXyfbEd2d3xsvkJkqsLw2akB3zsN1D2S
--ledger /opt/sol/validator-ledger
--accounts /mnt/solana-accounts
--rpc-bind-address 0.0.0.0
--no-voting
--rpc-port 8899
--log /opt/sol/logs/sol.log
--entrypoint entrypoint.mainnet-beta.solana.com:8001
--entrypoint entrypoint2.mainnet-beta.solana.com:8001
--entrypoint entrypoint3.mainnet-beta.solana.com:8001
--entrypoint entrypoint4.mainnet-beta.solana.com:8001
--entrypoint entrypoint5.mainnet-beta.solana.com:8001
--dynamic-port-range 8000-9000
--enable-rpc-transaction-history
--expected-genesis-hash 5eykt4UsFv8P8NJdTREpY1vzqKqZKvdpKuc147dw2N9d
--no-port-check
--full-rpc-api
--wal-recovery-mode skip_any_corrupted_record
--account-index spl-token-owner
--account-index program-id
--rpc-pubsub-enable-block-subscription
--geyser-plugin-config /home/ubuntu/solana-release/yellowstone-grpc-geyser/config.json
--limit-ledger-size

This is the monitoring data without RPC requests,Is there any way to reduce the reading of disk?

bump cc version manually

Problem

since cc 1.0.85, they use xcrun --show-sdk-platform-version (https://github.com/rust-lang/cc-rs/blob/df6262531e9e00a1e3cb6e4f69769eaf8dc06e58/src/lib.rs#L3617-L3623) to check things but this command doesn't work for ours env atm (we don't have xcode on our build machine)

ci error: https://buildkite.com/anza/agave-secondary/builds/24#018e2c79-4a8f-4f69-97fc-a38a478c427b
related: https://www.github.com/rust-lang/cc-rs/issues/1001

will track this one and upgrade cc manually later

alt_bn128 use stack instead of heap

Problem

I'm currently trying to use alt_bn128 to verify Groth16 ZKP. I'm using this repository which in turns uses the alt_bn128 module from the solana-program crate.

The problem I face when I run my program is:

Error: memory allocation failed, out of memory

Let me share some details about my circuit. It has 246 outputs i.e. public inputs. I've checked the code and I noticed the usage of the following functions:

These functions return a vector of data on the heap which means they allocate data on the heap not stack. Looking into the non-solana version of the functions I can see the following line

let mut input = input.to_vec();
input.resize(ALT_BN128_MULTIPLICATION_INPUT_LEN, 0);

//  pub const ALT_BN128_MULTIPLICATION_INPUT_LEN: usize = 128;

For circuits like mine which has 246 inputs this will result in allocating 246 times a vector of size 128 which would quickly hit the Solana heap size limits.

Proposed Solution

I wonder if could use the stack instead of creating vectors on the heap. The issue with the heap is that there is no way to free a previously allocated memory. However, with stack (correctly if I'm wrong) we still have a limit but the stack at least can grow and shrink, so as long as a function call stays within the stack limit we should be (in theory ok).

Most of the practical circuits will have public inputs of that size if not bigger. As it stands, it's hard to use the alt_bn128 for any practical circuit.

Ensure all threads have unique names

Problem

The solana-validator process currently has several threads that share a name. Most of these cases stem from code re-use; for example, the receiver() function in streamer is used by multiple network services (Gossip, Repair, TPU Votes, ...). Use of the same name makes using debug tools that allow per-thread examination (gdb, perf, ...) more difficult.

Solution

Ensure each individual thread has a unique name. For thread pools, it is not essential that threads within the pool have unique names as all of the threads within the pool will be performing the same kind of task, and performance for a thread in the pool is coupled to the pool

Pull Requests

Here is a list of pull requests that accomplish the stated goal of this issue:

solana-keygen grind

Is there a way to grind custom prefix address on server using some web3 client.
Use case: users can create custom prefix address by hitting some endpoint on server by proving some prefix text
?

	/// SimpleVote has a simpler and pre-determined format: it has 1 or 2 signatures,
	/// 2 write locks, a vote instruction and less than 32k (page size) accounts to load.
	/// It's cost therefore can be static #33269.

	pub fn num_transaction_signatures(&self) -> u64 {
	match self {
	Self::SimpleVote { .. } => 1,
	Self::Transaction(usage_cost) => usage_cost.num_transaction_signatures,
	}
	}

	TransactionCost::SimpleVote {
	writable_accounts: Self::get_writable_accounts(transaction),
	}

	let result = if wire_transactions.len() == 1 {
	Self::send_transaction(tpu_address, wire_transactions[0], connection_cache)
	} else {
	Self::send_transactions_with_metrics(tpu_address, wire_transactions, connection_cache)
	};

	if let Err(err) = result {
	warn!(
	"Failed to send transaction transaction to {}: {:?}",
	tpu_address, err
	);
	stats.send_failure_count.fetch_add(1, Ordering::Relaxed);
	}

	for f in futures {
	f.await
	.into_iter()
	.try_for_each(\|res\| res)
	.map_err(Into::<ClientErrorKind>::into)?;
	}
	Ok(())

	impl Bank {
	pub(crate) fn fill_missing_sysvar_cache_entries(&self) {
	let mut sysvar_cache = self.transaction_processor.sysvar_cache.write().unwrap();
	sysvar_cache.fill_missing_entries(\|pubkey, callback\| {
	if let Some(account) = self.get_account_with_fixed_root(pubkey) {
	callback(account.data());
	}
	});
	}

	pub(crate) fn reset_sysvar_cache(&self) {
	let mut sysvar_cache = self.transaction_processor.sysvar_cache.write().unwrap();
	sysvar_cache.reset();
	}

	pub fn get_sysvar_cache_for_tests(&self) -> SysvarCache {
	self.transaction_processor
	.sysvar_cache
	.read()
	.unwrap()
	.clone()
	}
	}