Giter Club home page Giter Club logo

Comments (11)

jonathanknowles avatar jonathanknowles commented on July 30, 2024 7

Dear @nicolasayotte

Many thanks for creating this issue!

As the author of CIP-2, and as a co-author of the multi-asset UTxO selection algorithm currently used in cardano-wallet, I thought it would be helpful to add some context to this discussion.

Regarding the creation of new CIPs

As you may be aware, CIP-2 was originally intended to be an informational CIP, whose purpose was to describe the algorithm that cardano-wallet used in the pre-multi-asset era, so that others who wished to implement similar algorithms would have a reference to work from.

Though as you rightly point out, CIP-2 is indeed obsolete in the context of a multi-asset UTxO blockchain.

Therefore, I strongly agree that we need CIPs that describe effective selection algorithms for multi-asset UTxO blockchains.

However, my personal inclination would be to create a completely new CIP specifically for this purpose. I believe that CIP-2 has some value as a historical record of how we performed selection in the single-asset era. It might even be useful as a reference for those wishing to implement UTxO selection on UTxO-based blockchains other than Cardano.

I think it's also true that there are many potential solutions to the problem of UTxO selection for multi-asset blockchains. If we were to modify CIP-2 instead of creating a new CIP, this might create the impression that CIP-2 is the "preferred" way to perform multi-asset selection for Cardano, when in reality there may be multiple competing solutions with different advantages and tradeoffs.

So my personal preference would be to leave CIP-2 as a historical CIP, and create new CIPs as necessary. As the author of CIP-2, I would be very happy for people to reuse any part of the text of CIP-2 in a new CIP. (The original CIP was licensed under https://creativecommons.org/licenses/by/4.0/legalcode).

With that said, I'd also like to address some of the issues mentioned:

Regarding the problems of over-selection and clustering

The random selection algorithm simply will pick and cluster all your NFTs together over time resulting in a "change" UTxO that becomes extremely large until simply moving it around breaks the transaction size limit. This will keep happening over and over until your wallet is simply stuck.

I propose the following changes :
Pick the UTxO with the native tokens that you NEED in your output, if any, first.

The multi-asset UTxO selection algorithm used in cardano-wallet actually already does select UTxOs according to a priority ordering, though I think it's almost certainly the case that we've not communicated this very well.

In fact, since the beginning of the multi-asset era, we've gone to great lengths to minimize the problem of over-selection (selecting a greater quantity of a given asset than is "optimal").

We currently use two strategies in an attempt to minimize this problem:

  • Priority ordering
  • Round-robin processing

Priority ordering

When selecting for a particular asset a, while we do select randomly, we select from the following UTxO subsets in priority order:

  • Set 1 ("singletons"):
    The subset of UTxOs that contain asset a and no other asset.
  • Set 2 ("pairs"):
    The subset of UTxOs that contain asset a and one other asset.
  • Set 3 ("everything"):
    The subset of UTxOs that contain asset a and any number of additional assets.

In other words, the algorithm:

  • only ever attempts to select from set 2 if set 1 is empty or exhausted.
  • only ever attempts to select from set 3 if set 2 is empty or exhausted.

So for any given asset a, the UTxO selection algorithm should only select from the final set (everything) if both the singletons and pairs sets are empty or exhausted.

For reference, see: https://github.com/input-output-hk/cardano-wallet/blob/a854531d4e86d5053c3442721ecbecf22d57b023/lib/core/src/Cardano/Wallet/CoinSelection/Internal/Balance.hs#L1228.

Round-robin processing

Another strategy we use to minimize the problem of over-selection is to use round-robin processing.

With round-robin processing:

  • We perform selection in multiple rounds.
  • Before the first round, we establish A: the set of all assets present in user-specified outputs.
  • In each round:
    • We consider each asset in set A only once.
    • For each asset a in set A:
      • we select just a single UTxO containing asset a (according to the above priority rule), and then move on to the next asset.
      • we eliminate a from A if the total selected amount of a is already optimal, or cannot be improved further.
    • We always consider the ada asset last because under the present ledger rules, all UTxOs must contain some ada, so selecting any UTxO will increase the total selected amount of ada.
  • At the end of each round:
    • If set A is empty, we terminate.
    • If set A is non-empty, then we start another round.

See here for reference: https://github.com/input-output-hk/cardano-wallet/blob/a854531d4e86d5053c3442721ecbecf22d57b023/lib/core/src/Cardano/Wallet/CoinSelection/Internal/Balance.hs#L1168

Regarding the use of fallback strategies

Use the current random UTxO selection amongst UTxOs with only Ada and try to meet the output requirements for Ada.
If you run out of simple only Ada UTxOs, use Largest first amongst the UTxOs that contain NFTs or fungible tokens to complete the tx.

When selecting for ada, the multi-asset UTxO selection algorithm used in cardano-wallet does attempt to first select from UTxOs containing only ada, according to the priority order listed above (this applies to all assets including ada).

Regarding Largest-First, I think we have to be slightly careful, as:

  1. Random-Improve has been demonstrated to have several beneficial properties relating to the distribution of fungible assets in a UTxO set (as highlighted in Edkso's research article here).
  2. Largest-First has been demonstrated to have various negative effects on the distribution of fungible assets in a UTxO set, especially when applied repeatedly over time. (For example, the tendency for the number of very small quantities to increase over time.)
  3. I think it's desirable for any selection algorithm to preserve these beneficial properties in a general way, for both ada and other fungible assets.

I do agree with you though, that Largest-First could be considered as fall-back strategy in situations where Random-Improve has failed.

However, for the algorithm used in cardano-wallet, we've recently already implemented a fall-back "minimal" strategy in addition to the original "optimal" strategy.

These two strategies differ by how much of each asset the algorithm attempts to select:

  • Optimal
    Under the optimal strategy, the UTxO selection algorithm attempts to select around twice the minimum possible amount of each asset from the available UTxO set, making it possible (in principle) to generate change outputs that are roughly the same sizes and shapes as the user-specified outputs.

  • Minimal
    Under the minimal strategy, the UTxO selection algorithm attempts to select just enough of each asset from the available UTxO set to meet the minimum amount. When the minimum amount of each asset is satisfied, UTxO selection terminates.

The minimal strategy will tend to generate fewer and smaller change outputs than the optimal strategy, and therefore smaller transactions. It's therefore more likely to succeed in creating a transaction that is within the transaction size limit. However, we suspect the minimal strategy, if used repeatedly over time, is likely to cause an increase in the number of very small quantities within the UTxO set. Therefore, we currently use this strategy only as a fallback.

See here for reference: https://github.com/input-output-hk/cardano-wallet/blob/a854531d4e86d5053c3442721ecbecf22d57b023/lib/core/src/Cardano/Wallet/CoinSelection/Internal/Balance.hs#L290

Limiting the sizes of created change outputs

Also, it is recommended to build output transactions with the two following parameters (user settings in a wallet or algorithm):

  • Token Fragmentation (N) -> Split any transaction output that has more than N NFTs into multiple outputs of at most N NFTs.
  • Make Me Some Change (A) -> Split any transaction output that has more that A Ada into multiple outputs of at most A Ada.

The multi-asset UTxO selection algorithm used in cardano-wallet actually already has support for this, though it is not currently configurable in the API.

For example, we already have support for limiting the sizes of outputs (when serialized) and partitioning them into smaller outputs if necessary. This support was introduced specifically to avoid creating change outputs that exceed the maximum size limit imposed by the protocol.

However, this support is not currently configurable in any way for users of the Cardano Wallet API. We could (in theory) make this configurable. For example, by allowing callers of the algorithm to limit the number of different assets in change outputs to a particular value, or to limit the quantity of a particular asset in any single output.

See here for reference: https://github.com/input-output-hk/cardano-wallet/blob/a854531d4e86d5053c3442721ecbecf22d57b023/lib/core/src/Cardano/Wallet/CoinSelection/Internal/Balance.hs#L2112

Libraries

I would like to share one last thing:

Within the Adrestia team, we currently have plans to take the multi-asset UTxO selection algorithm currently used in cardano-wallet and make it available as part of a separate library for building and balancing transactions, such that the resultant transactions, when signed, are guaranteed to be accepted by the ledger.

We intend to release this library in a Haskell version at first, and then as a JavaScript library later on. The JavaScript version would be derived from the Haskell codebase, and share the same test suite.

I personally also hope that after the release of this library, we could make a new informational CIP that documents our current UTxO selection algorithm in full.

If you've reached the end of this long comment, many thanks for reading through! I would be very happy to clarify any of the points raised within.

from cips.

siegfried avatar siegfried commented on July 30, 2024 3

I've published a package for UTxO selection. https://www.npmjs.com/package/cardano-utxo-wasm

from cips.

nicolasayotte avatar nicolasayotte commented on July 30, 2024 1

@rphair Absolutely! Will do that this weekend without fault 👍

from cips.

nicolasayotte avatar nicolasayotte commented on July 30, 2024 1

Well now, that will take me some time to address that reply. 🤣 But thanks a lot @jonathanknowles.

from cips.

nicolasayotte avatar nicolasayotte commented on July 30, 2024 1

Regarding the creation of new CIPs

People in the ecosystem have been referring to this cip as gospel truth a lot and implementing the algorithm as the only way to do Coin Selection in popular libraries. So if I were to improve CIP-2 I would definitely leave all the current content in as it still is a good Coin Selection algorithm amongst UTxOs with only Ada. Then I would add a section saying -> if you are selecting for utxos with multiples assets: here is a good strategy.

Regarding the problems of over-selection and clustering

That priority ordering is not correct and will result in clustering. I will propose my current flow and then back it up with an example. When selecting for a particular asset a, while we do select randomly, we select from the following UTxO subsets in priority order:

Set 1 ("singletons"): The subset of UTxOs that contain asset a and no other asset.

Set 2 ("pairs"): The subset of UTxOs that contain asset a and one other asset.

Set 3 ("multiples"): The subset of UTxOs that contain asset a and any number of additional assets.

Set 4 ("ada only"): The subset of UTxOs that contain only ada.

Set 5 ("everything"): All UTxOs not picked yet.

You start at sets 1, then 2, then 3 but you need to jump out of sets 1, 2 and 3 the moment you have enough a asset and go to Set 4 to complete the amount of necessary Ada for the transaction (if you can) to avoid involuntary clustering of assets. Only if Set 4 is unable to pay for the Ada requirement do you start plundering into set 5.

Example : I have :
"singletons": 1 utxo with 2 ada and 100 asset a
"pairs" 19 utxos with 2 ada, 100 asset a, 2000 asset b
"other" 1 utxo with 2000 ada

I want to send 40 ada and 100 asset a. The selection algorithm as you currently described will give me 5 utxos:
40 ada + 2000 asset a + 38000 asset b

The Coin Selection Algorithm I described would send 2 utxos for:
2002 ada + 100 asset a

Now imagine if you have very many nfts (like me, say 1000+) you algorithm NOT skipping to pure ada UTxOs once it is done accumulating the asset a total just grabs all my wallet at random trying to put together a little Ada while I have pure Ada tx just waiting to go that would result in no additional clustering.

Regarding the use of fallback strategies

The random improve article only applies with ONLY pure ada transactions. The assumptions of the whole article does not work at all with native assets that require a carry amount in Ada. This not to say the benefits of the Random Improve strategy is not super interesting, I read the article and approve the conclusions. It's simplify not directly applicable in a wallet with multiple assets and requires the use of proper subsets to maintain it's quality. If you were to run the simulation of the article with input transaction sometimes carrying native assets and a carry you would see catastrophic aggregation crop up quickly. We have seen it happen to every users that uses Daedalus or Yoroi or any other wallet than Eternl (who implemented Token Fragmentation) where wallets become unusable because its UTxOs are more than 16KB just to write the change transaction (the --tx-out command line is close to 16KB).

Limiting the sizes of created change outputs

Well, if that is configurable and already in it would be fantastic to offer that in the API so that transactions being created through the cardano-wallet have a predisposition towards orderly utxos outputs and maximize the potential advent of independent transactions (transactions using completely independent utxos).

Libraries

That is very cool and interesting, although the current cardano-wallet algorithms might address my current issue, having everyone use a high quality Haskell and very accessible Javascript library is probably the best way to standardize transaction creation on Cardano.

As a final point

all of us vending machine providers have been building our machines using Cardano-cli directly, reading the utxos through cardano-db-sync, picking them using our own algorithms to sidestep those issues and building tx and balancing them through our own code. When we (a lot of the people liking are top tier vending machine devs) say that this is an issue it is because we have been working with transaction building for a year, seen wallet fail catastrophically since 11 months and been generally working at improving utxo selection and output balancing for a year.

Adam Dean and I came up with the Token Fragmentation that was implemented in ccvault.io / ccwallet.io / eternl.io wallet that fixed the NFT enthusiasts wallets that got clustered by every other wallet software out there, including Daedalus.

Of course I read all your comments, sir. And thank you for putting in the time,

Nick AKA fencemaker

from cips.

mathieu2em avatar mathieu2em commented on July 30, 2024

I agree on those changes. This is a problem that need to be taken care of as soon as possible.

from cips.

rphair avatar rphair commented on July 30, 2024

At a recent CIP editors' meeting we agreed that issues on the CIP repository should be reviewed sequentially just as we review PR's. I think this issue is actually the one I used as an example in that discussion. It hasn't been forgotten about & please stay tuned as we work out a process for reviewing these issues.

from cips.

rphair avatar rphair commented on July 30, 2024

p.s. I've just reviewed the details of our discussion about "issues" and I think this is one of those that's best turned into a PR with the changes above, especially with so many people showing approval of the OP with no reservations expressed.

@nicolasayotte would you be willing to submit a PR for this, modifying CIP-0002 itself, and adding yourself as an author + this issue as a Comments-URI? In any case I have this flagged for discussion at the next editors' meeting to see if anyone has a counter opinion that it should go into a new CIP (but I doubt it, since the original CIP's not much use without consideration of native tokens).

from cips.

nicolasayotte avatar nicolasayotte commented on July 30, 2024

And just to clarify, the point of the output formatting is not really to avoid hitting the 16KB limit, but to create utxos that are more easily consumable later.

For example: if you need to send me 25000 ada and 50 NFTs, to increase the chances of me using those UTxOs efficiently you could send me 9 outputs:
10000 ada, 10000 ada, 4500 ada, 400 ada,
20 ada with 10 NFTs,
20 ada with 10 NFTs,
20 ada with 10 NFTs,
20 ada with 10 NFTs,
20 ada with 10 NFTs.

Doing that requires a slightly larger transaction to write out, but create 9 UTxOs which have a very high probability of being spendable independently from each other.

Let me show you a live example of me breaking up a clustered utxo by sending it to myself in eternl.io wallet

image

Here you see the utxo becomes 12 outputs that are all very likely to be usable independently.

The "Maximum Ada before splitting" value should be customizable and the "Maximum amount of different tokens in a UTxO before splitting" should also be customizable. Both could have defaults.

Right now, when I split my utxos that way using the eternl.io Token Fragmentation, the current Coin Selection algorithm used in most dApps will cluster all my stuff back together for every transaction I make...

from cips.

nicolasayotte avatar nicolasayotte commented on July 30, 2024

@jonathanknowles finally got around to discussing each point. As a scientist, I am really challenging all of this to help us get to a stronger solution.

from cips.

nicolasayotte avatar nicolasayotte commented on July 30, 2024

@jonathanknowles I am glad you took time to give me a thorough answer. I am kind of waiting on your feedback on my response before going forward though.

from cips.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.