Comments (32)
I have doubts if bitwise operations on byte strings is an optimal choice for cryptographic applications.
I'm not sure that the point of CIP-0058 was to enable cryptographic applications, but it's perfectly valid to bring the point up. Yo make a good point that the separation between integers and bytestrings makes it difficult to mix arithmetic and bitwise operations. I'm somewhat doubtful that many cryptographic operations could be efficiently implemented in Plutus Core even if we were to provide new builtins that were suitable for both arithmetic and bitwise operations. Plutus Core is interpreted and it's functional, and both of those aspects will slow things down: you can't do in-place modifications of bytestrings, for example: every time you modify a single bit you have to copy the entire bytestring, so if you're doing hundreds or thousands of such operations it'll be slow. I guess we'd have to try it and see.
Anyway, now is a good time to bring this up, before we do any more work on CIP-0058. Part of the idea of CIPs is that they're open for members of the community to discuss and contribute to, so your input is very welcome.
from cips.
I'm not sure that the point of CIP-0058 was to enable cryptographic applications,
This was definitely one of the major use cases.
FWIW, the inefficiency part applies to almost any other smart contract platform,
unless you go full WASM, which has its own issues.
The presence of proper bitwise operations would make infeasible things feasible, even if not entirely efficient.
You don't need much to implement e.g. FRI.
from cips.
Copying my old comments from IntersectMBO/plutus#4733 (comment)
I just realised that
integerToByteString
is problematic because you can't control the length of the output.
This is important because all operations only work on arguments of equal length.
Padding out a bytestring to the correct length is not zero-cost since we need to create a helper bytestring which we append to the output, resulting in 3 allocated bytestrings rather than 1.
When doing cryptography using these functions, we'd always be mod-ing all arithmetic operations, hence conversion to and from must be fast. In addition, we would never be using negative numbers, instead representing negative numbers as the upper half of the range we choose, hence sign-extension isn't a worry.
If the integer is too big for the chosen length, we cut off the top.
We'd still want to support converting an arbitrary-length integer to bytestring, as there isn't a way of calculating the necessary length effectively. This could be done by regarding a chosen length of 0 as "unlimited". I'm not sure if this is a good idea.
In any case, I assume this should be added to the CIP?
If we switch
BuiltinByteString
toShortByteString
, conversion from/to integers would be O(1), which would be quite a big efficiency-boost I'd imagine. The only downside is that slicing becomes O(n), but I'm not sure this is an issue, as slicing isn't commonly used anyways. @michaelpj ?Edit: Not entirely straightfoward since
BigNat#
has the invariants that: 1) it's word-sized and 2) top-most word is non-zero.
If we assume word sizes are 64 bits, we can still take advantage of this though! On 64-bit platforms,
integerToByteString
can be truly O(1).byteStringToInteger
can be too if we err if it's not word-sized.
On non-64-bit platforms it gets a bit awkward, but it might still be worth it.
This would obviously require changing the specification of the conversion functions to match the invariants ofBigNat#
on 64-bit platforms.Edit 2: We don't need to change the specification.
BuiltinByteString
would be composed of aByteArray#
and an explicit additional length, such that the underlying array always fulfills the invariants ofBigNat#
. If the explicit length is shorter than the underlying array, then the extra bytes must be zero. If the explicit length is longer than the underlying array, the missing bytes can be assumed to be zero. The real length must always be a multiple of the word size, with a top-most non-zero-word. Hence, the explicit length can not be a full word less than the real length.
With this,integerToByteString
is truly O(1). The explicit length should be such that the top-most byte is non-zero. You can then slice to N bytes (equivalent to modulus of that power of two), which will create a newBuiltinByteString
with the same invariants, an explicit length of N, and a real length of ceil(N / wordsize)*wordsize. Converting this back to anInteger
is also O(1), as we can use the underlying array as-is, discarding the explicit length. The downside of slicing being slower still holds, albeit
you could work around this by keeping around an explicit offset too, but this would simply delay the copy to the conversion back to an integer, which might still be worth it, but doesn't seem important enough to me.
A consequence of the previous scheme is that we can implement an O(1) zero-extension function
zeroExtendByteString :: Integer -> BuiltinByteString -> BuiltinByteString
that increments the explicit length without touching the underlying array. This might be the only thing we need to add to/change in the specification. Then, if you havex
andy
, 64-bit signed integers represented as the[0..2^64[
range inInteger
, to do an addition, we sum the integers, convert to bytestring (almost no-op), zero-extend to 8 bytes (almost no-op, no-op if already above 8 bytes), AND with 0xFFFFFFFFFFFFFFFF, convert back (almost no-op). This seems reasonable.An alternative design would be to change slicing such that an out-of-bounds range, rather than being invalid, is filled with zeros.
That would also solve the issue with roughly the same performance (you wouldn't need the AND anymore).Perhaps including both changes would be wise.
The question is, is the overhead of calling a function low or high relative to the cost of
modInteger
? If high, it might in the end not be faster than just doingmodInteger
! Of course, future improvements to the Plutus interpreter might change this.
from cips.
The CIP explains why we do not implement these operations on integers, see the section "No metaphor-mixing between numbers and bits".
from cips.
In addition, we would never be using negative numbers
There exist modular arithmetic algorithms that utilize negative numbers. For example Algorithm 2.22 from book ''Guide by elliptic curve cryptography" by Hankerson et al.
I think that's a little bit of a red herring. When you're doing modular arithmetic you're working in the ring ℤ_n for some n and it's natural to represent an element of that ring by some integer, which is usually normalised to lie between 0 and n-1, so in particular it's positive. However, when you're looking at a power a^k with a in ℤ_n, k really is an integer, not an element of the ring, and negative powers are perfectly reasonable for invertible elements of the ring.
from cips.
The point is more so you'd have to handle negative numbers yourself.
integerToByteString
should indeed have a width argument.
Another option is to add integer arithmetic operations to bytestrings.
from cips.
this would likely require the introduction of fixed-width integers, as currently we only have integers of arbitrary length, plenty of cryptographic algorithms use and abuse rollover too.
Metaphor-mixing is bad, but that doesn't preclude making conversion as cheap as possible.
since we have endianness requirements, 'zero cost' conversions aren't generally possible in the conversion.
from cips.
libgmp representation is little-endian
like sequence of limbs. Limbs size and endianess is platform specific. All tier1 ghc platforms are little endian. Thus, in theory, the conversion to little endian plutus bytestring could be (near) zero cost. Conversion to big endian plutus bytestring requires reversing the bytestring and it cannot be zero cost. All of that is based on the assumption that ghc and node will never run on big endian platforms. While unlikely they will we cannot assume that. Anyway, such 'casting' smells bad. If if was up to me I would not approve such a change.
from cips.
CIP-58 is due to be replaced in any case at this point. The conversion primitives that have been merged recently have already deviated from CIP-58 to the point of mutual incompatibility.
from cips.
Note also that we don't necessarily want to fix ourselves to the libgmp
bignum backend. That would for example forever give up any possibility of running a node (or even just checking a Plutus script) in Javascript, for example, because there the bigint operations are implemented with FFI calls to JS-native bigint operations.
from cips.
@Ryun1 @Crypto2099 - the submitter @marcinbugaj seems to raise a valid question and has invited @michaelpj @kozross @perturbing @kwxm to continue the discussion from IntersectMBO/plutus#4733 (comment) which makes sense because there are over 200 comments in that pull request already.
Marcin: as mentioned in that comment I wouldn't think of this issue as a "ticket" since it's beyond the CIP repository admins' capabilities to resolve. Instead we can watch the discussion that develops among the subject matter experts here and hopefully identify if & when a pull request might be appropriate for CIP-0058.
from cips.
You could just implement every single cryptographic operation in the interpreter directly,
but many times you want to change them slightly, and many of them can be of considerable
complexity, increasing the complexity of Cardano.
from cips.
Metaphor-mixing is bad, but that doesn't preclude making conversion as cheap as possible.
from cips.
I don't mind metaphor-mixing
especially when it's good for efficiency. The purpose of bitwise primops is to make scripts more efficient, isn't it? The purpose is not necessarily to fix the ubiquitous design of integers being capable of arithmetic and bit operations. I'd prefer Plutus bitwise primops to be done directly on integers. That has a few advantages:
- there exists no more efficient nor simpler implementation. The plutus bitwise prioms would just delegate to Haskell integer bit operations.
- Plutus bitwise primops being compatible with Haskell bit operations makes Plutus easier to understand. Every difference in behavior of bit operations between Plutus and Haskell will be perceived as a quirk by developers
from cips.
In addition, we would never be using negative numbers
There exist modular arithmetic algorithms that utilize negative numbers. For example Algorithm 2.22 from book ''Guide by elliptic curve cryptography" by Hankerson et al.
from cips.
In addition, we would never be using negative numbers
There exist modular arithmetic algorithms that utilize negative numbers. For example Algorithm 2.22 from book ''Guide by elliptic curve cryptography" by Hankerson et al.
The point is you would represent them two's complement as is done in other languages.
So you'd represent -1
as 2^n - 1
for some n
explicitly.
You probably want the wrapping on addition/subtraction too.
from cips.
To avoid bias in the the discussion (and also to attract more interested parties) I'm changing the title a bit... please feel free to correct further if the new title doesn't wholly reflect what is currently being debated.
from cips.
While bitwise operations on integers are useful in some contexts, bytestrings may be more suitable in others. For example, bytestrings have a natural notion of length, but the size of an integer in bytes is a bit more lax. Suppose you have two integers of different lengths and you want to perform a bitwise xor
. How do you reconcile the different lengths? Do you truncate the longer integer or extend the shorter one? Do you truncate/extend on the left or right? There was a lengthy discussion about this in the earlier PR and we eventually decided to only allow bitwise and/or/xor on bytestrings of identical length: you have to make sure in advance that your inputs are all of the right length. For integers you'd either have to fix a strategy and bake it into the operations or equip each of the operations with a couple of boolean arguments saying whether to truncate or extend and whether to do it at the left or right.
There are similar issues for shifts and rotations: in particular, what happens if you rotate the bits in an integer (which have no fixed length)? After rotating by a certain number of bits you might end up with one or more zero bits at the start, and presumably they'd just be dropped since there's no way of representing an integer with leading zeros. This would mean that in general if you take an integer n and rotate it by i bits and then by j bits, you'd probably end up with something very different from what you'd get by rotating n by i+j bits, which seems undesirable. Note that in Data.Bits
, for example, rotation for the Integer type is defined to be the same as shifting.
There was also a mention of zero-cost conversions between bytestrings and integers, but is this really possible? Again I'm thinking about leading zeros in bytestrings: if you convert a bytestring to an integer and then back again you might end up with something shorter than the thing you started with, so maybe we'd need a width argument as well, as we currently have for integerToByteString
, and that requires extra work to construct an output of the correct size.
Also, if we're dealing with signed integers, what happens to the signs during shifts/rotations/xor/etc? There are standard ways to do that for fixed-width integers, but it may be more tricky for the unbounded integers that we have in Plutus Core.
I'm not saying that we shouldn't think about bitwise operations for integers, just that things may not be as easy as they appear at first sight. One of the use cases in CIP-0058 was concise data structures based on bytestrings, and those might not be so easy to implement if we had to represent them using integers instead of bytestrings. Maybe there's no one-size-fits-all solution and we'd need separate sets of operations for bytestrings and integers to accommodate the needs of different application areas. That would need a whole new CIP though, and it probably wouldn't get implemented any time soon.
from cips.
There exist modular arithmetic algorithms that utilize negative numbers. For example Algorithm 2.22 from book ''Guide by elliptic curve cryptography" by Hankerson et al.
I agree with what @kwxm said about this, these negative integers are semantics, it is taking the inverse and the exponentiation with the positive exponent (also see this CIP).
from cips.
since we have endianness requirements, 'zero cost' conversions aren't generally possible in the conversion.
Even if we didn't, 'zero cost' conversions would be impossible due to incompatible representations between Integer
and ByteString
. You can't avoid a copy when converting between those two no matter how you do it.
from cips.
since we have endianness requirements, 'zero cost' conversions aren't generally possible in the conversion.
Even if we didn't, 'zero cost' conversions would be impossible due to incompatible representations between
Integer
andByteString
. You can't avoid a copy when converting between those two no matter how you do it.
Not necessarily true. I looked into this at the time and you can use the same representation as integers for bytestrings.
from cips.
since we have endianness requirements, 'zero cost' conversions aren't generally possible in the conversion.
We assume little endian.
from cips.
Not necessarily true. I looked into this at the time and you can use the same representation as integers for bytestrings.
At least as far as I could tell in GHC 9.0 and onwards, Integer
s are represented as (essentially) unpinned ByteArray#
, while ByteString
s are a Ptr Word8
(which happens to point to pinned memory). You cannot inter-convert between these in general: taking an Addr#
from unpinned memory is not safe (since it might move and its address is not stable), so converting it to a Ptr
isn't something you can do easily. At least in this direction, I don't see how this can be done without copying.
from cips.
Not necessarily true. I looked into this at the time and you can use the same representation as integers for bytestrings.
At least as far as I could tell in GHC 9.0 and onwards,
Integer
s are represented as (essentially) unpinnedByteArray#
, whileByteString
s are aPtr Word8
(which happens to point to pinned memory). You cannot inter-convert between these in general: taking anAddr#
from unpinned memory is not safe (since it might move and its address is not stable), so converting it to aPtr
isn't something you can do easily. At least in this direction, I don't see how this can be done without copying.
We would use ByteArray#
for Plutus ByteString
s.
from cips.
That's a change that is quite unlikely to get approved.
from cips.
That's a change that is quite unlikely to get approved.
I don't see why the SPOs would disapprove of this.
from cips.
libgmp representation is
little-endian
like sequence of limbs. Limbs size and endianess is platform specific. All tier1 ghc platforms are little endian. Thus, in theory, the conversion to little endian plutus bytestring could be (near) zero cost. Conversion to big endian plutus bytestring requires reversing the bytestring and it cannot be zero cost. All of that is based on the assumption that ghc and node will never run on big endian platforms. While unlikely they will we cannot assume that. Anyway, such 'casting' smells bad. If if was up to me I would not approve such a change.
It's worse than this - you also have a pinned-versus-unpinned conversion, which cannot be done safely in general (as I have specified). Furthermore, Integer
s are padded to limb size, which means that you (in general) can't convert an arbitrary ByteString
without copying for padding (even if the pinned-versus-unpinned wasn't an issue). Still worse is that for very short Integer
s (aka, those that fit within a machine integer), the representation in GHC 9.0 and later uses an Int#
inside, to which we, once again, cannot convert without copying (or the other way around for that matter).
The idea of a zero-cost conversion between (Plutus representations of) Integer
and ByteString
is a complete non-starter and I ruled it out a while ago.
from cips.
At this point enough argument has been provided for the claim that zero-cost conversion is not doable :)
Furthermore I wish we could agree on the fact that CIP58 is not useful for cryptographic application. Those has been mentioned in the first paragraph of CIP58 as an area of application for bitwise ops. Yet, the goal is not fulfilled. Arithmetic ops can be performed on integers. Bitwise ops on bytestrings. Conversion is not cheap and it's possible that using bitwise ops to divide by power of 2 is more expensive than using normal division.
from cips.
As long as you cannot do bitwise and arithmetic operations on the same type (or without non-zero cost conversion) the 'problem' is still there.
from cips.
Pinnedness is not relevant, the point is you wouldn’t use ByteString at all.
from cips.
I think sacrificing proper support for big endian platforms is worth it.
The best solution would be to add support for something like WASM scripts but this seems unlikely due to bureaucracy.
from cips.
Also, one of the use cases that we have to test the viability of these operations is an implementation of Ed25519 using them. If we can't make that efficient enough to fit within a reasonable amount of budget then that's a problem for sure, and if we can then I think we should be able to do most other things too.
from cips.
Related Issues (20)
- CIP-9999 | Adjust preamble and structure w.r.t CIP-0001 HOT 1
- CIP-1852 | Adjust preamble and structure w.r.t CIP-0001
- CIP-1853 | Adjust preamble and structure w.r.t CIP-0001
- CIP-1854 | Adjust preamble and structure w.r.t CIP-0001
- CIP-1855 | Adjust preamble and structure w.r.t CIP-0001
- Add References section to CIPs/CPSs HOT 10
- CIP-100: Make witnessing optional + add CIP-08 message signing HOT 5
- CIP-9999 | How to add Copyright to CPSs HOT 4
- CIP-0069: alternative approach HOT 10
- CIP-0072: Suggestions for improving dApp registration identity verification HOT 4
- CIP-100 | Canonical example is wrong HOT 3
- CIP 100 | Provide directions on how to create signatures for the body without circular dependencies. HOT 19
- Wiki: Initial topics + points to cover HOT 5
- Rendering errors on cips.cardano.org. HOT 2
- CIP-0013: Current state of integration and further advancements HOT 13
- CIP-0010 | Add schema links HOT 2
- Voter metadata sign the initial document HOT 1
- Tracking protocol parameters definitions HOT 3
- Indexing CIPs Repo for Core `Waiting for Implementation` proposals HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cips.