fxamacker / ccf_draft Goto Github PK
View Code? Open in Web Editor NEWCadence Compact Format specification (RC2)
License: Apache License 2.0
Cadence Compact Format specification (RC2)
License: Apache License 2.0
The following improvements were identified by reading Cadence Core Contracts with CCF specs and codec in mind.
More types need to be added in CDDL:
Reassign tag numbers to reserve some tag numbers in each (sub)group.
Add interface types as options to ccf-composite-type-message
. This is more extensible than using simple type at the cost of encoding a little more data.
Add reference and restricted types as options to inline-type.
Add support for function value.
Refactor CDDL to separate type objects from type value objects for readability and cleaner implementation.
In CDDL, uint-value
, uint128-value
, and uint256-value
should have .ge 0
.
I have some pending updates to CCF text and CDDL that should be merged before the CCF Spec is hosted at onflow/ccf repo.
More info:
Thanks @turbolent for suggesting we add CDDL validation to CI.
CCF uses CBOR Preferred Serialization to encode values to their smallest form.
Describe canonical encoding at a higher level than CBOR (CCF level).
[cadence_value]
should be [*cadence_value]
, representing array of zero or more encoded Cadence simple values.
CBOR is a modern alternative to earlier data formats such as JSON, GOB, MessagePack, etc. Some comparisons between CBOR and other binary formats were published by IETF:
Currently, we don't have global CCF type id. So CCF type id is scoped to the value it describes.
We can define a top level CCF message that combines CCF type definitions with the value it describes. This solution can be used before we support global CCF type id.
Make the CBOR tag numbers in the CDDL match the CCF codec.
The CDDL section of CCF specs were updated and some of the examples no longer match the updated CDDL.
Update CCF examples in the specs to match the updated definitions in CDDL notation.
Also, add more examples and confirm they are all in sync with CCF Codec (PR 2364) merged to onflow/cadence on April 13, 2023.
In JSON-CDC, sometimes cadence-type-id was encoded when it was just a "stringification" of other encoded data and not necessary to encode.
In CCF, we don't need to keep this inefficiency for the sake of compatibility.
Thanks @turbolent for spotting this! ๐
Remove the inefficiency by removing unnecessary cadence-type-id from
The capability-type
in CDDL shouldn't be an array.
The specification should be explicit about invalid UTF-8 strings.
My preference is to reject invalid UTF-8 text strings. However, there may be edge cases I'm not aware of, so confirm with Cadence team first before adding text.
We now explicitly specify nullable types starting with PR #75.
Remove null as an option for type-value.
Specify null as an option for capability-type-value and restricted-type-value.
Add valid requirement for unique name in function-type.type-parameters.
Thanks @turbolent for great suggestions during our discussions on Slack and for opening PR 75!
Mention decoding limits can be stricter for untrusted inputs and less strict for trusted inputs. For example, CBOR limits such as MaxArrayElements
, MaxMapPairs
, and MaxNestedLevels
can be set differently for decoders processing trusted and untrusted inputs. CCF-based protocols can also specify different limits to balance tradeoffs.
The main tradeoff for decoder limits:
A GRPC limit of 20 MB can support (at most) a 20,000,000 element array (for an unrealistic message with zero-overhead and 1 byte elements).
In practice, it would take many thousands of non-malicious CCF messages (like average-sized events) to reach a 20 MB GRPC limit, so it doesn't make sense to allow more than 20,000,000 elements for each array within a single CCF message.
This update to CCF specs can be done after opening PR to add CCF Codec to onflow/cadence and before CCF Specs RC2.
Benchmarks are from initial proof-of-concept CCF codec. We should update benchmarks and comparisons with a reminder that we are not comparing apples to apples.
Prior formats (CBF and JSON-Cadence Data Interchange) didn't specify requirements for validity, sorting, etc.
Sorting data, deterministic encoding, etc. are not free.
Cadence recently added type-parameters to function-value. Update CCF to support it.
This requires Cadence to add static type for Cadence external PathLink
value.
Relies on onflow/cadence#2167
Besides CBOR Core Deterministic Encoding Requirements, etc. there are additional requirements for CCF deterministic encoding requirements which are defined in this specification.
Currently, the bullet point only mentions the deterministic requirements from CBOR's RFC 8949.
Thanks @turbolent for suggesting we use https://github.com/anweiss/cddl to check CDDL in PR #10.
Add simple-type-id
in CDDL (around 105 lines).
CCF specs should be updated to:
Some changes to specs are required:
Update composite-type-value.initializers
from "one or many" to "zero or one" since only one initializer is supported and sorting is hard for multiple initializers.
Remove deterministic sorting requirement for composite-type-value.initializers
since only one initializer is supported and initializer parameters have natural sorting and shouldn't be changed.
Thanks @turbolent for great discussion and suggesting this today!
The findings are primarily from onflow/flow-go#3448 and some of its text can be reused here. Some of the content under Interoperability and Reuse and also be moved into Why CBOR.
Thanks @turbolent for the suggestion to add this section.
Update Why CBOR section's mention of fxamacker/cbor to:
For example, make it clear that CCF-based protocols and data formats must use a deterministic sequence when encoding more than one Cadence composite type.
CI takes too long because compiling Rust programs takes too long.
Optimize by downloading, verifying checksum, and running precompiled cddl
program to avoid building it.
The current status and timeline is outdated. Also, the introduction can be improved by copying this Introduction section from the CCF specification:
Cadence external values (e.g. transaction arguments, events, etc.) have been encoded using JSON-Cadence Data Interchange format, which is human-readable, verbose, and doesn't define deterministic encoding.
CCF is a binary data format that allows more compact, efficient, and deterministic encoding of Cadence external values. Consequently, the CCF codec in Cadence is faster, uses less memory, encodes deterministically, and produces smaller messages than the JSON-CDC codec.
A real FeesDeducted
event can encode to:
Unlike prior formats, CCF defines all requirements for deterministic encoding (sort orders, smallest encoded forms, and Cadence-specific requirements) to allow CCF codecs implemented in different programming languages to deterministically produce identical messages.
For security, CCF was designed to allow efficient detection and rejection of malformed messages without creating Cadence objects. This allows more costly checks (e.g. validity) to be performed only on well-formed messages.
CCF leverages vendor-neutral Internet Standards such as CBOR (RFC 8949), which is designed to be relevant for decades.
Add a section to specify restrictions on CCF messages.
The value definition is incomplete in this block because it is missing composite value.
CCF_TypeAndValue_Message = #6.CBORTagTypeAndValue([
type: CCF_InlineTypeInfo,
value: cadence_value / [cadence_value]
])
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.