Giter Club home page Giter Club logo

cddl's People

Contributors

anweiss avatar carl-wallace avatar dependabot[bot] avatar ericseppanen avatar hansl avatar itamarst avatar jrandolf avatar sebastiengllmt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cddl's Issues

Evaluate nom

A v2 of this library would likely warrant a re-write using a more formal parser library, like nom. It would be interesting to compare the performance of nom vs. the handwritten implementation that exists today.

Array validation doesn't work for records

Array validation doesn't seem to work as expected when the array contains a non-homogenous "record" (fixed length list using different types at different indices).

If I create the following CDDL:

human = [
  age: int,
  name: tstr,
]

I expect that this will only validate an input list with two elements [integer, string] in order. However, the current code will successfully validate the following (incorrect) JSON inputs:

["Bob", 43]

or

[44, 45, "Carol", "Chuck"]

It appears that the code is iterating through the group checking that somewhere in the input list there is a value of the correct type; it fails to enforce that there are the right number of elements and that elements come in the expected order.

This problem exists in CBOR validation as well.

CBOR validation for type choice in array

Using the latest version at this point in time: 0.9.0-beta.1.

I am having a problem with validating CBOR [18] (hex: 9F12FF - regardless of definite/indefinite array) against CDDL tester = [18/12] which should be valid.

But I am getting:

failed: error validating type choice at cbor location : expected value 12, got Array([Integer(Integer(18))])

and validating CBOR [12] (hex: 9F0CFF) against the same CDDL also yields:

failed: error validating type choice at cbor location /0: expected value 18, got Integer(12)
error validating type choice at cbor location : expected value 12, got Array([Integer(Integer(12))])

but validating CBOR [12] and [18] against CDDL tester = [18//12] works.

According to the specification type choices should have worked, am I mistaken or is this something that is currently not supported in this package?

Thank you for your great work!

Incorrect group choice parsing

Take the following code as an example:

foo = int

; no errors
bar = { int, int // int, tstr }

; error
; baz = { int, foo // int, tstr }

It seems to happen whenever a typename is used as the last field within a group choice.
It generates an Error return. However, even in the first example, bar, I think it is still incorrectly parsing, as it seems to consider it more equivalent to what bar = {int, int int, tstr } might be, although that in and of itself seems weird.

As per the RFC8610 specs:

Analogous to types, CDDL also allows choices between groups,
delimited by a "//" (double slash). Note that the "//" operator
binds much more weakly than the other CDDL operators, so each line
within "delivery" in the following example is its own alternative in
the group choice:

               address = { delivery }

               delivery = (
               street: tstr, ? number: uint, city //
               po-box: uint, city //
               per-pickup: true )

               city = (
               name: tstr, zip-code: uint
               )

bar should be parsed into 2 group choices with 2 members each, not 1 choice with 4 members.

Here is the debug-printed value of bar:

CDDL { rules: [Type(TypeRule {
    name: Identifier(("bar", None)),
    generic_param: None,
    is_type_choice_alternate: false,
    value: Type([
        Type1 {
            type2: Map(Group([
                GroupChoice([
                    (ValueMemberKey(ValueMemberKeyEntry { occur: None, member_key: None, entry_type: Type([Type1 { type2: Typename((Identifier(("int", None)), None)), operator: None }]) }), true),
                    (ValueMemberKey(ValueMemberKeyEntry { occur: None, member_key: None, entry_type: Type([Type1 { type2: Typename((Identifier(("int", None)), None)), operator: None }]) }), false),
                    (ValueMemberKey(ValueMemberKeyEntry { occur: None, member_key: None, entry_type: Type([Type1 { type2: Typename((Identifier(("int", None)), None)), operator: None }]) }), true),
                    (ValueMemberKey(ValueMemberKeyEntry { occur: None, member_key: None, entry_type: Type([Type1 { type2: Typename((Identifier(("tstr", None)), None)), operator: None }]) }), false)
                ])
            ])),
            operator: None
        }
    ]),
    range: (0, 7)
})] }

And here is how bar = { int, int, int, tstr } is parsed:

CDDL { rules: [Type(TypeRule {
    name: Identifier(("bar", None)),
    generic_param: None,
    is_type_choice_alternate: false,
    value: Type([
        Type1 {
            type2: Map(Group([
                GroupChoice([
                    (ValueMemberKey(ValueMemberKeyEntry { occur: None, member_key: None, entry_type: Type([Type1 { type2: Typename((Identifier(("int", None)), None)), operator: None }]) }), true),
                    (ValueMemberKey(ValueMemberKeyEntry { occur: None, member_key: None, entry_type: Type([Type1 { type2: Typename((Identifier(("int", None)), None)), operator: None }]) }), true),
                    (ValueMemberKey(ValueMemberKeyEntry { occur: None, member_key: None, entry_type: Type([Type1 { type2: Typename((Identifier(("tstr", None)), None)), operator: None }]) }), false)
                ])
            ])),
            operator: None
        }]),
        range: (0, 7)
})] }

Which is missing one of the fields, possibly since they are unnamed?

Which made me try bar = ( a: int, b: int // c: int, d: tstr ) but it still has the same problem as before with them all being in one group choice.

`.and` operator not fully supported?

I have this CDDL snippet:

m = non-empty<{
  ? 0 => "zero"
}>

non-empty<M> = (M) .and ({ + any => any })

which produces this error:

> docker run -i --rm -v $PWD:/data -w /data ghcr.io/anweiss/cddl-cli:latest compile-cddl --cddl non-empty.cddl
error: parser errors
  ┌─ input:5:20
  │
5 │ non-empty<M> = (M) .and ({ + any => any })
  │                    ^^^^ expected rule identifier followed by an assignment token '=', '/=' or '//='

Error: "Parser error"

It looks like .and is not fully supported?

VSCode LSP extension

Create a language server extension for VSCode built on the wasm package. Prototype being developed in the lsp branch.

Improved validation error handling

All errors are currently Box'ed with little actionable information. The library should provide for better error handling mechanisms for lexing, parsing and validation.

Mandatory map fields not enforced?

I have this CBOR (diag notation):

{1: 65535, 2: h'1122334455', 3: 6, }

It validates successfully with this CDDL:

var_header = {
        K_KEY_PROVIDER: uint,
        K_KEY_ID: bstr,
        ? K_KEY_VERSION: uint,
        ? K_AUX_DATA: bstr,
        ? K_NONCE : bstr,
        ? K_AUTH_TAG : bstr,
        ? K_AAD : bstr,
        *uint => any ; extensions
}

K_RESERVED = 0
K_KEY_PROVIDER = 1
K_KEY_ID = 2
K_KEY_VERSION = 3
K_AUX_DATA = 4
K_NONCE = 5
K_AUTH_TAG = 6
K_AAD = 7
        ; extend here

According to @cabo my CDDL is incorrect because it translates to textual, rather than integer, map keys, e.g. "K_NONCE" and not 5. My file is accepted because of the "extensions" line. However it should have failed validation anyway, because the first two fields (K_KEY_PROVIDER and K_KEY_ID) are mandatory in the schema, but missing from the CBOR file.

Cleanup validation error handling

Errors are only being handled one at a time and the line and column numbers are not properly tracked during parsing and error handling. Errors should be tolerated and the line and column numbers should correctly identify where the errors are being detected.

Group choices parsing error

tester = $$vals
$$vals //= 18
$$vals //= 12

gives a error parsing CDDL: incremental parsing error, also

tester = $$vals
$$vals //= ( 18 )
$$vals //= ( 12 )

gives a error parsing CDDL: incremental parsing error

tester = $$vals
$$vals //= ( 18 , )
$$vals //= ( 12 , )

works (although unable to validate due to #116 (comment)).

If I read the specification correctly, a type should be able to be coerced into a group (with one grpent) during semantic analysis and/or during validation (if needed). Or the other way around: everything is a group until it is not: trying to coerce it to a type and checking if it can be.

CDDL parsing fails with carefully placed newline.

In the following block of CDDL:

top-level = top-group

;; This fails to parse with the newline after the "=>"
top-group //= (identifier-a =>
    int)

;; It parses without the newline, or if the newline is before the
;; arrow.
top-group //= (identifier-b => int)
top-group //= (identifier-c
    => int)

identifier-a = 1
identifier-b = 2
identifier-c = 3

The first top-group declaration cause a parse error. It seems to require both using an identifier as the key, and having the newline after the arrow. It is easy to work around by either moving the arrow to the new line, or just joining the lines.

error: parser errors
  ┌─ input:4:31
  │
4 │   top-group //= (identifier-a =>
  │ ╭──────────────────────────────^
5 │ │     int)
  │ ╰^ invalid group entry syntax

Error: "Parser error"

unwrapping doesn't work as expected

I'm getting conflicting results of 'valid' cddl from the gem implementation of cddl and this one. The following example fails to validate using your rust CDDL tool but does using the gem tool. The following CDDL is used to validate a JSON-LD document where the value of the @context is either a sting or an array of items where the first item as stated below followed by one or more URIs. My understanding was the ~ is used to unwrap the type and remove the necessary CBOR tag (32).

document = {
  @context : "https://www.example.com/ns/v1" / [ "https://www.example.com/ns/v1", 1* ~uri ]
}

Thoughts? Thanks!

Is the warning in README still valid?

Hi,

Thanks for publishing this crate!

I'm investigating CDDL validation for Python. Lacking an existing library in Python, wrapping a safe implementation in Rust seems like a much better approach than C/C++, and your library came up. The README has a caveat ("personal learning exercise" etc.), but:

  1. You still seem to be actively working on this, and sounds like trying to get to 1.0?
  2. The other Rust alternative is cddl-cat (https://github.com/ericseppanen/cddl-cat) which seems a lot less featureful.
  3. There doesn't seem to be unsafe so presumably there's a limit to how much can go wrong (a panic, worst case? and I believe the Python wrapper will just turn that into a Python exception).

So perhaps that warning is no longer valid? In which case, perhaps it should be removed.

Validation always fails for CBOR with non-standard simple values

Seems like cddl is unable to validate any CBOR binary that uses non-standard simple values, instead producing

Validation of "filename.cbor" failed

error parsing cbor: unassigned type at offset X

As far as I understand this is due to serde_cbor intentionally producing parser error when it encounters any simple value it doesn't understand.

Is there any workaround for that or the fix would be to replace serde_cbor with other library?

New beta1 panics when previous version didn't

One of my tests for the validator involves passing in an empty slice (originating in a Python byte string).

In 0.9.0beta0, this gave a validation error.

In beta1, it panics:

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })', /home/itamarst/.cargo/registry/src/github.com-1ecc6299db9ec823/cddl-0.9.0-beta.1/src/validator/mod.rs:169:76
stack backtrace:
   0: rust_begin_unwind
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:498:5
   1: core::panicking::panic_fmt
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/panicking.rs:107:14
   2: core::result::unwrap_failed
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/result.rs:1613:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/result.rs:1295:23
   4: cddl::validator::validate_cbor_from_slice
             at /home/itamarst/.cargo/registry/src/github.com-1ecc6299db9ec823/cddl-0.9.0-beta.1/src/validator/mod.rs:169:38
...

In general I would suggest never using unwrap() or expect() in library APIs unless it's utterly impossible to avoid, since panics are very problematic for library users.

Nicer API for repeat validation

A common pattern is to use the same schema to validate multiple documents. In the current API, this requires a bunch of work involving the innards of the implementation:

  1. Create a lexer.
  2. Create a parser.
  3. For each document to be parsed, create a validator, since validators are single-shot.

A nicer API would be something like:

let cddl_schema = CDDLSchema::from_slice(my_schema_bytes);
for document in documents {
    cdd_schema.validate(document)?
}

.feature not active in the online validator

The CDDL of the core-href draft currently does not validate in the online service unless the .feature is removed. Same goes for the JC<> example.

Docs say that .feature is supported and on by default; it appears that it is not on on the web service because it gives errors like:

expected rule identifier followed by an assignment token '=', '/=' or '//='

Could that be enabled for the web service?

(By the way, the README still calls the spec for .feature draft-ietf-cbor-cddl-control; it has been promoted to RFC 9165 since then).

Error tolerance

Extend parser to be more error tolerant based on incomplete CDDL. This is required for implementing any sort of language server functions for IDE support.

Problems with `serde_json` dependency

Hi,

I'm wrapping your library for Python, and initial setup failed to build:

  error: expected item, found `"serde_json requires that either `std` (default) or `alloc` feature is enabled"`
   --> /home/itamarst/.cargo/registry/src/github.com-1ecc6299db9ec823/serde_json-1.0.69/src/features_check/error.rs:1:1
    |
  1 | "serde_json requires that either `std` (default) or `alloc` feature is enabled"
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected item

I imagine this is a missing feature flag in the serde_json dependency in Cargo.toml. I can workaround it locally by explicitly adding serde_json as a dependency, but I assume this is something that will impact other people.

Refactor validation logic

The validation logic is incredibly confusing to new contributors. The first attempt was really just to get things working, but the resulting code is less than ideal. This issue aims to track all activities related to refactoring this logic to make it much more readable.

Bug (probably): unspecified types aren't errors

Consider the following schema:

reputation-object = {
        application: text
        reputons: [* reputon]
}

The reputon type is never defined, and yet the validation code seems perfectly happy with the following document:

{"application": "blah", "reputons": [{"reputon": "xx"}

If this is invalid behavior, great, that can just be fixed.

But perhaps this is valid according to the RFC, and if so that's a very bad feature in a schema language, because a typo in a type name means validation suddenly doesn't happen. In which case perhaps a strict mode would be useful where all types must be explicitly specified? (But I really hope this isn't valid according to the RFC...).

.size control produces unexpected errors

The CDDL spec says:

When applied to an unsigned integer, the ".size" control restricts
the range of that integer by giving a maximum number of bytes that
should be needed in a computer representation of that unsigned
integer. In other words, "uint .size N" is equivalent to
"0...BYTES_N", where BYTES_N == 256**N.

audio_sample = uint .size 3 ; 24-bit, equivalent to 0...16777216

Using version 0.9.0-beta.1:

The contents of size.cddl

start = Record
Record = {
  id: Id
}
Id = uint .size 8
$ echo '{ "id": 5 }' | cddl validate --cddl size.cddl --stdin
[ERROR] Validation from stdin failed: error validating at JSON location /id: expected value .size 8, got 5

I get similar obscure errors when I apply the .size control to the bytes type.

Lifetime on CDDLValidator/CDDL structs is annoying

I am wrapping the library with Python, using PyO3. PyO3's macros for wrapping classes aren't happy with having a lifetime parameter, so having a CDDL attribute is difficult (I don't want to parse the schema from scratch each time I validate a document).

More broadly, having a lifetime identifier on the CDDL struct just makes using it annoying since it percolates to everything.

I am very much not an expert, but I suspect this could be fixed simply by having the parser own the lexer and the original schema string?

Move tag lexing into the parser

Tags are lexed, but should be parsed to address tagged data items with containing data types ... ABNF: "#" "6" ["." uint] "(" S type S ")". At the moment, tags of this format aren't properly parsed.

Example of invalid spec passing

sargun:oci2 sargun$ ./cddl-darwin-amd64 --version
cddl 0.8.5
sargun:oci2 sargun$ ./cddl-darwin-amd64 compile-cddl --cddl hello.cddl 
hello.cddl is conformant

The contents:

; 1868785970 is "oci2" as an integer
oci2 = #6.1868785970({
    ; A given version of the spec may use a specific set
    ; of hash schemes, file layouts, etc.. Therefore in order
    ; to allow for multiple versions of the schema to exist
    ; simultaneously, a user can quickly read this as the
    ; basis of comparison.
    version: uint,
    files: {
        *filename => file,
    }
})

; Consider restricting this
filename = tstr

file = {
    mode: mode,
    (
        ? uid: unsigned,
        ? gid: unsigned //
        ? username: tstr,
        ? groupname: tstr
    )dwadawda
    ; Access time
    ? atime: tdate,
    ; Modification time
    ? mtime: tdate,
    content: content,
}

content = {
    $$content,
}

$$content //= (
    type: "regularfile",
    regularfile: [
        ; A 0 lengthed file may omit the hash.
        size: uint,
        ; Blake 3 256-bit hash
        ? b3-256: bstr .size 256,
        ; Because this is a vector, it would require
        ; revving the specification.
        ;
        ; TODO: Consider adding new hash types.
        ; TODO: Consider adding holes.
    ],
)

$$content //= (
    type: "directory",
    directory: []
)

$$content //= (
    type: "link",
    link: [
        target: tstr,
    ]
)

$$content //= (
    type: "symlink",
    symlink: [
        target: tstr,
    ]
)

$$content //= (
    type: "character",
    character: [
        major: uint .le 18446744073709551615,
        minor: uint .le 18446744073709551615,  
    ]
)

$$content //= (
    type: "block",
    block: [
        major: uint .le 18446744073709551615,
        minor: uint .le 18446744073709551615, 
    ]
)

$$content //= (
    type: "fifo",
    block: []
)

rwx = [
    read: bool,
    write: bool,
    execute: bool,
]

mode = [
    user: rwx,
    group: rwx,
    other: rwx,
    setuid: bool,
    setgid: bool,
    sticky: bool,
]

The part that's "dwadawda" is invalid.

WebAssembly bindings for validator

What is the reason behind the validator module not being exported for wasm targets? What would be required to enable that functionality or is it not possible at all?

Thank you!

Using .default with another control operator or a range does not work

When I try to define a port number field like so:

example = {
    ? port: (uint .lt 65536) .default 5683
}

or like so

example = {
    ? port: 0..65535 .default 5683
}

I get an error when I try to validate the following JSON object:

{
    "port": 5682
}

The error I get looks something like this:

error validating at cddl location "" and JSON location : CDDL member key must be string data type. got 5683

Using the example for .default and another control operator from the specification actually also fails.

Reduce crate size

Much of the size of this crate can be attributed to the regex crate dependency. Only the needed features of regex should be enabled per the instructions here

JSON data faker

Implement a JSON data faker from CDDL per one of the original project goals outlined in the README

Comment formatting bugs

There's some odd formatting behavior with comments after trailing commas in member key group entries.

Choice operator not parsed correctly in 0.5.2

The parsing of CDDL input foo = (int / float) changed in release 0.5.2. The choice operator no longer seems to work as intended.

Reading through the rfc, the only place the / operator is allowed is here:

type = type1 *(S "/" S type1)

So I would expect the resulting ast to contain a Type with two Type1 elements in type_choices. This is what I see in 0.5.1.

In 0.5.2 I am seeing something different: a Group with one GroupChoice containing two GroupEntry elements. I don't think that's right; it's what one would expect if I had specified foo = (int, float).

In fact, that's probably a better way to demonstrate the problem: the ast for (int, float) is the same as for (int / float).

abbreviated ast from 0.5.1

Type {
    type_choices: [
        Type1 {
            type2: Typename {
                ident: Identifier {
                    ident: "int",
                    socket: None,
                },
                generic_arg: None,
            },
            operator: None,
        },
        Type1 {
            type2: Typename {
                ident: Identifier {
                    ident: "float",
                    socket: None,
                },
                generic_arg: None,
            },
            operator: None,
        },
    ],
},

abbreviated ast from 0.5.2

Group {
    group_choices: [
        GroupChoice {
            group_entries: [
                (
                    TypeGroupname {
                        ge: TypeGroupnameEntry {
                            occur: None,
                            name: Identifier {
                                ident: "int",
                                socket: None,
                            },
                            generic_arg: None,
                        },
                    },
                    false,
                ),
                (
                    TypeGroupname {
                        ge: TypeGroupnameEntry {
                            occur: None,
                            name: Identifier {
                                ident: "float",
                                socket: None,
                            },
                            generic_arg: None,
                        },
                    },
                    false,
                ),
            ],
        },
    ],
}

Validating JSON document when schema has bytes

Hi,

Continuing investigation of cddl—support for same schema with both JSON and CBOR is great, but there's the problem of bytes. The CDDL RFC says "don't support bytes in schema language", which OK, that's an approach. But another alternative is to say "if schema says bytes, expectation is that in JSON document this will be base64-encoded bytes in a string." And then you could validate JSON documents even with a schema that had bytes, by converting to bytes as part of validation.

I imagine this would have to be a two-step process:

  1. Notice it's supposed to be bytes, check if it's string and convert to bytes.
  2. Then after deserialization to bytes apply any additional controls/constraints, e.g. .size.

Since this not quite compatible with the RFC (arguably it is compatible, in that RFC says "don't use bstr in schema" so this is a superset), might want such a mode hidden behind an option. Some questions:

  1. Would you be amenable to such a feature, if I provided it? Or some other alternative?
  2. Any sense of how difficult it would be to implement?

(Still investigating if this is an actual requirement for the project, or just a nice to have; if this takes more than 60 seconds to answer "not sure" is a fine answer for both questions.)

`tstr` type allows maps?

Thank you for this fantastic implementation of CDDL! ✨

I've stumbled upon the following issue and wonder if it might be a bug?
The following CDDL schema (wrongly?) allows maps as values in the fields map even though only tstr types are permitted:

message = {
    fields: {
        + tstr => tstr
    }
}

This CBOR here (with diagnostic JSON) gets accepted by this schema while I would expect an error:

A1666669656C6473A16474657374A164546578746F48656C6C6F2C204D65737361676521
{
  "fields": {
    "test": {
      "Text": "Hello, Message"
    }
  }
}

Regex escaping and syntax issues

This crate relies on the regex crate for parsing regex in CDDL. However, it only supports PCRE-like regex and doesn't require escaping of certain special characters. The parsed regex strings from CDDL should therefore be updated for proper parsing by the regex crate.

Named capture groups should also be prepended with a P as follows: (?P<name>)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.