Giter Club home page Giter Club logo

check-jsonschema's People

Contributors

6543 avatar borda avatar dependabot[bot] avatar djgoku avatar dkolepp avatar dsch avatar edgarrmondragon avatar electriquo avatar gionn avatar github-actions[bot] avatar hugovk avatar innovate-invent avatar jean-michelbenoit avatar jrdnbradford avatar kianmeng avatar mondeja avatar nikolaik avatar noorul avatar pre-commit-ci[bot] avatar s-weigand avatar sirosen avatar skwde avatar tgillbe avatar tpansino avatar willdasilva avatar yyuu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

check-jsonschema's Issues

Add support for native JSON5 loading of schemas?

This would be really great because usually there are complex schemas that can only be documented inside multiline comments. There is a $comment keyword added in the draft 7 of JSON Schema but I find it useless for documentation, mainly because the lack of support for structured patterns like lists or code blocks.

The unique problem that I see in a possible implementation is how to parse the null, NaN and Infinity keywords of JSON5, which are beyond JSON limitations. The safest would be, I think, to raise an error if found.

Anyways, I'm not sure if this is beyond of the scope of this project.

Personal current workaround

I'm currently documenting schemas in files apart, but I'm planning to download the JSON5 files in a local folder caching and converting to JSON them, adding that folder to a .gitignore file.

TypeError: expected string or bytes-like object

If the instance to be validated contains something like below, it would throw an exception.

$ cat negative_test/playbooks/vars/numeric-var-name.yml 
---
12: ... # invalid var name

Exception

$ python -m check_jsonschema --schemafile f/ansible-vars.json negative_test/playbooks/vars/numeric-var-name.yml
Traceback (most recent call last):
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/site-packages/check_jsonschema/__main__.py", line 3, in <module>
    main()
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/site-packages/check_jsonschema/__init__.py", line 26, in main
    ret = checker.run()
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/site-packages/check_jsonschema/checker.py", line 88, in run
    self._run()
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/site-packages/check_jsonschema/checker.py", line 71, in _run
    errors = self._build_error_map()
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/site-packages/check_jsonschema/checker.py", line 63, in _build_error_map
    for err in validator.iter_errors(doc):
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/site-packages/jsonschema/validators.py", line 229, in iter_errors
    for error in errors:
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/site-packages/jsonschema/_validators.py", line 368, in anyOf
    errs = list(validator.descend(instance, subschema, schema_path=index))
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/site-packages/jsonschema/validators.py", line 245, in descend
    for error in self.evolve(schema=schema).iter_errors(instance):
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/site-packages/jsonschema/validators.py", line 229, in iter_errors
    for error in errors:
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/site-packages/jsonschema/_validators.py", line 42, in additionalProperties
    extras = set(find_additional_properties(instance, schema))
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/site-packages/jsonschema/_utils.py", line 101, in find_additional_properties
    if patterns and re.search(patterns, property):
  File "/Users/ssbarnea/.pyenv/versions/3.10.3/lib/python3.10/re.py", line 200, in search
    return _compile(pattern, flags).search(string)
TypeError: expected string or bytes-like object

Support running the main module using `python -m check_jsonschema`

It would be really nice if I was able to run the CLI using python3 -m check_jsonschema.

Currently(version 0.14.0) Python gives this error:

$ python3 -m check_jsonschema
/Users/opalsymes/.pyenv/versions/3.10.0/bin/python3: No module named check_jsonschema.__main__; 'check_jsonschema' is a package and cannot be directly executed

I tend to use this so it will run with the correct version of python(been burnt too many times with it not doing it)

Support templated expressions in Azure pipelines files

Some lines in a yml/json file may not adhere to a schema. (azure devops preprocesses their yml pipelines prior to running their schema against the .yml files)

Is it possible to just add a comment syntax to a line to avoid flagging the line as an error?

E.g.

 ---
  valid_yml
  valid_yml 
  invalid_yml # !!! Skip this during validation
  valid_yml

Mishandling of local references to other schemas

I ran the following command:

[prompt ]$ check-jsonschema --schemafile web/schemas/zdoc-heat-schema.json tests/schemas \
  /osp_heat_templates/expected_pass/test1.yaml

The web/schemas/zdoc-heat-schema.json schema file contained a reference to another schema:

      template:
        $ref: 'heat-schema.json#'

My understanding of the JSON pointers is (http://niem.github.io/json/reference/json-schema/references/):
A JSON pointer takes the form of A#B in which:

  • A is the relative path from the current schema to a target schema. If A is empty, the reference is to a type or property in the same schema, an in-schema reference. Otherwise, the reference is to a different schema, a cross-schema reference.
  • B is the complete path from the root of the schema to a type or property in the schema. If # in not included or B is empty, the reference is to an entire schema.

So, the heat-schema.json schema should be relative to the web/schemas/zdoc-heat-schema.json........

The error I got was:

Traceback (most recent call last):
  File "/home/XXX/git/YYY/dle-hooks/venv/lib64/python3.8/site-packages/jsonschema/validators.py", line 774, in resolve_from_url
    document = self.store[url]
  File "/home/XXX/git/YYY/dle-hooks/venv/lib64/python3.8/site-packages/jsonschema/_utils.py", line 22, in __getitem__
    return self.store[self.normalize(uri)]
KeyError: 'file://./heat-schema.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib64/python3.8/urllib/request.py", line 1507, in open_local_file
    stats = os.stat(localfile)
FileNotFoundError: [Errno 2] No such file or directory: '/heat-schema.json'

The error seems to indicate that local schema references are not resolved relative to the location of the local target file.

cli tool does not have a --version

check-jsonschema should allow --version and report its version, especially as this is nice and clean way to assert that the tool is installed.

Running the tool with --help produces too much output for CI/CD usage but --version would be just right.

Add or link to docs on how to write schema

Cool project! I'm quite new to schemas though. Perhaps the readme could be expanded with some instructions on how to write your own schema. If that's too involved, maybe link to another source?

Consider using `click` to handle CLI parsing

I've been building on argparse thusfar and it's... fine. argparse is good when the application stays simple enough, but as it grows it gets a bit unwiledy. click might suit this better.

click provides tab-completion support + a better system for declaring new argument types.
It doesn't handle mutex options out of the box, but I've built a mutex validator on top of it before and it wasn't too hard.

pre-commit and cookiecutter Poetry auto-generated GHA

Using Cookiecutter Poetry, I get some GHA which use an uses: ./.github/workflows/setup-poetry-env include. This is fine on GHA but pre-commit complains that:

Schema validation errors were encountered.
  .github/workflows/run-checks/action.yml::$: Additional properties are not allowed ('description', 'runs' were unexpected)
  .github/workflows/run-checks/action.yml::$: 'on' is a required property
  .github/workflows/run-checks/action.yml::$: 'jobs' is a required property
Schema validation errors were encountered.
  .github/workflows/setup-poetry-env/action.yml::$: Additional properties are not allowed ('description', 'inputs', 'runs' were unexpected)
  .github/workflows/setup-poetry-env/action.yml::$: 'on' is a required property
  .github/workflows/setup-poetry-env/action.yml::$: 'jobs' is a required property

Is there any way to exclude those files from pre-commit?

My pre-commit configuration is:

- repo: https://github.com/python-jsonschema/check-jsonschema
  rev: 0.16.1
  hooks:
  - id: check-jsonschema
    name: "Check GitHub Workflows"
    files: ^\.github/workflows/
    types: [yaml]
    args: ["--verbose", "--schemafile", "https://json.schemastore.org/github-workflow"]

[BUG] UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 1571: character maps to <undefined>

System:
OS: Windows 10 x64-bit
python-jsonschema: 0.14.2
hooks: check-github-workflows

Issue:
Running hook id check-github-workflows through pre-commit gives the following error:

Traceback (most recent call last):
  File "C:\Program Files\Python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\hendra11\.cache\pre-commit\repobbwlicnh\py_env-python3.8\Scripts\check-jsonschema.EXE\__main__.py", line 7, in <module>     
  File "C:\Users\hendra11\.cache\pre-commit\repobbwlicnh\py_env-python3.8\lib\site-packages\check_jsonschema\__init__.py", line 26, in main  
    ret = checker.run()
  File "C:\Users\hendra11\.cache\pre-commit\repobbwlicnh\py_env-python3.8\lib\site-packages\check_jsonschema\checker.py", line 88, in run    
    self._run()
  File "C:\Users\hendra11\.cache\pre-commit\repobbwlicnh\py_env-python3.8\lib\site-packages\check_jsonschema\checker.py", line 71, in _run   
    errors = self._build_error_map()
  File "C:\Users\hendra11\.cache\pre-commit\repobbwlicnh\py_env-python3.8\lib\site-packages\check_jsonschema\checker.py", line 61, in _build_error_map
    for filename, doc in self._instance_loader.iter_files():
  File "C:\Users\hendra11\.cache\pre-commit\repobbwlicnh\py_env-python3.8\lib\site-packages\check_jsonschema\loaders\instance\__init__.py", line 62, in iter_files
    data = loadfunc(fp)
  File "C:\Users\hendra11\.cache\pre-commit\repobbwlicnh\py_env-python3.8\lib\site-packages\ruamel\yaml\main.py", line 343, in load
    return constructor.get_single_data()
  File "C:\Users\hendra11\.cache\pre-commit\repobbwlicnh\py_env-python3.8\lib\site-packages\ruamel\yaml\constructor.py", line 111, in get_single_data
    node = self.composer.get_single_node()
  File "_ruamel_yaml.pyx", line 701, in _ruamel_yaml.CParser.get_single_node
  File "_ruamel_yaml.pyx", line 902, in _ruamel_yaml.CParser._parse_next_event
  File "_ruamel_yaml.pyx", line 911, in _ruamel_yaml.input_handler
  File "C:\Program Files\Python38\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 1571: character maps to <undefined>

Validation of Azure pipeline gives strange error

running the same checking as #411 and getting the following error?

 'PublishCodeCoverageResults@2' is not one of ['PowerShell@2']

docs: https://docs.microsoft.com/en-us/azure/devops/pipelines/tasks/test/publish-code-coverage-results
the cmd used was

check-jsonschema .azure-pipelines/*.yml --schemafile "https://raw.githubusercontent.com/microsoft/azure-pipelines-vscode/v1.188.1/service-schema.json"

transfering from microsoft/azure-pipelines-vscode#412

Schema errors are not rendered correctly after switch to iter_errors

Given the following configuration:

values.json:

{
  "$schema": "http://json-schema.org/draft-07/schema",
  "$defs": {
    "test": {
      "type": "string"
    }
  }
}

test.json:

{
  "$schema": "http://json-schema.org/draft-07/schema",
  "type": "object",
  "required": ["test"],
  "properties": {
    "test": {
      "$ref": "./values.json#/$defs/test"
    }
  }
}

test.yaml:

test:
  foo: bar

Getting the following error:

Schema validation errors were encountered.
Traceback (most recent call last):
  File "/Users/xxxx/.cache/pre-commit/repo7ycxoqhn/py_env-python3.9/bin/check-jsonschema", line 8, in <module>
    sys.exit(main())
  File "/Users/xxxx/.cache/pre-commit/repo7ycxoqhn/py_env-python3.9/lib/python3.9/site-packages/check_jsonschema/__init__.py", line 16, in main
    checker.run()
  File "/Users/xxxx/.cache/pre-commit/repo7ycxoqhn/py_env-python3.9/lib/python3.9/site-packages/check_jsonschema/checker.py", line 106, in run
    path = [str(x) for x in err.path] or ["<root>"]
AttributeError: 'list' object has no attribute 'path'

Add support for native TOML checking of instances

Hi, I was reading the thread python-jsonschema/jsonschema#582 and I'm very interested in native TOML instance reading support. Do you still plan to add this, only would be added if other users ask it...? As I'm seeing in the source code, seems to be relatively easy to implement, so let me know if you want some help with it.

Thanks for the great work, cheers!

Syntax error halts validation of other files and prints backtrace

I am trying to implement check-jsonschema in our workflow, where we intend to use it as a git hook that will automatically validate all relevant files. Unfortunately, once one of the files has a syntax error (that will cause a JSONDecodeError), the entire checking process halts and no other files are validated. In addition, an entire stacktrace is printed.

This feels a bit user-unfriendly and makes the output of this tool less useful. I am not sure whether this is intended behaviour; but I would like to see the invalid file being skipped (with an error printed), without causing the entire program to crash. This does in fact seem to be the behaviour of the old CLI jsonschema.

Would it be possible to look into this?

vendor-schemas script needs to check validity better

As raised in #66 , the GitLab CI schema was incorrect. It looks like the content was a 404 response, but it could also have been a 200 with "not found" as the body. I'll have to investigate a little.

Clearly, the vendor-schemas script needs better checking.
At a minimum:

  • make sure it rejects any non-200 response (can expand to 2xx or other codes in the future if necessary)
  • check that the downloaded content is valid JSON
  • (optional) check that the downloaded schema validates under its relevant metaschema

Add generic hook for validating all known schemas

By looking at https://github.com/sirosen/check-jsonschema/blob/main/.pre-commit-hooks.yaml I realise that the currently provided set of hooks does not scale well on real-life repositories, where you might easily have more than 10+ files using schemas.

I would like to suggest a generic option that would detect all files that match known schemas and validate them, with an option to fallback to offline more, so it would also work in locked down environments like https://pre-commit.ci

As far users follow official file naming patterns, it should not be a problem to do that and we might avoid having to add tons of hook definitions and to have to maintain them.

The current set of builtin schemas is already good but we can easily add more in time.

I may worth remarking that the first hook definition from https://github.com/sirosen/check-jsonschema/blob/main/.pre-commit-hooks.yaml is broken as it would not work when used bare and thow an error for each encountered file:

check-jsonschema: error: Either --schemafile or --builtin-schema must be provided

IMHO, what I describe here would allow this first hook to be used without having to specify any file pattern or schemas to be used.

While testing --builtin-schema I encountered two unexpected errors:

Error: builtin schema could not be loaded, no such schema
NoSuchSchemaError: no builtin schema named playbook.yml was found

My expectation was that a file that does not match a schema association pattern should just not be validated, aka skipped.

Improper use of stdout and stderr

It seems that check-jsonschema implementation does not follow the basic cli guidelines (see https://clig.dev/#the-basics section on stdout and stderr).

  • found validation errors are expected program output, so they should go to stdout
  • no validation errors is a success, so not stdout output is expected
  • Schema validation errors were encountered. and ok -- validation done counts as messaging/logging, and they should go to stderr instead of stdout, something that tools usually even control with verbosity/quite flags.

Following these recommendation is essential as other tools would rely on processing output and either ignore or silence stderr, which is never to be processed.

Based on my tests it seems that the current behavior got the stdout and stderr usage swapped.

When I tried to 2>/dev/null i realised that there was nothing useful on stdout.

That is a relatively common mistake to do when someone implements a linter as "stderr" name gives the impression that it for errors, but when the goal of the tool is to identify problems/errors in received data, that is not really the "stderr" kind of error.

You can think it from another angle: it desired/normal/expected outcome to find errors in schema instances, so no need to send them to stderr.

I hope that explains it. I an easily make a PR to fix if you want.

Schema validation errors -> TypeError: argument of type 'int' is not iterable

Hook detail:

- repo: https://github.com/sirosen/check-jsonschema
  rev: 0.3.0
  hooks:
    - id: check-jsonschema
      language: python
      files: \.(json)$
      args: ["--schemafile", "./schemas/myschemafile.json"]

In check_jsonschema.py, if a schema validations was encountered, when building the json path, the join() fonction throws an unexpected error around line 125.

Schema validation errors were encountered.
Traceback (most recent call last):
  File "/Users/myuser/.cache/pre-commit/repo_qktbzsh/py_env-python3.9/bin/check-jsonschema", line 8, in <module>
    sys.exit(main())
  File "/Users/myuser/.cache/pre-commit/repo_qktbzsh/py_env-python3.9/lib/python3.9/site-packages/check_jsonschema.py", line 127, in main
    ".".join(x if "." not in x else f'"{x}"' for x in err.path)
  File "/Users/myuser/.cache/pre-commit/repo_qktbzsh/py_env-python3.9/lib/python3.9/site-packages/check_jsonschema.py", line 127, in <genexpr>
    ".".join(x if "." not in x else f'"{x}"' for x in err.path)
TypeError: argument of type 'int' is not iterable

When printing err.path, it shows as a dequeue entry, the first element is the json element key name, and the second entry seems to be its depth in the file. The second entry is an int, which seems to be the root cause of the unexpected exception.

Potential fix:
By stringifying 'x' in the join() function, the error dissapears and log outputs shows what seems to be a more intelligible error message for end users:

Schema validation errors were encountered.
  myfile.json::jsonelement-name.9.myattribute: 'valueXYZ' is not one of ['A', 'B', 'C', 'D']

strict checking: fail when there are keys that are not in the schema

If I have a schema that forces my JSON files to contain keys a and b, it will fail if b is missing. However, if a and b are present alongside an additional key c, the program will not fail. Is it possible to add a (suggestion) --strict flag that does not allow additional keys?

Errors shown on RefResolutionError are unclear

When referencing a specific remote schema within a local schema file the validator fails.

This only seems to happen on the https://json.schemastore.org/prometheus.rules.json schema.

To recreate:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "prometheusRules": {
      "$ref": "https://json.schemastore.org/prometheus.rules.json"
    }
  }
}

values.yaml:

prometheusRules:
  groups:
  - name: test
    rules:
    - alert: test
      expr: |
        absent(up{release="test"} == 1)
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: test
$ pre-commit run check-jsonschema -a
Validate values.....................................................Failed
- hook id: check-jsonschema
- exit code: 1

Traceback (most recent call last):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 774, in resolve_from_url
    document = self.store[url]
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/_utils.py", line 22, in __getitem__
    return self.store[self.normalize(uri)]
KeyError: 'https://json.schemastore.org/prometheus.rules'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 777, in resolve_from_url
    document = self.resolve_remote(url)
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 863, in resolve_remote
    with urlopen(uri) as url:
  File "/usr/local/Cellar/[email protected]/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 214, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/Cellar/[email protected]/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/local/Cellar/[email protected]/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/local/Cellar/[email protected]/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 561, in error
    return self._call_chain(*args)
  File "/usr/local/Cellar/[email protected]/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/[email protected]/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/bin/check-jsonschema", line 8, in <module>
    sys.exit(main())
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/check_jsonschema/__init__.py", line 14, in main
    checker.run()
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/check_jsonschema/checker.py", line 51, in run
    validator.validate(instance=doc)
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 352, in validate
    for error in self.iter_errors(*args, **kwargs):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 328, in iter_errors
    for error in errors:
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/_validators.py", line 282, in properties
    for error in validator.descend(
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 344, in descend
    for error in self.iter_errors(instance, schema):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 328, in iter_errors
    for error in errors:
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/_validators.py", line 263, in ref
    for error in validator.descend(instance, resolved):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 344, in descend
    for error in self.iter_errors(instance, schema):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 328, in iter_errors
    for error in errors:
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/_validators.py", line 282, in properties
    for error in validator.descend(
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 344, in descend
    for error in self.iter_errors(instance, schema):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 328, in iter_errors
    for error in errors:
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/_validators.py", line 81, in items
    for error in validator.descend(item, items, path=index):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 344, in descend
    for error in self.iter_errors(instance, schema):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 328, in iter_errors
    for error in errors:
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/_validators.py", line 282, in properties
    for error in validator.descend(
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 344, in descend
    for error in self.iter_errors(instance, schema):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 328, in iter_errors
    for error in errors:
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/_validators.py", line 81, in items
    for error in validator.descend(item, items, path=index):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 344, in descend
    for error in self.iter_errors(instance, schema):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 328, in iter_errors
    for error in errors:
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/_validators.py", line 337, in oneOf
    errs = list(validator.descend(instance, subschema, schema_path=index))
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 344, in descend
    for error in self.iter_errors(instance, schema):
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 328, in iter_errors
    for error in errors:
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/_validators.py", line 259, in ref
    scope, resolved = validator.resolver.resolve(ref)
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 766, in resolve
    return url, self._remote_cache(url)
  File "/Users/xxxx/.cache/pre-commit/repoj9m3e3g9/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 779, in resolve_from_url
    raise exceptions.RefResolutionError(exc)
jsonschema.exceptions.RefResolutionError: HTTP Error 404: Not Found

execution fails

reproduce:

$ python3 -m venv venv
$ source venv/bin/activate
$ python --version
Python 3.9.5

$ pip install check-jsonschema -q
$ check-jsonschema --help
Traceback (most recent call last):
  File "/home/foo/tmp/venv/bin/check-jsonschema", line 5, in <module>
    from check_jsonschema import main
ImportError: cannot import name 'main' from 'check_jsonschema' (/home/foo/tmp/venv/lib/python3.9/site-packages/check_jsonschema/__init__.py)

is it a bug?

seems like it can be fixed by patching venv/bin/check-jsonschema

$ diff venv/bin/check-jsonschema venv/bin/check-jsonschema.orig
5c5
< from check_jsonschema.cli import main
---
> from check_jsonschema import main

$ check-jsonschema --help
usage: check-jsonschema [-h] --schemafile SCHEMAFILE [--cache-filename CACHE_FILENAME] instancefiles [instancefiles ...]

positional arguments:
  instancefiles         JSON or YAML files to check.

optional arguments:
  -h, --help            show this help message and exit
  --schemafile SCHEMAFILE
                        REQUIRED. The path to a file containing the jsonschema to use or an HTTP(S) URI for the schema. If a remote
                        file is used, it will be downloaded and cached locally based on mtime.
  --cache-filename CACHE_FILENAME
                        The name to use for caching a remote schema. Defaults to the last slash-delimited part of the URI.

Add homebrew formula tap

I wanted to use check-jsonschema as a command line tool, but there was no brew formula package.

Contributing Homebrew/homebrew-core upstream has failed due the GitHub repository wasn't notable enough. (Homebrew/homebrew-core#105012)

So I have made a homebrew tap repository (sudosubin/homebrew-cask-jsonschema), but if it exists in an upstream organization(python-jsonschema), it might be better managed.

# before (sudosubin/homebrew-check-jsonschema)
brew tap sudosubin/check-jsonschema

# after (python-jsonschema/homebrew-check-jsonschema)
brew tap python-jsonschema/check-jsonschema

How do others think?

Referencing a YAML schema from a YAML schema fails

Hey there,

I have been trying to write a YAML schema referencing another YAML schema [1].

This results in:

Failure resolving $ref within schema

RefResolutionError: Expecting value: line 1 column 1 (char 0)
  in "/usr/local/lib/python3.8/dist-packages/check_jsonschema/checker.py", line 73
  >>> errors = self._build_error_map()

  caused by

  JSONDecodeError: Expecting value: line 1 column 1 (char 0)
    in "/usr/local/lib/python3.8/dist-packages/jsonschema/validators.py", line 816
    >>> document = self.resolve_remote(url)

    caused by

    StopIteration: 0
      in "/usr/lib/python3.8/json/decoder.py", line 353
      >>> obj, end = self.scan_once(s, idx)

      caused by

      KeyError: 'file:///path/to/B.schema.yml'
        in "/usr/local/lib/python3.8/dist-packages/jsonschema/validators.py", line 813
        >>> document = self.store[url]

It does make sense, as resolving the reference schema is handled by jsonschema, which has no support for YAMLย โˆ’ย took me a while to get there though :)
Hence Iโ€™m not sure whether much can be done in check-jsonschema (perhaps some documentation clarification and/or error message hint)

[1] Something along the lines of the following โˆ’ exact syntax might be still be off as Iโ€™m still figuring things out ^_^

$schema: "https://json-schema.org/draft/2020-12/schema"
description: A schema
type: object

$ref: "B.schema.yml#properties"

required:
  - foo
$schema: "https://json-schema.org/draft/2020-12/schema"
description: B
type: object
properties:
  foo:
    type: string
  bar:
    type: integer

Switch from urllib to requests, enable retries

In the first version of the code, I thought avoiding requests would keep installs faster (fewer dependencies).

Now I want to add a very simple retry loop for the schema fetching, and the easiest way to do that is to use the requests retry mechanism.

The cost of another package is negligible vs the improvement to maintainability here.

Check schemas embedded in files

There are multiple times when we (the JSON Schema team) would like to be able to validate JSON embedded in other files, such as markdown. (Specifically, JSON Schema examples in our docs and website.)

I imagine it would work by allowing the following:

  • Specify what files should be inspected
  • Identify which fragments of the files are JSON (This may need different approaches per format, potentially with a way to self declare the wish to be validated)
  • Process such fragments which self-identify which Schema they should be validated with (such as a standard meta-schema)
  • Optionally take an argument/s which specifies a default JSON Schema to be used for all fragments within a given file/glob of files

Please do reach out to me on Slack or @Julian. I likely won't see follow up comments here directly as I've declared GH notificaiton bankruptcy. Thanks.

invalid validation error

I am unsure if this is an issue with check-jsonschema, jsonschema itself, since the schema file seems ok to me. If you could point what is the source of the issue, I can move the issue there.

Given a Renovate configuration file renovate.json

$ cat renovate.json
{
  "regexManagers": [
    {
      "fileMatch": ["^Dockerfile$"],
      "matchStrings": ["ENV YARN_VERSION=(?<currentValue>.*?)\n"],
      "depNameTemplate": "yarn",
      "datasourceTemplate": "npm"
    }
  ]
}

When running check-jsonschema it reports an error

$ check-jsonschema --schemafile https://docs.renovatebot.com/renovate-schema.json renovate.json
Schema validation errors were encountered.
  ./renovate.json::regexManagers.0.matchStrings.0: 'ENV YARN_VERSION=(?<currentValue>.*?)\n' is not a 'regex'

But the error does not happen when running jsonschema

import json
import jsonschema

with open("./renovate-schema.json") as f:
  schema = json.load(f)
with open("./renovate.json") as f:
  doc = json.load(f)  
try:
  jsonschema.validate(instance=doc, schema=schema)
except Exception as err:
  print(err)

These are the details of my environment

$ python --version
Python 3.9.9

$ pip list | grep jsonschema
check-jsonschema 0.7.1
jsonschema       3.2.0

Azure Pipelines boolean regex error

The Azure Pipelines schema lists a definition for boolean objects, which is essentially a regex for a series of "boolean-like" inputs.
In my pipeline, I'm using a YAML boolean value:

steps: 
  - script: echo "Hello world"
    continueOnError: true

This raises the following error:

$ check-jsonschema --builtin-schema vendor.azure-pipelines --data-transform azure-pipelines azure-pipelines.yml
  azure-pipelines.yml::$: {'steps': [{'script': 'echo "Hello world"', 'continueOnError': True}]} is not valid under any of the given schemas
  Underlying errors caused this.
  Best Match:
    $: {'steps': [{'script': 'echo "Hello world"\n', 'continueOnError': True}]} is not of type 'string'

If I change the continueOnError value to explicitly be a string, this passes:

steps: 
  - script: echo "Hello world"
    continueOnError: "true"

It seems like Azure and the VSCode pipeline validation tool are casting true as "true" which passes the regex, but check-jsonschema converts this to the Python bool True which fails to match the regex. Given the Azure documentation contains a number of examples where the boolean true is used, this causes valid pipelines to fail with check-jsonschema.

I'm not sure about the other vendors, but the "boolean-as-a-string" definition is potentially an Azure-only issue.
But I'm not sure why the error would occur at all, because a json.dumps on a {"continueOnError": True} dict results in the lowercase {"continueOnError": true} JSON object, so perhaps this is an issue with the jsonschema package.

Support JSON5 instance files

JSON5 is a format used by Renovate (see #31). It's a refinement of JSON similar to many less mainstream JSON+comments formats.

There are two usable python JSON5 parsers:

  • pyjson5 is a parser written in cython
  • json5 is a pure python parser

Both have a reasonably sized, but small, userbase.
json5 will be much slower, but I'm somewhat concerned to add a requirement for a cython tool for a few reasons (most notably: the possibility of a failed install from the sdist).


One approach would be to support both libraries -- since we only need the load function from each -- and not make a decision at all.

I could add json5 as a requirement, and if you pull in additional_dependencies: [pyjson5], prefer that implementation for its speed.

pre-commit fails when cache is missing

having 10 files that are checked using the pre-commit hook, on a fresh run (cache directory does not exist) the check fails (but the cache is populated with the schemas) but when running pre-commit again, everything passes ok.

while trying to debug, i noticed that pre-commit calls the 10 files in 2 batches and thus invokes check-jsonschema twice.
i can reproduce it when using pre-commit, but it does not happen when using check-jsonschema directly.

how to reproduce:

  1. checkout the branch
  2. delete jsonschema_validate cache directory
  3. run pre-commit run --all-files --verbose

regex format validation fails on non-python regexes

Format validation will fail on "regex" fields which use syntaxes which are not valid in python, but are valid in other regex engines.

In the originating use-case for this issue, (?<foo>) was used as a name-capturing group. The trouble here is that (?< has a special meaning in python regular expressions which is not common to all regex engines. In general, (? ... ) expressions are a space of great variation between engines.

I'm not sure how best to handle this case. regex validation in jsonschema is done with python's regex engine with re.compile, but that is not strictly compatible with ECMA 262 regex syntax (which is the point of reference for JSON Schema).

JSON Schema itself notes that strict adherence to ECMA 262 is not necessarily feasible for all implementations, and recommends that schema authors use a safe subset of regex syntaxes.

A couple of years ago, jsonschema worked on supporting JS syntax, but it had to be backed out due to issues. The resulting js-regex package appears to be abandoned.

Roughly, I see a few options:

  • try to support this syntax with customized regex validation (HARD)
  • wrap the jsonschema regex format validator to look for (? followed by any character other than ! or =, and disable the check in those cases only
  • remove "regex" from the values of format which are supported/checked by check-jsonschema
  • add a flag to disable certain formats, e.g. --disable-formats "regex,date"
  • expect users encountering this issue to disable format checks altogether
Originating Comment

I've tested the new version and it broke our pre-commit schema validation:

- repo: https://github.com/sirosen/check-jsonschema
  rev: 0.6.0
  hooks:
  - id: check-jsonschema
    name: Validate Renovate
    files: ^\.github/renovate\.json
    types:
    - json
    args:
    - --schemafile
    - https://docs.renovatebot.com/renovate-schema.json

renovate.json:

{
  "extends": [
    "config:base"
  ],
  "regexManagers": [
    {
      "fileMatch": [
        "(^|/)\\.pre-commit-config\\.yaml$"
      ],
      "matchStrings": [
        "\\nminimum_pre_commit_version: (?<currentValue>.*?)\\n"
      ],
      "depNameTemplate": "pre-commit",
      "datasourceTemplate": "pypi"
    },
    {
      "fileMatch": [
        "(^|/)\\.pre-commit-config\\.yaml$"
      ],
      "matchStrings": [
        "\\n\\s*entry: (?<depName>[^:]+):(?<currentValue>\\S+)"
      ],
      "datasourceTemplate": "docker"
    }
  ]
}

Error:

Schema validation errors were encountered.
  .github/renovate.json::regexManagers.0.matchStrings.0: '\\nminimum_pre_commit_version: (?<currentValue>.*?)\\n' is not a 'regex'

Originally posted by @dudicoco in #19 (comment)

Implement format checker

Would be great to validate formats. Here's an example of how to use the jsonschema python library to validate a date:

from jsonschema import FormatChecker, validate

schema = {
    "title": "Athlete",
    "type": "object",
    "properties": {"birthday": {"type": "string", "format": "date"}},
}

validate({"birthday": "xxx"}, schema, format_checker=FormatChecker())  # raises jsonschema.exceptions.ValidationError
validate({"birthday": "2020-10-20"}, schema, format_checker=FormatChecker())  # works fine

Setup a readthedocs site for documentation

Initially, I could fit all of the doc in the README. I wanted to keep it that way, but it's already too long.
If I add a config file to allow the pre-commit hook to track schemas and instances for changes, that behavior would need extensive doc.

A doc site built with sphinx will do nicely.

run against JSON files when only schema has changed

This may well not be possible, but it doesn't hurt to ask.

I have a repo containing JSON files as well as a schemas for those JSON files.
When I change a JSON file, check-jsonschema diligently checks those JSON files.
However, when I change a schema file, check-jsonschema isn't checking whether my JSON files still match the (now updated) schema.

When I add the always_run: yes toggle, it complains about there not being any instancefiles. I guess this means that git does not provide access to files that are not part of the commit. But perhaps there's a way?

!reference tag of .gitlab-ci.yml not supported?

I can't get custom YaML tags to work in check-jsonschema. In particular, the GitLab CI schema defines a custom !reference tag that raises an uncaught exception when attempting to validate it with check-jsonschema. Here's an example .gitlab-ci.yml (taken from the GitLab docs):

include:
  - local: setup.yml

.teardown:
  after_script:
    - echo deleting environment

test:
  script:
    - !reference [.setup, script]
    - echo running my own command
  after_script:
    - !reference [.teardown, after_script]

Validating it throws an exception:

$ check-jsonschema --builtin-schema vendor.gitlab-ci .gitlab-ci.yml
Traceback (most recent call last):
  File "/usr/local/bin/check-jsonschema", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/check_jsonschema/cli.py", line 269, in main
    execute(args)
  File "/usr/local/lib/python3.10/site-packages/check_jsonschema/cli.py", line 316, in execute
    ret = checker.run()
  File "/usr/local/lib/python3.10/site-packages/check_jsonschema/checker.py", line 88, in run
    self._run()
  File "/usr/local/lib/python3.10/site-packages/check_jsonschema/checker.py", line 74, in _run
    errors = self._build_error_map()
  File "/usr/local/lib/python3.10/site-packages/check_jsonschema/checker.py", line 64, in _build_error_map
    for filename, doc in self._instance_loader.iter_files():
  File "/usr/local/lib/python3.10/site-packages/check_jsonschema/loaders/instance/__init__.py", line 61, in iter_files
    data = loadfunc(fp)
  File "/usr/local/lib/python3.10/site-packages/check_jsonschema/loaders/instance/yaml.py", line 26, in load
    data = _yaml.load(stream)
  File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/main.py", line 434, in load
    return constructor.get_single_data()
  File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/constructor.py", line 121, in get_single_data
    return self.construct_document(node)
  File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/constructor.py", line 131, in construct_document
    for _dummy in generator:
  File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/constructor.py", line 668, in construct_yaml_seq
    data.extend(self.construct_sequence(node))
  File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/constructor.py", line 225, in construct_sequence
    return [self.construct_object(child, deep=deep) for child in node.value]
  File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/constructor.py", line 225, in <listcomp>
    return [self.construct_object(child, deep=deep) for child in node.value]
  File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/constructor.py", line 154, in construct_object
    data = self.construct_non_recursive_object(node)
  File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/constructor.py", line 189, in construct_non_recursive_object
    data = constructor(self, node)
  File "/usr/local/lib/python3.10/site-packages/ruamel/yaml/constructor.py", line 690, in construct_undefined
    raise ConstructorError(
ruamel.yaml.constructor.ConstructorError: could not determine a constructor for the tag '!reference'
  in ".gitlab-ci.yml", line 10, column 7

This is against check-jsonschema 0.16.1 and jsonschema 4.6.0.

The GitLab documentation says to add a custom tag to the validator, but I'm not sure how to do that with check-jsonschema / jsonschema. Since the web editor of GitLab does support it since GitLab 15.1 (issue / MR) I was hoping check-jsonschema would be able to support is as well.

Unable to download new redirected schema from SchemaStore.

There is a problem with SchemaStore PR SchemaStore/schemastore#2040. I don't know what the real cause of the problem is.
The JSON schema is removed and the catalog list is updated with the new reference. And from what I understand, this plugin would not use the new redirected JSON schema. Can you look at this?
I assume that when the catalog is updated with a newer data timestamp than the JSON schema cache time, it will download a new JSON schema from the new updated catalog URL link.

Unable to control terminal colors

I tried to look for various common methods to control the use of ANSI colors in output but found none as working or documented.

As I am trying to use this tool from code, I have no need for ANSI escapes.

None helped: PY_COLORS=0, NO_COLOR=1, TERM=dumb.

check-github-workflows fails when run in pre-commit.ci

Hello,

Thanks for this very nice hook.

I'm using your hook on the .pre-commit-config.yaml file :

# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
fail_fast: false

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
  rev: v4.0.1
  hooks:
    - id: check-added-large-files
      args: ['--maxkb=500']
    - id: check-case-conflict
    - id: check-json
    - id: check-merge-conflict
    - id: check-shebang-scripts-are-executable
    - id: check-symlinks
    - id: check-toml
    - id: check-vcs-permalinks
    - id: check-xml
    - id: check-yaml
    - id: destroyed-symlinks
    - id: detect-private-key
    - id: end-of-file-fixer
    - id: fix-byte-order-marker
    - id: forbid-new-submodules
    - id: mixed-line-ending
      args: ['--fix=lf']
    - id: trailing-whitespace
      args: [--markdown-linebreak-ext=md]

- repo: https://github.com/codespell-project/codespell
  rev: v2.1.0
  hooks:
    - id: codespell # Spellchecker

- repo: https://github.com/Kr4is/cmake-format-precommit.git
  rev: v0.6.14
  hooks:
    - id: cmake-format
      args: ['--config=.cmake-format.yaml']
    - id: cmake-lint
      args: ['--config=.cmake-format.yaml']

- repo: https://github.com/pocc/pre-commit-hooks
  rev: v1.3.4
  hooks:
    - id: clang-format
    - id: clang-tidy
    - id: oclint
    - id: uncrustify
    - id: cppcheck
    - id: cpplint
    - id: include-what-you-use

- repo: https://github.com/sirosen/check-jsonschema
  rev: 0.6.0
  hooks:
    - id: check-github-workflows

- repo: https://github.com/jackdewinter/pymarkdown
  rev: 0.9.2
  hooks:
    - id: pymarkdown
      args: ['--config=.pymarkdown.json','scan']

And the pre-commit action crash with this errors :

Validate GitHub Workflows................................................Failed
- hook id: check-github-workflows
- exit code: 1

Traceback (most recent call last):
  File "/usr/lib/python3.8/urllib/request.py", line 1354, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/usr/lib/python3.8/http/client.py", line 1252, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1298, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1247, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1007, in _send_output
    self.send(msg)
  File "/usr/lib/python3.8/http/client.py", line 947, in send
    self.connect()
  File "/usr/lib/python3.8/http/client.py", line 1414, in connect
    super().connect()
  File "/usr/lib/python3.8/http/client.py", line 918, in connect
    self.sock = self._create_connection(
  File "/usr/lib/python3.8/socket.py", line 787, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
  File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/pc/clone/Z3_cC7MdQPqRmRExP-ygmA/py_env-python3/bin/check-jsonschema", line 8, in <module>
    sys.exit(main())
  File "/pc/clone/Z3_cC7MdQPqRmRExP-ygmA/py_env-python3/lib/python3.8/site-packages/check_jsonschema/__init__.py", line 15, in main
    checker.run()
  File "/pc/clone/Z3_cC7MdQPqRmRExP-ygmA/py_env-python3/lib/python3.8/site-packages/check_jsonschema/checker.py", line 47, in run
    validator = self.get_validator()
  File "/pc/clone/Z3_cC7MdQPqRmRExP-ygmA/py_env-python3/lib/python3.8/site-packages/check_jsonschema/checker.py", line 40, in get_validator
    return schema_loader.get_validator()
  File "/pc/clone/Z3_cC7MdQPqRmRExP-ygmA/py_env-python3/lib/python3.8/site-packages/check_jsonschema/loaders/schema.py", line 96, in get_validator
    schema = self.reader.read_schema()
  File "/pc/clone/Z3_cC7MdQPqRmRExP-ygmA/py_env-python3/lib/python3.8/site-packages/check_jsonschema/loaders/schema.py", line 53, in read_schema
    with self.downloader.open() as fp:
  File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/pc/clone/Z3_cC7MdQPqRmRExP-ygmA/py_env-python3/lib/python3.8/site-packages/check_jsonschema/cachedownloader.py", line 106, in open
    cached_file = self._download()
  File "/pc/clone/Z3_cC7MdQPqRmRExP-ygmA/py_env-python3/lib/python3.8/site-packages/check_jsonschema/cachedownloader.py", line 92, in _download
    with self._urlopen() as conn:
  File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/pc/clone/Z3_cC7MdQPqRmRExP-ygmA/py_env-python3/lib/python3.8/site-packages/check_jsonschema/cachedownloader.py", line 75, in _urlopen
    with urllib.request.urlopen(self._file_url) as conn:
  File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/usr/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 1397, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/usr/lib/python3.8/urllib/request.py", line 1357, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -3] Temporary failure in name resolution>

Am I doing something wrong (forgetting some parameters or so) ?

Thanks

github actions: partially interpolated values throws off checking

an example which fails:

on:
  push:
jobs:
  macos:
    strategy:
      matrix:
        include:
          - macos-version: "10.15"
          - macos-version: "11"
    runs-on: macos-${{ matrix.macos-version }}
    steps:
      - uses: actions/checkout@v2

this fails with:

$ check-jsonschema --builtin-schema vendor.github-workflows build_library.yml 
Schema validation errors were encountered.
  build_library.yml::$.jobs.macos: {'strategy': {'matrix': {'include': [{'macos-version': '10.15'}, {'macos-version': '11'}]}}, 'runs-on': 'macos-${{ matrix.macos-version }}', 'steps': [{'uses': 'actions/checkout@v2'}]} is not valid under any of the given schemas
  Underlying errors caused this.
  Best Match:
    $.jobs.macos: 'uses' is a required property

adjusting this slightly causes it to pass:

$ git diff --no-index build_library{,2}.yml
diff --git a/build_library.yml b/build_library2.yml
index 431f047..b0542f2 100644
--- a/build_library.yml
+++ b/build_library2.yml
@@ -5,8 +5,8 @@ jobs:
     strategy:
       matrix:
         include:
-          - macos-version: "10.15"
-          - macos-version: "11"
-    runs-on: macos-${{ matrix.macos-version }}
+          - macos-version: "macos-10.15"
+          - macos-version: "macos-11"
+    runs-on: ${{ matrix.macos-version }}
     steps:
       - uses: actions/checkout@v2

the error message is also a bit unfortunate -- but caused by the "reusable workflow" union -- removing the union gives a nicer message but presumably breaks reusable workflow checking:

$ check-jsonschema --builtin-schema vendor.github-workflows build_library.yml 
Schema validation errors were encountered.
  build_library.yml::$.jobs.macos: {'strategy': {'matrix': {'include': [{'macos-version': '10.15'}, {'macos-version': '11'}]}}, 'runs-on': 'macos-${{ matrix.macos-version }}', 'steps': [{'uses': 'actions/checkout@v2'}]} is not valid under any of the given schemas
  Underlying errors caused this.
  Best Match:
    $.jobs.macos.runs-on: 'macos-${{ matrix.macos-version }}' is not one of ['macos-10.15', 'macos-11', 'macos-12', 'macos-latest', 'self-hosted', 'ubuntu-18.04', 'ubuntu-20.04', 'ubuntu-22.04', 'ubuntu-latest', 'windows-2016', 'windows-2019', 'windows-2022', 'windows-latest']

Always use vendored schemas for hooks

It's possible today for a hook to pass when the network is available, but fail when it is not. The result in pre-commit.ci could be especially confusing.

The failover behavior can be removed if the hooks only use the vendored schemata.

This change will increase the pressure to do regular releases, to keep up with schemastore. I'll also look at setting up automated PRs via GH Actions.

Validation of dates inside yaml files fails

Using properties with format "date-time" and "date" do not work with unquoted dates which adhere to the tag:yaml.org,2002:timestamp format.

Schema extract:

        "offset_datetime_1": { "type": "string", "format": "date-time" },
        "offset_datetime_2": { "type": "string", "format": "date-time" },
        "offset_datetime_3": { "type": "string", "format": "date-time" },
        "offset_datetime_4": { "type": "string", "format": "date-time" },
        "offset_datetime_5": { "type": "string", "format": "date-time" },
        "offset_datetime_6": { "type": "string", "format": "date-time" },
        "local_datetime_1": { "type": "string", "format": "date-time" },
        "local_datetime_2": { "type": "string", "format": "date-time" },
        "local_datetime_3": { "type": "string", "format": "date-time" },
        "local_datetime_4": { "type": "string", "format": "date-time" },
        "local_date_1": { "type": "string", "format": "date" },
        "local_date_2": { "type": "string", "format": "date" },
        "local_time_1": { "type": "string", "format": "time" },
        "local_time_2": { "type": "string", "format": "time" },
        "local_time_3": { "type": "string", "format": "time" },
        "local_time_4": { "type": "string", "format": "time" },

Test file extract:

  offset_datetime_1: 1979-05-27T07:32:00Z
  offset_datetime_2: 1979-05-27T00:32:00-07:00
  offset_datetime_3: 1979-05-27T00:32:00.999999-07:00
  offset_datetime_4: '1979-05-27T07:32:00Z'
  offset_datetime_5: '1979-05-27T00:32:00-07:00'
  offset_datetime_6: '1979-05-27T00:32:00.999999-07:00'

  naive_datetime_1: 1979-05-27T07:32:00
  local_datetime_2: 1979-05-27T00:32:00.999999

  local_date_1: 1979-05-27
  local_date_2: '1979-05-27'

  local_time_1: 07:32:00
  local_time_2: 00:32:00.999999
Schema validation errors were encountered.
  instance.yaml::$.object.offset_datetime_1: datetime.datetime(1979, 5, 27, 7, 32) is not of type 'string'
  instance.yaml::$.object.offset_datetime_2: datetime.datetime(1979, 5, 27, 7, 32) is not of type 'string'
  instance.yaml::$.object.offset_datetime_3: datetime.datetime(1979, 5, 27, 7, 32, 0, 999999) is not of type 'string'
  instance.yaml::$.object.local_datetime_2: datetime.datetime(1979, 5, 27, 0, 32, 0, 999999) is not of type 'string'
  instance.yaml::$.object.local_date_1: datetime.date(1979, 5, 27) is not of type 'string'

pre-commit hook argument for schemafile doesn't support ~/

If we have a schemafile in the user's home directory, I can pass ~/schemafile.json when invoking check-jsonschema from the command line without error, but if I pass args: ["--schemafile", "~/schemafile.json"] to the pre-commit hook, I get a FileNotFoundError, because it does not expand the ~/ operator. Looks like the call to open could use a os.path.expanduser on the path being opened? This allows multiple developers with different usernames to have the schemafile in their home directory, without having to have the username hard-coded in the pre-commit yaml file.

Refactor for better re-use?

Refactor/Feature Request

I'd like to see the run method return a returncode, and let main exit appropriately based on the return code provided by the run method.
https://github.com/sirosen/check-jsonschema/blob/42ad7416d2bbe133fa87db0e8a0559df534f35ea/src/check_jsonschema/__init__.py#L16

Use case

I have a strange use case where I need to validate a YAML file, but that YAML is a build artifact that is generated from a Jinja2 template (that is kept in source control). I need to do some pre-processing (with a wrapper script) - and then I want to use the check-jsonschema library (SchemaChecker) with arguments constructed from the arguments passes to the wrapper script (which is also a pre-commit compatible script).

Because the SchemaChecker issues the sys.exit command - I can't examine the return code and take action based on that. Embedding the SchemaChecker causes the wrapper script to stop.

If the sys.exit call was moved from SchemaChecker - to main - then the SchemaChecker could be used independently.

Note/Workaround

I can get around this - by using the subprocess module and calling check-jsonschema as a CLI subprocess, and then examine the return code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.