ghandic / jsf Goto Github PK
View Code? Open in Web Editor NEWCreates fake JSON files from a JSON schema
Home Page: https://ghandic.github.io/jsf
License: Other
Creates fake JSON files from a JSON schema
Home Page: https://ghandic.github.io/jsf
License: Other
Hi @ghandic
generate_and_validate() method does not return any json object. If that is how it is meant to be, maybe I can rename it to "validate()" and generate a pull request.
Thanks for the tool!
Is there way to have it use default values from the schema where present, rather than faking values?
Allow auto data generation to produce data from fake files with certain encoding
https://json-schema.org/understanding-json-schema/reference/non_json_data.html#id2
Since we are using eval anything in the provider string will be evaluated which is insecure, this should be default safe and optionally allow use of lambda's etc
Self explanatory.
How to reproduce:
from json_faker import JSF
faker = JSF(
{
"type": "object",
"properties": {
"name": {"type": "string", "minLength":3, "maxLength":3 },
"email": {"type": "string", "$provider": "faker.email"},
},
"required": ["name", "email"],
}
)
fake_json = faker.generate()
fake_json
The property name
will return the ''
empty string.
Attempt fix:
Momentarily, I changed the random_fixed_length_sentence
function in the jsf/schema_types/string_utils/content_type/text__plain.py
file to this for now but eventhough the _min
variable has a default value of 0, it won't return an empty string (although it wouldn't be the most valuable string).
def random_fixed_length_sentence(_min: int = 0, _max: int = 50) -> str:
_min = int(_min)
_max = int(_max)
if _min > _max:
raise ValueError("minLength must be less than maxLength")
# Needs better implementation to return empty string
sentence = ""
while len(sentence) <= _min:
sentence = random.choice(LOREM).capitalize()
while len(sentence) < _max and random.random() > 0.2:
sentence += " " + random.choice(LOREM)
# sentence += random.choice(['.', '!', '?'])
return sentence[:_max].strip()
Support not
schema: https://json-schema.org/understanding-json-schema/reference/combining#not
I have a use case where I need to generate data for parquet datatypes. I am currently using a custom version of JSF. Would you like to have this feature here?
JSON looks like the following:
"UInt32": {
"type": "uint32"
},
"UInt64": {
"type": "uint64"
},
"Float16": {
"type": "float16"
}
[number.py:jsf.src.schema_types.number:line 304 - generate()] - INFO: Generating random uint32
[number.py:jsf.src.schema_types.number:line 52 - generate()] - DEBUG: is_float: False
[number.py:jsf.src.schema_types.number:line 72 - generate()] - INFO: Generated number: 35227457
[number.py:jsf.src.schema_types.number:line 333 - generate()] - INFO: Generating random uint64
[number.py:jsf.src.schema_types.number:line 52 - generate()] - DEBUG: is_float: False
[number.py:jsf.src.schema_types.number:line 72 - generate()] - INFO: Generated number: 4669327448559716910
[number.py:jsf.src.schema_types.number:line 362 - generate()] - INFO: Generating random float16
[number.py:jsf.src.schema_types.number:line 57 - generate()] - DEBUG: is_float: True
[number.py:jsf.src.schema_types.number:line 72 - generate()] - INFO: Generated number: 1.920763087895552e+17
Currently working to draft 7, but should add support for multiple draft versions
Use test data from https://github.com/Julian/jsonschema/tree/main/json
In cases such as:
I also would like to be able to apply defaults to the default generation.
Example from JSON Schema Faker, show the inputs and outputs:
You can see in the above example, that the generated sample is easy to read and understand. Whereas a randomly generated set of inputs would lose much valuable context.
The use case is that we define inputs for our application as JSON schema specifications, and we ask users to provide their input matching that specification. It is much preferrable to have those generated values defaulting to the actual default values: (1) becase it is easier to understand from a user perspective if they see 'items per batch' defaulting to something like 1000
and not 1235134523451345
, and (2) because in those cases, deleting the defaults has the expected effect or basically not overriding them.
Another case would be an example URL value or an example region name. Seeing the example or default value gives a much stronger hint of what kind of inputs are actually required. (If you see us-east-1
as the default, you are going to feel comfortable providing us-west-2
, and you probably won't mistakenly type US West (Oregon)
.)
As (apparently?) implemented in JSON Schema Faker, I could imagine adding a use_defaults
and use_examples
into the generate()
method.
Happy to contribute if this is something I could help with.
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"accountUid": {
"type": "string"
}
}
}
If I do the following:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from jsf import JSF
faker = JSF.from_json("simple.json")
fake_json = faker.generate()
print(fake_json)
I get the following error:
root@d42f6d379f02:/app# python basic.py
/usr/local/lib/python3.10/site-packages/pydantic/_internal/_config.py:261: UserWarning: Valid config keys have changed in V2:
* 'smart_union' has been removed
warnings.warn(message, UserWarning)
Traceback (most recent call last):
File "/app/basic.py", line 6, in <module>
faker = JSF.from_json("simple.json")
File "/usr/local/lib/python3.10/site-packages/jsf/parser.py", line 208, in from_json
return JSF(json.load(f))
File "/usr/local/lib/python3.10/site-packages/jsf/parser.py", line 54, in __init__
self._parse(schema)
File "/usr/local/lib/python3.10/site-packages/jsf/parser.py", line 183, in _parse
self.root = self.__parse_definition(name="root", path="#", schema=schema)
File "/usr/local/lib/python3.10/site-packages/jsf/parser.py", line 141, in __parse_definition
return self.__parse_object(name, path, schema)
File "/usr/local/lib/python3.10/site-packages/jsf/parser.py", line 66, in __parse_object
props.append(self.__parse_definition(_name, path=f"{path}/{_name}", schema=definition))
File "/usr/local/lib/python3.10/site-packages/jsf/parser.py", line 156, in __parse_definition
return self.__parse_primitive(name, path, schema)
File "/usr/local/lib/python3.10/site-packages/jsf/parser.py", line 59, in __parse_primitive
return cls.from_dict({"name": name, "path": path, "is_nullable": is_nullable, **schema})
File "/usr/local/lib/python3.10/site-packages/jsf/schema_types/string.py", line 156, in from_dict
return String(**d)
File "/usr/local/lib/python3.10/site-packages/pydantic/main.py", line 150, in __init__
__pydantic_self__.__pydantic_validator__.validate_python(data, self_instance=__pydantic_self__)
pydantic_core._pydantic_core.ValidationError: 1 validation error for String
contentEncoding
Field required [type=missing, input_value={'name': 'accountUid', 'p...False, 'type': 'string'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.0.3/v/missing
I can see here #58 where you added support for ContentEncoding and in #4 you even link to the spec of the json schema where it explains what the ContentEncodings are.
What I don't see is anyway for the default value to be used. IE I don't want to specify the ContentEncoding on each and every single field. As I'm going to have tens of schemas with hundreds of fields each that are not generated by me. Yet I need to figure out a way to generate and fake data for these schemas so we can automate some testing of ETL's.
This being said I also noticed that when I put "ContentEncoding": "8bit"
in my schema, it still failed. Apparently you're checking that the value is 8-bit
. Which according to the json schema docs is wrong.
The acceptable values are
7bit
,8bit
,binary
,quoted-printable
,base16
,base32
, andbase64
. If not specified, the encoding is the same as the containing JSON document.
Use test data from https://github.com/Julian/jsonschema/tree/main/json
Hi @ghandic
I see a recursion.json in src/tests/data, but I do not see a test for it. Is it supported by jsf? Also, is $ref:"#" supported?
Thanks.
Thanks for this package!
It would be lovely for downstream packagers if the tests
:
sdist
on PyPI
import jsf
rather than import ..jsf
so that they could test the as-installed packageI'd be happy to work up a PR that did these things, if that was desirable.
Motivation: I'm looking to package this for conda-forge:
The lack of tests aren't a hold-up, but do help us catch metadata creep which is only semi-automated.
Thanks again!
Is there a reason that we do not use logging in JSF?
Hi,
I am having issues with JSF when patternProperties
was defined. See below:
JSON Schema:
{
"title": "XXXXX",
"description": "XXXXX",
"type": "object",
"definitions": {
"InstructionItem": {
"type": "object",
"properties": {
"Command": {
"description": "XXXX",
"type": "string"
},
"ExecutionTimeout": {
"description": "XXXX",
"type": "integer"
},
"ExecutionType": {
"description": "XXXXX",
"type": "string"
},
"InvokeSequence": {
"description": "XXXXXX",
"type": "integer"
},
"MachineLabel": {
"description": "XXXX",
"type": "string"
},
"NodeReference": {
"description": "XXXXX",
"type": "string"
}
},
"required": [
"Command",
"ExecutionTimeout",
"ExecutionType",
"InvokeSequence",
"MachineLabel",
"NodeReference"
]
},
"InstructionStep": {
"type": "object",
"properties": {
"CertificateURL": {
"description": "XXXX",
"type": "string"
},
"Description": {
"description": "XXXX",
"type": "string"
},
"ManualStep": {
"description": "XXXX",
"type": "boolean"
},
"RunAsUser": {
"description": "XXXX",
"type": "string"
},
"StepCommand": {
"description": "XXXX",
"type": "string"
},
"StepFunction": {
"description": "XXXX",
"type": "string"
},
"UseFunction": {
"description": "XXXX",
"type": "boolean"
},
"StepRun": {
"description": "XXXX",
"type": "string"
},
"cwd": {
"description": "XXXX",
"type": "string"
}
},
"required": [
"CertificateURL",
"Description",
"ManualStep",
"RunAsUser",
"StepCommand",
"StepFunction",
"UseFunction",
"StepRun",
"cwd"
]
}
},
"properties": {
"AreNotificationsEnabled": {
"description": "XXXX",
"type": "boolean"
},
"Description": {
"description": "XXXXX",
"type": "string"
},
"Instructions": {
"description": "XXXXX",
"type": "array",
"items": {
"$ref": "#/definitions/InstructionItem"
}
},
"IsActive": {
"description": "XXXXX",
"type": "boolean"
},
"IsCustomerFacing": {
"description": "XXXXXX",
"type": "boolean"
},
"IsAdminFacing": {
"description": "XXXXX",
"type": "boolean"
},
"IsSystem": {
"description": "XXXXX",
"type": "boolean"
},
"Name": {
"description": "XXXXXX",
"type": "string"
},
"Nodes":{
"description": "XXXXX",
"type": "object",
"patternProperties": {
"[A-Z_]+": {
"description": "XXXXX",
"type": "object",
"properties": {
"AdminTask": {
"description": "XXXXX",
"type": "object",
"properties": {
"AdminFun": {
"description": "XXXXX",
"type": "object",
"patternProperties": {
"[A-Z_]+": {
"description": "XXXX",
"type": "object",
"properties": {
"Instructions": {
"description": "XX",
"type": "object",
"patternProperties": {
"[A-Z_-]+": {
"$ref": "#/definitions/InstructionStep"
}
}
}
}
}
}
}
}
}
}
}
}
}
},
"required": [
"AreNotificationsEnabled",
"Description",
"Instructions",
"IsActive",
"IsCustomerFacing",
"IsAdminFacing",
"IsSystem",
"Name",
"Nodes"
]
}
Error message:
> jsf --schema .\af.schema --instance .\t.json
Traceback (most recent call last):
File "%HOME%\jsonvalidator\env\lib\site-packages\jsf\schema_types\object.py", line 40, in generate
return super().generate(context)
File "%HOME%\jsonvalidator\env\lib\site-packages\jsf\schema_types\base.py", line 49, in generate
raise ProviderNotSetException()
jsf.schema_types.base.ProviderNotSetException
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "%HOME%\jsonvalidator\env\lib\site-packages\jsf\schema_types\object.py", line 40, in generate
return super().generate(context)
File "%HOME%\jsonvalidator\env\lib\site-packages\jsf\schema_types\base.py", line 49, in generate
raise ProviderNotSetException()
jsf.schema_types.base.ProviderNotSetException
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\python3\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\python3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "%HOME%\jsonvalidator\env\Scripts\jsf.exe\__main__.py", line 7, in <module>
File "%HOME%\jsonvalidator\env\lib\site-packages\typer\main.py", line 214, in __call__
return get_command(self)(*args, **kwargs)
File "%HOME%\jsonvalidator\env\lib\site-packages\click\core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "%HOME%\jsonvalidator\env\lib\site-packages\click\core.py", line 1053, in main
rv = self.invoke(ctx)
File "%HOME%\jsonvalidator\env\lib\site-packages\click\core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "%HOME%\jsonvalidator\env\lib\site-packages\click\core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "%HOME%\jsonvalidator\env\lib\site-packages\typer\main.py", line 500, in wrapper
return callback(**use_params) # type: ignore
File "%HOME%\jsonvalidator\env\lib\site-packages\jsf\cli.py", line 19, in main
JSF.from_json(schema).to_json(instance)
File "%HOME%\jsonvalidator\env\lib\site-packages\jsf\parser.py", line 143, in to_json
json.dump(self.generate(), f, indent=2)
File "%HOME%\jsonvalidator\env\lib\site-packages\jsf\parser.py", line 131, in generate
File "%HOME%\jsonvalidator\env\lib\site-packages\jsf\schema_types\object.py", line 42, in generate
return {o.name: o.generate(context) for o in self.properties if self.should_keep(o.name)}
File "%HOME%\jsonvalidator\env\lib\site-packages\jsf\schema_types\object.py", line 42, in <dictcomp>
return {o.name: o.generate(context) for o in self.properties if self.should_keep(o.name)}
File "%HOME%\jsonvalidator\env\lib\site-packages\jsf\schema_types\object.py", line 42, in generate
return {o.name: o.generate(context) for o in self.properties if self.should_keep(o.name)}
TypeError: 'NoneType' object is not iterable
I've tried replacing patternProperties
with properties
and it worked.
Thanks,
Right now __parse_definition()
might rely on self.definitions
for the parsing of references:
Line 159 in 162d6cd
If $ref
s haven't been defined in perfect order parsing might fail.
I'm currently converting pydantic
models to JSON schemas and end up with a valid and compliant JSON schema.
However, the ordering of the refs is out of my control.
When loading the schema to JSF
it fails with:
line 164, in __parse_definition
cls.name = name
^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'name'
To reproduce the issue:
faker = JSF(
{
"$defs": {
"Foo": {
"properties": {"bar": {"$ref": "#/$defs/SomeEnum"}},
"required": ["bar"],
"title": "Foo",
"type": "object",
},
"SomeEnum": {"enum": ["A", "B"], "title": "SomeEnum", "type": "string"},
},
"properties": {"foobar": {"anyOf": [{"$ref": "#/$defs/Foo"}]}},
"required": ["foobar"],
"title": "FooBarObject",
"type": "object",
}
)
However, if you switch Foo with SomeEnum it works as expected:
faker = JSF(
{
"$defs": {
"SomeEnum": {"enum": ["A", "B"], "title": "SomeEnum", "type": "string"},
"Foo": {
"properties": {"bar": {"$ref": "#/$defs/SomeEnum"}},
"required": ["bar"],
"title": "Foo",
"type": "object",
}
},
"properties": {"foobar": {"anyOf": [{"$ref": "#/$defs/Foo"}]}},
"required": ["foobar"],
"title": "FooBarObject",
"type": "object",
}
)
If multipleOf is not set in the schema, then generating a number always attempts to use a step of 1, and throws an exception when no such valid number exists.
>>> from jsf import JSF
>>> JSF({"type": "number", "minimum": 0.1, "maximum": 0.9}).generate()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../lib/python3.8/site-packages/jsf/parser.py", line 251, in generate
return self.root.generate(context=self.context)
File ".../lib/python3.8/site-packages/jsf/schema_types/number.py", line 37, in generate
step * random.randint(math.ceil(float(_min) / step), math.floor(float(_max) / step))
File "/usr/lib/python3.8/random.py", line 248, in randint
return self.randrange(a, b+1)
File "/usr/lib/python3.8/random.py", line 226, in randrange
raise ValueError("empty range for randrange() (%d, %d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (1, 1, 0)
I suggest a check that max - min is greater than step, and if not try a smaller step.
It is even worse when using exclusive Maximums and Minimums, when it is unable to find any value in a range from 0.1-2.9
>>> JSF({
"type": "number",
"minimum": 0.1,
"maximum": 2.9,
"exclusiveMinimum": True,
"exclusiveMaximum": True
}).generate()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../lib/python3.8/site-packages/jsf/parser.py", line 251, in generate
return self.root.generate(context=self.context)
File ".../lib/python3.8/site-packages/jsf/schema_types/number.py", line 37, in generate
step * random.randint(math.ceil(float(_min) / step), math.floor(float(_max) / step))
File "/usr/lib/python3.8/random.py", line 248, in randint
return self.randrange(a, b+1)
File "/usr/lib/python3.8/random.py", line 226, in randrange
raise ValueError("empty range for randrange() (%d, %d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (2, 2, 0)
Testing the error handling, giving users a easy to debug output rather then default error logs
Currently, you force with faker = Faker() your own instance of faker, it's possible to pass faker as parameter for better customization?
Regards
The current enum implementation results in generated int
and float
data being coerced to str
due to how Pydantic handles Union
(see these docs). Pydantic will coerce the input to the first type it can match in the Union
, which in the current implementation of JSFEnum
is always a string for integers and floats.
class JSFEnum(BaseSchema):
enum: Optional[List[Union[str, int, float, None]]] = []
Pydantic offers the following recommendation to solve this issue:
As such, it is recommended that, when defining Union annotations, the most specific type is included first and followed by less specific types.
However, it also issues a warning concerning Union
s inside of List
or Dict
types:
typing.Union
also ignores order when defined, soUnion[int, float] == Union[float, int]
which can lead to unexpected behaviour when combined with matching based on the Union type order inside other type definitions, such asList
andDict
types (because python treats these definitions as singletons). For example,Dict[str, Union[int, float]] == Dict[str, Union[float, int]]
with the order based on the first time it was defined. Please note that this can also be affected by third party libraries and their internal type definitions and the import orders.
Because of this I think the best solution is to use Pydantic's Smart Union which will check the entire Union
for the best type match before attempting to coerce.
Hi.
I have a simple schema:
{
"$schema": "http://json-schema.org/draft-06/schema#",
"type": "object",
"additionalProperties": false,
"properties": {
"a_arr": {
"type": "array",
"items": {
"$ref": "#/definitions/A"
}
}
},
"definitions": {
"A": {
"type": "object",
"additionalProperties": false,
"properties": {
"bar": {
"$ref": "#/definitions/B"
}
}
},
"B": {
"type": "object",
"additionalProperties": false,
"properties": {
"foo": {
"type": "string"
}
}
}
}
}
When I am trying to generate data, I hit an error.
Code sample:
import json
from jsf import JSF
s = json.load(open("schema.json"))
f = JSF(s)
fake_json = f.generate()
print(fake_json)
I got this traceback:
Traceback (most recent call last):
File "/Users/vlad/Library/Application Support/JetBrains/PyCharm2023.2/scratches/scratch_105.py", line 6, in <module>
mismo_faker = JSF(mismo)
File "/Users/vlad/Projects/test/venv/lib/python3.9/site-packages/jsf/parser.py", line 53, in __init__
self._parse(schema)
File "/Users/vlad/Projects/test/venv/lib/python3.9/site-packages/jsf/parser.py", line 179, in _parse
item = self.__parse_definition(name, path=f"#/{def_tag}", schema=definition)
File "/Users/vlad/Projects/test/venv/lib/python3.9/site-packages/jsf/parser.py", line 140, in __parse_definition
return self.__parse_object(name, path, schema)
File "/Users/vlad/Projects/test/venv/lib/python3.9/site-packages/jsf/parser.py", line 65, in __parse_object
props.append(self.__parse_definition(_name, path=f"{path}/{_name}", schema=definition))
File "/Users/vlad/Projects/test/venv/lib/python3.9/site-packages/jsf/parser.py", line 164, in __parse_definition
cls.name = name
AttributeError: 'NoneType' object has no attribute 'name'
Process finished with exit code 1
My pip freeze:
annotated-types==0.5.0
attrs==23.1.0
certifi==2023.7.22
charset-normalizer==3.2.0
Faker==19.3.1
idna==3.4
jsf==0.8.0
jsonschema==4.19.0
jsonschema-specifications==2023.7.1
pydantic==2.3.0
pydantic_core==2.6.3
python-dateutil==2.8.2
referencing==0.30.2
requests==2.31.0
rpds-py==0.10.0
rstr==3.2.1
six==1.16.0
smart-open==6.3.0
typing_extensions==4.7.1
urllib3==2.0.4
However, when I reorder the definitions, it is working as expected.
{
"$schema": "http://json-schema.org/draft-06/schema#",
"type": "object",
"additionalProperties": false,
"properties": {
"a_arr": {
"type": "array",
"items": {
"$ref": "#/definitions/A"
}
}
},
"definitions": {
"B": {
"type": "object",
"additionalProperties": false,
"properties": {
"foo": {
"type": "string"
}
}
},
"A": {
"type": "object",
"additionalProperties": false,
"properties": {
"bar": {
"$ref": "#/definitions/B"
}
}
}
}
}
Pydantic 2 was released a few days ago... some of our tests are now failing with
.env/lib/python3.11/site-packages/pydantic/_internal/_config.py:206: in prepare_config
warnings.warn(DEPRECATION_MESSAGE, DeprecationWarning)
E pydantic.warnings.PydanticDeprecatedSince20: Support for class-based `config` is deprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.0.1/migration/
config = <class 'jsf.schema_types.enum.JSFEnum.Config'>
(pytest is set up to treat warnings as errors)
Until deprecated usages are addressed, would it be possible to specify the requirement on pydantic
as >=1.10.4,<2
(instead of just >=1.10.4
)?
Implementation requires a single, global reference for all definitions. Making reference to a "Complex Structure" example provided by JSONSchema.org.
The schema leverages the idea that a definition
using an $id
signals a new scope for that definition and references are only made to other definitions within that scope. Specifically, the root's $defs
block only contains an address
definition while that address
block itself has a definitions
block:
{
"$defs": {
"address": {
"$id": "/schema/address",
"definitions": {
"state": {}
}
}
}
}
In this example, the root object has references to the address
definition's '$id' value using "/state/address"
. Within the address
block, it also contains a definition (state
) which is referenced from properties as "#/definitions/state"
.
In this case, the #
symbol relates to the scope within the address
block.
{
"$id": "https://example.com/schemas/customer",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"first_name": { "type": "string" },
"last_name": { "type": "string" },
"shipping_address": { "$ref": "/schemas/address" },
"billing_address": { "$ref": "/schemas/address" }
},
"required": ["first_name", "last_name", "shipping_address", "billing_address"],
"$defs": {
"address": {
"$id": "/schemas/address",
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "$ref": "#/definitions/state" }
},
"required": ["street_address", "city", "state"],
"definitions": {
"state": { "enum": ["CA", "NY", "... etc ..."] }
}
}
}
}
NOTE: Have verified this schema works with the
jsonschema
Python library using a generated JSON object using this project.
Should be able to run a jsf.JSF(schema)
with this JSON Schema:
import json
import jsf
schema = json.load(open("complex.schema.json" , "r"))
gen = jsf.JSF(schema)
new_json = gen.generate()
print(json.dumps(new_json, indent=2))
Running the above code generates AttributeError: 'NoneType' object has no attribute 'name'
on parser.py#L181.
Making adjustments to the schema definition to flatten the dependency tree and removing references looking at internal dependencies by their $id
, we can make it work.
{
"$id": "https://example.com/schemas/customer",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"first_name": { "type": "string" },
"last_name": { "type": "string" },
"shipping_address": { "$ref": "#/$defs/address" },
"billing_address": { "$ref": "#/$defs/address" }
},
"required": ["first_name", "last_name", "shipping_address", "billing_address"],
"$defs": {
"address": {
"$id": "/schemas/address",
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "$ref": "#/$defs/state" }
},
"required": ["street_address", "city", "state"]
},
"state": { "enum": ["CA", "NY", "... etc ..."] }
}
}
In the following line:
Line 78 in 3db7519
A TypeError
is raised when the item type represents more than one type (excluding null
). I'm not sure why.
Can this method just return a random type from the list (including null)?
So:
import random
...
def __is_field_nullable(self, schema: Dict[str, Any]) -> Tuple[str, bool]:
item_type = schema.get("type")
if isinstance(item_type, list):
if "null" in item_type:
return random.choice(item_type), True
return item_type, False
The latest JSON Schema draft versions recommend using $defs
instead of definitions
with the note that the actual reference pointer should be extracted from the $ref
fragment. I think it will require a change in the _parse
method of https://github.com/ghandic/jsf/blob/main/src/jsf/parser.py
The LOREM in text__plain.py
file does not contain a word with length of one and two.
Therefore if a schema specify [minLength, maxLength] = [1,2] ([1,1],[2,2]) on a string property, it will return an empty string.
For example for a country code string ( -- Not necessarily the best example because country code should be an enum rather than a string but let's say for this exercise that the code value does not really matter ๐คฃ )
{
"properties": {
"code": {
"maxLength": 2,
"minLength": 2,
"title": "Code",
"type": "string"
}
}
}
Possible solution:
Just change the Lorem string to include a few small words
Edit:
The following line is also not correct if we want to be able to have a word of exact size
valid_words = list(filter(lambda s: len(s) < remaining, LOREM))
Should be replaced by
valid_words = list(filter(lambda s: len(s) <= remaining, LOREM))
The rest should still work thanks to that .strip()
at the end that will remove the extra space.
typer
brings along rather a lot of dependencies. Might it be possible to make that dependency optional for using this as a library? One way would be a [cli]
extra, or a whole separate package for jsf-cli
.
Seems to fail on all use of references including self references
Take a look at the 2 schemas attached
{
"title": "AlertSync",
"description": "\u5ba1\u8ba1\u544a\u8b66model",
"type": "object",
"properties": {
"audit_label": {
"title": "Audit Label",
"type": "string",
"format": "ipv4"
},
"category": {
"title": "Category",
"minimum": 1,
"maximum": 15,
"type": "integer"
},
"level": {
"title": "Level",
"minimum": 0,
"maximum": 3,
"type": "integer"
},
"src_mac": {
"title": "Src Mac",
"default": "00:00:00:00:00:00",
"pattern": "^([0-9A-F]{2})(\\:[0-9A-F]{2}){5}$",
"type": "string"
},
"src_ip": {
"title": "Src Ip",
"type": "string",
"format": "ipv4"
},
"src_port": {
"title": "Src Port",
"minimum": 1,
"maximum": 65535,
"type": "integer"
},
"dst_mac": {
"title": "Dst Mac",
"default": "FF:FF:FF:FF:FF:FF",
"pattern": "^([0-9A-F]{2})(\\:[0-9A-F]{2}){5}$",
"type": "string"
},
"dst_ip": {
"title": "Dst Ip",
"type": "string",
"format": "ipv4"
},
"dst_port": {
"title": "Dst Port",
"minimum": 1,
"maximum": 65535,
"type": "integer"
},
"l4_protocol": {
"$ref": "#/definitions/L4ProtocolEnum"
},
"protocol": {
"$ref": "#/definitions/ProtocolEnum"
},
"illegal_ip": {
"title": "Illegal Ip",
"default": [],
"type": "array",
"items": {
"type": "string",
"format": "ipv4"
}
},
"last_at": {
"title": "Last At",
"default": "2022-12-30T14:08:30.753677",
"type": "string",
"format": "date-time"
},
"count": {
"title": "Count",
"default": 1,
"minimum": 1,
"maximum": 100000,
"type": "integer"
},
"other_info": {
"title": "Other Info",
"type": "object"
},
"payload": {
"title": "Payload",
"pattern": "^([0-9A-F]{2})+$",
"type": "string"
}
},
"required": [
"audit_label",
"category",
"level",
"l4_protocol",
"protocol"
],
"definitions": {
"L4ProtocolEnum": {
"title": "L4ProtocolEnum",
"description": "An enumeration.",
"enum": [
"TCP",
"UDP"
],
"type": "string"
},
"ProtocolEnum": {
"title": "ProtocolEnum",
"description": "An enumeration.",
"enum": [
"S7COMM",
"MODBUS"
],
"type": "string"
}
}
}
Traceback (most recent call last):
File "/root/repos/sa-data-perf/venv/lib/python3.10/site-packages/jsf/schema_types/object.py", line 40, in generate
return super().generate(context)
File "/root/repos/sa-data-perf/venv/lib/python3.10/site-packages/jsf/schema_types/base.py", line 49, in generate
raise ProviderNotSetException()
jsf.schema_types.base.ProviderNotSetException
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/repos/sa-data-perf/venv/lib/python3.10/site-packages/jsf/schema_types/object.py", line 40, in generate
return super().generate(context)
File "/root/repos/sa-data-perf/venv/lib/python3.10/site-packages/jsf/schema_types/base.py", line 49, in generate
raise ProviderNotSetException()
jsf.schema_types.base.ProviderNotSetException
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/repos/sa-data-perf/debug.py", line 14, in <module>
print(jsf.generate())
File "/root/repos/sa-data-perf/venv/lib/python3.10/site-packages/jsf/parser.py", line 137, in generate
return self.root.generate(context=self.context)
File "/root/repos/sa-data-perf/venv/lib/python3.10/site-packages/jsf/schema_types/object.py", line 42, in generate
return {o.name: o.generate(context) for o in self.properties if self.should_keep(o.name)}
File "/root/repos/sa-data-perf/venv/lib/python3.10/site-packages/jsf/schema_types/object.py", line 42, in <dictcomp>
return {o.name: o.generate(context) for o in self.properties if self.should_keep(o.name)}
File "/root/repos/sa-data-perf/venv/lib/python3.10/site-packages/jsf/schema_types/object.py", line 42, in generate
return {o.name: o.generate(context) for o in self.properties if self.should_keep(o.name)}
TypeError: 'NoneType' object is not iterable
can jsf work with schema like given, this schema was generated by pydantic
, i'm not sure which part cause this error, hope log more specifically to tell me which property cause this error
Since current implementation is making use of set
s in Python, dicts are not hashable, change would be needed to rectify this.
Example
"errors": {
"type": "object",
"properties": {
"validationErrors": {
"type": "array",
"minItems": 0,
"maxItems": 2,
"uniqueItems": false,
"items": [
{
"type": "object",
"$state": {
"error": "lambda: random.choice([{'code':'3013','message':'Mandatory field is either Null or blank','field':'IDNumber'}, {'code':'2013','message':'Mandatory field is either Null or blank','field':'IDNumber'}])"
},
"properties": {
"code": {
"type": "string",
"description": "Error code from Digital gateway validation checks",
"$provider": "lambda: state['validationErrors[0]']['error']['code']"
},
"message": {
"type": "string",
"description": "",
"$provider": "lambda: state['validationErrors[0]']['error']['message']"
},
"field": {
"type": "string",
"description": "",
"$provider": "lambda: state['validationErrors[0]']['error']['field']"
}
},
"required": ["code", "message", "field"]
}
]
}
},
"required": ["validationErrors"]
}
If we are using a schema like this:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "urn://Media.schema.json",
"title": "Media",
"version": "0.0.1",
"description": "This event represents a Media",
"type": "object",
"properties": {
"envID": {
"type": "string",
"minLength": 10
},
"envTimestamp": {
"type": "integer",
"exclusiveMinimum": 0
},
"javaType": {
"type": "string"
},
"mediaKey": {
"type": "string",
"minLength": 5,
"maxLength": 64
},
"mediaType": {
"type": "string",
"enum": [
"COVER"
]
}
},
"required": [
"envID",
"envTimestamp",
"mediaType"
],
"additionalProperties": false,
"oneOf": [
{
"properties": {
"mediaType": {
"const": "COVER"
}
},
"required": [
"javaType"
]
}
]
}
When "mediaType" is "COVER" the variable "javaType" must be included always. This is not happening now
This should include using shared local states, custom execution contexts, custom initial states, validation etc
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.