mikeywaites / kim Goto Github PK

View Code? Open in Web Editor NEW

317.0 10.0 17.0 827 KB

Kim: A JSON Serialization and Marshaling framework

Home Page: http://kim.readthedocs.org/en/latest/

License: Other

Python 100.00%

python json rest-api serialization marshalling

kim's Introduction

Kim: A JSON Serialization and Marshaling framework

https://circleci.com/gh/mikeywaites/kim.svg?style=shield&circle-token=d46954b5e66c2cc885f35c745baaea9a70e961af

Introducing Kim:

.. code-block:: python

>>> mapper = UserMapper(data=response.json())
>>> mapper.marshal()
User(id='one', name='Bruce Wayne', 'title'='CEO/Super Hero')
>>> user_two = User.query.get('two')
>>> mapper = UserMapper(obj=user_two)
>>> mapper.serialize()
{u'id': 'two', u'name': 'Martha Wayne', 'title': 'Mother of Batman'}

Kim Features

Kim is a feature packed framework for handling even the most complex marshaling and serialization requirements.

Web framework agnostic - Flask, Django, Framework-XXX supported!
Highly customisable field processing system
Security focused
Control included fields with powerful roles system
Handle mixed data types with polymorphic mappers
Marshal and Serialize nested objects

Kim officially supports Python 2.7 & 3.3–3.5

Installation

Install Kim using pip:

.. code-block:: bash

$ pip install py-kim

Documentation

Learn all of Kim's features with these simple step-by-step instructions or check out the quickstart guide for a rapid overview to get going quickly.

http://kim.readthedocs.io/en/latest/

kim's People

Contributors

Stargazers

Watchers

Forkers

krak3n radeklos charleypeng1 hbcbh1999 emulbreh alirizakeles ipv1337 margaferrez elspawaczo larsks kimanhthi12 dxe4 1eye4all

kim's Issues

Attach roles to mappings opposed to serializers.

Roles are currently specified in the Meta class of kim.serializers.Serializer. They provide a way to affect the fields used at run time in different scenarios. Equally they provide the user with a way to affect which fields are relevant for marshaling/serializing when operating on related objects and data structures.

This has ultimately made the Serializer class more powerful than the underlying mapping that it constructs which was not the original intention. Serializer should remain as syntactic sugar for easily creating Mapping objects in a re-usable way.

This enhancement will allow Mapping objects to accept and make use of roles as a way to manage which Types are mapped. Any Roles defined on the Serializer therefore are just the way to define a role on mapping.

https://github.com/mikeywaites/kim/blob/master/kim/mapping.py#L15 would accept a new roles kwarg which would contain a collection of Role objects as seen previously in the Serializer class.

Mapping would then define a method get_iterable which is called with an optional role identifier as a kwarg that would return a mapping constructed from the role or the default (all fields)

Serializer would simply pass any provided role straight onto the underlying mapping here

https://github.com/mikeywaites/kim/blob/master/kim/serializers.py#L178

Group pipelines into processing steps

#46 To aid subclassing support and generally improve reusability of pipelines we should split the pipeline into 4 groups

input validation process output

This will allow end users to easily add to any part of the pipeline without needing to re-implement the whole thing.

Add `Field` docs and update flask-kim-example app

#46

Passing non int to IntergerForeignKey results in a crash not a ValidationError

field.Boolean doesn't convert 'truthy' strings to boolen

field.Boolean should handle any 'truthy' value and when permitted, should process the value into its boolean value. IE when a string is set the field.Boolean should return True

class MyMapper(Mapper):
     activated = field.Boolean()

# scalar_sub_select will return an id which is a string rather than a Boolean scalar type.
>>> obj = query(Model, scalar_sub_select.label('activated')).first()

>>> MyMapper(obj=obj).serialize(objs)
>>> {'activated': 'asdasdsd'}

Implement .partial() api for marshaling data with an existing object.

We need a better solution than just throwing the existing object at marshal

Split FieldError into Config/Runtime errors and Processing error types

#46 FieldError is currently used for all types of exceptions inside of Field. Field.invalid should make use of a new exception FieldInvalid when raising errors that occur during pipeline processing

Marshalling PUT request with missing required fields does not error.

class User(db.Model):

    __tablename__ = 'users'
    id = Column(String, primary_key=True)
    name = Column(String)
    password = Column(String)


class UserMapper(Mapper):
    __type__ = User

    id = field.String(read_only=True)
    name = field.String()
    password = field.String()

    __roles__ = {
        'public': ['name', 'id']
    }


def test_put_object_api_with_errors(flask_app, api, db_session, client):

    u1, = create_users(db_session, 1)
    data = {}
    resp = client.put(
        url_for('api.users.object', obj_id=u1.id),
        data=json.dumps(data), content_type='application/json')

    assert resp.status_code == 400
    assert resp.content_type == 'application/json'
    data = json.loads(resp.data.decode('utf-8'))
    assert data == {'error': True}

Nested mappers cannot access session.parent

session.parent can only be accessed in a collection: field.Collection(field.Nested(SomeMapper))

it will be null in the case of a simple field.Nested(SomeMapper)

Allow arbitrary string errors to be raised from invalid

Rather than having to add to extra_error_msgs

Raise error when trying to serialize None

m = Mapper(data=bla)
m.serialize()

obj should have been passed here instead of data, but this isn't made clear until it causes a weird error

Throw an error if you try to pass role (etc) to a Field rather than a Type

t.Collection requires Instantiated type

The following raises an exception:

links = Field(t.Collection(t.String), read_only=True)

TypeError: Collection() requires a valid Type as its first argument

However this works:

links = Field(t.Collection(t.String()), read_only=True)

Thanks,

Chris

Support for running test suite in python 2.x and python 3.x

Create nicer error message if getter not set on ForeignKey

Allow nested serializers to be specified as strings to avoid circular import issues

@krak3n reported this issue

Take a serializer definition such as

class SchedulableEntitySerializer(SQASerializer):
    id = Field(t.String, read_only=True)

    name = Field(t.String)
    description = Field(t.String, required=False)

    company_relationship = Field(
        NestedForeignKeyStr(CompanyRelationshipSerializer)
    )

Allowing Nested* to accept a python import path as a string would help to solve some circular dependancy issues.

class SchedulableEntitySerializer(SQASerializer):
    id = Field(t.String, read_only=True)

    name = Field(t.String)
    description = Field(t.String, required=False)

    company_relationship = Field(
        NestedForeignKeyStr('path.to.CompanyRelationshipSerializer',)
    )

Support dicts as well as objects

Rather than blindly using getattr/setattr, so people can work with dict-like objects if they want

Allow everything to be imported from the top level kim module

And update docs. This way, when we inevitably move things around, people won't be as upset.

custom validators should be passed the marshaled data not the raw data

if i have a DateTime field defined and a custom validator for it, the custom validator will be passed '2014-05-12T00:00:00Z', when it should probably be passed datetime(2014, 5, 12)

FieldOpts may raise FieldOptError - Field.init should handle and re-rasie

#46 it would also be nice if there was something for exceptions raised by Opts classes

at the moment they don’t know what field they came from so the error won’t be very helpful
they should raise errors which are caught in Field.init and reraised with helpful errors

Custom validators are called with the wrong instance (value of self)

The value of self when a custom validator (validate_FOO method) is called will always be the first instance of that Serializer that ever existed, and not the current instance.

Support multiple/reusable top level validate methods

perhaps with hooks/decorator interface?

Passing nested serializers as strings doesn't work on SQASerializers

Nested roles

Description

Roles provide user of Kim with a powerful way to control what fields should be included when marshaling or serialzing a Mapper. The ability to specify a role on a nested field has been available since v1 but the existing functionality only offers so much.

This proposal outlines new functionality that would allow users to specify the name of roles attached to a Nested field from inside Role definitions on other Mappers. We will also offer greater control over how Nested fields are processed by allowing users to set specific serialize and marshal roles.

Targeting Nested field roles

class UserMapper(Mapper):
   name = field.String()
   address = field.String()

   __roles__ = {'basic': blacklist('address')}

class EventMapper(Mapper):
    name = field.String()
    location = field.String()
    user = field.Nested(UserMapper)

   __roles__ = {
       'simple': whitelist('name', 'user@basic')
   }

   mapper = EventMapper(data=json.loads(json_data)).marshal(role='simple')

So the user@basic syntax results in the nested mapper using the 'basic' role. The user@basic Role may also specify a role for a nested field too. The Nested fields would be processed the same way all the way down the chain.

Different Serialize and Marshal roles for Nested fields.

It's quite common that you wan't to allow different fields when serializing data to those permitted when marshaling. This feature will add two new properties to the NestedFieldOpts -

serialize_role
marshal_role

class EventMapper(Mapper):
    name = field.String()
    location = field.String()
    user = field.Nested(UserMapper, serialize_role='__default__', marshal_role='basic')

The new Nested roles syntax will allow users to specify different roles for serilaizing and marshaling

class EventMapper(Mapper):
    name = field.String()
    location = field.String()
    user = field.Nested(UserMapper)

   __roles__ = {'simple': whitelist('name', 'user@serialize:__default__', 'user@marshal:basic')}

Deciding which role to use when processing a Nested field will flow something like the following:

* if marshaling and there is a nested marshal role use that
* elif marshaling and there is a nested generic role for the field, use that`
* elif marshaling and there is a `marshal_role` option set in NestedFieldOpts, use that
* elif marshaling and there is a `role` option set in NestedFieldOpts, user that`
* else just use the `__default__` Role.

* if serializing and there is a nested serialize role, use that
* elif serializing and there is a nested generic role for the field, use that`
* elif serializing and there is a `serialize_role` option set in NestedFieldOpts, use that
* elif serializing and there is a `role` option set in NestedFieldOpts, use that`
* else just use the `__default__` Role.

Conclusion

We feel this feature is going to add a huge amount of value to Kim. Nested is already one of the best features Kim offers. Providing more options for configuring how they work will hopefully lead to some great use cases.

We would love to hear any feedback anyone has on this feature. We wan't to make sure we get it right for everyone. If you have any suggestions or even just want to let us know you like the proposed approach then please don't hesitate.

Questions

When specifying marhsal and serialize roles using the new nested role syntax should they be provided separately (as seen in the example) or should we consider another option?

Another option that we considered was specifying the role in the following form:

user@{role_name} OR serialize:{role_name},marshal:{role_name}

We might also consider using an object over a string in the Role definition.

whitelist('name', nested('field_name', role='X', 'serialize='X', marshal='Y'), 'foo', ...)

Top level validate raises MappingErrors erases field Validation errors

When raising MappingErrors in a top level validate method on a serializer causes any previous errors on other fields lower down the chain to be lost.

Line 279 of mapping.py is the culprit. A merging errors with the errors raised from the MappingError could potentially be quite straight forward.

Happy to contribute if you approve.

Chris

Only one invalid field at a time

Currently only one invalid field can be reported at the time, because an exception is raised immediately after calling .invalid()

It would be helpful if more than one could be reported at a time, with the raise happening later on.

https://github.com/mikeywaites/kim/blob/release/1.0.0-beta/kim/field.py#L222

Improve tests for parent_session by using a Mock pipeline.

Support validating the length of a field

Allow certain types to validate that their length is not over a certain value.

Make required=True the default

For consistency with kim1

Cannot pass arguments to nested serializers

Mapper system

#38

Mappers are the building blocks of Kim - they define how JSON output should look and how input JSON should be expected to look.

Mappers consist of Fields. Fields define the shape and nature of the data both when being ouputted (serialised) and inputted (marshaled).

Mappers must define a model. This is the type that will be instantiated if a new object is marshaled through the mapper. If you only want a simple object, you can set this to dict or object.

class AuthorMapper(Mapper):
    __model__ = Author

    name = String()
    date_of_birth = Date()

Choices should probably be a Field param not a Type param

Partial and field source don't mix very well

name = field.String(source='full', required=True)

@marshaling.processes('name')
def print_name(session):
     print(session.data)

mapper.marshal({}, partial=True, obj=existing_obj)

expected result: session.data == existing_obj.full
actual result: session.data == None

Memoization

Kim acts a single point for entry for data into the system via apis. This makes it a great candidate for answering questions such as "Did my data change?" and "what did it change from?".

Field API changes.

We would store a private property on the Field instance called _changes which would store a ref to the changes processed by that field. Each field storing its changes will provide Mapper with a simplified API for retrieving all the changes for all its fields.

Field.__init__
+ self._changes = {}

Storing changes

We only care about changes that occur during marshaling. The most effective place for us to detect any change is in the update_output_to_source.

@pipe(run_if_none=True)
def update_output_to_source(session):
    """Store ``data`` at field.opts.source for a ``field`` inside
    of ``output``

    :param session: Kim pipeline session instance

    :raises: FieldError
    :returns: None
    """

    # memoize = session.field.opts.memoize
    source = session.field.opts.source
    try:
        if source == '__self__':
            attr_or_key_update(session.output, session.data)
        else:
            old_value = attr_or_key(session.output, source)
            new_value = set_attr_or_key(
                session.output, session.field.opts.source, session.data)
            if session.field.opts.get('memoize', False):
                session.field.set_changes(old_value, new_value)
    except (TypeError, AttributeError):
        raise FieldError('output does not support attribute or '
                         'key based set operations')

This would also allow users to easily disable the memoization for certain fields on a field by field basis by letting the user pass memoize=False to FieldOpts

Mapper API Changes

Mapper would also store a changes object which would container the data collected from each field as each field is iterated over.

Mapper.__init__
+ self._changes

+ Mapper.get_changes()

For each successfully marshalled field get_change_from_field() would be called to pull the value changes and store them in Mapper._changes

        for field in fields:
            try:
                field.marshal(self.get_mapper_session(data, output))
                self.get_changes_from_field(field)
            except FieldInvalid as e:
                self.errors[field.name] = e.message
            except MappingInvalid as e:
                # handle errors from nested mappers.
                self.errors[field.name] = e.errors

The Mapper would also expose a method get_changes that would return a serialized version of the mappers changes dict.

Nested

Nested change tracking should be as simple as calling get_changes() on the nested_mapper here. https://github.com/mikeywaites/kim/blob/release/1.0.0-beta/kim/pipelines/nested.py#L75

session.field.set_field_changes(nested_mapper.get_changes)

The set_field_changes Method on Nested will be overridden to support a non scalar data type.

Collection

Collection change tracking is also supported in a similar manner to Nested. We will simply call collection.set_field_changes(field.get_changes()) for each field that's marshalled in the collection.

https://github.com/mikeywaites/kim/blob/release/1.0.0-beta/kim/pipelines/collection.py#L45

Mapper fields should be reinstantiated when mapper is instantiated

Currently self.fields is in fact cls.fields and changes made to fields persist across all Mapper instances

replace @hooks api with new FieldOpts options.

Whilst it was a nice idea in practice it's pretty awful to use and generally leads to lots of code duplication. That coupled with the horror of self not being set when the method is called.

I think it would generally be a lit nicer to provide Field definitions with extra_inputs extra_validators
extra_processors extra_outputs

An example definition would be something like.

@pipe
def my_validator(session):
    #do some stuff ...


class FooMapper(Mapper):
    id = field.Integer(extra_validators=[my_validator])

I think this will lead to a much cleaner, more predictable API all round.

Set Field.source to Field.name

#46 Field.opts.source should default to Field.opts.name:

self.source = opts.pop('source', self.name)

https://github.com/mikeywaites/kim/pull/46/files#diff-e5139502fee14d471fc89aa9b48b303dR28

Sending {'id': 'name'} to field.String produces a python object

In an attempt to test an invalid PATCH request i tried sending

data = {
    'name': {'id': 'name'},
}

I expected this to actually end up as a str '{"id": "name"}' but SQA threw an error after trying to save it as an object.

Field.default is ignored when the field is None

    tags = field.Collection(
        field.Nested('TagMapper', getter=tag_getter),
        default=[], required=False)

With a field definition like the above, if the tags attr on the Mapper.type is None, default will not be returned. We should always return the default.

Documentation.

Provide full api documentation plus detailed usage examples of the all of kim's features. The docs should also provide a detailed look at the design of kim and what responsibility each part plays.

I think that a scenario would be extremely useful for the usage docs. Something that takes a full working flask application and builds on a simple initial set of examples in a REST api and works up to the more complex areas where the SQASerializers begin to shine.

@krak3n @jackqu7 You guys have any ideas on a scenario the examples can play through?

We could also host the full working application on github so people can check it out and have a play.

Introduction to KIM ( what is kim, what are its goals and how does it help me)
Kim Walkthrough
Quickstart flask rest api with basic serializer
Part one - Defining Serializers / marshaling/serializing
Part two - Adding SQA model serialization
Part Three - Handling marshaling/serializing related objects
Part Four - Validation
Part Five - Defining Custom types / Extending KIM
Part Six - ....
Part Seven - ...
Full detailed apis docs (ensuring that all methods/class/funcs are properly documented)
Contributors page / Development how to
Release changelog/planned release schedule

Proposal: Role Specific Top Level Validate Method

Consider this user case:

On a POST request a top level validate needs to run a query to check the uniqueness of an object being created in the DB.

On a PUT request the top level validator does not need to do this check as the object already exists.

We could put a condition in the top level validate method to check the request method, however seeing as kim already has the idea of roles, this feels like something which could be attached to this:

def validator1(serializer, data):
    pass

def validator2(serializer, data):
    pass

def validator3(serializer, data):
    pass

class Fooizer(SQASerializer):

    foo_id = Field(t.String)
    bar_id = Field(t.String)

    class Meta(object):
        roles = {
            'update': {
                'validators': [validator1, validator2, validator3]  # Called before final top level validate
                'fields': blacklist('foobar')
            }
       }

    def validate(self, data):
        # This is always called last

This would allow us to have custom top level validation on a role by role basis that can be reused across our serializes.

What do you think?

Chris

getter invalid error returns nested object.

with a getter function on a nested field in a field.Collection, the error format is not consistent with field.invalid

def pillar_getter(session):

    if session.data and 'id' in session.data:
        pillar = Pillar.get_by_id(session.data['id']).one_or_none()
        if pillar.company.organisation_id != current_user.company.organisation_id:
            return None
        else:
            return pillar

class WeightedPillarMapper(WeightedComponentMapper):

    __type__ = PerformanceTemplatePillar

    pillar = field.Nested('PillarMapper', role='pillar_weighting', getter=pillar_getter)
    kpis = field.Collection(
        field.Nested('WeightedKpiMapper',
                     getter=kpi_getter,
                     allow_updates=True,
                     allow_create=True),
        required=False,
        default=[],
        extra_marshal_pipes={
            'process': [set_order],
            'validation': [check_for_duplicate_kpis, validate_weightings]
        },
        error_msgs = {
            'duplicate_error': 'You can\'t specify a kpi more than once in a pillar.',
            'invalid_weighting': 'Please ensure all the kpi weightings inside '
                                 'each pillar add up to 100%.'
        }
    )

If the getter fails the error message format is an object in the form of {'pillar': 'pillar not found'}. Taking an error response format from the Vizibl API we end up with

            exp = {
                'status': 400,
                'errors': [
                    {
                        'field': 'pillars',
                        'error': {"pillar": "pillar not found"}
                    }
                ]
            }

We'd expect to see consistent error format in the form of

            exp = {
                'status': 400,
                'errors': [
                    {
                        'field': 'pillars',
                        'error': "pillar not found"
                    }
                ]
            }

Data should be retrieved from field.source when serializing, not field.name

Allow roles to be passed as strings to nested

Implement output pipe

#46 fields will be responsible for pushing them selves into the output data when serialising and Marshaling

This will allow fields to do things like taking a name input and populating first name and last name on a model

versions of six lower than 1.9.0 not working

Some weird issue with the metaclass stuff in six was detected on version 1.6.1

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/kim/__init__.py in <module>()
      7
      8
----> 9 from .mapper import Mapper
     10 from .field import Field

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/kim/mapper.py in <module>()
    183
    184
--> 185 class Mapper(six.with_metaclass(MapperMeta, object)):
    186     """Mappers are the building blocks of Kim - they define how JSON output
    187     should look and how input JSON should be expected to look.

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/six.pyc in with_metaclass(meta, *bases)
    629 def with_metaclass(meta, *bases):
    630     """Create a base class with a metaclass."""
--> 631     return meta("NewBase", bases, {})
    632
    633 def add_metaclass(metaclass):

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/kim/mapper.py in __init__(cls, classname, bases, dict_)
    179
    180     def __init__(cls, classname, bases, dict_):
--> 181         _MapperConfig.setup_mapping(cls, classname, dict_)
    182         type.__init__(cls, classname, bases, dict_)
    183

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/kim/mapper.py in setup_mapping(cls, cls_, classname, dict_)
     85     def setup_mapping(cls, cls_, classname, dict_):
     86         cfg_cls = _MapperConfig
---> 87         cfg_cls(cls_, classname, dict_)
     88
     89     def __init__(self, cls_, classname, dict_):

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/kim/mapper.py in __init__(self, cls_, classname, dict_)
     99         # the user is looking to override the default role and dont create one
    100         # here.
--> 101         if '__default__' not in self.cls.__roles__:
    102             self.cls.roles['__default__'] = \
    103                 whitelist(*self.cls.fields.keys())

AttributeError: type object 'NewBase' has no attribute ‘__roles__’

Implement Nested field type

Implement the Nested field type and associated pipelines.

Nested should support all the features still relevant from Kim 1. We should also consider the agreed upgrades required for the 1.0.0 release like named roles etc.

Marshaling collection with default=None and no data causes FieldError

class WeightedPillarMapper(Mapper):
    id = field.String()
    pillar = field.Nested('PillarMapper')

class TemplateMapper(Mapper):

    pillars = field.Collection(
        field.Nested('WeightedPillarMapper', required=False, allow_create=True), required=False)

With the above Mapper configuration attempting to marshal TemplateMapper with no pillars key present in the json causes a FieldError to be raised. Setting the field.Collection(default=[]) kwarg resolves the issues.

Possibly something related to run_if_none or something like that.

Dont expect a specific type for types.Collection

types.Collection currently implements the TypedType interface and therefore enforces a check that the type is a list.

This should really deal with iterables and not enforce any such check.

Mapper class names in registry should not be required to be globally unique

For example, if your application includes third party libraries that define their own Kim Mappers, you're likely to have conflicts with common names such as User.

A simple workaround could be to namespace them by module, using sensible defaults to mean in most cases it's not required to use the full path.

ie. If the Nested is in foo.serializers, look there first rather than bar.serializers

mikeywaites / kim Goto Github PK

kim's Introduction

Kim: A JSON Serialization and Marshaling framework

Kim Features

Installation

Documentation

kim's People

Contributors

Stargazers

Watchers

Forkers

kim's Issues

Description

Targeting Nested field roles

Different Serialize and Marshal roles for Nested fields.

Conclusion

Questions

Field API changes.

Storing changes

Mapper API Changes

Nested

Collection

Recommend Projects

Recommend Topics

Recommend Org