Giter Club home page Giter Club logo

kim's Introduction

Kim: A JSON Serialization and Marshaling framework

https://circleci.com/gh/mikeywaites/kim.svg?style=shield&circle-token=d46954b5e66c2cc885f35c745baaea9a70e961af

Introducing Kim:

.. code-block:: python
>>> mapper = UserMapper(data=response.json())
>>> mapper.marshal()
User(id='one', name='Bruce Wayne', 'title'='CEO/Super Hero')
>>> user_two = User.query.get('two')
>>> mapper = UserMapper(obj=user_two)
>>> mapper.serialize()
{u'id': 'two', u'name': 'Martha Wayne', 'title': 'Mother of Batman'}

Kim Features

Kim is a feature packed framework for handling even the most complex marshaling and serialization requirements.

  • Web framework agnostic - Flask, Django, Framework-XXX supported!
  • Highly customisable field processing system
  • Security focused
  • Control included fields with powerful roles system
  • Handle mixed data types with polymorphic mappers
  • Marshal and Serialize nested objects

Kim officially supports Python 2.7 & 3.3–3.5

Installation

Install Kim using pip:

.. code-block:: bash
$ pip install py-kim

Documentation

Learn all of Kim's features with these simple step-by-step instructions or check out the quickstart guide for a rapid overview to get going quickly.

http://kim.readthedocs.io/en/latest/

kim's People

Contributors

elspawaczo avatar emulbreh avatar jackqu7 avatar krak3n avatar larsks avatar margaferrez avatar mikeywaites avatar radeklos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kim's Issues

Attach roles to mappings opposed to serializers.

Roles are currently specified in the Meta class of kim.serializers.Serializer. They provide a way to affect the fields used at run time in different scenarios. Equally they provide the user with a way to affect which fields are relevant for marshaling/serializing when operating on related objects and data structures.

This has ultimately made the Serializer class more powerful than the underlying mapping that it constructs which was not the original intention. Serializer should remain as syntactic sugar for easily creating Mapping objects in a re-usable way.

This enhancement will allow Mapping objects to accept and make use of roles as a way to manage which Types are mapped. Any Roles defined on the Serializer therefore are just the way to define a role on mapping.

https://github.com/mikeywaites/kim/blob/master/kim/mapping.py#L15 would accept a new roles kwarg which would contain a collection of Role objects as seen previously in the Serializer class.

Mapping would then define a method get_iterable which is called with an optional role identifier as a kwarg that would return a mapping constructed from the role or the default (all fields)

Serializer would simply pass any provided role straight onto the underlying mapping here

https://github.com/mikeywaites/kim/blob/master/kim/serializers.py#L178

Group pipelines into processing steps

#46 To aid subclassing support and generally improve reusability of pipelines we should split the pipeline into 4 groups

input validation process output

This will allow end users to easily add to any part of the pipeline without needing to re-implement the whole thing.

field.Boolean doesn't convert 'truthy' strings to boolen

field.Boolean should handle any 'truthy' value and when permitted, should process the value into its boolean value. IE when a string is set the field.Boolean should return True

class MyMapper(Mapper):
     activated = field.Boolean()

# scalar_sub_select will return an id which is a string rather than a Boolean scalar type.
>>> obj = query(Model, scalar_sub_select.label('activated')).first()

>>> MyMapper(obj=obj).serialize(objs)
>>> {'activated': 'asdasdsd'}

Marshalling PUT request with missing required fields does not error.

class User(db.Model):

    __tablename__ = 'users'
    id = Column(String, primary_key=True)
    name = Column(String)
    password = Column(String)


class UserMapper(Mapper):
    __type__ = User

    id = field.String(read_only=True)
    name = field.String()
    password = field.String()

    __roles__ = {
        'public': ['name', 'id']
    }


def test_put_object_api_with_errors(flask_app, api, db_session, client):

    u1, = create_users(db_session, 1)
    data = {}
    resp = client.put(
        url_for('api.users.object', obj_id=u1.id),
        data=json.dumps(data), content_type='application/json')

    assert resp.status_code == 400
    assert resp.content_type == 'application/json'
    data = json.loads(resp.data.decode('utf-8'))
    assert data == {'error': True}

t.Collection requires Instantiated type

The following raises an exception:

links = Field(t.Collection(t.String), read_only=True)
TypeError: Collection() requires a valid Type as its first argument

However this works:

links = Field(t.Collection(t.String()), read_only=True)

Thanks,

Chris

Allow nested serializers to be specified as strings to avoid circular import issues

@krak3n reported this issue

Take a serializer definition such as

class SchedulableEntitySerializer(SQASerializer):
    id = Field(t.String, read_only=True)

    name = Field(t.String)
    description = Field(t.String, required=False)

    company_relationship = Field(
        NestedForeignKeyStr(CompanyRelationshipSerializer)
    )

Allowing Nested* to accept a python import path as a string would help to solve some circular dependancy issues.

class SchedulableEntitySerializer(SQASerializer):
    id = Field(t.String, read_only=True)

    name = Field(t.String)
    description = Field(t.String, required=False)

    company_relationship = Field(
        NestedForeignKeyStr('path.to.CompanyRelationshipSerializer',)
    )

Nested roles

Description

Roles provide user of Kim with a powerful way to control what fields should be included when marshaling or serialzing a Mapper. The ability to specify a role on a nested field has been available since v1 but the existing functionality only offers so much.

This proposal outlines new functionality that would allow users to specify the name of roles attached to a Nested field from inside Role definitions on other Mappers. We will also offer greater control over how Nested fields are processed by allowing users to set specific serialize and marshal roles.

Targeting Nested field roles

class UserMapper(Mapper):
   name = field.String()
   address = field.String()

   __roles__ = {'basic': blacklist('address')}

class EventMapper(Mapper):
    name = field.String()
    location = field.String()
    user = field.Nested(UserMapper)

   __roles__ = {
       'simple': whitelist('name', 'user@basic')
   }

   mapper = EventMapper(data=json.loads(json_data)).marshal(role='simple')

So the user@basic syntax results in the nested mapper using the 'basic' role. The user@basic Role may also specify a role for a nested field too. The Nested fields would be processed the same way all the way down the chain.


Different Serialize and Marshal roles for Nested fields.

It's quite common that you wan't to allow different fields when serializing data to those permitted when marshaling. This feature will add two new properties to the NestedFieldOpts -

  1. serialize_role
  2. marshal_role
class EventMapper(Mapper):
    name = field.String()
    location = field.String()
    user = field.Nested(UserMapper, serialize_role='__default__', marshal_role='basic')

The new Nested roles syntax will allow users to specify different roles for serilaizing and marshaling

class EventMapper(Mapper):
    name = field.String()
    location = field.String()
    user = field.Nested(UserMapper)

   __roles__ = {'simple': whitelist('name', 'user@serialize:__default__', 'user@marshal:basic')}

Deciding which role to use when processing a Nested field will flow something like the following:

* if marshaling and there is a nested marshal role use that
* elif marshaling and there is a nested generic role for the field, use that`
* elif marshaling and there is a `marshal_role` option set in NestedFieldOpts, use that
* elif marshaling and there is a `role` option set in NestedFieldOpts, user that`
* else just use the `__default__` Role.
* if serializing and there is a nested serialize role, use that
* elif serializing and there is a nested generic role for the field, use that`
* elif serializing and there is a `serialize_role` option set in NestedFieldOpts, use that
* elif serializing and there is a `role` option set in NestedFieldOpts, use that`
* else just use the `__default__` Role.

Conclusion

We feel this feature is going to add a huge amount of value to Kim. Nested is already one of the best features Kim offers. Providing more options for configuring how they work will hopefully lead to some great use cases.

We would love to hear any feedback anyone has on this feature. We wan't to make sure we get it right for everyone. If you have any suggestions or even just want to let us know you like the proposed approach then please don't hesitate.

Questions

When specifying marhsal and serialize roles using the new nested role syntax should they be provided separately (as seen in the example) or should we consider another option?

Another option that we considered was specifying the role in the following form:

user@{role_name} OR serialize:{role_name},marshal:{role_name}

We might also consider using an object over a string in the Role definition.

whitelist('name', nested('field_name', role='X', 'serialize='X', marshal='Y'), 'foo', ...)

Top level validate raises MappingErrors erases field Validation errors

When raising MappingErrors in a top level validate method on a serializer causes any previous errors on other fields lower down the chain to be lost.

Line 279 of mapping.py is the culprit. A merging errors with the errors raised from the MappingError could potentially be quite straight forward.

Happy to contribute if you approve.

Chris

Mapper system

#38

Mappers are the building blocks of Kim - they define how JSON output should look and how input JSON should be expected to look.

Mappers consist of Fields. Fields define the shape and nature of the data both when being ouputted (serialised) and inputted (marshaled).

Mappers must define a model. This is the type that will be instantiated if a new object is marshaled through the mapper. If you only want a simple object, you can set this to dict or object.

class AuthorMapper(Mapper):
    __model__ = Author

    name = String()
    date_of_birth = Date()

Partial and field source don't mix very well

name = field.String(source='full', required=True)

@marshaling.processes('name')
def print_name(session):
     print(session.data)

mapper.marshal({}, partial=True, obj=existing_obj)

expected result: session.data == existing_obj.full
actual result: session.data == None

Memoization

Kim acts a single point for entry for data into the system via apis. This makes it a great candidate for answering questions such as "Did my data change?" and "what did it change from?".

Field API changes.

We would store a private property on the Field instance called _changes which would store a ref to the changes processed by that field. Each field storing its changes will provide Mapper with a simplified API for retrieving all the changes for all its fields.

Field.__init__
+ self._changes = {}

Storing changes

We only care about changes that occur during marshaling. The most effective place for us to detect any change is in the update_output_to_source.

@pipe(run_if_none=True)
def update_output_to_source(session):
    """Store ``data`` at field.opts.source for a ``field`` inside
    of ``output``

    :param session: Kim pipeline session instance

    :raises: FieldError
    :returns: None
    """

    # memoize = session.field.opts.memoize
    source = session.field.opts.source
    try:
        if source == '__self__':
            attr_or_key_update(session.output, session.data)
        else:
            old_value = attr_or_key(session.output, source)
            new_value = set_attr_or_key(
                session.output, session.field.opts.source, session.data)
            if session.field.opts.get('memoize', False):
                session.field.set_changes(old_value, new_value)
    except (TypeError, AttributeError):
        raise FieldError('output does not support attribute or '
                         'key based set operations')

This would also allow users to easily disable the memoization for certain fields on a field by field basis by letting the user pass memoize=False to FieldOpts

Mapper API Changes

Mapper would also store a changes object which would container the data collected from each field as each field is iterated over.

Mapper.__init__
+ self._changes

+ Mapper.get_changes()

For each successfully marshalled field get_change_from_field() would be called to pull the value changes and store them in Mapper._changes

        for field in fields:
            try:
                field.marshal(self.get_mapper_session(data, output))
                self.get_changes_from_field(field)
            except FieldInvalid as e:
                self.errors[field.name] = e.message
            except MappingInvalid as e:
                # handle errors from nested mappers.
                self.errors[field.name] = e.errors

The Mapper would also expose a method get_changes that would return a serialized version of the mappers changes dict.

Nested

Nested change tracking should be as simple as calling get_changes() on the nested_mapper here. https://github.com/mikeywaites/kim/blob/release/1.0.0-beta/kim/pipelines/nested.py#L75

session.field.set_field_changes(nested_mapper.get_changes)

The set_field_changes Method on Nested will be overridden to support a non scalar data type.

Collection

Collection change tracking is also supported in a similar manner to Nested. We will simply call collection.set_field_changes(field.get_changes()) for each field that's marshalled in the collection.

https://github.com/mikeywaites/kim/blob/release/1.0.0-beta/kim/pipelines/collection.py#L45

replace @hooks api with new FieldOpts options.

Whilst it was a nice idea in practice it's pretty awful to use and generally leads to lots of code duplication. That coupled with the horror of self not being set when the method is called.

I think it would generally be a lit nicer to provide Field definitions with extra_inputs extra_validators
extra_processors extra_outputs

An example definition would be something like.

@pipe
def my_validator(session):
    #do some stuff ...


class FooMapper(Mapper):
    id = field.Integer(extra_validators=[my_validator])

I think this will lead to a much cleaner, more predictable API all round.

Field.default is ignored when the field is None

    tags = field.Collection(
        field.Nested('TagMapper', getter=tag_getter),
        default=[], required=False)

With a field definition like the above, if the tags attr on the Mapper.type is None, default will not be returned. We should always return the default.

Documentation.

Provide full api documentation plus detailed usage examples of the all of kim's features. The docs should also provide a detailed look at the design of kim and what responsibility each part plays.

I think that a scenario would be extremely useful for the usage docs. Something that takes a full working flask application and builds on a simple initial set of examples in a REST api and works up to the more complex areas where the SQASerializers begin to shine.

@krak3n @jackqu7 You guys have any ideas on a scenario the examples can play through?

We could also host the full working application on github so people can check it out and have a play.

  • Introduction to KIM ( what is kim, what are its goals and how does it help me)
  • Kim Walkthrough
  • Quickstart flask rest api with basic serializer
  • Part one - Defining Serializers / marshaling/serializing
  • Part two - Adding SQA model serialization
  • Part Three - Handling marshaling/serializing related objects
  • Part Four - Validation
  • Part Five - Defining Custom types / Extending KIM
  • Part Six - ....
  • Part Seven - ...
  • Full detailed apis docs (ensuring that all methods/class/funcs are properly documented)
  • Contributors page / Development how to
  • Release changelog/planned release schedule

Proposal: Role Specific Top Level Validate Method

Consider this user case:

On a POST request a top level validate needs to run a query to check the uniqueness of an object being created in the DB.

On a PUT request the top level validator does not need to do this check as the object already exists.

We could put a condition in the top level validate method to check the request method, however seeing as kim already has the idea of roles, this feels like something which could be attached to this:

def validator1(serializer, data):
    pass

def validator2(serializer, data):
    pass

def validator3(serializer, data):
    pass

class Fooizer(SQASerializer):

    foo_id = Field(t.String)
    bar_id = Field(t.String)

    class Meta(object):
        roles = {
            'update': {
                'validators': [validator1, validator2, validator3]  # Called before final top level validate
                'fields': blacklist('foobar')
            }
       }

    def validate(self, data):
        # This is always called last

This would allow us to have custom top level validation on a role by role basis that can be reused across our serializes.

What do you think?

Chris

getter invalid error returns nested object.

with a getter function on a nested field in a field.Collection, the error format is not consistent with field.invalid

def pillar_getter(session):

    if session.data and 'id' in session.data:
        pillar = Pillar.get_by_id(session.data['id']).one_or_none()
        if pillar.company.organisation_id != current_user.company.organisation_id:
            return None
        else:
            return pillar

class WeightedPillarMapper(WeightedComponentMapper):

    __type__ = PerformanceTemplatePillar

    pillar = field.Nested('PillarMapper', role='pillar_weighting', getter=pillar_getter)
    kpis = field.Collection(
        field.Nested('WeightedKpiMapper',
                     getter=kpi_getter,
                     allow_updates=True,
                     allow_create=True),
        required=False,
        default=[],
        extra_marshal_pipes={
            'process': [set_order],
            'validation': [check_for_duplicate_kpis, validate_weightings]
        },
        error_msgs = {
            'duplicate_error': 'You can\'t specify a kpi more than once in a pillar.',
            'invalid_weighting': 'Please ensure all the kpi weightings inside '
                                 'each pillar add up to 100%.'
        }
    )

If the getter fails the error message format is an object in the form of {'pillar': 'pillar not found'}. Taking an error response format from the Vizibl API we end up with

            exp = {
                'status': 400,
                'errors': [
                    {
                        'field': 'pillars',
                        'error': {"pillar": "pillar not found"}
                    }
                ]
            }

We'd expect to see consistent error format in the form of

            exp = {
                'status': 400,
                'errors': [
                    {
                        'field': 'pillars',
                        'error': "pillar not found"
                    }
                ]
            }

Implement output pipe

#46 fields will be responsible for pushing them selves into the output data when serialising and Marshaling

This will allow fields to do things like taking a name input and populating first name and last name on a model

versions of six lower than 1.9.0 not working

Some weird issue with the metaclass stuff in six was detected on version 1.6.1

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/kim/__init__.py in <module>()
      7
      8
----> 9 from .mapper import Mapper
     10 from .field import Field

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/kim/mapper.py in <module>()
    183
    184
--> 185 class Mapper(six.with_metaclass(MapperMeta, object)):
    186     """Mappers are the building blocks of Kim - they define how JSON output
    187     should look and how input JSON should be expected to look.

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/six.pyc in with_metaclass(meta, *bases)
    629 def with_metaclass(meta, *bases):
    630     """Create a base class with a metaclass."""
--> 631     return meta("NewBase", bases, {})
    632
    633 def add_metaclass(metaclass):

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/kim/mapper.py in __init__(cls, classname, bases, dict_)
    179
    180     def __init__(cls, classname, bases, dict_):
--> 181         _MapperConfig.setup_mapping(cls, classname, dict_)
    182         type.__init__(cls, classname, bases, dict_)
    183

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/kim/mapper.py in setup_mapping(cls, cls_, classname, dict_)
     85     def setup_mapping(cls, cls_, classname, dict_):
     86         cfg_cls = _MapperConfig
---> 87         cfg_cls(cls_, classname, dict_)
     88
     89     def __init__(self, cls_, classname, dict_):

/home/vagrant/.virtualenvs/vizibl-api/local/lib/python2.7/site-packages/kim/mapper.py in __init__(self, cls_, classname, dict_)
     99         # the user is looking to override the default role and dont create one
    100         # here.
--> 101         if '__default__' not in self.cls.__roles__:
    102             self.cls.roles['__default__'] = \
    103                 whitelist(*self.cls.fields.keys())

AttributeError: type object 'NewBase' has no attribute ‘__roles__’

Implement Nested field type

Implement the Nested field type and associated pipelines.

Nested should support all the features still relevant from Kim 1. We should also consider the agreed upgrades required for the 1.0.0 release like named roles etc.

Marshaling collection with default=None and no data causes FieldError

class WeightedPillarMapper(Mapper):
    id = field.String()
    pillar = field.Nested('PillarMapper')

class TemplateMapper(Mapper):

    pillars = field.Collection(
        field.Nested('WeightedPillarMapper', required=False, allow_create=True), required=False)

With the above Mapper configuration attempting to marshal TemplateMapper with no pillars key present in the json causes a FieldError to be raised. Setting the field.Collection(default=[]) kwarg resolves the issues.

Possibly something related to run_if_none or something like that.

Mapper class names in registry should not be required to be globally unique

For example, if your application includes third party libraries that define their own Kim Mappers, you're likely to have conflicts with common names such as User.

A simple workaround could be to namespace them by module, using sensible defaults to mean in most cases it's not required to use the full path.

ie. If the Nested is in foo.serializers, look there first rather than bar.serializers

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.