Giter Club home page Giter Club logo

pyrsistent's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyrsistent's Issues

Replace factory functions with classmethods

Having e.g. both pmap and PMap means I need to do two imports, or that API docstrings can't find references if I just import the latter.

Instead, how about factory classmethods?

from pyrsistent import PMap
m = PMap.create({'x': 1})
isinstance(m, PMap) # True

Nested pmap merging

I would like:

pmap({
    'a': {
        'b': {
            'c': 1
        }
    },
    'z': {
        'y': {
            'x': 10
        }
    }
}).merge(pmap({
    'a': {
        'b': {
            'd': 2
        }
    },
    'z': {
        'y': {
            'w': 9
        }
    }
}))

to result in:

pmap({
    'a': {
        'b': {
            'c': 1,
            'd': 2
        }
    },
    'z': {
        'y': {
            'x': 10,
            'w': 9
        }
    }
})

but it results in:

pmap({
    'a': {
        'b': {
            'd': 2
        }
    },
    'z': {
        'y': {
            'w': 9
        }
    }
})

I'm using pmaps to merge JSON objects. Would it be possible to add a parameter that could specify this behaviour, please?

Thanks,

Michael

Public introspection API for fields and checked types

We're about to add some code that introspects pyrsistent classes - in particular PRecord/PClass and the checked data structures - in order to detect changes to our pyrsistent-based configuration model. You can see a sketch of this code here: https://github.com/ClusterHQ/flocker/pull/1836/files#diff-c03885f8c4e64651ea9a499e99090a83R28

Unfortunately this currently requires using private pyrsistent APIs. It would be good to have a public API for finding this information. As a first pass it seems that adding this as extra methods to the classes themselves is problematic, insofar as it means subclasses get extra methods they weren't expecting. So maybe there should be a set of public external functions that extract information from the private implementation details.

pformat on pyrsistent objects

It'd be great if pprint and pformat worked "properly" on pyrsistent objects. That is, if the output of pformat(some_pmap) were comparable to pformat(some_dict).

That is, the following Python code...

from pyrsistent import pmap
from pprint import pprint

some_dict = {'foo': 'reverse the earth', 'bar': 'on a train to bristol', 'baz': 'some rather long line'}

print '## normal dict'
pprint(some_dict)
print

print '## pmap'
pprint(pmap(some_dict))

... produces this output ...

## normal dict
{'bar': 'on a train to bristol',
 'baz': 'some rather long line',
 'foo': 'reverse the earth'}

## pmap
pmap({'baz': 'some rather long line', 'foo': 'reverse the earth', 'bar': 'on a train to bristol'})

Whereas it should produce:

## normal dict
{'bar': 'on a train to bristol',
 'baz': 'some rather long line',
 'foo': 'reverse the earth'}

## pmap
pmap({
  'baz': 'some rather long line', 
  'foo': 'reverse the earth', 
  'bar': 'on a train to bristol'})

... or something similar

(Related to #69, but this is a request for enhancement, rather than a defect report. Fixing this would probably also fix #69)

Implement .copy() to get closer to standard interface

I normally use the persistent data structures to have certainty that the code is not modifying something accidentally.

Nevertheless, I don't want all the code to know that then are dealing with something other than the standard types. If pyrsistent types implemented the copy() method just as return self, then I'd be able to use them in cases where a defensive copy is performed.

Sometimes a pyrsistent-using program crashes with SIGSEGV

We started noticing segfaults after introducing some hypothesis-based testing of some of our pyrsistent-using code. Here's a specific failure we ran into, https://clusterhq.atlassian.net/browse/FLOC-2913

I haven't constructed a minimal reproducing example yet (as I understand it, hypothesis is generating some random data and initializing a bunch of pyrsistent structures with it and some random data is causing the problem ... but I can't see which random data).

Maybe some hypothesis-based tests in pyrsistent would be a good way to narrow down the problem.

Ordering for PClass instances

It'd be good if PClass instances were either automatically ordered or could have an ordering imposed.

Problem

In particular, given:

class Foo(PClass):
  x = field()

This hypothesis test reliably fails:

    @given(integers(), integers())
    def test_pyrsistent_class_ordering(self, a, b):
        [a, b] = sorted([a, b])
        foo_a = Foo(x=a)
        foo_b = Foo(x=b)
        self.assertTrue(foo_a <= foo_b, '%s <= %s' % (foo_a, foo_b))

I can reproduce this in an interpreter:

In [1]: from pyrsistent import PClass, field

In [2]: class Foo(PClass):

        x = field()
   ...:

In [3]: a = Foo(x=0)

In [4]: b = Foo(x=0)

In [5]: a <= b
Out[5]: True

In [6]: a <= b
Out[6]: True

In [7]: b <= a
Out[7]: False

Proposed solutions

Natural ordering

For PClass objects with multiple fields, there is a "natural" ordering, where objects in the PClass are ordered according to their components, with the first-defined field being the most significant component for ordering. I think this could make a sensible default ordering for PClasses. This is similar to what attrs does. e.g. objects of

class Qux(PClass):
  x = field()
  y = field()

Would be ordered as if they were (x, y) tuples.

Explicit syntax

Alternatively, there could be an explicit syntax for specifying how a PClass was ordered. e.g.

@ordered_by('z', 'x')
class Bar(PClass):
  x = field()
  y = field()
  z = field()

So that ordering on Bar was equivalent to ordering on (Bar.z, Bar.x).

It has the advantage (& disadvantage) of being explicit, and cleanly separated from the rest of pyrsistent. I'm tinkering away at such a decorator now, but think that some facility really ought to be provided by the pyrsistent library.

equality with standard data structures

it would be really awesome to do something like decorate functions so that all arguments to and the return value from get frozen. then inside the decorated function, use literal data, and even mutation, and lock it down on the way in/out.

beyond that use case, it seems to me that equality with standard data structures could be desirable in many circumstances.

are there any kind of technical blockers for equality with standard data structures of the same type?

from pyrsistent import v, thaw

assert thaw(v(1, 2, 3)) == [1, 2, 3], 'this works'
assert v(1, 2, 3) == [1, 2, 3], 'this does not work'

Perhaps PRecord shouldn't be based on PMap

I don't think PMap is the most efficient way to represent most records -- and the Clojure developers would seem to agree, given that their record system is not based on HAMTs, but rather just plain java objects, with updates causing full object copies. (well, I believe they do have a HAMT for dynamic fields added after a record is instantiated, but the declared attributes are fully copied on modification).

This is because records have small, fixed numbers of fields, whereas pmaps excel at large data structures where modifications will not require reallocating most of the structure.

I have a preliminary benchmark showing that copying the object + mutating the result is faster than PRecord for small numbers of fields. On CPython, with the pyrsistent C extension, object copy+mutate is faster up to and beyod 100 fields (I didn't want to wait to find out when/if they converge), but the gap clearly closes with a larger number of fields. On PyPy, the gap closes much quicker, with copy+mutate winning out significantly at 5 fields, and converging at around 10 fields. At 20 fields on PyPy, PRecord wins out significantly.

There's also the big caveat that my benchmark doesn't do all the same bookkeeping that PRecord does, like type checking and whatnot.

So there's some weighing to do here, and perhaps a consideration for dynamically switching between a PMap-based implementation or an copy+mutate implementation depending on the number of fields and the Python runtime. I suspect the number of fields per object in typical Python objects is significantly less than 10, though I've certainly seen (and written) my share of objects with 15 fields.

Instances of different PRecord subclasses compare equal to each other

>>> from pyrsistent import PRecord, field
>>> class A(PRecord):
...     x = field()
... 
>>> class B(PRecord):
...     x = field()
... 
>>> A(x=1) == B(x=1)
True

This is undesirable because it is idiomatic for different Python classes to represent different types. If objects of different type accidentally share the same field names and values, they should still be considered different.

By way of analogy,

>>> (1,) == [1]
False
>>> (1,) == {1}
False
>>> (1,) == "\x01"
False
>>> 

Cannot pformat sets of PMaps

from pyrsistent import pmap
from pprint import pprint

pprint({pmap({'foo': 2}), pmap({'bar': 3})})

Gives this traceback:

set([Traceback (most recent call last):
  File "demo-pyrsistent-bug.py", line 4, in <module>
    pprint({pmap({'foo': 2}), pmap({'bar': 3})})
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pprint.py", line 59, in pprint
    printer.pprint(object)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pprint.py", line 117, in pprint
    self._format(object, self._stream, 0, 0, {}, 0)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pprint.py", line 199, in _format
    object = _sorted(object)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pprint.py", line 82, in _sorted
    return sorted(iterable)
  File "/Users/jml/.virtualenvs/flocker/lib/python2.7/site-packages/pyrsistent/_pmap.py", line 132, in __lt__
    raise TypeError('PMaps are not orderable')
TypeError: PMaps are not orderable

This is particularly a problem because many test suites call pformat on the inputs to assertions, which means that genuine test failures are more difficult to debug.

Add succinct idiom for checked types

I'm writing code that pre-checked types looks like this:

class MyRec(PRecord):
    paths = field(type=PMap, initial=pmap(), factory=pmap, mandatory=True)

Switching to CheckedPMap, it's now something like this:

class MyRec(PRecord):
    class _PathMap(CheckedPMap):
        __key_type__ = unicode
        __value_type__ = FilePath
    paths = field(type=_PathMap, initial=_PathMap(), factory=_PathMap, mandatory=True)

This is rather verbose, but likely to be a common use case. I'm thinking that perhaps some utility functions would be helpful. For example:

class MyRec(PRecord):
    # If mandatory=True then automatically set initial too? Probably.
    paths = pmap_field(unicode, FilePath, mandatory=True)  # optional invariant keyword argument

If I end up writing this for work I'll submit a PR.

Multiple invariants for a single field

I would like to be able to specify more than one predicate as an invariant for a field, and have different messages for each predicate.

Say I want a field for an integer that is both positive and even. I would currently have to write an invariant like this:

class Foo(PRecord):
  even_natural = field(type=int, invariant=lambda x: (x > 0 and x % 2 == 0, 'x negative or odd'))

I would rather write it like this:

class Foo(PRecord):
  even_natural = field(type=int, invariants=lambda x: [(x > 0, 'x negative'), (x % 2 == 0: 'x odd')])

I'm not 100% convinced that this is the right syntax—perhaps a full suite of boolean combinators is necessary—but I'd like something like this.

Define recursive data types

This is a bit of a wishlist feature.

I'd like to be able to define a data structure that looks like this:

class Leaf(PClass):
  foo = field(type=unicode)

class Tree(PClass):
  children = pvector_field((Tree, Leaf))

However, Python will object and say that Tree is not defined. Is there a way to specify the field's type after the class construction?

TypeError raised by checked types could be more informative

When a type invariant of a CheckedPMap (and other checked types) is violated, the resulting TypeError doesn't say much about what happened:

>>> class X(CheckedPMap):                                                                                                                                      
...     __key_type__ = int                                                                                                                                     
...     __value_type__ = str                                                                                                             
... 
>>> X().set("3", "4")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jean-paul/Environments/flocker/local/lib/python2.7/site-packages/pyrsistent.py", line 829, in set
    return self.evolver().set(key, val).persistent()
  File "/home/jean-paul/Environments/flocker/local/lib/python2.7/site-packages/pyrsistent.py", line 2971, in set
    _check_types([key], self._destination_class._checked_key_types)
  File "/home/jean-paul/Environments/flocker/local/lib/python2.7/site-packages/pyrsistent.py", line 2721, in _check_types
    raise TypeError
TypeError
>>> 

It would be very nice to have an exception that included the type on which the invariant was violated, the expected type, and the received type. For example, a string message like TypeError("TypeXcan only be used withintkeys, notstr") would be great. Presenting the information in a structured way would be pretty cool too - for example, using a subclass of TypeError with some attributes defined, eg CheckedKeyTypeError(X, int, str) producing an object with checked_type, required_type, and received_type attributes.

delete method for vectors

I just realized that there's no way to delete items by index (or slice) in PVector. I realize that this wouldn't be super efficient, but it seems like a surprising hole in the API. Is it intentionally left out of the API to discourage it for performance reasons? If so it might be worth mentioning that in the docs somewhere.

Otherwise I'd be happy to implement a delete method for the Python implementation. A C one might be a bit much for me though :) I suppose evolvers should also support __delitem__.

(I just realized using remove as the name of this method would be confusing because list.remove removes by value, not index).

PMap and _PBag have ID-based comparison semantics

PMap does not implement the comparison methods, so ordering them gives non-deterministic results. Other objects in Pyrsistent like PSet and PVector do support comparison, but PMap and _PBag don't.

Allow multiple types for checked classes

Updated: Checked again and noticed fields already have tests for this, but checked types do not.

Sometimes it's useful to support multiple types.

  • Classic example is int and long: the difference between them is implementation detail in Python 2, so you pretty much always want to support both interchangeably.
  • We also have fields that are either None or int.

Fields already do this, but CheckedPVector etc. don't seem to have tests for this case.

Unify thaw()/freeze() with PRecord serialization somehow

I would like to round-trip a tree of nested PSets, PRecords, etc. into simple Python objects and back. I've written a simple sketch of this, which I will add below. One issue I encountered was that thaw() doesn't know about PRecord serialization, and vice versa:

from operator import and_
from pyrsistent import PRecord, pset, PSet, field, thaw


def set_field(klass):
    """
    Create a field which is a PSet of the given class.
    """
    if issubclass(klass, PRecord):
        klass_factory = klass.create
    else:
        klass_factory = klass

    def invariant(obj):
        return (
            reduce(and_, [isinstance(i, klass) for i in obj]),
            "All instances must be of type {}".format(klass))

    def serializer(format, obj):
        result = thaw(obj)
        if issubclass(klass, PRecord):
            result = [i.serialize() for i in result]
        return result

    def factory(items):
        return pset([klass_factory(i) for i in items])

    return field(type=PSet, factory=factory, invariant=invariant,
                 serializer=serializer)


class Application(PRecord):
    name = field(type=unicode)
    image = field(type=unicode)


class Node(PRecord):
    applications = set_field(Application)


def example():
    node = Node(applications=[Application(name=u'myapp', image=u'myimage'),
                              Application(name=u'b', image=u'c')])
    print node
    serialized = node.serialize()
    print serialized
    restored = Node.create(serialized)
    print restored == node


if __name__ == '__main__':
    example()

Nested record deserialization in collection fields ?

class Foo(PRecord):
    foo = field(type=str)

class Bar(PRecord):
    bar = pvector_field(Foo)

Bar.create(Bar(bar=v(Foo(foo="foo"))).serialize())

It seems like pvector_field can't figure out that it should try to desieralize Foo from a dict so I guess I have to implement a custom factory ? Is this just a missing implementation or me missing something obvious.

Py3 installation error

I get the below error when installing pyrsistent with pip3. The installation apparently works anyway, looks like the error is in some performance test.

Also, is there supposed to be a "tests" package installed outside of pyrsistent? The name seems likely to collide with something else.

Downloading/unpacking pyrsistent
  Downloading pyrsistent-0.7.0.tar.gz (47kB): 47kB downloaded
  Running setup.py (path:/tmp/pip_build_jofo/pyrsistent/setup.py) egg_info for package pyrsistent

Requirement already satisfied (use --upgrade to upgrade): six in /home/jofo/.local/lib/python3.4/site-packages (from pyrsistent)
Installing collected packages: pyrsistent
  Running setup.py install for pyrsistent
    building 'pvectorc' extension
    x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.4m -c pvectorcmodule.c -o build/temp.linux-x86_64-3.4/pvectorcmodule.o
    x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.4/pvectorcmodule.o -o build/lib.linux-x86_64-3.4/pvectorc.cpython-34m.so
      File "/home/jofo/.local/lib/python3.4/site-packages/tests/performance_run.py", line 247
        print "Number of accesses: %s" % len(indices)
                                     ^
    SyntaxError: invalid syntax

Constructor aliases

At least for PClass, which is likely to have methods to interact with it, rather than external users accessing fields directly, it would be nice to be able to provide an alias for use when construction an object. In particular, I'd like to be able pass fields that will be private members, without the leading _ in the constructor.

For example:

class Thing(PClass):
   _context = field(constructor_alias='context')

thing = Thing(context="the-context")

I'm not sure if that is a good API. The only use case I have is for dropping the leading _.

equality is inconsistent about type comparisons

>>> v() == []
False
>>> s() == set()
True
>>> m() == {}
True

It seems like the data structures should either only compare against themselves or allow comparison with any similar iterables.

Any ideas for something like "lens" for Python?

I'm creating this ticket mostly as a point of discussion.

Have you heard of the "lens" idea in Haskell? There are a few libraries that implement it, most popularly lens-family and lens. It's a pretty rich set of combinators for dealing with data structures. The most obvious thing it makes convenient is updating deeply nested immutable structures in a way that "looks like" imperative code. Basically, a generalized version of the (now gone) set_in method that used to be in pyrsistent, but can also representing getting.

For example, in Haskell, we have lenses called _1 and _2 which represents the respective items in a tuple.

ghci> set (_2._1) 42 ("hello",("world","!!!"))
("hello",(42,"!!!"))

The cool thing about lenses is that the . in between the _2 and _1 is just the normal function composition operator -- lenses are actually just functions. set takes a lens, a value, and the structure to update, and returns a new structure.

I think it would be pretty handy to have something like this for Python. I think even if you don't represent them as functions-that-compose it can be pretty handy to have something like this:

_1 = index(1)
_2 = index(2)
set([_2, _1], 42, ("hello", ("world", "!!!")) == ("hello", (42, "!!!"))

And of course we can imagine having more lens-constructors like attr and key, and also a function get instead of set.

I think this idea is probably useful enough to be something separate from pyrsistent (for example I often want it even when updating immutable objects that aren't created from pyrsistent) -- I just figured this would be a pretty good forum to bring it up in since it would be particularly useful for pyrsistent.

One design question would be whether there could be some way these "lens constructors" could be polymorphic to different data types. like, would we need separate lens-constructors for tuple-indexes and pvector-indexes? what about pmaps vs dicts? They would all need different implementations for setting (though they would be the same for getting). Maybe it'd be better to just keep these different, so the built-in lens library could have "index" for tuples and lists and "key" for dictionaries (which is implemented by copying the data structure), pyrsistent could provide "pindex" for pvectors and "pkey" for pmaps, etc.

As a final note, it's probably a bad idea to call these things "lenses" if they don't actually conform to the design of lenses in Haskell -- there's a lot of very precisely defined terminology in the computer-science community and I'd hate to make something that's blatantly different with the same name. I'm just trying to find a nicer way to update deeply nested structures in Python, not reproduce lens faithfully.

PRecord support for iteration is surprising and mostly undesirable

PRecord instances are iterable:

>>> from pyrsistent import PRecord, field
>>> class X(PRecord):
...     y = field()
... 
>>> list(X(y=3))
['y']

I see that this behavior is inherited from PMap, of which PRecord is a subclass. I think that iteration over a notional record type is undesirable. It's possible to iterate over the fields by using another of the inherited PMap features (keys(), iterkeys()) - though I'm not sure I like those features very much either (and I lose them if I happen to want a field with a colliding name). Support for the implicit iterator protocol seems most likely to be a source of bugs.

PMap __getattr__ is problematic sometimes

Frankly I'm not a fan of PMap.__getattr__ in the first place, but in particular the fact that it raises KeyError and not AttributeError causes some things to misbehave.

For example, the trial unit test runner searches for test classes to run by going through all the objects in a module and doing an issubclass(o, TestCase) check.

issubclass works by checking an object's __bases__ attribute, and returning True if the second argument can be found in the bases (recursively).

Usually issubclass raises a TypeError when an object isn't a class at all -- and trial specifically catches TypeError. The problem is that issubclass only catches AttributeError when trying to find a __bases__ attribute, but it doesn't handle other errors like KeyError - so in this case it just raises the KeyError up to the caller.

This leads to trial completely bailing out if you happen to have a PMap instance as a global variable in module 😿.

So I think it would make more sense for PMap.__getattr__ to translate KeyError to AttributeError, but honestly I would rather just see it removed -- mashing namespaces together (methods of PMap and items found within a PMap) can often lead to confusion and errors.

Failing to pip install

I ran into this trying to install pyrsistent, eventually got it to go by modifying the setup.py not to load the README, which isn't a real fix. Maybe loading it in utf-8 mode would be a better thing. Partly wondered if anyone else had seen this. I can implement the loading thing if that seems reasonable.

$ pip install pyrsistent
Collecting pyrsistent
Downloading pyrsistent-0.11.7.tar.gz (62kB)
100% |################################| 65kB 2.4MB/s
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 20, in
File "/tmp/pip-build-7etocp9s/pyrsistent/setup.py", line 13, in
readme = f.read()
File "/home/me/venv/lib64/python3.4/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 21946: ordinal not in range(128)

+ operator for PMap

It'd be nice to be able to do

>>> pmap({1:2}) + pmap({3:4})
pmap({1:2, 3:4})
>>> pmap({1:2}) + pmap({1:4})
pmap({1:4})

Expand invariants to cover PSet etc.

As shown in #17, it's a little awkward to integrate the idea of "a PSet (or PVector) of X" into PRecord. For situations outside of PRecord it's not possible to enforce at all.

It would be nice if there was a way to add invariants (type-based, at a minimum) to PSet and friends, much like PRecord allows adding invariants to PMap.

alternative names for single-letter constructors

The single-letter constructors can be pretty annoying

  • they're hard to search for in a module
  • they very easily conflict with local variable names

It'd be nice if we had alternative, longer names for v, s, m, and b.

recursive freeze/thaw functions?

Before I submit a PR, I would like to see if it'd be acceptable for pyrsistent to have recursive freeze(mutable) -> persistent and thaw(persistent) -> mutable functions.

I found the following functions useful, but they make some tradeoffs.

def freeze(o):
    """
    Recursively convert a simple Python data structure (lists, tuples,
    dictionaries) into pyrsistent versions of those data structures.
    """
    typ = type(o)
    if typ is dict:
        return pmap({k: freeze(v) for k, v in o.iteritems()})
    elif typ is list:
        return pvector(map(freeze, o))
    elif typ is tuple:
        return tuple(map(freeze, o))
    else:
        return o


def thaw(o):
    """
    Recursively convert pyrsistent data structures into basic Python types.
    """
    typ = type(o)
    if typ is type(pvector()):
        return map(thaw, o)
    if typ is type(pmap()):
        return {k: thaw(v) for k, v in o.iteritems()}
    if typ is tuple:
        return tuple(map(thaw, o))
    else:
        return o

The basic tradeoff this makes is extensibility/lossiness vs inflexibility/lossless conversions. My implementation chooses inflexibility and lossless conversion. Only lists, dicts, and tuples are supported. Tuples are maintained as tuples but have their elements recursively converted. If we supported other types of containers, like arbitrary iterables, we could convert those to PVectors but then the type information would be lost if we converted back.

The most common use case for these is probably for dealing with json, when you're using the stdlib "json" module but then want to deal with the resulting data persistently, and also to convert persistent structures to data structures that the json module will be able to serialize.

It occurs to me that I left set<->PSet conversion out of this implementation due to an oversight. To be added :)

Bad argument handling in PVector evolver

Off by 1 bug?

>>> p = pvector([1, 2, 3])
>>> del p.evolver()[3]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: ../Objects/listobject.c:290: bad argument to internal function
>>> del p.evolver()[4]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: Index out of range: 4

PSet is not pickleable

>>> pickle.dumps(pyrsistent.pset())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/pickle.py", line 1374, in dumps
    Pickler(file, protocol).dump(obj)
  File "/usr/lib64/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/usr/lib64/python2.7/pickle.py", line 306, in save
    rv = reduce(self.proto)
  File "/usr/lib64/python2.7/copy_reg.py", line 77, in _reduce_ex
    raise TypeError("a class that defines __slots__ without "
TypeError: a class that defines __slots__ without defining __getstate__ cannot be pickled

Checkout from today:

$ git log -1
commit ab29465bd31dfee63976f8ecf6be8de6fb106087
Author: Tobias Gustafsson <[email protected]>
Date:   Sat Feb 21 15:10:59 2015 +0100

    Type and invariant checking for checked pmap

Not a priority for me, we're going to rip out all the pickling code soon, but something I encountered.

Release 0.9

I keep hitting bugs that are fixed in master... any chance for a release?

PClass.set doesn't work with string-based usage

$ cat reproduce.py
from uuid import uuid4, UUID
from pyrsistent import PClass, field, optional

class DatasetState(PClass):
    dataset_id = field(type=UUID, mandatory=True)
    primary = field(type=optional(UUID), mandatory=True)
    maximum_size = field(type=optional(int), mandatory=True)
    path = field(type=optional(str), mandatory=True)

d = DatasetState(dataset_id=uuid4(), primary=None, maximum_size=3,
                 path="xxxx")
print d
d_with_kwargs = d.set(path=None)
print d_with_kwargs

# This blows up though:
d_with_string = d.set("path", None)

And when I run it:

$ python reproduce.py 
DatasetState(path='xxxx', maximum_size=3, primary=None, dataset_id=UUID('beffffcc-ebc2-4047-8957-97db23c21c6b'))
DatasetState(path=None, maximum_size=3, primary=None, dataset_id=UUID('beffffcc-ebc2-4047-8957-97db23c21c6b'))
Traceback (most recent call last):
  File "reproduce.py", line 19, in <module>
    d_with_string = d.set("path", None)
  File "/home/itamarst/ClusterHQ/flocker/.tox/py27/local/lib/python2.7/site-packages/pyrsistent/_pclass.py", line 79, in set
    return self.__class__(**{args[0]: args[1]})
  File "/home/itamarst/ClusterHQ/flocker/.tox/py27/local/lib/python2.7/site-packages/pyrsistent/_pclass.py", line 48, in __new__
    raise InvariantException(tuple(invariant_errors), tuple(missing_fields), 'Field invariant failed')
pyrsistent._checked_types.InvariantException: Field invariant failed

Sometimes psets don't compare correctly with sets or frozensets

My branch adding a bunch of pyrsistent 0.9 usage is failing consistently on our build machines - but not on my local desktop. Sample failure:

twisted.trial.unittest.FailTest: not equal:
a = [frozenset([<Application(name=u'mysql-clusterhq', image=<object object at 0x7ffacf447ba0>, ports=frozenset([]), volume=None, links=frozenset([]), environment=None, memory_limit=None, cpu_shares=None, restart_policy=<RestartNever()>)>,
            <Application(name=u'site-clusterhq.com', image=<object object at 0x7ffacf447bb0>, ports=frozenset([]), volume=None, links=frozenset([]), environment=None, memory_limit=None, cpu_shares=None, restart_policy=<RestartNever()>)>]),
 u'example.com']
b = [pset([<Application(name=u'site-clusterhq.com', image=<object object at 0x7ffacf447bb0>, ports=frozenset([]), volume=None, links=frozenset([]), environment=None, memory_limit=None, cpu_shares=None, restart_policy=<RestartNever()>)>, <Application(name=u'mysql-clusterhq', image=<object object at 0x7ffacf447ba0>, ports=frozenset([]), volume=None, links=frozenset([]), environment=None, memory_limit=None, cpu_shares=None, restart_policy=<RestartNever()>)>]),
 u'example.com']

Those two seem like they ought to be equal, and on my computer they are. And yet. (Application is a class using the characteristic library; eventually we'll probably switch everything to pyrsistent).

Full set of failures:
http://build.clusterhq.com/builders/flocker-ubuntu-14.04/builds/1272/steps/trial/logs/problems

p*_field prevents pickle from working on a PClass

I would never use pickle in production, but as I was trying to write some strawman example storage code for an example app, I discovered that while I could pickle basic PClasses, I can't pickle any that use pmap_field or pvector_field (and probably others that have similar implementations)

e.g.:

>>> class Foo(PClass):
...  v = pvector_field(int)
...
>>> Foo()
Foo(v=IntPVector([]))
>>> dumps(Foo())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
cPickle.PicklingError: Can't pickle <class 'pyrsistent._field_common.IntPVector'>: attribute lookup pyrsistent._field_common.IntPVector failed

The same happens for pmap_field.

I guess this is because of the way that those functions generate classes at runtime?

@tobgu if you can let me know what needs done to fix this I can try to submit a PR.

Split out pyrsistent C extension?

I don't really trust the C extension -- I'm seeing segfaults that go away when I disable it. I'm not 100% certain that it's from Pyrsistent's C extension (it's a complex app and there are some other extensions loaded, like Cryptography's), but in general I want to avoid C extensions wherever possible. And if I care about speed, I can always just use PyPy, which optimizes pyrsistent quite a bit.

I'd really like to be able to install pyrsistent without the C extension. Could the C extension perhaps be split out into a separate library on pypi?

InvariantException should have a `__str__`

If some code unexpectedly raises an InvariantException and the exception is logged, or propagates to the top-level, only the bare description is printed, which makes it hard to see what the actual error is.

Transform at arbitrary depth for map

It seems it's not possible to transform a whole map at all its depth with one transform call.

For example the following call will transform resource_info at depth 3 (with key data being at depth one) with the function _resolve_user:

resource_info.transform(['data', ny, ny], _resolve_user)

But it does not seem possible to transform all the values at any depth at once?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.