Giter Club home page Giter Club logo

Comments (11)

mmckerns avatar mmckerns commented on June 30, 2024

I'd disagree… dill can serialize this type of object. What pickle does is to serialize the class "by reference", which means using the module name (__main__ in this case) and the class name. If you pass the byref flag to dill, it can use the same mechanism.

>>> pickle.dumps(obj)
"ccopy_reg\n_reconstructor\np0\n(c__main__\nInheritsList\np1\ncdill.dill\n_load_type\np2\n(S'ListType'\np3\ntp4\nRp5\n(lp6\nS'string'\np7\natp8\nRp9\n."
>>> 
>>> dill.dumps(obj, byref=True)
'\x80\x02c__main__\nInheritsList\nq\x00)\x81q\x01U\x06stringq\x02a}q\x03b.'
>>> 
>>> dill.loads(dill.dumps(obj, byref=True))
['string']

This is fine for many cases, however is not good if you have an interactively defined class that you want to serialize… and you want the class to be persistent across interpreter sessions. For those cases, pickle fails, while dill will serialize the entire code for the class, and hence it should work to serialize the object to a file and then start up the interpreter and unpickle the object. The setting byref=False is the default for dill.dumps.

It does appear that while the object serializes, the super causes problems deserializing… and thus is worth some investigation.

>>> dill.dumps(obj, byref=False)
'\x80\x02cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01U\x08TypeTypeq\x02\x85q\x03Rq\x04U\x0cInheritsListq\x05h\x01U\x08ListTypeq\x06\x85q\x07Rq\x08\x85q\t}q\n(U\r__slotnames__q\x0b]q\x0cU\n__module__q\rU\x08__main__q\x0eU\x08__init__q\x0fh\x01U\x0cFunctionTypeq\x10\x85q\x11Rq\x12(cdill.dill\n_unmarshal\nq\x13U\x94c\x01\x00\x00\x00\x01\x00\x00\x00\x03\x00\x00\x00C\x00\x00\x00s\x17\x00\x00\x00t\x00\x00t\x01\x00|\x00\x00\x83\x02\x00j\x02\x00\x83\x00\x00\x01d\x00\x00S(\x01\x00\x00\x00N(\x03\x00\x00\x00t\x05\x00\x00\x00supert\x0c\x00\x00\x00InheritsListt\x08\x00\x00\x00__init__(\x01\x00\x00\x00t\x04\x00\x00\x00self(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>R\x02\x00\x00\x00\x03\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x14\x85q\x15Rq\x16c__builtin__\n__main__\nh\x0fNNtq\x17Rq\x18U\x07__doc__q\x19U\x04blahq\x1aU\x06appendq\x1bh\x12(h\x13U\x9dc\x02\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00C\x00\x00\x00s\x1a\x00\x00\x00t\x00\x00t\x01\x00|\x00\x00\x83\x02\x00j\x02\x00|\x01\x00\x83\x01\x00\x01d\x00\x00S(\x01\x00\x00\x00N(\x03\x00\x00\x00t\x05\x00\x00\x00supert\x0c\x00\x00\x00InheritsListt\x06\x00\x00\x00append(\x02\x00\x00\x00t\x04\x00\x00\x00selft\x03\x00\x00\x00obj(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>R\x02\x00\x00\x00\x05\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x1c\x85q\x1dRq\x1ec__builtin__\n__main__\nh\x1bNNtq\x1fRq utq!Rq")\x81q#U\x06stringq$a}q%b.'
>>> dill.loads(dill.dumps(obj, byref=False))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mmckerns/lib/python2.7/site-packages/dill-0.2b2.dev-py2.7.egg/dill/dill.py", line 138, in loads
    return load(file)
  File "/Users/mmckerns/lib/python2.7/site-packages/dill-0.2b2.dev-py2.7.egg/dill/dill.py", line 131, in load
    obj = pik.load()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 858, in load
    dispatch[key](self)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1182, in load_append
    list.append(value)
  File "<stdin>", line 6, in append
TypeError: super(type, obj): obj must be an instance or subtype of type
>>> 

from dill.

phantomas1234 avatar phantomas1234 commented on June 30, 2024

Thanks for making the title more specific. The deserialization is indeed the issue and my code snippet is basically a boiled down version of https://github.com/opencobra/cobrapy/blob/master/cobra/core/DictList.py#L4 which I don't control. Anyways, I am just trying to get some code parallelized using IPython.parallel and would prefer using dill for the serialization using .use_dill() because dill is just awesome. However, that also means that I don't have low level control over how IPython handles dill.

I just discovered that dill==0.2a1 (installed via pip) does not throw the exception that you've posted above while 0.2b1 does. Hope this helps.

from dill.

mmckerns avatar mmckerns commented on June 30, 2024

I'm glad you like dill. Yes, in dill=0.2b1 I set byref=False as the default. If there's no way to pass that flag to ipython (apparently not), then it'll take until the issue as outlined above gets resolved. Maybe there's a better approach, having to use the byref=True flag sometimes is annoying.

from dill.

mmckerns avatar mmckerns commented on June 30, 2024

Some notes on follow-up…

>>> class Bar(list):
...   def __init__(self):
...     super(Bar, self).__init__()
... 
>>> b = Bar() 
>>> 
>>> dill.loads(dill.dumps(Bar))
<class '__main__.Bar'>
>>> 
>>> Bar.mro()
[<class '__main__.Bar'>, <type 'list'>, <type 'object'>]
>>> dill.loads(dill.dumps(Bar)).mro()
[<class '__main__.Bar'>, <type 'list'>, <type 'object'>]
>>>
>>> dill.loads(dill.dumps(b))
[]
>>> 
>>> b.append('foo')
>>> b
['foo']
>>> dill.loads(dill.dumps(b))
['foo']

So far so good.

>>> class Foo(list):
...   def __init__(self):
...     super(Foo, self).__init__()
...   def count(self, obj):
…     return super(Foo, self).count(obj)
... 
>>> f = Foo()
>>> dill.loads(dill.dumps(Foo))
<class '__main__.Foo'>
>>> 
>>> f.extend([1,2,3])
>>> f
[1, 2, 3]
>>> dill.loads(dill.dumps(f))
[1, 2, 3]
>>> f.count(2)
1
>>> dill.loads(dill.dumps(f)).count(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in count
TypeError: super(type, obj): obj must be an instance or subtype of type

Ah, that's the issue.

>>> class Test(list):
...   def append(self, obj):
...     super(Test, self).append(obj)
... 
>>> t = Test()
>>> t.append('foo')
>>> 
>>> class Test(list):
...   def append(me, obj):
...     super(Test, me).append(obj)
... 
>>> t.append('bar')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in append
TypeError: super(type, obj): obj must be an instance or subtype of type

So apparently, when dill is serializing the class code… the class code is being recompiled, and the instance is considered "stale" for any existing instances. So that's what needs a fix...

from dill.

mmckerns avatar mmckerns commented on June 30, 2024

Looks like pickle does this:

def find_class(self, module, name):
    # Subclasses may override this
    __import__(module)
    mod = sys.modules[module]
    klass = getattr(mod, name)
    return klass

so, to make this work for classes in cases like the one in this issue, I'd either have to update the reference in the instance to point to the new class definition... or build a new instance from the new class definition, and then transfer the state from the existing (serialized) instance. Both are tricky, but feasible, I think.

from dill.

mmckerns avatar mmckerns commented on June 30, 2024

Changing this to an "enhancement" instead of a "bug". What might be considered a bug is to have byref=False be the default value for dumps.

from dill.

mmckerns avatar mmckerns commented on June 30, 2024

a simple fix for Foo is:

>>> isinstance(f, Foo)
False
>>> f.__class__ = Foo
>>> isinstance(f, Foo)
True
>>> f.count(2)
1

or possibly better, since dill already has a handle to __main__:

>>> import __main__ as _main_module
>>> isinstance(f, Foo)
False
>>> f.__class__ = _main_module.Foo
>>> isinstance(f, Foo)
True
>>> f.count(2)
1

from dill.

mmckerns avatar mmckerns commented on June 30, 2024

then in load, we'd have something like:

obj.__class__ = getattr(_main_module, type(obj).__name__)

after the object is loaded, but before it's returned.

Better than that might be to push the re-referencing back down to the save_reduce level.

from dill.

mmckerns avatar mmckerns commented on June 30, 2024

should be 'fixed' in 07be939

from dill.

panamantis avatar panamantis commented on June 30, 2024

Strange that for Python 2.7 dill 2.7.1 it doesn't like to load classes with the super statement? Seems to be related to the above?

> class List_Wrap(list):
>     def __init__(self):
>         super(List_Wrap,self).__init__()
>         return
> def test_dill():
>     a_dill=dill.dumps(List_Wrap)
>     AB=dill.loads(a_dill)
>     AB()
>     return

Still getting the error:

   super(List_Wrap,self).__init__()
TypeError: super(type, obj): obj must be an instance or subtype of type

**update: the fix would be to run:

globals()[AB.__name__] = AB

before AB().

Sorry, I'm not sure why that fixed it.

from dill.

mmckerns avatar mmckerns commented on June 30, 2024

@panamantis: Thanks for the comment. I believe it's a duplicate of this or #75 or #56. Cases that we see failure to serialize in the presence of super has been documented in other issues.

from dill.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.