uqfoundation / dill Goto Github PK
View Code? Open in Web Editor NEWserialize all of Python
Home Page: http://dill.rtfd.io
License: Other
serialize all of Python
Home Page: http://dill.rtfd.io
License: Other
Running a frozen Python application (frozen using cx_Freeze) on Windows throws the error "NameError: name 'exit' is not defined" at at dill.py
line 133.
ExitType = type(exit)
This line is reached (as I understand it) any time the import of dill is not in an IPython shell. Why it only errors when frozen is unclear to me. However, changing this line to:
ExitType = type(sys.exit)
...appears to solve the problem.
As mentioned on stackoverflow picloud has an impressive picklers that handle numpy objects closures and a lot of other clever tricks that seem to be on the dill TODO list. and it is available under reasonably permissive licensing. Moreover, since picloud is shutting down, it is a propitious time to negotiate alternative licensing. And I can't see any mention of it in the commit logs for dill, so it it seems worth mentioning it here or the pathos tracker - I don't have an account on the pathos tracker, so I can't raise a ticket there.
Depending, of course, on coding conventions and methods, it might save some time for the dill project.
The code appears only to be available as a tarball -- creating a free picloud.com account may be required? but the LGPL conditions are clearly stated therein, so re-use if permissible, although it may not fit entirely with the modified BSD licence of dill.
dill.source.getsource
is pretty good about getting the source for imported and interactively defined function, lambdas, and class methods. It'd be nice to get the source code for classes and class instances and such. Currently, assumes the inspected object has a func_code
attribute.
If I use this script to serialize something:
import pandas as pd
import dill
def func(x):
return pd.DataFrame({'a' : x})
def func2(x):
return func(x) + func(x)
with open("out.dill", "w+") as f:
dill.dump(func2, f)
And load it with:
import dill
with open("out.dill") as f:
func2 = dill.load(f)
print func2([1,2,3,4,5])
I get
Traceback (most recent call last):
File "read_test.py", line 6, in <module>
print func2([1,2,3,4,5])
File "write_test.py", line 8, in func2
return func(x) + func(x)
NameError: global name 'func' is not defined
What is the intention for how the user should handle this?
Thanks
Follow up for #41:
Python 3.4.1 (default, May 19 2014, 17:23:49)
[GCC 4.9.0 20140507 (prerelease)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import abc, dill
>>> abc.ABCMeta.zzz=1
>>> dill.dump_session()
>>>
================Restart================
Python 3.4.1 (default, May 19 2014, 17:23:49)
[GCC 4.9.0 20140507 (prerelease)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.load_session()
>>> abc.ABCMeta
<class 'abc.ABCMeta'>
>>> abc.ABCMeta.zzz
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'ABCMeta' has no attribute 'zzz'
This is due to https://github.com/uqfoundation/dill/blob/master/dill/dill.py#L808-L813 not adding the __dict__
attribute:
>>> dill.dill._trace(True)
>>> dill.dumps(abc.ABCMeta)
T4: <class 'abc.ABCMeta'>
b'\x80\x03cabc\nABCMeta\nq\x00.'
A pickle from a class with byref=False
can tend to be large -- the serializing function is _dict_from_dictproxy
. This often causes the pickle to contain things like copyright
-- at first glance, the contents looks as if it may be similar to __builtins__.__dict__
.
Would it be possible to reduce the size of the pickle for such a class, by removing any unnecessary items that are currently pickled?
I found another example where Dill fails to pickle objects when running under doctest. This is a one-line addition to my test-case in #18.
I'm using Python 2.7.6 and Dill eb122e6.
/cc @distobj
import dill as pickle
import doctest
pickle.dill._trace(1)
class SomeUnreferencedUnpicklableClass(object):
def __reduce__(self):
raise Exception
unpicklable = SomeUnreferencedUnpicklableClass()
# This works fine outside of Doctest:
serialized = pickle.dumps(lambda x: x)
# This fails because it tries to pickle the unpicklable object:
def tests():
"""
>>> unpicklable = SomeUnreferencedUnpicklableClass() # <-- Added since #18
>>> serialized = pickle.dumps(lambda x: x)
"""
return
print "\n\nRunning Doctest:"
doctest.testmod()
Output:
F1: <function <lambda> at 0x101f8f848>
F2: <function _create_function at 0x101c96d70>
Co: <code object <lambda> at 0x100474ab0, file "dillbugtwo.py", line 13>
F2: <function _unmarshal at 0x101c96c08>
D1: <dict object at 0x10031ba20>
D2: <dict object at 0x101da9de0>
Running Doctest:
F1: <function <lambda> at 0x101f8f938>
F2: <function _create_function at 0x101c96d70>
Co: <code object <lambda> at 0x101f369b0, file "<doctest __main__.tests[1]>", line 1>
F2: <function _unmarshal at 0x101c96c08>
D2: <dict object at 0x102206120>
F1: <function tests at 0x101f8f848>
Co: <code object tests at 0x100474e30, file "dillbugtwo.py", line 16>
D1: <dict object at 0x10031ba20>
D2: <dict object at 0x10222e940>
**********************************************************************
File "dillbugtwo.py", line 19, in __main__.tests
Failed example:
serialized = pickle.dumps(lambda x: x)
Exception raised:
Traceback (most recent call last):
File "/Users/joshrosen/anaconda/lib/python2.7/doctest.py", line 1289, in __run
compileflags, 1) in test.globs
File "<doctest __main__.tests[1]>", line 1, in <module>
serialized = pickle.dumps(lambda x: x)
File "/Users/joshrosen/anaconda/lib/python2.7/site-packages/dill-0.2.2.dev-py2.7.egg/dill/dill.py", line 165, in dumps
dump(obj, file, protocol, byref)
File "/Users/joshrosen/anaconda/lib/python2.7/site-packages/dill-0.2.2.dev-py2.7.egg/dill/dill.py", line 158, in dump
pik.dump(obj)
File "/Users/joshrosen/anaconda/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "/Users/joshrosen/anaconda/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/Users/joshrosen/anaconda/lib/python2.7/site-packages/dill-0.2.2.dev-py2.7.egg/dill/dill.py", line 506, in save_function
obj.__dict__), obj=obj)
File "/Users/joshrosen/anaconda/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
File "/Users/joshrosen/anaconda/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/Users/joshrosen/anaconda/lib/python2.7/pickle.py", line 562, in save_tuple
save(element)
File "/Users/joshrosen/anaconda/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/Users/joshrosen/anaconda/lib/python2.7/site-packages/dill-0.2.2.dev-py2.7.egg/dill/dill.py", line 538, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "/Users/joshrosen/anaconda/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "/Users/joshrosen/anaconda/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
File "/Users/joshrosen/anaconda/lib/python2.7/pickle.py", line 306, in save
rv = reduce(self.proto)
File "dillbugtwo.py", line 8, in __reduce__
raise Exception
Exception
**********************************************************************
1 items had failures:
1 of 2 in __main__.tests
***Test Failed*** 1 failures.
add tutorial to cover major features; should be built white-paper style to demonstrate solving a problem or set of problems
import time can be significantly reduced by not loading "dill.objects". I believe the easiest way is to convert the local imports of dill.objects to strings that will be "exec'd" upon call of load_types.
Importing dill
registers pickling handlers. Is it bad behavior that dill
cannot be imported without side-effects on another module?
This could lead to subtle bugs, when one library does
import dill as pickle # only needs dill
while another library uses standard pickle
and perhaps relies in
some way on the fact that some objects cannot be pickled.
What about this:
import dill
dill.register_with_pickle()
use getsource(func)
and getsource(freevars(func))
to get closure code
Looks like three possible combinations are all is needed. Should test if will ever find more than one function in freevars(func)
I've not been able to reproduce this, but I did observe a interpreter session where there were two lambdas built… and then dill.dump_session
/ dill.load_session
was used… that dill.source.getsource
picked up the wrong lambda. Attempts to reproduce this erroneous behavior in a less adhoc way have been unsuccessful.
dill.source
methods should only spawn new objects in the calling namespace for those that are specifically requested. For example, the following code introduces f
, when all that was desired was bar
. The code should be "enclosed" in a dummy closure, if nothing else.
>>> def foo(f):
... def bar(x):
... return f(x)+x
... return bar
...
>>> import math
>>> zap = foo(math.sin)
>>> import dill
>>> print dill.source.importable(zap)
from math import sin as f
def bar(x):
return f(x)+x
I'm encountering problems running doctests via python -m doctest
and nosetests --with-doctest
. This seems related to #18 (and also in PySpark, FWIW) but differs in that doctest.testmod()
executed inside the module doesn't trigger the problem. Sample code;
import dill
import doctest
def test_dill():
"""
>>> out = dill.dumps(lambda x: x)
"""
out = dill.dumps(lambda x: x)
doctest.testmod()
So when executed via python -m doctest
under Python 2.7.3, I get a long recursive stacktrace of save_module_dict and save_module calls, concluding with;
...
File "/home/mbaker/venvs/bibframe/local/lib/python2.7/site-packages/dill/dill.py", line 773, in save_module
state=_main_dict)
File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/home/mbaker/venvs/bibframe/local/lib/python2.7/site-packages/dill/dill.py", line 504, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/home/mbaker/venvs/bibframe/local/lib/python2.7/site-packages/dill/dill.py", line 816, in save_type
StockPickler.save_global(pickler, obj)
File "/usr/lib/python2.7/pickle.py", line 748, in save_global
(obj, module, name))
PicklingError: Can't pickle <class 'unittest.util.Mismatch'>: it's not found as unittest.util.Mismatch
dump_session() seems to fail when used with ipython notebook set up for inline plots. When the %matplotlib inline
magic is used you get a traceback resulting in
PicklingError: Can't pickle 'RendererAgg' object: <RendererAgg object at 0x42a2bb8>
and when using the --pylab=inline
option at start up you get a traceback resulting in
PicklingError: Can't pickle <class 'matplotlib.axes.AxesSubplot'>: it's not found as matplotlib.axes.AxesSubplot
Using Dill 0.2a.dev, ipython 1.1.0
Using the master branch of dill with the lastest commit eb96239, I cannot read arguments from terminal for python script, for example test_args.py.
import sys
import dill
if __name__ == "__main__":
print sys.argv[1]
and run test_args.py on terminal.
$ python test_args.py d
usage: PROG [-h]
PROG: error: unrecognized arguments: d
When I remove import dill, it works.
$ python test_args.py d
d
I am using python 2.7.3 and on ubuntu 12.04 TLS.
It will be easier to keep track of whether or not pull requests break anything if you test things with Travis. What test framework do you prefer to use to run the tests?
hello i have been trying to install or upgrade pip with easy install but i keep getting this error:
easy_install pip
Traceback (most recent call last):
File "/Users/djibrilkeita/bin/easy_install", line 5, in
from pkg_resources import load_entry_point
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 2603, in
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 666, in require
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 565, in resolve
pkg_resources.DistributionNotFound: distribute==0.6.49
i have deleted all the easy_install versions in usr/bin or usr/local/bin or /Users/djibrilkeita/bin and resintall the setuptools but i still doesn't work any suggestions?
Generally it's more robust to pickle new-style classes by reference (as pickle does), except in the cases when the class definition is changing (or being deleted)… dill serializes the class definition instead of using a reference. However, for some cases, it may be better to serialize by reference -- the pickle is also much smaller. It would be good to be able to select how the new-style class is serialized.
Load of a pickled file handle can create a file. Is that what is desired? If so, what should the file-pointer position in the file be? Right now, the state of the file is preserved, as well as the mode and the position. Here are the consequences...
dude@hilbert>$ python
Python 2.7.8 (default, Jul 3 2014, 05:59:29)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> f = open('test.txt', 'w+')
>>> f.write('hello')
>>> dill.dump_session('test.pkl')
>>>
Deleting the file and then loading the session simulates going to another computer where test.txt
does not exist.
dude@hilbert>$ rm test.txt
remove test.txt? y
dude@hilbert>$ python
Python 2.7.8 (default, Jul 3 2014, 05:59:29)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.load_session('test.pkl')
>>> f
<open file 'test.txt', mode 'w+' at 0x1096fac90>
>>> f.write('world')
>>> f.seek(0)
>>> f.read()
'\x00\x00\x00\x00\x00world'
>>>
Similarly, if 'test.txt' existed, but had the contents "goodbye" instead of "hello"… it gets even nastier.
dude@hilbert>$ python
Python 2.7.8 (default, Jul 3 2014, 05:59:29)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> g = open('test.txt', 'r')
>>> g.read()
'goodbye\n'
>>> dill.load_session('test.pkl')
>>> f
<open file 'test.txt', mode 'w+' at 0x100e69d20>
>>> f.write('world')
>>> f.seek(0)
>>> f.read()
'\x00\x00\x00\x00\x00world'
>>> g.seek(0)
>>> g.read()
'\x00\x00\x00\x00\x00world'
>>>
This indicates that dill
is actually serializing more than the file handle, it's in essence serializing the file, but with "filler" if it's creating a new file and the position is greater than zero. Is that the desired behavior? Or better if a new file is created, the position be reset with seek(0)
? Or better that if the file doesn't exist, the file handle be closed?
mrocklin@notebook$ cat testdill.py
import dill
def test_dill():
assert dill.dumps({'foo': 'bar'})
mrocklin@notebook$ nosetests testdill.py
D2: <dict object at 0x17ed1d0>
.
----------------------------------------------------------------------
Ran 1 test in 0.001s
OK
Note the line output during nosetests.
D2: <dict object at 0x17ed1d0>
For larger examples this swamps my screen. It doesn't seem to happen during normal execution. Perhaps this is going out to stderr?
forked from question on issue #13.
Looks like when you import a subclassed numpy.ndarray, it only routes through the StockPickler.
>>> from numpy_new import *
{'color': 'green'}
B2: <built-in function _reconstruct>
T4: <class 'numpy_new.TestArray'>
T4: <type 'numpy.dtype'>
{}
B2: <built-in function _reconstruct>
T4: <class 'numpy_new.TestArray'>
T4: <type 'numpy.dtype'>
{}
{'color': 'green'}
B2: <built-in function _reconstruct>
T4: <class 'numpy_new.TestArray'>
T4: <type 'numpy.dtype'>
{}
B2: <built-in function _reconstruct>
T4: <class 'numpy_new.TestArray'>
T4: <type 'numpy.dtype'>
{}
lawyers...
Pickles of functions (and pickles of things that contain functions, like classes) are not quite deterministic---they depend on iteration order of the _reverse_typemap
inside dill.py. Depending on the order, either the symbol "LambdaType" or "FunctionType" will be used to represent functions. Either will work as far as unpickling goes, but having different representations of the same value can cause trouble with e.g. caching.
While most invocations of Python 2.x yield the same iteration order for _reverse_typemap
, use of the -R flag (recommended for user-facing services; c.f. http://www.ocert.org/advisories/ocert-2011-003.html) randomizes this order.
Note that the functionality of -R is on by default for versions >= 3.3:
http://docs.python.org/3/whatsnew/3.3.html
If you have a function like this:
@require("module_name")
def require_test(x):
return True
And you try to use IPython parallel's parallel map, you get this error:
File "/n/home05/kirchner/anaconda/envs/gemini/lib/python2.7/site-packages/IPython/kernel/zmq/serialize.py", line 102, in serialize_object
buffers.insert(0, pickle.dumps(cobj,-1))
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 1374, in dumps
Pickler(file, protocol).dump(obj)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/site-packages/dill-0.2a2.dev-py2.7.egg/dill/dill.py", line 443, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/site-packages/dill-0.2a2.dev-py2.7.egg/dill/dill.py", line 443, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/site-packages/dill-0.2a2.dev-py2.7.egg/dill/dill.py", line 421, in save_function
obj.func_closure), obj=obj)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 405, in save_reduce
self.memoize(obj)
File "/n/home05/kirchner/anaconda/envs/ipc/lib/python2.7/pickle.py", line 244, in memoize
assert id(obj) not in self.memo
But if you just use the regular pickle, it works fine. I have a minimal example here:
https://github.com/roryk/ipython-cluster-helper/blob/master/example/example.py
Do you have any thoughts about why? I dug around a bunch but quickly got out of my depth. :)
Works when doit
and squared
are defined in a file. Misidentifies squared
when they are built in the interpreter.
>>> doit = lambda f: lambda x: f(x)**2
>>> @doit
... def squared(x):
... return x
...
functions include __dict__
in their pickles, so dynamically added attributes are also serialized. Should this feature be added to other types of objects?
This is off of a clean environment with Python 2.7.6 and pip.
(test-pathos)mrocklin@linux2:~$ pip install dill
Downloading/unpacking dill
Could not find a version that satisfies the requirement dill (from versions: 0.2a1, 0.2a1, 0.1a1)
Cleaning up...
No distributions matching the version for dill
Maybe I'm doing something overly naive. This is what I as a novice user would expect to work though.
I noticed that the version installed via pip install dill
(I think 0.1) is broken. The github version seems to work. Any plans on making a new pypi release soon?
To reproduce:
Place in this code in a file:
def f(func):
def w(*args):
return func(*args)
return w
@f
def f2(): pass
import dill
print(dill.dumps(f2))
In a interactive session or another file:
import (your file name)
I have a crash in both versions of python with the lastest source which looks like:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "./debug.py", line 11, in <module>
print(dill.dumps(f2))
File "./dill/dill.py", line 130, in dumps
dump(obj, file, protocol, byref)
File "./dill/dill.py", line 123, in dump
pik.dump(obj)
File "/usr/lib/python3.3/pickle.py", line 235, in dump
self.save(obj)
File "/usr/lib/python3.3/pickle.py", line 297, in save
f(self, obj) # Call unbound method with explicit self
File "./dill/dill.py", line 438, in save_function
obj.__closure__), obj=obj)
File "/usr/lib/python3.3/pickle.py", line 416, in save_reduce
self.memoize(obj)
File "/usr/lib/python3.3/pickle.py", line 255, in memoize
assert id(obj) not in self.memo
AssertionError
It looks like something to do with pickling the __code__
attribute, then repickling it when doing the __globals__
attribute, as if you comment out the __globals__
attribute in dill.py, there is no crash (just an incomplete pickle, presumably).
Follow up for #41, number 2:
>>> import dill
>>> import numpy as np
>>> np.min = np.max
>>> dill.dump_session()
>>> ^D
############ restart ############
>>> import dill
>>> dill.load_session()
>>> np.min([1,2,3,4,5])
1
This could be fixed by deciding to pickle a module's __dict__
(https://github.com/uqfoundation/dill/blob/master/dill/dill.py#L763-L764) using a function like:
def _can_pickle_module(mod, _cache={}, _recursion_protection=[None]):
if _recursion_protection[0] is not None:
return mod is _recursion_protection[0]
if mod not in _cache:
_recursion_protection[0] = mod
try:
dumps(mod)
except:
_cache[mod] = False
else:
_cache[mod] = True
finally:
_recursion_protection[0] = None
return _cache[mod]
and the condition in save_module
being:
if _can_pickle_module(obj) or is_dill(pickler) and obj is pickler._main_module:
However, this approach is less efficient. This needs discussion.
could do it if read in whole file...
This is a corner-case, I think... but important for sympy. See comments after the closure of issue #5.
When using full class pickling (e.g. cls.__module__ = '__main__'
), pickling
dynamically generated class methods fails with maximum recursion depth exceeded
Looks like a separate issue from #56. Maybe due to handling of the circular references?
Class.__dict__ --> method ; method.im_class --> Class
test case:
import dill
import types
def proto_method(self):
pass
def make_class(name):
cls = type(name, (object,), dict())
setattr(cls, 'methodname', types.MethodType(proto_method, None, cls))
globals()[name] = cls
return cls
if __name__ == '__main__':
NewCls = make_class('NewCls')
print(dill.pickles(NewCls))
dill.detect.trace:
INFO:dill:T2: <class '__main__.NewCls'>
INFO:dill:F2: <function _create_type at 0x7fe706a281b8>
INFO:dill:T1: <type 'type'>
INFO:dill:F2: <function _load_type at 0x7fe706a28140>
INFO:dill:T1: <type 'object'>
INFO:dill:D2: <dict object at 0x7fe706a31e88>
INFO:dill:Me: <unbound method NewCls.proto_method>
INFO:dill:T1: <type 'instancemethod'>
INFO:dill:F1: <function proto_method at 0x7fe70f2a1a28>
INFO:dill:F2: <function _create_function at 0x7fe706a28230>
INFO:dill:Co: <code object proto_method at 0x7fe710832530, file "/path/test.py", line 4>
INFO:dill:F2: <function _unmarshal at 0x7fe706a280c8>
INFO:dill:D4: <dict object at 0x7fe706a32280>
INFO:dill:D2: <dict object at 0x7fe706a1ab40>
INFO:dill:T2: <class '__main__.NewCls'>
INFO:dill:D2: <dict object at 0x7fe706a22c58>
INFO:dill:Me: <unbound method NewCls.proto_method>
INFO:dill:T2: <class '__main__.NewCls'>
INFO:dill:D2: <dict object at 0x7fe706a20b40>
INFO:dill:Me: <unbound method NewCls.proto_method>
INFO:dill:T2: <class '__main__.NewCls'>
INFO:dill:D2: <dict object at 0x7fe706a22d70>
INFO:dill:Me: <unbound method NewCls.proto_method>
...etc...
Python 3 support would be great. I recommend using a single codebase. You just have to add a few compatibility definitions (or you can depend on six, but IMHO it's overkill).
From #57, when a file is dumped, then the file is deleted... then pickled file is loaded… and the file_mode is not such that the entire file was pickled (e.g. just the file handle was pickled), a new name is needed. Should it be os.devnull
? Look for what python does for similar cases, if possible.
can't pickle new-style classes across interpreter sessions
There's a bug (I'm sure more than one) so that when you make a closure with a lambda as the inner function (as in foo
below), dill.source.importable(bar)
will puke out the entire history file. So that's not good. The following code reproduces the error.
>>> def foo(f):
... squared = lambda x: f(x)**2
... return squared
...
>>> @foo
... def bar(x):
... return 2*x
>>>
>>> print dill.source.importable(bar)
I'm not sure what kind of impact that might have if one would ignore an object... then expect to start up a session again and everything work. Maybe it's not up to dill
to care… and it's the user's problem if it blows things up in the dump/load of the session.
I've been using dill via the direct dill.Pickler() and dill.Unpickler() interface. This was crashing because the normal constructors don't set dill.Pickler._main_module and dill.Unpickler._main_module. This seems unusual, and I'm wondering what the rational is for it.
I added the following init() methods to dill.Pickler and dill.Unpickler, and they seemed to solve my problem. Am I missing something here? Should this change be incorporated into the github repo?
### Extend the Picklers
class Pickler(StockPickler):
"""python's Pickler extended to interpreter sessions"""
dispatch = StockPickler.dispatch.copy()
_main_module = None
_session = False
pass
def __init__(self, *args, **kwargs):
StockPickler.__init__(self, *args, **kwargs)
self._main_module = _main_module
class Unpickler(StockUnpickler):
"""python's Unpickler extended to interpreter sessions and more types"""
_main_module = None
_session = False
def find_class(self, module, name):
if (module, name) == ('__builtin__', '__main__'):
return self._main_module.__dict__ #XXX: above set w/save_module_dict
return StockUnpickler.find_class(self, module, name)
pass
def __init__(self):
StockUnpickler.__init__(self, *args, **kwargs)
self._main_module = _main_module
dill.source.getsource(some_function, enclosing=True) appears to return incorrect output when called from within an IPython terminal or Notebook session. The results differ from that of an ordinary Python terminal.
I'm running Python 2.7.7 |Anaconda 2.0.1 (64-bit)| (default, Jun 11 2014, 10:40:02) [MSC v.1
500 64 bit (AMD64)] on Windows 7 professional.
Code like this:
import pickle
import dill
...
pickle.dumps(f)
Yields an error like this:
File "/home/mrocklin/Software/anaconda/lib/python2.7/pickle.py", line 1374, in dumps
Pickler(file, protocol).dump(obj)
File "/home/mrocklin/Software/anaconda/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "/home/mrocklin/Software/anaconda/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "/home/mrocklin/Software/anaconda/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
File "/home/mrocklin/Software/anaconda/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/home/mrocklin/Software/anaconda/lib/python2.7/site-packages/dill/dill.py", line 501, in save_module_dict
if pickler._session: # we only care about session the first pass thru
AttributeError: Pickler instance has no attribute '_session'
In [1]: import dill
In [2]: dill.__version__
Out[2]: '0.2'
Need to look at how decorators on classes pickle... see if there's any unexpected results. It 'works', but the objects in some cases don't seem to COPY correctly.
I stumbled over a weird case where dill fails to serialize an object that pickle handles fine. I guess the issues has something to do with inheriting from a builtin type and the usage of super in overriding append
.
import pickle
import dill
class InheritsList(list):
"""docstring for InheritsList"""
def __init__(self):
super(InheritsList, self).__init__()
def append(self, obj):
super(InheritsList, self).append(obj)
obj = InheritsList()
obj.append('string')
# Works
repickled = pickle.loads(pickle.dumps(obj))
print repickled
# Does not work
redilled = dill.loads(dill.dumps(obj))
print repickled
It's related to an issue I posted here.
opencobra/cobrapy#72
The code to do this is obvious, but do it with ctypes, so dill still builds pure python.
I believe inspect.getsource
retrieves code blocks with decorators on decorated functions, however dill.source.getblocks
doesn't. It should, or at least have the option to do so.
Additionally, there should be an option to return 'enclosing' code (i.e. just the inner function, or the enclosing block as well).
When pickling functions defined inside of doctests, Dill seems to include additional objects in the function closure even if the function doesn't reference them. This can cause the pickling to fail if some of these unreferenced objects are unpicklable; this also adds bloat to the serialized function.
I'm one of the authors of PySpark, a Python API for the Spark cluster computing framework, and I'm trying to use Dill to replace our current function serializer. We're currently using PiCloud's cloudpickle library, which seems to handle these doctest cases properly; I'd like to switch to Dill because it seems to be more actively developed and handles some cases that cloudpickle doesn't handle properly.
Dill seems to work perfectly from the Python shell, but it's different behavior in doctests is causing our test suite to break (unpicklable Py4J-wrapped Java objects are included in closures, among other issues).
Here's a small standalone testcase that reproduces the issue on Python 2.7.5:
import dill as pickle
import doctest
import logging
logging.basicConfig(level=logging.DEBUG)
class SomeUnreferencedUnpicklableClass(object):
def __reduce__(self):
raise Exception
unpicklable = SomeUnreferencedUnpicklableClass()
# This works fine outside of Doctest:
serialized = pickle.dumps(lambda x: x)
# This fails because it tries to pickle the unpicklable object:
def tests():
"""
>>> serialized = pickle.dumps(lambda x: x)
"""
return
print "\n\nRunning Doctest:"
doctest.testmod()
Here's the output, which shows that the unpicklable object is being included in the closure when running under doctest:
F1: <function <lambda> at 0x110b65de8>
INFO:dill:F1: <function <lambda> at 0x110b65de8>
T1: <type 'function'>
INFO:dill:T1: <type 'function'>
F2: <function _load_type at 0x110b626e0>
INFO:dill:F2: <function _load_type at 0x110b626e0>
Co: <code object <lambda> at 0x10ff4d5b0, file "unpickleable.py", line 14>
INFO:dill:Co: <code object <lambda> at 0x10ff4d5b0, file "unpickleable.py", line 14>
F2: <function _unmarshal at 0x110b62668>
INFO:dill:F2: <function _unmarshal at 0x110b62668>
D1: <dict object at 0x7f9401c1aee0>
INFO:dill:D1: <dict object at 0x7f9401c1aee0>
Running Doctest:
F1: <function <lambda> at 0x110b6e398>
INFO:dill:F1: <function <lambda> at 0x110b6e398>
T1: <type 'function'>
INFO:dill:T1: <type 'function'>
F2: <function _load_type at 0x110b626e0>
INFO:dill:F2: <function _load_type at 0x110b626e0>
Co: <code object <lambda> at 0x110b664b0, file "<doctest __main__.tests[0]>", line 1>
INFO:dill:Co: <code object <lambda> at 0x110b664b0, file "<doctest __main__.tests[0]>", line 1>
F2: <function _unmarshal at 0x110b62668>
INFO:dill:F2: <function _unmarshal at 0x110b62668>
D2: <dict object at 0x7f9401e930c0>
INFO:dill:D2: <dict object at 0x7f9401e930c0>
F1: <function tests at 0x110b65de8>
INFO:dill:F1: <function tests at 0x110b65de8>
Co: <code object tests at 0x10ff4d630, file "unpickleable.py", line 17>
INFO:dill:Co: <code object tests at 0x10ff4d630, file "unpickleable.py", line 17>
D1: <dict object at 0x7f9401c1aee0>
INFO:dill:D1: <dict object at 0x7f9401c1aee0>
M2: <module 'logging' from '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/logging/__init__.py'>
INFO:dill:M2: <module 'logging' from '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/logging/__init__.py'>
F2: <function _import_module at 0x110b62f50>
INFO:dill:F2: <function _import_module at 0x110b62f50>
M2: <module '__builtin__' (built-in)>
INFO:dill:M2: <module '__builtin__' (built-in)>
M2: <module 'doctest' from '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/doctest.py'>
INFO:dill:M2: <module 'doctest' from '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/doctest.py'>
T2: <class '__main__.SomeUnreferencedUnpicklableClass'>
INFO:dill:T2: <class '__main__.SomeUnreferencedUnpicklableClass'>
F2: <function _create_type at 0x110b62758>
INFO:dill:F2: <function _create_type at 0x110b62758>
T1: <type 'type'>
INFO:dill:T1: <type 'type'>
T1: <type 'object'>
INFO:dill:T1: <type 'object'>
D2: <dict object at 0x7f9402853920>
INFO:dill:D2: <dict object at 0x7f9402853920>
F1: <function __reduce__ at 0x110b6e1b8>
INFO:dill:F1: <function __reduce__ at 0x110b6e1b8>
Co: <code object __reduce__ at 0x10ff4fa30, file "unpickleable.py", line 8>
INFO:dill:Co: <code object __reduce__ at 0x10ff4fa30, file "unpickleable.py", line 8>
D1: <dict object at 0x7f9401c1aee0>
INFO:dill:D1: <dict object at 0x7f9401c1aee0>
M2: <module 'dill' from '/Users/joshrosen/env/lib/python2.7/site-packages/dill/__init__.pyc'>
INFO:dill:M2: <module 'dill' from '/Users/joshrosen/env/lib/python2.7/site-packages/dill/__init__.pyc'>
**********************************************************************
File "unpickleable.py", line 19, in __main__.tests
Failed example:
serialized = pickle.dumps(lambda x: x)
Exception raised:
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/doctest.py", line 1254, in __run
compileflags, 1) in test.globs
File "<doctest __main__.tests[0]>", line 1, in <module>
serialized = pickle.dumps(lambda x: x)
File "/Users/joshrosen/env/lib/python2.7/site-packages/dill/dill.py", line 121, in dumps
dump(obj, file, protocol)
File "/Users/joshrosen/env/lib/python2.7/site-packages/dill/dill.py", line 115, in dump
pik.dump(obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/Users/joshrosen/env/lib/python2.7/site-packages/dill/dill.py", line 418, in save_function
obj.func_closure), obj=obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 562, in save_tuple
save(element)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/Users/joshrosen/env/lib/python2.7/site-packages/dill/dill.py", line 440, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 306, in save
rv = reduce(self.proto)
File "unpickleable.py", line 9, in __reduce__
raise Exception
Exception
**********************************************************************
1 items had failures:
1 of 1 in __main__.tests
I would like to generate new classes programatically at runtime using type()
, then serialize these (for use on other computers on a cluster). Is there a way to get dill to pickle these?
Strangely, the classes seem to pickle ok if this is done by the __main__
module, but not any other module. minimal test:
classmaker.py:
import dill
def f():
cls = type('NewCls', (object,), dict())
print(dill.pickles(cls))
if __name__ == "__main__":
f()
consumer.py:
import classmaker
classmaker.f()
running these:
$ python classmaker.py
True
$ python consumer.py
False
In the second case the pickling exception is: Can't pickle <class 'classmaker.NewCls'>: it's not found as classmaker.NewCls
This is an interesting use case, detailed on stackoverflow.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.