I'm not sure what kind of impact that might have if one would ignore an object... then

I think either overload the dump method of the <code

OK, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

should dump_session accept a list/dict of objects to ignore? about dill HOT 9 OPEN

uqfoundation commented on July 18, 2024 6

should dump_session accept a list/dict of objects to ignore?

from dill.

Comments (9)

RuneScape314159265 commented on July 18, 2024 3

Firstly, what an epic tool! Super useful when working with jupyter notebooks that take a long time to complete and recomputing everything is either a) impossible b) merely a massive pain - thank you for making it!

I think it would be great if this ^ above hack were incorporated into the package itself, i.e.

dill.dump_session(fileName, ignore=True)

Produces

Variables:
- var1
- var3
- var5 
could not be pickled and will not be restored. Do you wish to continue (everything else that can be will be stored)? [y/n]:

The user types yes to continue and dill saves everything about the current environment that it can, ignoring the variables specified.

In a large file with a lot of globals I imagine that even running the check might take a while (it does in some of my files) so there should probably be an additional flag so that the user can set whether they want dill to automatically pickle even if it can't do everything - default being yes (if the user selects no, then, if dill can't pickle everything like above, they will be prompted asking whether they want to continue).

It would be a big quality of life improvement for me, and since I'm hardly unique I'm guessing many others.

Going hunting online for a work around isn't easy (this post, presenting the best solution, is quite hidden away and I'm guessing many don't read through it / miss it).

######################################################

P.S: Until such a time: @mmckerns fix needs a tiny bit of updating. iteritems is deprecated in Python3. Also, map is lazy, and thus doesn't actually do anything. An easy way to force execution is to turn it into a list. In all:

list(map(globals().pop, tuple(i for (i,j) in globals().items() if not dill.pickles(j) and i not in ('__builtins__',))))
dill.dump_session("testing.db")

from dill.

matsjoyce commented on July 18, 2024 1

We could replace the objects with a dill.Ignored singleton. It'd be a relatively simple change, but I'm having difficulty visualising a circumstance where its OK to have a load_session where half the things don't work... But, as you say, it's the users problem. It would just have to come with a big doc string saying Use as last resort! The object will NOT work on the other end!.

from dill.

mmckerns commented on July 18, 2024 1

It might make sense when there's an isolated object, such as a generator that was created but not used… but in the case that it's a matplotlib plot in an IPython session, and it doesn't serialize, but the user is primarily wanting to capture it… I think it's not so good.

Maybe a better alternate would be dump_session(ignore=True), where dill just "skips" anything that it can't serialize… (i.e. catch all serialization errors, and move on). Then there's only some of the corner cases that blow up on load… and the same could be done there. Is then a session with missing bits worthwhile for the user? That's for the user to decide.

from dill.

matsjoyce commented on July 18, 2024

Yup, I suppose so. What would be the best way to implement that? Overload Pickler.save and Unpickler.load?

from dill.

mmckerns commented on July 18, 2024

I think either overload the dump method of the dill.Pickler, or wrap the behavior into the dump_session function call -- probably the former. I could see a pure try-except approach, or an approach driven by the methods in dill.detect. Similarly for load.

from dill.

mittenchops commented on July 18, 2024

I just asked a StackOverflow question that might be a use case for this:
http://stackoverflow.com/questions/27351980/how-to-add-a-custom-type-to-dills-pickleable-types

from dill.

matsjoyce commented on July 18, 2024

OK, @mittenchops, question answered. It could be a use case, but remember that if anything on the other end needs your collection, it wouldn't be there.

from dill.

mmckerns commented on July 18, 2024

Currently when I do a dump_session, I do something like this:

map(globals().pop, tuple(i for (i,j) in globals().iteritems() if not dill.pickles(j) and i not in ('__builtins__',)))

to remove all objects that will not pickle. There's probably a better way to do it, but I tend to at least do some variant of the above on-the-fly. This will not be the most efficient, but will work as long as dill.pickles is correct (which is overwhelmingly most of the time).

from dill.

leogama commented on July 18, 2024

Hello, I'm working on a new feature like this for dump_session(). As it is not always possible or convenient to delete unpickable or large but cheap to generate objects from the namespace before saving the session, as they could be needed after it, I consider this to be a relevant feature.

Before I submit a draft PR, how do you think the API could be like? And how should load_session()'s behavior be in this case? Should it simply ignore the not saved variables, restore them as a dill singleton as suggested, or should it do this just for variables not defined in the namespace?

I already have a working prototype that can deal with IPython's command history variables. 👌🏼

from dill.

should dump_session accept a list/dict of objects to ignore? about dill HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent