Giter Club home page Giter Club logo

Comments (7)

matsjoyce avatar matsjoyce commented on August 17, 2024

Yup, I can reproduce this. Its failing because doctest makes a copy of the globals. #18 was fixed, as the unpicklable object was the same object, so compared equal. The test from #18:

F1: <function <lambda> at 0x209c7d0>
F2: <function _create_function at 0x1f9ccf8>
Co: <code object <lambda> at 0x7f14c60b1f30, file "a.py", line 13>
F2: <function _unmarshal at 0x1f9cb90>
unpicklable => 34159056
__builtins__ => 139727198997256
__file__ => 25442704
__package__ => 139727197077152
doctest => 33743880
__name__ => 139727198740816
SomeUnreferencedUnpicklableClass => 33904736
pickle => 25437528
__doc__ => 139727197077152
D1: <dict object at 0x1776450>
D2: <dict object at 0x20535b0>


Running Doctest:
F1: <function <lambda> at 0x209c9b0>
F2: <function _create_function at 0x1f9ccf8>
Co: <code object <lambda> at 0x2014b30, file "<doctest __main__.tests[0]>", line 1>
F2: <function _unmarshal at 0x1f9cb90>
tests => 34195408
unpicklable => 34159056 <=== Equal to above
__builtins__ => 139727198997256
__file__ => 25442704
__package__ => 139727197077152
serialized => 34050816
doctest => 33743880
__name__ => 139727198740816
SomeUnreferencedUnpicklableClass => 33904736
pickle => 25437528
__doc__ => 139727197077152
D1: <dict object at 0x20c1a80>
D2: <dict object at 0x21493c0>

This means that the crucial conditions in save_moduledict pass:

if is_dill(pickler) and obj == pickler._main_module.__dict__ and not pickler._session:
    code continues
# and
elif not is_dill(pickler) and obj == _main_module.__dict__:
    code continues

Defining another object called unpicklable overwrites the global one. As SomeUnreferencedUnpicklableClass does not define __eq__, the crucial tests fail, and so the modules __dict__ is pickled, and the test fails.

F1: <function <lambda> at 0x2b977d0>
F2: <function _create_function at 0x2a99cf8>
Co: <code object <lambda> at 0x7f648a1b9f30, file "a.py", line 13>
F2: <function _unmarshal at 0x2a99b90>
unpicklable => 45665168
__builtins__ => 140069790829320
__file__ => 36972944
__package__ => 140069788909216
doctest => 45285456
__name__ => 140069790572880
SomeUnreferencedUnpicklableClass => 44567424
pickle => 36967768
__doc__ => 140069788909216
D1: <dict object at 0x2275450>
D2: <dict object at 0x2b4fa80>


Running Doctest:
F1: <function <lambda> at 0x2b97938>
F2: <function _create_function at 0x2a99cf8>
Co: <code object <lambda> at 0x2b11b30, file "<doctest __main__.tests[1]>", line 1>
F2: <function _unmarshal at 0x2a99b90>
tests => 45709264
unpicklable => 45770192
__builtins__ => 140069790829320
__file__ => 36972944
__package__ => 140069788909216
serialized => 45560576
doctest => 45285456
__name__ => 140069790572880
SomeUnreferencedUnpicklableClass => 44567424
pickle => 36967768
__doc__ => 140069788909216
D2: <dict object at 0x2bbe0c0>
F1: <function tests at 0x2b977d0>
Co: <code object tests at 0x232ba30, file "a.py", line 16>
tests => 45709264
unpicklable => 45665168 <=== Not equal to above, so if statements fail
__builtins__ => 140069790829320
__file__ => 36972944
__package__ => 140069788909216
serialized => 45560576
doctest => 45285456
__name__ => 140069790572880
SomeUnreferencedUnpicklableClass => 44567424
pickle => 36967768
__doc__ => 140069788909216
D1: <dict object at 0x2275450>
D2: <dict object at 0x2c46500>

Unfortunately, I can't see an obvious solution, except to make the SomeUnreferencedUnpicklableClass compare equal.

from dill.

JoshRosen avatar JoshRosen commented on August 17, 2024

@matsjoyce The issue seems to arise even if the doctests's unpicklable doesn't override a global variable, because this code also reproduces the bug:

import dill as pickle
import doctest

pickle.dill._trace(1)

class SomeUnreferencedUnpicklableClass(object):
    def __reduce__(self):
        raise Exception

# This fails because it tries to pickle the unpicklable object:
def tests():
    """
    >>> unpicklable = SomeUnreferencedUnpicklableClass()
    >>> serialized = pickle.dumps(lambda x: x)
    """
    return

print "\n\nRunning Doctest:"
doctest.testmod()

I only included the other code to show that the example works correctly outside of doctest.

from dill.

matsjoyce avatar matsjoyce commented on August 17, 2024

Well, yeah, any thing that makes globals() == doctest's globals() to return False will make it fail.
Doing

import dill as pickle
import doctest

pickle.dill._trace(1)

class SomeUnreferencedUnpicklableClass(object):
    def __reduce__(self):
        raise Exception
    def __eq__(self, other):
        return True

unpicklable = SomeUnreferencedUnpicklableClass()

# This works fine outside of Doctest:
serialized = pickle.dumps(lambda x: x)

# This fails because it tries to pickle the unpicklable object:
def tests():
    """
    >>> unpicklable = SomeUnreferencedUnpicklableClass()  # <-- Added since #18
    >>> serialized = pickle.dumps(lambda x: x)
    """
    return

print "\n\nRunning Doctest:"
doctest.testmod()

as predicted, does pass.

from dill.

JoshRosen avatar JoshRosen commented on August 17, 2024

The __eq__ trick will work sometimes, but I'd like users to be able to define new variables in their doctests without having to add a corresponding dummy global variable.

Is there any way to determine which members of doctest's globals aren't referenced by the function? If so, maybe we could work with a copy of the doctest globals() dict where we remove unreferenced variables that aren't equal to global() variables. It seems like this could be impossible in general, since I could write a function that accepts a string and uses it to index into doctest's globals, but maybe it's doable for typical functions?

from dill.

matsjoyce avatar matsjoyce commented on August 17, 2024

Adding a dummy global will not work either. The variable in the doctest has to either compare equal or have the exact same id. You could hack something together using dill.source, inspect and tokenise, but it wouldn't be very general, so probably shouldn't be included in mainline dill. I'll give it a go.

from dill.

matsjoyce avatar matsjoyce commented on August 17, 2024

This works:

@register(FunctionType)
def save_function(pickler, obj):
    if not _locate_function(obj): #, pickler._session):
        log.info("F1: %s" % obj)
        if PY3:
            code, globs, name, defaults, closure = obj.__code__, obj.__globals__, obj.__name__, obj.__defaults__, obj.__closure__
        else:
            code, globs, name, defaults, closure = obj.func_code, obj.func_globals, obj.func_name, obj.func_defaults, obj.func_closure
        import inspect
        import tokenize, token
        from io import BytesIO
        a = inspect.getsource(obj)
        if not PY3:
            a = unicode(a)
        toks = tokenize.generate_tokens(BytesIO(a.encode('utf-8')).readline)
        names = {i[1] for i in toks if i[0] == token.NAME}
        globs_copy = {k: v for k, v in globs.items() if k in names}
        pickler.save_reduce(_create_function, (code, globs_copy, name, defaults, closure,
                                obj.__dict__), obj=obj)
    else:
        log.info("F2: %s" % obj)
        StockPickler.save_global(pickler, obj) #NOTE: also takes name=...
    return

But, as you say, it won't work for lambda x: globals()[x].

from dill.

mmckerns avatar mmckerns commented on August 17, 2024

As of fadbffc, this works with recurse=True.

from dill.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.