Giter Club home page Giter Club logo

Comments (14)

matsjoyce avatar matsjoyce commented on July 18, 2024

Replacing the last line of the file with dill.dumps([f2.__globals__]) does not result in a crash, but

from io import BytesIO
file = BytesIO()
pik = dill.Pickler(file)
pik.save_reduce(dill.dill.FunctionType, (f2.__globals__,), obj=f2)

does and it does crash for all functions.

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

@matsjoyce: thanks for the report and the test code. I think this issue is exactly the same as the one identified in issue #18. I think the solution involves passing in a "globals dict" where anything that the function doesn't require has been removed.

Interestingly, if you just run your code with python (your file name).py, then it pickles.

from dill.

matsjoyce avatar matsjoyce commented on July 18, 2024

Looking at the dill logging, it runs as __main__ due to:

#from dill.py
def save_module_dict(pickler, obj):
    if is_dill(pickler) and obj is pickler._main_module.__dict__:

When run as __main__,__globals__ is stored here

        log.info("D1: <dict%s" % str(obj.__repr__).split('dict')[-1]) # obj
        if PYTHON3:
            pickler.write(bytes('c__builtin__\n__main__\n', 'UTF-8'))
        else:
            pickler.write('c__builtin__\n__main__\n')
    elif not is_dill(pickler) and obj is _main_module.__dict__:
        log.info("D3: <dict%s" % str(obj.__repr__).split('dict')[-1]) # obj
        if PYTHON3:
            pickler.write(bytes('c__main__\n__dict__\n', 'UTF-8'))
        else:
            pickler.write('c__main__\n__dict__\n')   #XXX: works in general?
    else: 

When run as module, __globals__ is stored here

        log.info("D2: <dict%s" % str(obj.__repr__).split('dict')[-1]) # obj
        StockPickler.save_dict(pickler, obj)
    return

Which means that when the file is run as __main__, the __globals__ attribute is not actually stored, so there is no crash.

f2.__globals__ is debug.__dict__
>>> True

Maybe inject 'c%s\n__dict__\n' % obj.__module__ in somehow?

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

Absolutely.

I think the fix, however, is to pass in a copy of __globals__, where everything that's not required to build the function has been popped out.

I think I have bits of the code needed to do that in dill.pointers and dill.detect.

from dill.

matsjoyce avatar matsjoyce commented on July 18, 2024

I've got some code that seems to work:

@register(dict)
def save_module_dict(pickler, obj):
    if is_dill(pickler) and obj is pickler._main_module.__dict__:
        log.info("D1: <dict%s" % str(obj.__repr__).split('dict')[-1]) # obj
        if PYTHON3:
            pickler.write(bytes('c__builtin__\n__main__\n', 'UTF-8'))
        else:
            pickler.write('c__builtin__\n__main__\n')
    elif not is_dill(pickler) and obj is _main_module.__dict__:
        log.info("D3: <dict%s" % str(obj.__repr__).split('dict')[-1]) # obj
        if PYTHON3:
            pickler.write(bytes('c__main__\n__dict__\n', 'UTF-8'))
        else:
            pickler.write('c__main__\n__dict__\n')   #XXX: works in general?
    elif "__name__" in obj:
        try:
            module = _import_module(obj["__name__"])
        except:
            pass
        if module.__dict__ is obj:
            if PYTHON3:
                pickler.write(bytes('c%s\n__dict__\n' % obj['__name__'], 'UTF-8'))
            else:
                pickler.write('c%s\n__dict__\n' % obj['__name__'])
    else:
        log.info("D2: <dict%s" % str(obj.__repr__).split('dict')[-1]) # obj
        StockPickler.save_dict(pickler, obj)
    return

The new bit (in the middle) tries to find a module that has a __dict__ that is the obj, and if that can be found, deal with it the same way as for __main__. Haven't fully tested it yet though.

from dill.

matsjoyce avatar matsjoyce commented on July 18, 2024

That fix works for individual objects, but load_session fails with:

Traceback (most recent call last):
  File "debug2.py", line 1, in <module>
    from debug import *
  File "/home/matthew/Programming/C++/Python/eggs/debug.py", line 17, in <module>
    b=dill.load_session()
  File "/home/matthew/Programming/C++/Python/eggs/dill/dill.py", line 183, in load_session
    module = unpickler.load()
  File "/home/matthew/Programming/C++/Python/eggs/pickle.py", line 911, in load
    dispatch[key[0]](self)
  File "/home/matthew/Programming/C++/Python/eggs/pickle.py", line 1314, in load_build
    inst = stack[-1]
IndexError: list index out of range

Might just be a silly mistake on my side, though.

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

Ah… that's a good idea.

from dill.

matsjoyce avatar matsjoyce commented on July 18, 2024

The above exception can be fixed by checking if it is the main module.

@register(dict)
def save_module_dict(pickler, obj):
    if is_dill(pickler) and obj is pickler._main_module.__dict__:
        log.info("D1: <dict%s" % str(obj.__repr__).split('dict')[-1]) # obj
        if PYTHON3:
            pickler.write(bytes('c__builtin__\n__main__\n', 'UTF-8'))
        else:
            pickler.write('c__builtin__\n__main__\n')
    elif not is_dill(pickler) and obj is _main_module.__dict__:
        log.info("D3: <dict%s" % str(obj.__repr__).split('dict')[-1]) # obj
        if PYTHON3:
            pickler.write(bytes('c__main__\n__dict__\n', 'UTF-8'))
        else:
            pickler.write('c__main__\n__dict__\n')   #XXX: works in general?
    elif '__name__' in obj and obj != _main_module.__dict__:
        try:
            module = _import_module(obj['__name__'])
        except:
            pass
        if module.__dict__ is obj:
            if PYTHON3:
                pickler.write(bytes('c%s\n__dict__\n' % obj['__name__'], 'UTF-8'))
            else:
                pickler.write('c%s\n__dict__\n' % obj['__name__'])
    else:
        log.info("D2: <dict%s" % str(obj.__repr__).split('dict')[-1]) # obj
        StockPickler.save_dict(pickler, obj)
    return

Shall I make a PR?

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

There are a few small things above that would need to be addressed first:

  1. if your try throws an exception, you pass, then module is undefined.
  2. if module.__dict__ is not object, then nothing gets serialized

Both (1) and (2) above should be rolled up into the elif, so where those conditions aren't met, something else happens (such as 'D2': save_dict)

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

Just a suggestion:

    elif '__name__' in obj and obj is not _main_module.__dict__ \
        and obj is getattr(_import_module(obj['__name__']),'__dict__',None):
        log.info("D4: <dict%s" % str(obj.__repr__).split('dict')[-1]) # obj
        if PYTHON3:
            pickler.write(bytes('c%s\n__dict__\n' % obj['__name__'], 'UTF-8'))
        else:
            pickler.write('c%s\n__dict__\n' % obj['__name__'])

Then, yes, do a pull request.

from dill.

matsjoyce avatar matsjoyce commented on July 18, 2024

The problem with that is if _import_module throws. I'll cook something up.

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

Yep. So possible fix would be to catch errors in _import_module and return a None in that case. Could add a safe=True, flag to _import_module if didn't want to deal with changes elsewhere, but I think that the other places it's used it'd be consistent. (safe=True would catch errors and return None, safe=False would throw errors)

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

Something like this is python2/3 agnostic:

def _import_module(import_name, safe=False):
    try:
        if '.' in import_name:
            items = import_name.split('.')
            module = '.'.join(items[:-1])
            obj = items[-1]
        else:
            return __import__(import_name)
        return getattr(__import__(module, None, None, [obj]), obj)
    except:
        if safe: return None
        raise sys.exc_info()[1]

But handle it as you like.

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

closed by #28

from dill.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.