Giter Club home page Giter Club logo

Comments (14)

dionhaefner avatar dionhaefner commented on July 18, 2024 1

Thanks, I think I understand the problem now. recurse=True doesn't work but I guess that's due to some modifications done to the callable by pytest.

from dill.

mmckerns avatar mmckerns commented on July 18, 2024 1

yes, there is a PR that is mostly done that handles a bunch of module serialization variants. work on it seems to have stalled a bit though.

from dill.

dionhaefner avatar dionhaefner commented on July 18, 2024

Funnily enough, it works when I do this before pickling:

foo.__globals__.pop(foo.__name__)

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

I want to make sure I'm understanding this correctly, but running your script normally works, however if you run under the control of pytest (and subprocess), it throws the error above. Is that correct? If so, I'd be interested to run with dill.detect.trace(True).

from dill.

dionhaefner avatar dionhaefner commented on July 18, 2024

That's what I thought, but now I realized this is actually a pathing issue.

$ python tests/dill_test.py
ok

$ cd tests
$ pytest dill_test.py
ok

$ pytest tests/dill_test.py
NOT OK

So in the latter case, dill.load tries to import dill_test.py but fails because it's not on sys.path. It is fixed by changing the load script to this:

test_script = dedent(f"""
        import dill
        import sys
        sys.path.append("{os.path.dirname(__file__)}")
        with open("{picklefile}", "rb") as f:
            func = dill.load(f)
        func()
""")

Is there a way to pickle a function so it can be executed even if the original module isn't available when unpickling?

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

Generally, dill assumes that module dependencies are installed... and while it does provide different approaches for tracing dependencies in the global scope... what you might be able to do in any case is to dump the module along with the function. Then you'd load the module and then the function. Something like this is only needed for "uninstalled" modules. This is ok for saving state, but not really that good for parallel computing.

from dill.

dionhaefner avatar dionhaefner commented on July 18, 2024

Generally, dill assumes that module dependencies are installed.

But why is this module a dependency in the first place? The function doesn't access any globals.

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

The global dict is required to create a function object.

Python 3.8.18 (default, Aug 25 2023, 04:23:37) 
[Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import types
>>> print(types.FunctionType.__doc__)
Create a function object.

  code
    a code object
  globals
    the globals dictionary
  name
    a string that overrides the name from the code object
  argdefs
    a tuple that specifies the default argument values
  closure
    a tuple that supplies the bindings for free variables
>>> 

However, dill has different settings that modify how the global dict is handled. So, you can try dill.settings['recurse'] = True, which will only pickle items in the global dict that are pointed to by the function, and otherwise stores a dummy global dict.

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

you can often see what's going on with dill.detect.trace(True)

from dill.

dionhaefner avatar dionhaefner commented on July 18, 2024

Okay here goes nothing.

This is the case that works:

$ python tests/dill_test.py
┬ F1: <function foo at 0x102580040>
├┬ F2: <function _create_function at 0x102fb32e0>
│└ # F2 [34 B]
├┬ Co: <code object foo at 0x102755b00, file "/private/tmp/tests/dill_test.py", line 6>
│├┬ F2: <function _create_code at 0x102fb3370>
││└ # F2 [19 B]
│└ # Co [102 B]
├┬ D2: <dict object at 0x0102fc49c0>
│└ # D2 [25 B]
├┬ D2: <dict object at 0x0102956a00>
│└ # D2 [2 B]
├┬ D2: <dict object at 0x0102fc4b80>
│├┬ D2: <dict object at 0x0102938ac0>
││└ # D2 [2 B]
│└ # D2 [23 B]# F1 [198 B]

This is the one that doesn't:

$ pytest tests/dill_test.py
┬ F2: <function foo at 0x104473be0># F2 [20 B]

So if pytest is involved, dill doesn't even try to pickle any of the function's attributes...?

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

Essentially, yes. "F2" is passing the function off to pickle. The key is that there's an internal function called _locate_function, and if that returns False... probably in this case because _import_module does not find the module... then it punts to pickle which gives up.

from dill.

dionhaefner avatar dionhaefner commented on July 18, 2024

Isn't it the other way around? According to https://github.com/uqfoundation/dill/blob/master/dill/_dill.py#L1881C12-L1881C12, dill uses the stock pickler when _locate_function returns True. But this is not what I want, since I want to dump the function object itself, not a reference to it.

from dill.

mmckerns avatar mmckerns commented on July 18, 2024

Yes, you are correct. I missed the not in the if statement.

from dill.

dionhaefner avatar dionhaefner commented on July 18, 2024

Could you imagine having a flag similar to byref for modules that forces dill to pickle the function object instead of a reference to it? I think this would get us a lot closer to what we want to achieve.

from dill.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.