Giter Club home page Giter Club logo

Comments (9)

ejguan avatar ejguan commented on July 18, 2024 1

@erip

Thanks for taking interest in this issue.

I understand the intent of this issue to effectively be a way to wrap up the try-except import and cache the module (or parent)

This would cover partial requirement.

I think the ideal situation would be import module/submodule to global namespace lazily. And, when the methods from such module or submodule are invoked, the actual module is imported. By this mean, we don't need to assign the module to each DataPipe instance self to expose such modules into __iter__ function.

Current:

class XXXDataPipe:
    def __init__(self, ...):
        try:
            import abc
            self._abc = abc
        except:
             raise Error

    def __iter__(self):
        self._abc

Ideally:

class XXXDataPipe:
    def __init__(self, ...):
        abc = lazy_import("abc")  # Inject into global namespace

    def __iter__(self):
        abc.xxx  # I am not sure if mypy is going to allow such behavior

from data.

ejguan avatar ejguan commented on July 18, 2024 1

In other words, attribute access on not-installed modules should never happen?

Yeah, you are right. The Error should be raised at the first place. Otherwise, it should be a bug.

from data.

erip avatar erip commented on July 18, 2024 1

So you want to stick with the current, but wrap up the try/except logic?

class XXXDataPipe:
    def __init__(self, ...):
        self._abc = lazy_import("abc")

    def __iter__(self):
        self._abc

from data.

erip avatar erip commented on July 18, 2024

I'm happy to take this up.

It seems like if you return the lazily loaded module from some function, you don't actually need to support as .... For example, you could do something like np = lazy_import("numpy") where you'd normally do this check.

One question about the lazy loading is how lazy we want to get. I understand the intent of this issue to effectively be a way to wrap up the try-except import and cache the module (or parent). If so, I guess the logic would basically be:

def lazy_import(module, error_msg, submodule=None):
    if not module_installed(module, submodule):
        raise ModuleNotFoundError(error_msg)
    # check if module (and submodule if provided) are in the "cache"
    if not module_imported(module, submodule):
         import(module, submodule)
     # return the module (or submodule) to be used in code
     return load_module(module, submodule)

Is this what you had in mind, @ejguan? If so, there are probably some other logistics to address like what to do with parent modules which might be polluting the namespace, etc. If we're only worrying about modules and submodules, I think we could do this all with importlib, though submodules can be a bit hacky at times...

from data.

erip avatar erip commented on July 18, 2024

I see. The failure to import would result in an immediate error, but the actual import is the lazy part. Is that right? In other words, attribute access on not-installed modules should never happen?

from data.

erip avatar erip commented on July 18, 2024

Excellent, thanks for the clarification. One idea that's kind of messy is using globals() to populate the module with as_ or fully qualified name. The issue is that it will then be a global everywhere and not just within the file. For instance, in this minimal example

def lazy_import(name, as_=None):
    # Doesn't handle error_msg well yet
    import importlib
    mod = importlib.import_module(name)
    if as_ is not None:
        name = as_
    globals()[name] = mod

class Foo:
    def __init__(self):
        lazy_import("numpy", as_="np")

    def foo(self):
        return np.array([1,2,])

f = Foo()
print(f.foo()) # prints [1, 2] as expected

Using this, I think it might introduce some collisions or a heavily polluted global namespace.

from data.

ejguan avatar ejguan commented on July 18, 2024

You are right. Then, let's keep it as the minimum as possible. Not using global for now. If more users requested, we can easily extend it.

from data.

erip avatar erip commented on July 18, 2024

I asked here and it seems like pandas has something similar to what we want. They still need to import where used so it's not terribly different, but it prevents us from needing to stuff modules into self or have globals floating around.

from data.

ejguan avatar ejguan commented on July 18, 2024

I asked here and it seems like pandas has something similar to what we want. They still need to import where used so it's not terribly different, but it prevents us from needing to stuff modules into self or have globals floating around.

Sounds great.

from data.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.