Giter Club home page Giter Club logo

Comments (2)

VivekPanyam avatar VivekPanyam commented on May 28, 2024

Including a requirements.lock file could resolve the issue but this would cause installation when loading the model.

As you noted, passing in requirements to create_python_neuropod or create_pytorch_neuropod lets you specify python packages required by the model. If the dependency is not already available, it will be downloaded and unpacked (not quite installed; see below).

You also mentioned in code that this is problematic when running multiple python models in a single process and it's only intended to work when using OPE.

If you're running multiple python models without OPE, things can get really tricky (even if you're not using requirements). This is because, without OPE (out-of-process execution), all models will run in the same process using the same python interpreter.

It's possible that this can cause conflicts on its own (e.g. one model changing global state that another depends on), but it gets much more complex when requirements are introduced. If one model depends on torch==1.8.0 and one depends on torch==1.9.0, both can't be loaded into the same python interpreter (in a robust way).

Additionally, conflicts between transitive native dependencies of models (e.g. different versions of Eigen used by torch and tensorflow) can cause really really complicated problems.

Running with OPE solves these issues because every model runs in its own isolated environment independent of your application and independent of every other model.

So, I am wondering is it possible to pre-install the necessary package like torch to the isolated python environment before loading the model and keep the size of the python backend small at the same time?

Dependencies are only downloaded and unpacked the first time they're used and then are immediately available on subsequent model requests for the same dependency. One thing to note is that the packages aren't actually installed; the unpacked packages are added to sys.path when requested by a model. This lets us support multiple versions of packages (e.g torch 1.8 and 1.9) in a clean way. See here for more details.

If you control the runtime environment, you can preload dependencies during your environment build process by running a placeholder model that depends on everything you want to preload:

create_python_neuropod(
    ...
    requirements="""
    torch==1.8.0
    """
    ...
)

And as long as you persist the cache folder (~/.neuropod/pythonpackages/py{major}{minor}/), subsequent uses of torch==1.8.0 in that environment will not go through a download/unpacking process.

Final notes

If you can ensure that all the models you run won't have conflicting dependencies (or transitive dependencies), you might be able to get away with not using OPE, but that doesn't seem particularly robust.

Also, if you want specific packages to be available to all of your models (without them specifying requirements), you can add them here:

# A lockfile of runtime requirements to bootstrap with
reqs = """
future==0.18.2
numpy=={}
six==1.15.0
testpath==0.4.4
""".format(

This triggers a download/unpack at runtime on the first run of any python model, but the difference here is that those packages are available to all models regardless of if they're specified in a model's requirements or not.

Let me know if you have any other questions!

from neuropod.

qiyanz avatar qiyanz commented on May 28, 2024

Hi Vivek, thanks for your response. This answered all my questions. But when I checked the requirements.lock file in my model, I still could see --index-url and --trusted-host in it which would cause ValueError when loading deps. I remembered that I fixed them in commit here. How can I the neuropod package that has my fix?

from neuropod.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.