Giter Club home page Giter Club logo

Comments (18)

lethosor avatar lethosor commented on June 9, 2024

I'm already bundling eta with the fiftyone package, and that seems to be working (I haven't run a full end-to-end workflow to check for anything missing, but any issues there should be resolvable). There are a couple tweaks that I could make ETA-side to make it cleaner, though.

As for system dependencies, there isn't really a way to install arbitrary system packages when pip installing a Python package (at least as a wheel). We could look into bundling any that we really need, but I think requiring the user to install e.g. ffmpeg if they need it isn't much more of a lift than tensorflow.

from fiftyone.

brimoor avatar brimoor commented on June 9, 2024

Great, that's the goal. Overhaul ETA install to follow best practices. Nothing is off limits

from fiftyone.

brimoor avatar brimoor commented on June 9, 2024

Note that the ETA README currently says it only supports Python 2.7.X and Python 3.6.X. The reason, now a year or more old, that it didn't claim support for Python 3.7 was Tensorflow install issues. See voxel51/eta#167.

As mentioned above, if we shift to a strategy that requires the user to install TF themselves, then this becomes a non-issue. And, in any case, TF isn't required for the "lite" ETA install, so this Python version requirement should not apply to FiftyOne

from fiftyone.

brimoor avatar brimoor commented on June 9, 2024

Successfully running the examples at https://github.com/voxel51/eta/tree/develop/examples would constitute evidence that a "full" ETA install is working.

If the code in this doc https://github.com/voxel51/eta/blob/develop/docs/video_labels_guide.md runs, then a "lite" ETA install is working

from fiftyone.

brimoor avatar brimoor commented on June 9, 2024

As for how to gracefully handle providing a library that contains code that doesn't run unless a full install is performed (or some packages like TF are installed, for example) I used the pattern demonstrated below in FiftyOne:

(FiftyOne is designed to work without TF or PyTorch installed, but, if they are installed, certain additional features will be available. If the user tries to access a feature that requires an uninstalled package, they get an informative error):

fou.ensure_tf()

def ensure_tf():

from fiftyone.

lethosor avatar lethosor commented on June 9, 2024

I did some initial work in voxel51/eta@fc1f7dd (reasoning). I was unable to track down why ETA's requirements/common.txt lists the following packages, since nothing in ETA imports them directly or indirectly as far as I can tell - any insights?

  • blockdiag
  • contextlib2
  • Cython
  • funcparserlib
  • Jinja2
  • lxml
  • MarkupSafe
  • oauth2client
  • requests-toolbelt
  • scikit-learn
  • scipy
  • simplejson
  • webcolors

from fiftyone.

lethosor avatar lethosor commented on June 9, 2024

Another thing to consider here is changing the package name from ETA to avoid conflicts with the existing ETA package on PyPI. Any preference between voxel51-eta and voxel51-ETA (or something else)? This would not affect imports, but would affect (to my knowledge):

  • The package name passed to pip install
  • The package name that importlib.metadata requires
  • At a lower level, the name of the .dist-info folder where the package metadata lives on disk

Even if we only distribute ETA in our own registry, we should still rename it, because pip will search the public registry by default and find the existing ETA package. (Even if that behavior is possible to change, there is still a chance that someone will already have that ETA package installed, which could cause pip to not install ours.)

from fiftyone.

brimoor avatar brimoor commented on June 9, 2024

Looks like we've been a bit sloppy with the requirements file in ETA. Most of those are probably indirect dependencies that were erroneously added via a pip freeze or their corresponding code was deleted and now they're unneeded.

I see only two that need to stay:

  • blockdiag should stay- it installs a blockdiag executable that is used by eta/core/diagram.py
  • scipy is used in eta/core/gps.py

from fiftyone.

brimoor avatar brimoor commented on June 9, 2024

The scipy example is a case of a heavy import that's only used in one place. We might want to consider using a strategy like the one I adopted in FiftyOne to gently tell the user they need to install scipy only if they use the offending function.

This also applies to all the storage clients in eta.core.storage:

import boto3
import google.api_core.exceptions as gae
import google.cloud.storage as gcs
import google.oauth2.service_account as gos
import googleapiclient.discovery as gad
import googleapiclient.http as gah
import pysftp

They are each basically used in one place to serve their corresponding *StorageClient class. We could consider not installing these by default and instead telling the user to install them if they try to use that particular storage client

from fiftyone.

brimoor avatar brimoor commented on June 9, 2024

voxel51-eta is fwm for the ETA package name

from fiftyone.

lethosor avatar lethosor commented on June 9, 2024

scipy is used in eta/core/gps.py

Ah, I was grepping for ^(import|from), so it didn't catch that one because it wasn't at the beginning of a line. I will check again to see if there are others I missed.

I don't know that I would classify scipy as particularly heavy-weight, relative to the rest of the ETA dependencies, and it only directly depends on numpy:

$ pip install scipy
Collecting scipy
  Using cached scipy-1.4.1-cp36-cp36m-manylinux1_x86_64.whl (26.1 MB)
Collecting numpy>=1.13.3
  Using cached numpy-1.18.4-cp36-cp36m-manylinux1_x86_64.whl (20.2 MB)
Installing collected packages: numpy, scipy
Successfully installed numpy-1.18.4 scipy-1.4.1

(fiftyone-brain also requires scipy, anyway, so omitting scipy from ETA wouldn't save users any space.)

I was wondering about how vital the storage-related packages were, since they have quite a few dependencies of their own that add up. If we want to make those optional, I think extras are the best way to go. We would still need to add some sort of message if a user tries to use these features when the packages are missing, but it would be easier to resolve - pip install voxel51-eta[storage], or pip install voxel51-eta[s3] (should all of them be bundled as one storage feature or separately?).

from fiftyone.

brimoor avatar brimoor commented on June 9, 2024

FWM to leave scipy as a dependency.

I like the idea of using extras to manage dependencies:

pip install voxel51-eta           # this is "lite mode"
pip install voxel51-eta[storage]  # all storage clients
pip install voxel51-eta[ml]       # install ML dependencies, like `tensorflow/models` repo
pip install voxel51-eta[dev]      # dev dependencies

I think (https://www.python.org/dev/peps/pep-0426/#id41) [all] may work out of the box to install all extras?

I did a quick search and found this as a potential solution if we're worried about extra packages getting complicated: https://hanxiao.io/2019/11/07/A-Better-Practice-for-Managing-extras-require-Dependencies-in-Python/

An observation about the [ml] extra: that will require doing things like adding submodules and so forth, which I think would have to be done via some extra code in setup.py

from fiftyone.

lethosor avatar lethosor commented on June 9, 2024

https://www.python.org/dev/peps/pep-0426/#extras-optional-dependencies

If the standard all extra has no explicitly declared entries, then integration tools SHOULD implicitly define it as a dependency on all of the extras explicitly declared by the project.

I suspect pip respects this, at least.

As for tensorflow models, I think it would be worth seeing if https://pypi.org/project/tf-models-official/ will work. I'm not yet sure why it's so much smaller than the submodule, though. Ok, the research directory is the largest one, but it's still larger in our fork than in https://github.com/tensorflow/models/tree/master/research

from fiftyone.

lethosor avatar lethosor commented on June 9, 2024

The ETA CLI requires eta.core.storage (as does a single fallback in eta.core.models). Two options are to make the storage-related CLI commands fail at runtime or to hide them entirely if the storage module isn't available. Any preference?

from fiftyone.

brimoor avatar brimoor commented on June 9, 2024

No need to hide the methods from the CLI. In general, if a user tries to use a feature that requires a package extra, I think they should just get a runtime error

from fiftyone.

lethosor avatar lethosor commented on June 9, 2024

Ok, I will need to make it fail at runtime then (as written, the entire CLI will fail on startup if it can't import eta.core.storage)

from fiftyone.

lethosor avatar lethosor commented on June 9, 2024

If the code in this doc https://github.com/voxel51/eta/blob/develop/docs/video_labels_guide.md runs, then a "lite" ETA install is working

Looks like that code is outdated:

>>> frame_labels.add_frame_attribute(fattr1) 
AttributeError: 'VideoFrameLabels' object has no attribute 'add_frame_attribute'

Was it replaced with add_attribute? I'm not entirely sure if that's the correct fix.
(Other than that, my lite install seems to be working)

from fiftyone.

brimoor avatar brimoor commented on June 9, 2024

yes add_attribute

from fiftyone.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.