Comments (18)
I'm already bundling eta with the fiftyone package, and that seems to be working (I haven't run a full end-to-end workflow to check for anything missing, but any issues there should be resolvable). There are a couple tweaks that I could make ETA-side to make it cleaner, though.
As for system dependencies, there isn't really a way to install arbitrary system packages when pip install
ing a Python package (at least as a wheel). We could look into bundling any that we really need, but I think requiring the user to install e.g. ffmpeg if they need it isn't much more of a lift than tensorflow.
from fiftyone.
Great, that's the goal. Overhaul ETA install to follow best practices. Nothing is off limits
from fiftyone.
Note that the ETA README currently says it only supports Python 2.7.X and Python 3.6.X. The reason, now a year or more old, that it didn't claim support for Python 3.7 was Tensorflow install issues. See voxel51/eta#167.
As mentioned above, if we shift to a strategy that requires the user to install TF themselves, then this becomes a non-issue. And, in any case, TF isn't required for the "lite" ETA install, so this Python version requirement should not apply to FiftyOne
from fiftyone.
Successfully running the examples at https://github.com/voxel51/eta/tree/develop/examples would constitute evidence that a "full" ETA install is working.
If the code in this doc https://github.com/voxel51/eta/blob/develop/docs/video_labels_guide.md runs, then a "lite" ETA install is working
from fiftyone.
As for how to gracefully handle providing a library that contains code that doesn't run unless a full install is performed (or some packages like TF are installed, for example) I used the pattern demonstrated below in FiftyOne:
(FiftyOne is designed to work without TF or PyTorch installed, but, if they are installed, certain additional features will be available. If the user tries to access a feature that requires an uninstalled package, they get an informative error):
Line 30 in fd589ba
fiftyone/fiftyone/core/utils.py
Line 33 in fd589ba
from fiftyone.
I did some initial work in voxel51/eta@fc1f7dd (reasoning). I was unable to track down why ETA's requirements/common.txt lists the following packages, since nothing in ETA imports them directly or indirectly as far as I can tell - any insights?
- blockdiag
- contextlib2
- Cython
- funcparserlib
- Jinja2
- lxml
- MarkupSafe
- oauth2client
- requests-toolbelt
- scikit-learn
- scipy
- simplejson
- webcolors
from fiftyone.
Another thing to consider here is changing the package name from ETA
to avoid conflicts with the existing ETA package on PyPI. Any preference between voxel51-eta
and voxel51-ETA
(or something else)? This would not affect imports, but would affect (to my knowledge):
- The package name passed to
pip install
- The package name that
importlib.metadata
requires - At a lower level, the name of the
.dist-info
folder where the package metadata lives on disk
Even if we only distribute ETA in our own registry, we should still rename it, because pip will search the public registry by default and find the existing ETA package. (Even if that behavior is possible to change, there is still a chance that someone will already have that ETA package installed, which could cause pip to not install ours.)
from fiftyone.
Looks like we've been a bit sloppy with the requirements file in ETA. Most of those are probably indirect dependencies that were erroneously added via a pip freeze
or their corresponding code was deleted and now they're unneeded.
I see only two that need to stay:
blockdiag
should stay- it installs ablockdiag
executable that is used byeta/core/diagram.py
scipy
is used ineta/core/gps.py
from fiftyone.
The scipy
example is a case of a heavy import that's only used in one place. We might want to consider using a strategy like the one I adopted in FiftyOne to gently tell the user they need to install scipy
only if they use the offending function.
This also applies to all the storage clients in eta.core.storage
:
import boto3
import google.api_core.exceptions as gae
import google.cloud.storage as gcs
import google.oauth2.service_account as gos
import googleapiclient.discovery as gad
import googleapiclient.http as gah
import pysftp
They are each basically used in one place to serve their corresponding *StorageClient
class. We could consider not installing these by default and instead telling the user to install them if they try to use that particular storage client
from fiftyone.
voxel51-eta
is fwm for the ETA package name
from fiftyone.
scipy is used in eta/core/gps.py
Ah, I was grepping for ^(import|from)
, so it didn't catch that one because it wasn't at the beginning of a line. I will check again to see if there are others I missed.
I don't know that I would classify scipy as particularly heavy-weight, relative to the rest of the ETA dependencies, and it only directly depends on numpy:
$ pip install scipy
Collecting scipy
Using cached scipy-1.4.1-cp36-cp36m-manylinux1_x86_64.whl (26.1 MB)
Collecting numpy>=1.13.3
Using cached numpy-1.18.4-cp36-cp36m-manylinux1_x86_64.whl (20.2 MB)
Installing collected packages: numpy, scipy
Successfully installed numpy-1.18.4 scipy-1.4.1
(fiftyone-brain also requires scipy, anyway, so omitting scipy from ETA wouldn't save users any space.)
I was wondering about how vital the storage-related packages were, since they have quite a few dependencies of their own that add up. If we want to make those optional, I think extras are the best way to go. We would still need to add some sort of message if a user tries to use these features when the packages are missing, but it would be easier to resolve - pip install voxel51-eta[storage]
, or pip install voxel51-eta[s3]
(should all of them be bundled as one storage
feature or separately?).
from fiftyone.
FWM to leave scipy
as a dependency.
I like the idea of using extras
to manage dependencies:
pip install voxel51-eta # this is "lite mode"
pip install voxel51-eta[storage] # all storage clients
pip install voxel51-eta[ml] # install ML dependencies, like `tensorflow/models` repo
pip install voxel51-eta[dev] # dev dependencies
I think (https://www.python.org/dev/peps/pep-0426/#id41) [all]
may work out of the box to install all extras?
I did a quick search and found this as a potential solution if we're worried about extra packages getting complicated: https://hanxiao.io/2019/11/07/A-Better-Practice-for-Managing-extras-require-Dependencies-in-Python/
An observation about the [ml]
extra: that will require doing things like adding submodules and so forth, which I think would have to be done via some extra code in setup.py
from fiftyone.
https://www.python.org/dev/peps/pep-0426/#extras-optional-dependencies
If the standard
all
extra has no explicitly declared entries, then integration tools SHOULD implicitly define it as a dependency on all of the extras explicitly declared by the project.
I suspect pip respects this, at least.
As for tensorflow models, I think it would be worth seeing if https://pypi.org/project/tf-models-official/ will work. I'm not yet sure why it's so much smaller than the submodule, though. Ok, the research
directory is the largest one, but it's still larger in our fork than in https://github.com/tensorflow/models/tree/master/research
from fiftyone.
The ETA CLI requires eta.core.storage (as does a single fallback in eta.core.models). Two options are to make the storage-related CLI commands fail at runtime or to hide them entirely if the storage module isn't available. Any preference?
from fiftyone.
No need to hide the methods from the CLI. In general, if a user tries to use a feature that requires a package extra, I think they should just get a runtime error
from fiftyone.
Ok, I will need to make it fail at runtime then (as written, the entire CLI will fail on startup if it can't import eta.core.storage)
from fiftyone.
If the code in this doc https://github.com/voxel51/eta/blob/develop/docs/video_labels_guide.md runs, then a "lite" ETA install is working
Looks like that code is outdated:
>>> frame_labels.add_frame_attribute(fattr1)
AttributeError: 'VideoFrameLabels' object has no attribute 'add_frame_attribute'
Was it replaced with add_attribute
? I'm not entirely sure if that's the correct fix.
(Other than that, my lite install seems to be working)
from fiftyone.
yes add_attribute
from fiftyone.
Related Issues (20)
- [BUG] ConnectionResetError while processing big dataset HOT 1
- [FR] Support setting tags in custom importer class HOT 1
- [BUG] Fiftyone fails to create the metadata for large images HOT 1
- [BUG] upload dataset with 'keypoints' field to labelstudio correctly
- [BUG] CVAT fouc.import_annotations not importing video annotations
- No label colors not showing in Embeddings panel[BUG] HOT 1
- [BUG] Mongoengine: using pop() to update dict fields is dangerous and should be avoided HOT 1
- [BUG] GraphQL API Error name '_HAS_DEFAULT_FACTORY' is not defined HOT 2
- Wrong DateField value loaded from MongoDB HOT 3
- [BUG] operator set_progress error HOT 1
- fo.core.video.make_frames_dataset is sneakily considered a frame view HOT 9
- [FR] support ultralytics yolov8 models pretrained on open images v7 dataset in the model zoo
- [BUG] lancedb backend doesnt work in exfat disk HOT 1
- [BUG] Memory leak when using `add_coco_labels` for instance segmentation with coco_id_field set HOT 3
- [BUG]The bbox pixel coordinate is less than 0, causing the mask to not be displayed.
- [BUG] Errors while uploading large amounts of files to a label-studio
- How to rename the dataset to groundtruth using the frontend [BUG] HOT 1
- [INSTALL] fiftyone.core.config.FiftyOneConfigError: MongoDB could not be installed on your system. Please define a `database_uri` in your `fiftyone.core.config.FiftyOneConfig` to connect to yourown MongoDB instance or cluster HOT 3
- [BUG] SuperGradients YOLO-NAS inference not working HOT 3
- [FR] Using fiftyone for multitask classifier
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fiftyone.