Giter Club home page Giter Club logo

eta's People

Contributors

allenleetc avatar anddraca avatar aturkelson avatar benjaminpkane avatar brimoor avatar chrisstauffer avatar clementpinard avatar dependabot[bot] avatar ehofesmann avatar findtopher avatar iantimmis avatar j053y avatar jasoncorso avatar jeffreydominic avatar jinyixin621 avatar kevinqi34 avatar kunyilu avatar lethosor avatar mattphotonman avatar mikejeffers avatar nebulae avatar rohis06 avatar rpinnaka avatar sashankaryal avatar swheaton avatar tylerganter avatar yashbhalgat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

eta's Issues

Developer samples and "User" samples

We have a samples directory now that has examples code that seems to be intended for developers.

We also need to create samples for every module. Right now, I am putting such examples in the same place, but it is not clear to me that this is the right thing to do. Having examples running pipelines will make using and extending eta much easier.

Also: I do not like the word samples here. These are examples.

new requirement dill

eta.core.serial now imports dill. It should be added to requirements.txt. e.g.

dill==0.2.7.1

Add an `eta.core.objects.BaseFrame` class to encapsulate Frame implementation

There are many types of objects that we will want to store in Frame-like classes. We should have a BaseFrame class that defines all the common functionality and then subclasses like DetectedFrame, EmbeddedFrame, TrackedFrame, etc. that are thin-wrappers over BaseFrame that specify what type of objects are in the list.

Support `eta run --dry-run` flag

Request to add pipeline support for dry-run case that removes all configs and output files for the case that the user only wants to see the stdout.

Make eta.core.utils.parse_dir_* methods into builder methods of DataFileSequence

Methods like eta.core.utils.parse_dir_pattern and eta.core.utils.parse_bounds_from_dir_pattern should be converted into builder methods of eta.core.data.DataFileSequence, which should be our one-stop shop for all file-sequence-related operations.

(I like eta.core.data.DataFileSequence --- this idea has been sorely missing)

VGG16Featurizer should force user to call start() and stop()

Currently if we use eta.core.vgg16.VGG16Featurizer without explicitly calling start() and stop(), it will silently load and destroy a huge CNN every time featurize() is called. This is never what the user really wants.

I can see why Featurizer allows this to silently happen (setup/tear-down could be cheap), but VGG16Featurizer should raise an error here.

The other option is to set keep_alive=True, but then the naive user would be carrying around a CNN in memory, which also deserves an error.

Make eta.core.video.FramesRanges more general

The need to pass around sets of numbers like [1, 5, 6, 7, 10] or "1,5-7,10" is pretty general. We should upgrade the eta.core.video.FramesRanges class to provide this general functionality.

It should accept strings (including "*") and lists (including [])

eta.core.config.Config should also understand how to accept fields of this new type.

`Serializable` needs to be reflective

We need everything in eta that is written via json to be reflective. This would enhance and simplify overall functionality.

I also think we should deprecate from_json and write_json to just read and write.

pipelines: need a way to have global config settings inherited by individual modules

If I have a pipeline with a dozen modules and they all require a "frames" setting because they are all working with the same video. It would be far easier to have a setting like this set in the top-level config and then inherited. And, less room for error.

This would be harder if there are multiple videos. But, even less room for error.

(This is a thought I had while working with the pipeline bits. Up for discussion, of course, but wanted to get it down.)

Need ability to assign names to modules in pipelines

This will allow us to, for example, write a pipeline with multiple instances of the same module in it in different places.

These "custom" names would be used when setting parameters and defining the module connections in the pipeline metadata file.

Fresh install does not install tensorflow --- NO WAIT, sudo bash or bash...

Now that vgg is in the repo, we should have the install scripts install tensorflow.
On my mac, I got this after running the install script and then running embed_image.

jcorso@newbury-2 /voxel51/w/eta
$ cd examples/embed_vgg16
/voxel51/w/eta/examples/embed_vgg16
jcorso@newbury-2 /voxel51/w/eta/examples/embed_vgg16
$ python embed_image.py 
Traceback (most recent call last):
  File "embed_image.py", line 20, in <module>
    import tensorflow as tf
ImportError: No module named tensorflow

Ah, after digging a bit deeper, this is actually a problem with the install script. It got up to the install python bits, but then quit (without message) because they failed. My suspicion is that those bits did not get executed as sudo and my python requires sudo for installing for some reason that escapes me. (this is on a mac).

So, something needs to be changed/improved, even if it is the doc on how to run the install_externals as sudo.

Thoughts?

Add an `eta.core.config.Config.parse_enum` method to parse config fields that are enumerations

It would be useful to have an eta.core.config.Config.parse_enum() method that works like this:

class MyConfig(Config):

    def __init__(self, d);
        self.value = self.parse_enum(d, "value", Choices)

where the "enum" can be defined either as a class:

class Choices(Enum):
    A = valA
    B = valB

or a dict:

Choices = {
    "A": valA,
    "B": valB
}

A common pattern will be to use this mechanism when the user needs to choose between one or classes or functions to use.

Serializable needs a write_json method

Need to add a Serializable.write_json method. We really shouldn't be calling serial.write_json directly. Data I/O to disk should almost always be done through a "data class" that implements Serializable

Sample Data Should Be A Separate Download

We have been putting the sample data into the repository, but this will quickly bloat the repository if we add any sizable amount making it hard to work with. We need to establish a separate data dump that can be fetched if the user wants to run the examples, etc.

Installs: virtualenv and cross-platform

Probably not best practice to rely on system-wide installs.

Also: the mac parts rely on brew. Some of us use port (macports) instead of brew. How to reconcile? (Virtualenv?)

Should modules be included as a package in eta?

Currently modules is just a set of executable python code that uses the eta codebase. It is not a package (it has no "init.py" file). But, it is inside of eta within the repo. I'd suggest either moving it outside of the eta directory or turning it into a package.

Is there a fundamental reason why we would not want to allow modules to import other modules. It would not be possible just to "import modulename" because the actual code may be executing somewhere else.

Functionality to query/list available modules and pipelines on the path

A new-to-ETA developer will want to get acquainted with the available functionality out of the box. A seasoned-ETA developer will want to learn what new modules or pipelines may have been added recently. A pipeline developer will need to list available modules.

ETA needs an apt-cache-like functionality to navigate the module and pipeline space.

Formalize the notion of conditional execution of modules

For example:

  • only resize a video if it is above a certain allowed resolution. This is currently achieved via a max_size argument of the resize_videos module, but perhaps this is a general enough need that we should provide formal support for it.
  • only resize a video if a size argument is provided; if no argument is provided, the module should be "skipped" all together. This is currently achieved on a per-module basis in the resize_videos module by symlinking the outputs to the inputs, but perhaps this is a general enough need that we should provide formal support for it.

Is a custom OpenCV build necessary or worth it?

We currently build OpenCV from source during our external installs, but it is causing us pain every time we re-install ETA on a new machine (new developers, production deployments, etc). Moreover, the only customization we currently do is setting the WITH_CUDA flag.

Should we continue building OpenCV from source, or would pip install opencv-python suffice for us?

Need ability to include/run one pipeline within another

Options:
(A) support this only at the pipeline metadata level by adding a "pipelines" field that allows access to I/O of other pipelines. When a pipeline is built, a single pipeline config would be populated based on this information
(B) support this at the pipeline config level by allowing pipeline configs to point to other pipeline configs.

I'm leaning towards (A).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.