Giter Club home page Giter Club logo

icevision's Introduction

logo

An Agnostic Computer Vision Framework


tests docs codecov PyPI version Downloads

black license Discord


IceVision is the first agnostic computer vision framework to offer a curated collection with hundreds of high-quality pre-trained models from Torchvision, Open MMLab's MMDetection, Ultralytic's YOLOv5, Ross Wightman's EfficientDet and soon PyTorch Image Models. It orchestrates the end-to-end deep learning workflow allowing to train networks with easy-to-use robust high-performance libraries such as PyTorch-Lightning and Fastai.

IceVision Unique Features:

  • Data curation/cleaning with auto-fix

  • Access to an exploratory data analysis dashboard

  • Pluggable transforms for better model generalization

  • Access to hundreds of neural net models

  • Access to multiple training loop libraries

  • Multi-task training to efficiently combine object detection, segmentation, and classification models

Installation

pip install icevision[all]

For more installation options, check our docs.

Important: We currently only support Linux/MacOS.

Quick Example: How to train the Fridge Objects Dataset

Open In Colab image

image

Happy Learning!

If you need any assistance, feel free to:

Join our Forum

icevision's People

Contributors

2649 avatar adamfarquhar avatar addono avatar ai-fast-track avatar aisensiy avatar alexandrebrown avatar boscacci avatar burntcarrot avatar dnth avatar drscotthawley avatar famosi avatar fcakyon avatar frapochetti avatar fstroth avatar hectorlop avatar jerbly avatar joowon-dm-snu avatar lee00286 avatar lgvaz avatar matt-deboer avatar miwojc avatar nicjac avatar oke-aditya avatar paras-jain avatar partham16 avatar potipot avatar ribenamaplesyrup avatar rsomani95 avatar singhalpranav22 avatar strickvl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

icevision's Issues

Deprecate CategoryParser

This parser does not follow the structure of the other parsers and it's not very useful anyways

Restructuring Model folder

Following discussion on Slack and on issue #60
We come to a structure as follows

so we can have a structure something like this
-> backbones (use torchvision + custom)
-> layers (for some layers that you would use in the backbone)
-> models
-----> model_name_folder (e.g. fasterrcnn)
----------> model.py (take help of backbones here)
----------> dataloader.py (with minor edits for every mode)

model.py includes train_step, validation_step, test_step.

And train.py (code for training and inference from the model) Would be in the examples folder.

Inheriting from rcnn to faster rcnn is extra inter code coupling which we might avoid.
Let's have seperate structures for rcnn, fast rcnn and faster rcnn. It would make debugging easier as well.

Also, I will raise a PR for contributing.MD and FAQs.MD (will check how to make .rst)

COCOMetric bug with transforms

Transform that resize the image changes the positions of bboxes and segs.

Currently COCOMetric will use the positions of the original records to calculate it's metrics. There are three possible solutions:

  • Never scale validation images
  • Don't use pycocotools, write metrics from scratch (good for longterm, pycocotools is causing a lot of minor issues)
  • Apply transforms to the records passed to COCOMetric

transforms and iscrowd

If a transform removes an item from the image, the corresponded iscrowd also has to be removed

Learner

Is a high level Learner class a good idea?

This class would behave similarly to fastai, but would differ from the lightning workflow.

Maybe we can think of this like the high level API for training models, while lightning would be the mid level.

Trainer.fit more than once

We need the workflow to be able to do something like this:

model.freeze_to(-1)
trainer.fit(...)
model.freeze_to(0)
trainer.fit(...)

Trainer already correctly resumes training, but we need to reset the lr_scheduler.

Models and their configurations

๐Ÿš€ Feature

using torch lightning.

  • Also it would make a standard API, define the model, using lightning as trainer, user need not edit and inherit lot of stuff for backbone changes, architecture changes, num_classes etc. He can simply edit the model.

Why torchvision uses FrozenBatchNorm?

What is FrozenBatchNorm?

Why is it used on models like FasterRCNN and MaskRCNN?

How does it impact fine tuning? Because it's often a good idea to never freeze any batch norm layer while training (even if the other layers are freezed)

Integrate pytorch hub

Is your feature request related to a problem? Please describe.
Use models from pytorch hub

Describe the solution you'd like
Easily use models from hub, with minimal setup.

Refactors show_pred

Currently show_pred is specify to RCNN models

I think it go out of visualize and inside RCNNModel, I'm open to new ideas

Lr schedule

Add lr schedule to at least one example, person.ipynb

Improves example on the wheat dataset

๐Ÿ““ New example

What is the task?
Object detection

Is this example for a specific model?
FasterRCNN

Is this example for a specific dataset?
wheat


Don't remove
Main issue for examples: #39

Pickle records

Option to pickle records so we don't have to parse all data everytime

This option should be transparent to the user, we can expose it by a optional argument passed to DataParser.

Always a good discussion is where to store this data. Do it store it relative to the current file? Into /tmp? Or into a .mantisshrimp folder in the home directory?

Storing relative to the current file is always annoying when using version control, we have to explicitly not add it to checkout

Example

COCOParser(data, source, use_cached=True)

Learn.fit multiple times

Calling fit multiple times should start where the last call ended.

Need to take care of steps in the trainer

Validation loss feeds network twice

The modification previously made on layers.ipynb were affecting the model performance on evaluation. It's needed to better understand what is happening in roi_heads.forward before modifying the method.

For now it's okay to feed the model twice for getting the loss and then the predictions.

Getting the validation loss by using model.train also disconsiders other important effects like Dropout and BatchNorm

dataloader method on models

It's good that each model knows how to create it's own dataloader, but I don't like the fact that we need to instantiate the model to have access to the dataloader.

Previously we were using staticmethod, that got removed because we could not call super.

I think it's a good idea to bring staticmethod back, and instead of calling super we can just call the appropriate function

Rework of Item

Currently we have an Item class that is handling all use cases, this introduces a lot of complexity because we have to keep checking for Nones.

Because each model is specific to a single task, we could instead use specific items for each task. Something like:

class MaskBBoxItem(Item):
    ...

Or even specific to each model like:

class FasterRCNNItem
    ...

If we go specific to each model, we can insert item2training_sample in the class

Tutorials and Examples

๐Ÿ“’ Tutorials

Tutorials are in .ipynb format, explaining each step of the process, really detailed, not production like.

Core

Object detection

Segmentation

Keypoints

๐Ÿ““ Examples

Examples are be in the .py format, more production oriented. Ready to be run with arguments from the command line and easy to integrate with wandb sweeps and alike.

Object detection

Segmentation

Keypoints


Is there a new tutorial or example you would like to add? Comment below and we talk about it ๐Ÿ˜

Once we agree, create an Tutorial or Example request issue (use the template) and I'll edit this post with your new cool example!

Remove fastcore dependencies

The main question is: Do we want to keep utils functions like L , lmap, ifnotnone and stuff like that?

While this functions are really helpful to who already is used to then, it elevates the barrier for new contributors.

Implement layer groups

For fine-tuning, differential learning rates.

Would be good to have something like fastai freeze_to

Integrate models from pytorch hub

๐Ÿš€ Feature

Is your feature request related to a problem? Please describe.
Use model from pytorch hub

Describe the solution you'd like
Use models available on hub with minimal setup

Rework of dataloader

It might be a good idea to embed the dataloader inside the model, because each dataloader is specific to a model anyways, this would also more closely follow the lightning guidelines

class MantisRCNN(MantisModule):
    @staticmethod
    def dataloader(self, <pytorch dataloader kwargs>):
        # Do the specialization
        return dataloader

It would also be logical to bring item2training_sample inside the model

class MantisRCNN(MantisModule):
    @staticmethod
    def item2training_sample(item):
        # convert item to training sample
        ...

Images with no annotations

The torchvision model will throw an error if images with no annotations are passed to it. I think we need to preemptively remove these images.

Note that a transform (like random crop or zoom) can remove the items from an image (potentially leaving it with no objects).

Related to this and this

Integrate detr

๐Ÿš€ Feature

Detr is an amazing new approach to object detection just launched by facebook. Pretained weights are available to hub, so naturally this issue is a bit related to #38.

Describe the solution you'd like
Let's divide this task into three separate parts:

  • Model inference
  • Train from scratch: Should be easier to implement and is supported in the original code
  • Fine tuning: Not officially tested in the original code, will be a bit harder to implement.

Additional context
No other library supports this yet, let's goooo!! ๐Ÿš€ ๐Ÿš€ ๐Ÿš€

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.