Giter Club home page Giter Club logo

rotated-object-detection's Introduction

Rotated-Object-Detection

Novel ResNet inspired Tiny-FPN network (<2M params) for Rotated Object Detection using 5-parameter Modulated Rotation Loss

Crux

  • Architecture: FPN with classification and regression heads ~1.9M parameters
  • Loss Function: 5 Parameter Modulated Rotation Loss
  • Activation: Mish
  • Model Summary - reports/FPN_torchsummary.txt (reports/ also contain alterantive summary with named layers in table)
  • Training Script - src/train.py
  • Final Model Weights - src/checkpoints/model_93_ap.pt
  • Python Deps. and version - requirements.txt
  • Evaluation - src/main.py

Method

  • The reported results are using a ResNet inspired building block modules and an FPN.

  • Separate classification and regression subnets (single FC) are used.

  • Feature map from the top of the pyramid that has the best semantic representation is used for classification.

  • While the finer feature map at the bottom of the pyramid that has the best global representation is used for regressing the rotated bounding box. Finer details can be found in the code as comments. Code: src/models/detector_fpn.py

  • The whole implementation is from scratch, in PyTorch. Only the method for calculating AP from PR curves is borrowed and referenced (src/metrics.py/compute_ap).

Approach

  1. Random data generator that creates images with high noise and rotated objects (shapes) in random scales and orientations. (Private)
  2. Compare reusing generated samples for each epoch VS online generating and loading
  3. Implement modulated rotated loss and other metrics
  4. Experiment with loss functions and activations
  5. Tried to replace standard convolutional layers with ORN (Oriented Response Network) that use rotated filters to learn orientation (Could not integrate due to technical challenges)
  6. Improve basic model to use different heads for classification and regression
  7. Try variations by removing 512-dimensional filters as they take up the most parameters (~1M)
  8. Add feature pyramid and experiment with different building blocks and convolutional parameters (kernel size, stride in the first layer plays a big role)
  9. Streamline parameters in the building blocks and the prediction heads to be lower than 2M
  • Please find the rest of the report, with details on experiments and analysis, in reports/experiments.pdf

Opportunities to improve

  1. Use the rest of the pyramid layers for prediction (take more parameters) and have better logic to get the best detection
  2. Integrate ORN layers to FPN
  3. Using DenseNets with compact convolution layer configurations

rotated-object-detection's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

scott-mao mixkup

rotated-object-detection's Issues

ModuleNotFoundError: No module named 'src.CUDA'

Traceback (most recent call last):
  File "train.py", line 16, in <module>
    from src.CUDA.ORN.orn.functions import oraligned1d
ModuleNotFoundError: No module named 'src.CUDA'

You haven't uploaded src.CUDA

Could you please do it?
I am lookinhg forward to run ORN model script, else please let me know the source repo.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.