Giter Club home page Giter Club logo

iudlm's Introduction

Image Understanding with Deep Learning and Mathematics (IUDLM)

This is to understand images with deep learning approaches. As IUDLM tells, it involves Image Understanding (object detection, localization, recognition, segmentation, understanding), Deep Learning (CNN, RNN, RL), and Mathematics (Optimization, Statistics).

Our goal is to combine deep learning and object detection. For the overview of framework, we refer to Object Detection with Deep Learning: A Review -- Zhong-Qiu Zhao. We will borrow our machine learning algorithms and Cameo architecture.

We extract the core idea from the review, Recent Advances in Deep Learning for Object Detection -- Xiongwei Wu. Then we summarize the design guideline as a manual based on the reference and experience. We can build the computational model by constructing the blocks.

Deep Learning for Computer Vision

Layered Software Design

Object Detection Framework

The whole system is listed in the folder DL4CV. It consists of

  1. Deep Learning (keras/tensorflow)
1.1 iudlm -> dataloader
1.2 iudlm -> IO
1.3 iudlm -> model
1.4 iudlm -> preprocessor
1.5 iudlm -> utils
  1. Computer Vision (OpenCV)
2.1 videoanalysis -> CaptureManager
2.2 videoanalysis -> WindowManager
2.3 ...
  1. Real-Time Application
3.1 Cameo
3.2 ...

Deep Learning

  1. Neural Network
  2. Probabilistic Graphical Model
  3. Solver

Image Understanding

  1. Object Detection
  2. Object Recognition
  3. Segmentation
  4. Localization

Mathematics:

  1. Optimization
  2. Statistics

The pipeline of theory -- Object Detection

  1. Region proposal based (R-CNN)
  2. Regression/Classification based (YOLO)

The pipeline of implementation (Object-Oriented)

  1. Prototype
  2. Optimization
  3. Established
  4. Standard

Object-Oriented

  1. Class
  2. Abstraction

Programming Language and Libraries

  1. Python/PyCharm
  2. Tensorflow
  3. OpenCV
  4. Numpy
  5. Pandas
  6. Matplotlib
  7. Sklearn
  8. Scipy
  9. MATLAB
  10. C++

class ClassName(object):

def __init__(self):
    # variables

def method(self):
    # operations

Day 1

Reference: Selective Search for Object Recognition -- J.R.R. Uijlings

Problem: Generating possible object locations for use in object recognition

Solution: Selective Search

Day 2

Reference: Efficient Graph-Based Image Segmentation -- Pedro F. Felzenszwalb

Problem: segmenting an image into regions

Solution: Graph-Based Image Segmentation

Reference: Rich feature hierarchies for accurate object detection and semantic segmentation -- Ross Girshick

Framework: R-CNN: Regions with CNN features

Modules

  1. Region proposals
  2. Feature extraction
  3. Classification

Here we are focused on Region proposals. We have built the other modules. Once we can finish the region proposals module, we can build R-CNN, and its variants.

Day 3

We refer to source code mentioned in Efficient Graph-Based Image Segmentation. We will write the prototpye using python.

Day 4

We build a rough prototpye using Python.

Day 5

We build the prototype using object-oriented programming. Reference: Lifelong Machine Learning Systems: Beyond Learning Algorithms -- Daniel L. Silver

The goal is to sequentially retain learned knowledge and to selectively transfer that knowledge when learning a new task so as to develop more accurate hypotheses or policies.

Day 6

We went over the Deep learning notes cmu -- Deep learaning. This is to introduce Deep learning with neural network.

Rich feature hierarchies for accurate object detection and semantic segmentation -- Ross Girshick proposes R-CNN -- Regions with CNN features.

Day 7

Reference, Matching Networks for One Shot Learning -- Oriol Vinyals, consists of learning a class from a single labelled example.

Day 8

We review Faster-RCNN and write down a summary. We focus on two Modules: Region Proposal Network (RPN) and Region of Interest (ROI) Pooling. We use keras to build a MiniVGGNet as the base network.

Reference, SCARLET-NAS: Bridging the gap Between Scalability and Fairness in Neural Architecture Search -- Xiangxiang Chu, proposes an Architeture search approach to bridge the gap between Scalability and Fairness with a linearly transformation. The problem can be converted into a multi-objective optimization problem. Mathematically, we can find the optimal architecture by sovling the optimization problem.

Day 9

We extract the core idea from the review, Recent Advances in Deep Learning for Object Detection -- Xiongwei Wu. Then we summarize the design guideline as a manual based on the reference and experience. We can build the computational model by constructing the blocks. Deep Learning for Computer Vision, Layered Software Design, Object Detection Framework.

iudlm's People

Contributors

zekifayes avatar

Stargazers

 avatar

Watchers

James Cloos avatar  avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.