Giter Club home page Giter Club logo

car-and-truck-classification's Introduction

Car-and-Truck-classification

alt text

Contents

  1. Introduction
  2. SIFT
  3. Faster RCNN
  4. YOLOv4
  5. Results

Introduction

This repsitory shows different techniques to identify cars and trucks, namely: SIFT, FRCNN, YOLOv4 and Retinanet. SIFT is the only classical method and the only method that uses classification while the rest are object detection algorithms.

SIFT

SIFT extracts features from an image using the following steps: Scale space construction, Key point localization, Orientation assignment, followed by the formation of the key point descriptor. Once the features are extracted from each image of the training set, all image descriptors are put together, and their centres are generated using the K means algorithm. Next, we loop through all features within an image descriptor and generate a bag of features vector for each image descriptor based on their Euclidean distance from the centres computed in the previous step (This contains the frequency of various features within an image). This is done for all images and finally these bag of features vectors are put together for classification using an SVM classifier.

Faster RCNN

alt text

The following summarizes the working of the Faster RCNN: a) A region proposal algorithm to generate “bounding boxes” or locations of possible objects in the image; b) A feature generation stage to obtain features of these objects, usually using a CNN; c) A classification layer to predict which class this object belongs to; and d) A regression layer to make the coordinates of the object bounding box more precise.

YOLOv4

alt text

YOLO is an abbreviation for the term ‘You Only Look Once’. This is an algorithm that detects and recognizes various objects in a picture. Object detection in YOLO is done as a regression problem and it outputs the class probabilities of the detected images. YOLO algorithm employs convolutional neural networks (CNN) to detect objects in real-time and has a total of 53 CNN layers. As the name suggests, the algorithm requires only a single forward propagation through a neural network to detect objects. This means that prediction in the entire image is done in a single algorithm run. The CNN is used to predict various class probabilities and bounding boxes simultaneously.

RetinaNet

alt text

There are four major components of a RetinaNet model architecture: a) Bottom-up Pathway - The backbone network (e.g. ResNet) which calculates the feature maps at different scales, irrespective of the input image size or the backbone. b) Top-down pathway and Lateral connections - The top down pathway up samples the spatially coarser feature maps from higher pyramid levels, and the lateral connections merge the top-down layers and the bottom-up layers with the same spatial size. c) Classification sub-network - It predicts the probability of an object being present at each spatial location for each anchor box and object class. d) Regression sub-network - It regresses the offset for the bounding boxes from the anchor boxes for each ground-truth object.

Results

alt text

car-and-truck-classification's People

Contributors

vram97 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.