Giter Club home page Giter Club logo

minimal-bag-of-visual-words-image-classifier's Introduction

Minimal Bag of Visual Words Image Classifier

Implementation of a content based image classifier using the bag of visual words model in Python.

As the name suggests, this is only a minimal example to illustrate the general workings of such a system. The code is not optimized for speed, memory consumption or recognition performance. For a more advanced (state 2008) system check: https://github.com/shackenberg/phow_caltech101.py

If you need state of the art results for image classification check out keras.

The approach consists of two major steps called learning and classifying, represented in the files learn.py and classify.py.

The script learn.py will generate a visual vocabulary and train a classifier using a user provided set of already classified images. After the learning phase classify.py will use the generated vocabulary and the trained classifier to predict the class for any image given to the script by the user.

The learning consists of:

  1. Extracting local features of all the dataset images
  2. Generating a codebook of visual words with clustering of the features
  3. Aggregating the histograms of the visual words for each of the traning images
  4. Feeding the histograms to the classifier to train a model

The classification consists of:

  1. Extracting local features of the to be classified image
  2. Aggregating the histograms of the visual words for the image using the prior generated codebook
  3. Feeding the histogram to the classifier to predict a class for the image

This code relies on:

  • SIFT features for local features
  • k-means for generation of the words via clustering
  • SVM as classifier using the LIBSVM library

Example use:

You train the classifier for a specific dataset with:

python learn.py -d path_to_folders_with_images

To classify images use:

python classify.py -c path_to_folders_with_images/codebook.file -m path_to_folders_with_images/trainingdata.svm.model images_you_want_to_classify

The dataset should have following structure, where all the images belonging to one class are in the same folder:

.
|-- path_to_folders_with_images
|    |-- class1
|    |-- class2
|    |-- class3
...
|    โ””-- classN

The folder can have any name. One example dataset would be the Caltech 101 dataset.

Prerequisites:

To install the necessary libraries run following code from working directory:

# installing libsvm
wget -O libsvm.tar.gz http://www.csie.ntu.edu.tw/~cjlin/cgi-bin/libsvm.cgi?+http://www.csie.ntu.edu.tw/~cjlin/libsvm+tar.gz
tar -xzf libsvm.tar.gz
mkdir libsvm
cp -r libsvm-*/* libsvm/
rm -r libsvm-*/
cd libsvm
make
cp tools/grid.py ../grid.py
cd ..

# installing sift
wget http://www.cs.ubc.ca/~lowe/keypoints/siftDemoV4.zip
unzip siftDemoV4.zip
cp sift*/sift sift

Notes

If you get an IOError: SIFT executable not found error, try sudo apt-get install libc6-i386. sift is a 32Bit executable and you need to install additional libraries to make it run on 64Bit systems. More info and background on the misleading error message on unix.stackexchange

References:

Libsvm:

Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

SIFT:

David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.

sift.py:

Taken from http://www.janeriksolem.net/2009/02/sift-python-implementation.html

libsvm.py:

Addapted from easy.py contained in the LIBSVM packet by Chih-Chung Chang and Chih-Jen Lin.

minimal-bag-of-visual-words-image-classifier's People

Contributors

shackenberg avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.