Giter Club home page Giter Club logo

cnn_action_prediction-master's Introduction

Action and Action Class Prediction: Project Overview

  • Aim of the project is to develop Convolutional Neural Network for a multi-output predictions.
  • Obtained 3030 training and 2100 testing images containing 21 different actions under 5 action classes.
  • Since the number of training samples is small of approximate 1940 images, data augmentation is used which is a technique to randomly transform the images to artificially expand the training size.
  • Model developed comprised of two parts: one is used as the feature extractor with Transfer Learning that is made up of MobileNetV2 blocks, and the other is the classifier part which is made up of the fully connected layers and the output layer with Softmax as the chosen activation function for the final layer.
  • Model achieved 81% in predicting action and 92% in predicting action class on an Identical and Independent Dataset (IID).

Reference

Python Version: 3.6 Google Colab
GPU: NVIDIA Tesla K80 GPU
Packages: numpy, pandas, seaborn, matplotlib, tensorflow, keras, Image
CNN Article: F-beta Score in Keras
Metric Article: A Simple CNN: Multi Image Classifier
Data Generators: Keras data generators and how to use them

Introduction

  • Aim of this project is to develop a deep convolutional neural network (CNN) for a multi-output classifier that can identify the actions of a person from still images.
  • Before carrying out any investigation, it is clear a custom data/ image generator is required as there is insufficient memory to load the dataset.
  • There are 3030 training and 2100 testing images containing 21 different actions under 5 action classes.
  • The analysis is first done by carrying out Exploratory Data Analysis where images generated from the data generator is visualized.

Data Preprocessing

  • Data preprocessing is done on the dataset to encode all the categorical variables of actions and their action classes using Keras.
  • Further pre-processing is done by splitting the train dataset into 20% test and 80% train set.
  • The train set is then further split into 80% training and 20% validation data which will be used for model evaluations and improvements.

Exploratory Data Analysis

Several Observations can be made from the intitial exploration of these images.

  • Images have good similarity to common natural laguage dataset like imagenet. Transfer learning is an option.
  • Images have different shapes. These images requires a common shape for transformation.
  • Some exmaple images are ambiguous. The final performance for the model may be affected.
  • In some images, the important information and features is toward a corner of the image. Data Augmentation needs to be done carefully.

As observed, we can augment the images within the data generator.

Model Building

  • Before any neural network is being developed,
  • Baseline model is developed with Keras functional API that has a VGG-type network which have two convolutional layers with 3x3 filters along with a max pooling layer.
  • Since model contains more complexity and features within each images, transfer learning with pre-trained model is required for model building.
  • As a result, MobileNetV2 is selected for feature extraction because of its appropriate complexity for the task at hand.
  • The model development is followed by adding fully connected layers for classifying the action and the action class.

Experiments and Tuning

The goal of this project is to achieve a 70% accuracy in prediction both classes.

  • Augmented and non-augmented images were first fitted with Baseline Model under 100 epochs.
  • The results revealed unrepresentative validation dataset. It implies that validation set did not provide sufficient information to evaluate the ability for the model to generalize.

Error Analysis

cnn_action_prediction-master's People

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.