Giter Club home page Giter Club logo

control-mountain-car-using-webcam's Introduction

LinkedIn


Control Mountain-Car-v0 Gym Using Webcam

Table of Contents
  1. About The Project
  2. Preparing Dataset
  3. Model for Training
  4. Evaluation
  5. Results

About The Project

Controlling mountain car environment provided by openAI gym, using hand gestures by training dataset on pretrained VGG16 model on imagenet.

(back to top)

Built With

This section should list major frameworks/libraries used to bootstrap this project.

  • Tensorflow
  • OpenCV
  • Gym

(back to top)

Preparing Dataset

As there are three actions for input for mountain car environment, in this project the dataset is divided into three classes i.e. 0, 1, 2. Class 0 contains data for action 0 and similarly for class 1 and 2.

This project uses hand gestures to publish actions to the mountain car environment. Class 0 contains images of a human raising his index fingure just. When just the index finger is raised the action 0 is to be published, whereas class 2 contains images of person raising his index and middle finger both. Class 1 contains images of a person doing nothing, just sitting. So to sum up there are two gestures one raising index finger for action 0 and the other is to raise two fingers for action 2. These gestures are simple and accesible, as in action 1 the car does not move so for this action the person has to do nothing.

The webcam is used to take photos for creating dataset. 3 different person's photos are taken at differnt situaions, differnt lighting and with different camera angle. The photos of first 2 persons are used for training and validation whereas the third person's data is used for testing.

The data is divided into training, validation and testing. Training contains 70% of the data where as validation contains 20% and testing contains the remaining 10%.

(back to top)

Model for Training

In this project pretrained VGG-16 and ResNet50 on imagenet are used for training. Basic models are used in this project, with changes made only to the final layer. This is because this is just a three class classification problem while these models are built to handle up to 1000 classes.The last layer of both the models are passed through:

  • Global average pooling layer
  • Flatten layer to flatten the output to 1 dimension
  • A final Fully Connected Softmax Layer
  • Stochastic gradient descent is used as an optimizer

(back to top)

Evaluation

For evaluation of models following metrics were observed:

  • Accuracy
  • Loss
  • Precision
  • Recall

For the task both Resnet50 and VGG16 were used and compared on the basis of time in which each frame is passed through the model and prediction is made by the model. The task is tested on four persons whereas the dataset was collected from only 3 of them.

(back to top)

Results

For training a jupyter notebook is created. Initially the trained models evaluation metrics for validation fluctuated and also the loss was increasing which indicates signs of overfitting. Following are some results of initial trainings for 200 epochs:

Resnet-Acc

Resnet-Loss

Vgg-Acc

Vgg-loss

This issue is tried to be resolved by adding a dropout layer with a dropout rate of 0.2 before the fully connected layer and also a time based decay learning rate scheduler is added. But still the issue remains which suggest that dataset is too small for models to learn useful features. Following are the final results of Resnet50 training for 100 epochs:

Resnet-Acc

Resnet-Loss

Results for evaluation on test data:

Metrics Resnet50 VGG16
Accuracy 0.7600 0.6400
Loss 0.8652 1.4939

For testing on mountain car environment a script "reinforcement.py" is created which takes input argument for the model that is to be selected.

Usage

python reinforcement.py --model Vgg

or

python reinforcement.py --model Resnet

Comparing the time for each frame predictiction Resnet50 is approximately 1.3 times faster then the VGG16 model.

For testing of the task 4 different persons evaluated it and a video recording is made on one person's observations. The predictions made by the models are sometimes incorrect when the lighting or background conditions are changed this mainly due to the small dataset that is used for training. With a larger dataset this issue can be resolved.

(back to top)

control-mountain-car-using-webcam's People

Contributors

waleed15 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.