Giter Club home page Giter Club logo

computer-pointer-controller-using-intel-openvino's Introduction

Introduction

Use your gaze to control your computer's mouse pointer movement using Intel OpenVINO toolkit. This project is done completely offline. Along with Gaze Detection Model, You will need to use the following models too:

The gaze detection depends on other models such as head-pose-estimation, face-detection, facial-landmarks.

The gaze estimation model requires following inputs:

  • The Head Pose
  • The Left Eye Image
  • The Right Eye Image.

You will have to coordinate the flow of data from the input, and then amongst the different models and finally to the mouse controller.

Project Set Up and Installation:

  • Download the OpenVino Toolkit for your system with all the prerequisites.

  • Clone the Repository using git clone https://github.com/Dhananjayyy/computer-pointer-controller-using-intel-openvino.git

  • Create Virtual Environment using command virtualenv venv in the command prompt. Make sure to install virualenv in python.

  • Install all the requirements from "requirements.txt" file using pip install requirements.txt.

  • Download the required models from OpenVino Zoo using the commands below. It will download required models with all the precisions.

python3 /opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader/downloader.py --name gaze-estimation-adas-0002

python3 /opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader/downloader.py --name face-detection-adas-binary-0001

python3 /opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader/downloader.py --name head-pose-estimation-adas-0001

python3 /opt/intel/openvino/deployment_tools/open_model_zoo/tools/downloader/downloader.py --name landmarks-regression-retail-0009

How to run (Demo):

Pipeline

Method 1: (For Windows users only)

  • Double click on the script.bat file or open cmd in the project root folder and run script.bat

  • Prompt: Initializing OpenVINO environment

  • If your system have successfully installed OpenVINO environment and the requirement, it will be initialised.

  • Prompt: This project requires virtual environment, proceed to create? (Y/[N])

  • Pressing 'y' will create the virtual environment in your current directory. Press 'n' if it already exists.

  • Prompt: Download required dependancies in your virtual environment? (Y/[N])

  • This project requires certain packages to be installed in the virtual env to effectively run it.

  • It is stored in requirements.txt file.

  • Pressing 'y' will download them. PRessing 'n' will skip this step.

  • Prompt: Proceed to download the required models? (Y/[N])

  • This project requires four models from ithe model downloader: Gaze Detection Model, Face Detection Model, Head Pose Estimation Model, Facial Landmarks Detection Model

  • Press 'y' to proceed and 'n' to skip.

  • Prompt: Here's your script to run the project:

  • You will be displayed a generated script to run this project.

  • Proceed to execute the generated script? (Y/[N])

  • Press 'y' to execute the above script and run the project.

  • If above steps are complted successfully, the project will start

Method 2: (For Linux users only)

Step1. Open command prompt. Go to Virtual Environment location. Execute below commands.

cd venv/Scripts/
activate

Step2. Instantiate OpenVino Environment. (This is very important)

cd C:\Program Files (x86)\IntelSWTools\openvino\bin\
setupvars.bat

Step3. Go to the project directory

cd path_to_project_directory

Step4. Run below commands to execute the project (demo.mp4)

python {path to main.py file} -fdm {path to the xml file of face detection model} -ldm {path to the xml file of landmark detection model} -hpm {path to the xml file of head pose estimation model} -gem {path to the xml file of gaze estimation model} -ip {path to the video file demo.mp4} -flags fdm ldm hpm gem

Documentation:

Command Line Argument Information:

  • fdm : Path of xml file of face detection model
  • ldm : Path of xml file of landmark regression model
  • hpm : Path of xml file of Head Pose Estimation model
  • gem : Path of xml file of Gaze Estimation model
  • ip : Path of input Video file or cam for Webcam
  • flags (Optional): To preview video in separate window you need to Specify flag from fdm, ldm, hpm, gem
  • prob (Optional): To specify confidence threshold for face detection.
  • d (Optional): Specify Device for inference, the device can be CPU, GPU, VPU, FPGA, MYRIAD

Project Structure:

C:.
│   README.md
│   requirements.txt
│   script.bat
│
├───files
│       demo.mp4
│       fp16fps.png
│       fp16inf.png
│       fp16load.png
│       fp32fps.png
│       fp32inf.png
│       fp32load.png
│       int8fps.png
│       int8inf.png
│       int8load.png
│
├───models
│   └───intel
│       ├───face-detection-adas-binary-0001
│       │   └───FP32-INT1
│       │           face-detection-adas-binary-0001.bin
│       │           face-detection-adas-binary-0001.xml
│       │
│       ├───gaze-estimation-adas-0002
│       │   ├───FP16
│       │   │       gaze-estimation-adas-0002.bin
│       │   │       gaze-estimation-adas-0002.xml
│       │   │
│       │   ├───FP32
│       │   │       gaze-estimation-adas-0002.bin
│       │   │       gaze-estimation-adas-0002.xml
│       │   │
│       │   └───FP32-INT8
│       │           gaze-estimation-adas-0002.bin
│       │           gaze-estimation-adas-0002.xml
│       │
│       ├───head-pose-estimation-adas-0001
│       │   ├───FP16
│       │   │       head-pose-estimation-adas-0001.bin
│       │   │       head-pose-estimation-adas-0001.xml
│       │   │
│       │   ├───FP32
│       │   │       head-pose-estimation-adas-0001.bin
│       │   │       head-pose-estimation-adas-0001.xml
│       │   │
│       │   └───FP32-INT8
│       │           head-pose-estimation-adas-0001.bin
│       │           head-pose-estimation-adas-0001.xml
│       │
│       └───landmarks-regression-retail-0009
│           ├───FP16
│           │       landmarks-regression-retail-0009.bin
│           │       landmarks-regression-retail-0009.xml
│           │
│           ├───FP32
│           │       landmarks-regression-retail-0009.bin
│           │       landmarks-regression-retail-0009.xml
│           │
│           └───FP32-INT8
│                   landmarks-regression-retail-0009.bin
│                   landmarks-regression-retail-0009.xml
│
├───openvino_env
└───src
        face_detection.py
        facial_landmark_detection.py
        gaze_estimation.py
        head_pose_estimation.py
        input_feeder.py
        main.py
        mouse_controller.py

The root folder contains README file, Batch script.bat and requirements.txt files

The files folder contains the demo file and benchmark images.

The models folder contains the four downloaded models

The openvino-env folder is your virtual environment

The src folder has 4 model class files which are modularised and other required files

  • face_detection_model.py

  • gaze_estimation_model.py

  • landmark_detection_model.py

  • head_pose_estimation_model.py

  • main.py Run complete pipeline of the total project.

  • mouse_controller.py contains code to move mouse curser pointer based on mouse coordinates.

  • input_feeder.py contains code to load local video/webcam feed

Benchmarks

I have checked Inference Time, Model Loading Time, and Frames Per Second on different machines. I have run the model in 5 diffrent hardware named:

  • IEI Mustang F100-A10 FPGA
  • Intel Xeon E3-1268L v5 CPU
  • Intel Atom x7-E3950 UP2 GPU
  • Intel Core i5-6500TE CPU
  • Intel Core i5-6500TE GPU

INT8

Inference Time

Loading Time

FPS


FP16

Inference Time

Loading Time

FPS


FP32

Inference Time

Loading Time

FPS


Results:

  • The IEI Mustang F100-A10 FPGA took more time for inference than other because FPGAs are designed for speciific tasks.
  • The Intel Atom x7-E3950 UP2 GPU had highest loading time in every precision. Others had lowest.
  • The GPUs and CPU were quite tied in FPS department.
  • GPU had more FPS compared to any other hardware.
  • Though little costly than CPU, considering above results, it's a good choice to go with.
  • The difference between the models having different precisions is quite clear.
  • INT8 is low in size and it has low accuracy compared to others. Usefull for low memory low accuracy IoT devices.
  • FP16 is a good middle ground between size and accuracy tradeoff. Useful for most use cases.
  • FP32 has high accuracy and it is bigger in size. Useful for high memory high accuracy devices.

Edge Cases

  • Many People Scenario: If multiple peoples are present in the video, it will give results on one face.
  • Head Detection: When there's no one in the frame, it will skip the frame and inform the user.

Stand Out Suggestions:

  • Proper eye/head/gaze movement is advised
  • Lighting: We might improve pre-processing steps to reduce error due to bad lighting conditions.

computer-pointer-controller-using-intel-openvino's People

Contributors

dhananjayyy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.