Giter Club home page Giter Club logo

depthai_handface's Introduction

Face and hand tracking with DepthAI

Running Google Mediapipe Face Mesh and Hand Tracking models on Luxonis DepthAI hardware (OAK-D, OAK-D lite, OAK-1,...). The hand tracking is optionnal and can be disabled by setting the argument nb_hands to 0.
WIP

Demo

The models used in this repository are:

  • Mediapipe Blazeface, the short range version, for face detection. The distance face-camera must be < 2m.
  • Mediapipe Face Mesh for face landmark detection(468 landmarks). I call this model the basic model in this document,
  • Mediapipe Face Mesh with attention. This is an alternative to the previous model. In addition to the 468 landmarks, it can detect 10 more landmarks corresponding to the irises. Its predictions are more accurate around lips and eyes, at the expense of more compute (FPS on OAK-D ~10 frames/s). I call this model the attention model.
  • The Mediapipe Palm Detection model (version 0.8.0) and Mediapipe Hand Landmarks models (version lite), already used in depthai_hand_tracker.

Note that, whenever possible, the post-processing of the models output has been integrated/concatenated to the models themselves, thanks to PINTO's simple-onnx-processing-tools. Thus, Non Maximum Suppression for the face detection and palm detection models as well as some calculation with the 468 or 478 face landmarks are done at the level of the models. The alternative would have been to do these calculations on the host or in a script node on the device (slower).

Install

Install the python packages (depthai, opencv) with the following command:

python3 -m pip install -r requirements.txt

Run

Usage:

->./demo.py -h
usage: demo.py [-h] [-i INPUT] [-a] [-2] [-n {0,1,2}] [-xyz] [-f INTERNAL_FPS]
               [--internal_frame_height INTERNAL_FRAME_HEIGHT] [-t [TRACE]]
               [-o OUTPUT]

optional arguments:
  -h, --help            show this help message and exit

Tracker arguments:
  -i INPUT, --input INPUT
                        Path to video or image file to use as input (if not
                        specified, use OAK color camera)
  -a, --with_attention  Use face landmark with attention model
  -2, --double_face     EXPERIMENTAL. Run a 2nd occurence of the face landmark Neural Network
                        to improve fps. Hand tracking is disabled.
  -n {0,1,2}, --nb_hands {0,1,2}
                        Number of hands tracked (default=2)
  -xyz, --xyz           Enable spatial location measure of palm centers
  -f INTERNAL_FPS, --internal_fps INTERNAL_FPS
                        Fps of internal color camera. Too high value lower NN
                        fps (default= depends on the model)
  --internal_frame_height INTERNAL_FRAME_HEIGHT
                        Internal color camera frame height in pixels
  -t [TRACE], --trace [TRACE]
                        Print some debug infos. The type of info depends on
                        the optional argument.

Renderer arguments:
  -o OUTPUT, --output OUTPUT
                        Path to output video file

Some examples:

  • To run the basic face model with 2 hands max tracking:

    ./demo.py

  • Same as above but with the attention face model:

    ./demo.py -a

  • To run only the Face Mesh model (no hand tracking):

    ./demo.py [-a] -n 0

  • If you want to track only one hand (instead of 2), you will get better FPS by running:

    ./demo.py [-a] -n 1

  • Instead of the OAK* color camera, you can use another source (video or image) :

    ./demo.py [-a] -i filename

  • To measure face and hand spatial location in camera coordinate system:

    ./demo.py [-a] -xyz

    The measure is made on the wrist keypoints and on a point of the forehead between the eyes.

Keypress Function
Esc Exit
space Pause
1 Show/hide the rotated bounding box around the hand
2 Show/hide the hand landmarks
3 Show/hide the rotated bounding box around the face
4 Show/hide the face landmarks
5 Show/hide hand spatial location (-xyz)
6 Show/hide the zone used to measure the spatial location (small purple square) (-xyz)
f Switch between several face landmark rendering
f Switch between several hand landmark rendering
b Draw the landmarks on a black background

Credits

depthai_handface's People

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.