cansik / yolo-hand-detection Goto Github PK

View Code? Open in Web Editor NEW

261.0 7.0 60.0 18.22 MB

A pre-trained YOLO based hand detection network.

License: Other

Python 81.16% Shell 7.88% PowerShell 10.97%

handtracking hand detection machinelearning yolov3 yolov4

yolo-hand-detection's Introduction

YOLO-Hand-Detection

Scene hand detection for real world images.

Idea

To detect hand gestures, we first have to detect the hand position in space. This pre-trained network is able to extract hands out of a 2D RGB image, by using the YOLOv3 neural network.

There are already existing models available, mainly for MobileNetSSD networks. The goal of this model is to support a wider range of images and a more stable detector (hopefully 🙈).

Dataset

The first version of this network has been trained on the CMU Hand DB dataset, which is free to access and download. Because the results were ok, but not satisfying, I used it to pre annotate more images and manually then corrected the pre-annotations.

Because Handtracking by Victor Dibia is using the Egohands dataset, I tried to include it into the training-set as well.

In the end, the training set consists of the CMU Hand DB, the Egohands dataset and my own trained images (mainly from marathon runners), called cross-hands.

Training

The training took about 10 hours on a single NVIDIA 1080TI and was performed with the YOLOv3 default architecture. I also trained the slim version of it called YOLOv3-tiny.

YOLOv3

Precision: 0.89 Recall: 0.85 F1-Score: 0.87 IoU: 69.8

YOLOv3-Tiny

Precision: 0.76 Recall: 0.69 F1-Score: 0.72 IoU: 53.67

YOLOv3-Tiny-PRN

The tiny version of YOLO has been improved by the partial residual networks paper. Because of that I trained YOLO-Tiny-PRN and share the results here too. It is interesting to see that the Yolov3-Tiny-PRN performance comes close to the original Yolov3!

Precision: 0.89 Recall: 0.79 F1-Score: 0.83 IoU: 68.47

YOLOv4-Tiny

With the recent version of YOLOv4 it was interesting to see how good it performs against it's predecessor. Same precision, but better recall and IoU.

Precision: 0.89 Recall: 0.89 F1-Score: 0.89 IoU: 91.48

Testing

I could not test the model on the same dataset as for example the Egohands dataset, because I mixed the training and testing samples together and created my own test-dataset out of it.

As soon as I have time, I will publish a comparison of my trained data vs. for example Handtracking.

Inferencing

The models have been trained on an image size 416x416. It is also possible to inference it with a lower model size to increase the speed. A good performance / accuracy mix on CPUs has been discovered by using an image size of 256x256.

The model itself is fully compatible with the opencv dnn module and just ready to use.

Demo

To run the demo, please first install all the dependencies (requirements.txt) into a virtual environment and download the model and weights into the model folder (or run the shell script).

# mac / linux
cd models && sh ./download-models.sh

# windows
cd models && powershell .\download-models.ps1

Then run the following command to start a webcam detector with YOLOv3:

# with python 3
python demo_webcam.py

Or this one to run a webcam detrector with YOLOv3 tiny:

# with python 3
python demo_webcam.py -n tiny

For YOLOv3-Tiny-PRN use the following command:

# with python 3
python demo_webcam.py -n prn

For YOLOv4-Tiny use the following command:

# with python 3
python demo_webcam.py -n v4-tiny

Download

YOLOv3 Cross-Dataset
- Configuration
- Weights
YOLOv3-tiny Cross-Hands
- Configuration
- Weights
YOLOv3-tiny-prn Cross-Hands
- Configuration
- Weights
YOLOv4-Tiny Cross-Hands
- Configuration
- Weights

If you are interested in the CMU Hand DB results, please check the release section.

About

Trained by cansik, datasets are described in the readme and fall under the terms and conditions of their owners.

All the demo images have been downloaded from unsplash.com:

Tim Marshall, Zachary Nelson, John Torcasio, Andy Falconer, Sherise, Alexis Brown

yolo-hand-detection's People

Contributors

Stargazers

Watchers

yolo-hand-detection's Issues

Convert cross-hands.weights model to grayscale model

Can you publish the cross-hands.weights model for grayscale images? Will it be smaller?

How to retrain the model ?

Congrats on your work, I've tested and it had a nice performance in my dataset. On the other hand, to get better results I need to fine-tune this pre-trained model with my own dataset, but I noticed that there isn't a training algorithm. So, my question is: there is a way to retrain this model to fine-tune my model?

Indexing of output layer names

TypeError: only integer scalar arrays can be converted to a scalar index
self.output_names.append(ln[i - 1])

Please update Line 23 in yolo.py
Suggested Edit:
self.output_names.append(ln[int(i) - 1])

How to use the CMU Hand DB?

Hi, Thank for sharing and it's a pretty great projects!

But I hava a question about how to use the CMU hand Datasets?? Because I didn't find the bbox label about hand and it only seems to be related to the key points of the hand.

THX.

Request for a license file addition for this project

Hi, @cansik do you mind adding a license file for this project so that we know the usage rights?

Thanks.

Conversion To TensorFlow or Keras

Hey,
So I wanted to do hand detection (on the browser, using TFjS), but for the conversion to a tfjs model, I would require a TensorFlow or Keras model. Could you guide me in converting these models to TF?

No detection with webcam mode

I tried the normal and v4-tiny model with demo_webcam.py
Everything is at default but no detection can be seen...not a single one, and of coz I tried different angles and stuff but no avail.
I am also a follower of yolov4 btw

Annotation conversion from CMU to YOLO

Hi
How did you convert CMU Hand annotations data to YOLO format?

Datasets: How to reconstruct

Hi!

You wrote

In the end, the training set consists of the CMU Hand DB, the Egohands dataset and my own trained images (mainly from marathon runners), called cross-hands.

How can we find the dataaset to reconstruct the trainning?

how to GPU Inference?

Datasets

Any interest in uploaded your datasets? I can contribute by training larger models (and yolov4) and add them to this repo.

the demo with v4-tiny modle can not work

i use cmmand "python demo_webcam.py -n v4-tiny",the detector can not find hands,. And "python demo_webcam.py -n v4-tiny -c 0", it also can not work., how to fix it?

How can I get every detected hand coordinate

Hello, how can I get coordinates of every detected hand x and y coordinates. In your code for every hand given only one x and y.
I want to get hand1: x1, y1; hand2: x2, y2. How can I get this??

Missing source code

Unfortunately there is only a README file but no source code. Is the source code somehow encoded in the README?
Excuse my stupid question, I am new to YOLO and hand tracking.

Failed to parse NetParameter file

When running the code on PyCharm using command: python demo_webcam.py, I receive this error:

cv2.error: OpenCV(4.4.0) C:\Users\appveyor\AppData\Local\Temp\1\pip-req-build-q0nmoxxv\opencv\modules\dnn\src\darknet\darknet_importer.cpp:207: error: (-212:Parsing error) Failed to parse NetParameter file: models/cross-hands.cfg in function 'cv::dnn::dnn4_v20200609::
readNetFromDarknet'

Any way to resolve it?

Could you provide the training code for this hand detection project?

Could you provide the training code for this hand detection project? Since I want to deploy this project to a mobile platform which only supports the TFlite model. So I want to train this project and save the tflite model.

Thanks very much.

ZeroDivisionError: integer division or modulo by zero in demo.py

i get this error, when i use demo.py with some image that i took on my phone. does the image needs to be in specific size?

python demo.py -i test.jpg

loading yolo...
extracting tags for each image...
Traceback (most recent call last):
File "demo.py", line 78, in
print("AVG Confidence: %s Count: %s" % (round(conf_sum / detection_count, 2), detection_count))
ZeroDivisionError: integer division or modulo by zero

Bound box is sometimes partially formed

I have used your model for detecting on my own photos but sometimes the box is not completed like in the attached photo
what can I do?
also few times hands are not recognized, what processing can I do to solve this case?
Thanks a lot for this effort