Giter Club home page Giter Club logo

kazuhito00 / hand-gesture-recognition-using-mediapipe Goto Github PK

View Code? Open in Web Editor NEW
488.0 10.0 410.0 1.07 MB

MediaPipe(Python版)を用いて手の姿勢推定を行い、検出したキーポイントを用いて、簡易なMLPでハンドサインとフィンガージェスチャーを認識するサンプルプログラムです。(Estimate hand pose using MediaPipe(Python version). This is a sample program that recognizes hand signs and finger gestures with a simple MLP using the detected key points.)

License: Apache License 2.0

Python 10.11% Jupyter Notebook 89.30% Dockerfile 0.59%
mediapipe python hand-pose-estimation gesture-recognition gesture hands

hand-gesture-recognition-using-mediapipe's Introduction

[Japanese/English]

Note
キーポイント分類について、モデルを集めたリポジトリを作成しました。
Kazuhito00/hand-keypoint-classification-model-zoo

hand-gesture-recognition-using-mediapipe

MediaPipe(Python版)を用いて手の姿勢推定を行い、検出したキーポイントを用いて、
簡易なMLPでハンドサインとフィンガージェスチャーを認識するサンプルプログラムです。 mqlrf-s6x16

本リポジトリは以下の内容を含みます。

  • サンプルプログラム
  • ハンドサイン認識モデル(TFLite)
  • フィンガージェスチャー認識モデル(TFLite)
  • ハンドサイン認識用学習データ、および、学習用ノートブック
  • フィンガージェスチャー認識用学習データ、および、学習用ノートブック

Requirements

  • mediapipe 0.8.4
  • OpenCV 4.6.0.66 or Later
  • Tensorflow 2.9.0 or Later
  • protobuf <3.20,>=3.9.2
  • scikit-learn 1.0.2 or Later (学習時に混同行列を表示したい場合のみ)
  • matplotlib 3.5.1 or Later (学習時に混同行列を表示したい場合のみ)

Demo

Webカメラを使ったデモの実行方法は以下です。

python app.py

DockerとWebカメラを使ったデモの実行方法は以下です。

docker build -t hand_gesture .

xhost +local: && \
docker run --rm -it \
--device /dev/video0:/dev/video0 \
-v `pwd`:/home/user/workdir \
-v /tmp/.X11-unix/:/tmp/.X11-unix:rw \
-e DISPLAY=$DISPLAY \
hand_gesture:latest

python app.py

デモ実行時には、以下のオプションが指定可能です。

  • --device
    カメラデバイス番号の指定 (デフォルト:0)
  • --width
    カメラキャプチャ時の横幅 (デフォルト:960)
  • --height
    カメラキャプチャ時の縦幅 (デフォルト:540)
  • --use_static_image_mode
    MediaPipeの推論にstatic_image_modeを利用するか否か (デフォルト:未指定)
  • --min_detection_confidence
    検出信頼値の閾値 (デフォルト:0.5)
  • --min_tracking_confidence
    トラッキング信頼値の閾値 (デフォルト:0.5)

Directory

│  app.py
│  keypoint_classification.ipynb
│  point_history_classification.ipynb
│
├─model
│  ├─keypoint_classifier
│  │  │  keypoint.csv
│  │  │  keypoint_classifier.hdf5
│  │  │  keypoint_classifier.py
│  │  │  keypoint_classifier.tflite
│  │  └─ keypoint_classifier_label.csv
│  │
│  └─point_history_classifier
│      │  point_history.csv
│      │  point_history_classifier.hdf5
│      │  point_history_classifier.py
│      │  point_history_classifier.tflite
│      └─ point_history_classifier_label.csv
│
└─utils
    └─cvfpscalc.py

app.py

推論用のサンプルプログラムです。
また、ハンドサイン認識用の学習データ(キーポイント)、
フィンガージェスチャー認識用の学習データ(人差指の座標履歴)を収集することもできます。

keypoint_classification.ipynb

ハンドサイン認識用のモデル訓練用スクリプトです。

point_history_classification.ipynb

フィンガージェスチャー認識用のモデル訓練用スクリプトです。

model/keypoint_classifier

ハンドサイン認識に関わるファイルを格納するディレクトリです。
以下のファイルが格納されます。

  • 学習用データ(keypoint.csv)
  • 学習済モデル(keypoint_classifier.tflite)
  • ラベルデータ(keypoint_classifier_label.csv)
  • 推論用クラス(keypoint_classifier.py)

model/point_history_classifier

フィンガージェスチャー認識に関わるファイルを格納するディレクトリです。
以下のファイルが格納されます。

  • 学習用データ(point_history.csv)
  • 学習済モデル(point_history_classifier.tflite)
  • ラベルデータ(point_history_classifier_label.csv)
  • 推論用クラス(point_history_classifier.py)

utils/cvfpscalc.py

FPS計測用のモジュールです。

Training

ハンドサイン認識、フィンガージェスチャー認識は、
学習データの追加、変更、モデルの再トレーニングが出来ます。

ハンドサイン認識トレーニング方法

1.学習データ収集

「k」を押すと、キーポイントの保存するモードになります(「MODE:Logging Key Point」と表示される)


「0」~「9」を押すと「model/keypoint_classifier/keypoint.csv」に以下のようにキーポイントが追記されます。
1列目:押下した数字(クラスIDとして使用)、2列目以降:キーポイント座標


キーポイント座標は以下の前処理を④まで実施したものを保存します。


初期状態では、パー(クラスID:0)、グー(クラスID:1)、指差し(クラスID:2)の3種類の学習データが入っています。
必要に応じて3以降を追加したり、csvの既存データを削除して、学習データを用意してください。
  

2.モデル訓練

keypoint_classification.ipynb」をJupyter Notebookで開いて上から順に実行してください。
学習データのクラス数を変更する場合は「NUM_CLASSES = 3」の値を変更し、
「model/keypoint_classifier/keypoint_classifier_label.csv」のラベルを適宜修正してください。

X.モデル構造

keypoint_classification.ipynb」で用意しているモデルのイメージは以下です。

フィンガージェスチャー認識トレーニング方法

1.学習データ収集

「h」を押すと、指先座標の履歴を保存するモードになります(「MODE:Logging Point History」と表示される)


「0」~「9」を押すと「model/point_history_classifier/point_history.csv」に以下のようにキーポイントが追記されます。
1列目:押下した数字(クラスIDとして使用)、2列目以降:座標履歴


キーポイント座標は以下の前処理を④まで実施したものを保存します。


初期状態では、静止(クラスID:0)、時計回り(クラスID:1)、反時計回り(クラスID:2)、移動(クラスID:4)の
4種類の学習データが入っています。
必要に応じて5以降を追加したり、csvの既存データを削除して、学習データを用意してください。
   

2.モデル訓練

point_history_classification.ipynb」をJupyter Notebookで開いて上から順に実行してください。
学習データのクラス数を変更する場合は「NUM_CLASSES = 4」の値を変更し、
「model/point_history_classifier/point_history_classifier_label.csv」のラベルを適宜修正してください。

X.モデル構造

point_history_classification.ipynb」で用意しているモデルのイメージは以下です。
「LSTM」を用いたモデルは以下です。
使用する際には「use_lstm = False」を「True」に変更してください(要tf-nightly(2020/12/16時点))

Application example

以下に応用事例を紹介します。

Reference

Author

高橋かずひと(https://twitter.com/KzhtTkhs)

License

hand-gesture-recognition-using-mediapipe is under Apache v2 license.

hand-gesture-recognition-using-mediapipe's People

Contributors

arky avatar kazuhito00 avatar pinto0309 avatar taffarel55 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hand-gesture-recognition-using-mediapipe's Issues

Requirements File

Add a requirements file to quickly install the right version of python packages.

train dataset of PointHistoryClassifier?

Hi Kazuhito00, thank you very much for you cool job. I have a question about the train dataset of the point_history_classifier. I want to extend your demo with my own gesture, but when I collect my own train dataset and train the new model, I find the new
model can't recognise the new gesture correctly. I collect my data like this, After I Press 'h' and enter the mode to save the history of fingertip coordinates, I press '4' while completing the gesture and save the figure coordinate to the csv for train new model. Total num of the added data is about 500. Can you help me point out the problem?TKS!

more than 9 gestures.

Hello friend, I have been experimenting with your project to enter a raspberry and try it in the translation but I want to enter more than 9 gestures, I don't know if you can support me, it is from your next project.

C++版mediapipe

C++版でも同様のジェスチャー認識ができるものを作ってもらえるとありがたいです。

add CITATION.cff file

I would like to use the models for scientific work.

Please add a CITATION.cff file to indicate how to properly cite this repository.

Error while converting keras model to tflite.

after following: keypoint_classification.ipynb
got error while converting .hdf5 to .tflite

...
Classification Report
              precision    recall  f1-score   support

           0       1.00      1.00      1.00       317
           1       1.00      1.00      1.00       329

    accuracy                           1.00       646
   macro avg       1.00      1.00      1.00       646
weighted avg       1.00      1.00      1.00       646

2022-07-13 00:07:33.159140: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:362] Ignored output_format.
2022-07-13 00:07:33.159164: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:365] Ignored drop_control_dependency.
2022-07-13 00:07:33.159600: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: /tmp/tmpe_iawth6
2022-07-13 00:07:33.160549: I tensorflow/cc/saved_model/reader.cc:89] Reading meta graph with tags { serve }
2022-07-13 00:07:33.160564: I tensorflow/cc/saved_model/reader.cc:130] Reading SavedModel debug info (if present) from: /tmp/tmpe_iawth6
2022-07-13 00:07:33.163551: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
2022-07-13 00:07:33.164236: I tensorflow/cc/saved_model/loader.cc:229] Restoring SavedModel bundle.
2022-07-13 00:07:33.190545: I tensorflow/cc/saved_model/loader.cc:213] Running initialization op on SavedModel bundle at path: /tmp/tmpe_iawth6
2022-07-13 00:07:33.196456: I tensorflow/cc/saved_model/loader.cc:305] SavedModel load for tags { serve }; Status: success: OK. Took 36862 microseconds.
2022-07-13 00:07:33.208658: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
WARNING:absl:Buffer deduplication procedure will be skipped when flatbuffer library is not properly loaded
Traceback (most recent call last):
  File "handsign_train.py", line 94, in <module>
    tflite_quantized_model = converter.convert()
  File "/home/jason/hand-gesture-recognition-using-mediapipe/venv/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 930, in wrapper
    return self._convert_and_export_metrics(convert_func, *args, **kwargs)
  File "/home/jason/hand-gesture-recognition-using-mediapipe/venv/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 922, in _convert_and_export_metrics
    return flatbuffer_utils.convert_object_to_bytearray(model_object)
  File "/home/jason/hand-gesture-recognition-using-mediapipe/venv/lib/python3.8/site-packages/tensorflow/lite/tools/flatbuffer_utils.py", line 84, in convert_object_to_bytearray
    model_offset = model_object.Pack(builder)
  File "/home/jason/hand-gesture-recognition-using-mediapipe/venv/lib/python3.8/site-packages/tensorflow/lite/python/schema_py_generated.py", line 11662, in Pack
    operatorCodes = builder.EndVector()
TypeError: EndVector() missing 1 required positional argument: 'vectorNumElems'

handsign_train.py has the same code as keypoint_classification.ipynb

package:
tensorflow 2.9.1
tf-nightly 2.10.0.dev20220710
flatbuffers 1.12

"MODE:Logging Point History" not always working

Dear @Kazuhito00,

First of all thank you so much for sharing this amazing project. Your code is so well written and your tutorials are perfect.

I'm experiencing some issues with the "mode:loggin point history", when pressing "h" key while running the app.

What I'm trying to do, obviously, is to record some new points in order to train the model to recognise a new index gesture. What I find weird is that sometimes it works and I'm able to add new data points to the "point_history.csv" file, and sometimes, for no reason, the recording doesn't work.

As I'm aware the problem could come from many aspects of my environment and the way I'm following each of your tutorial steps, I just wanted to ask a more general question : have you been experiencing problems with the "Mode:logging point history" and would have any idea where this could come from ?

If you find that the issue lacks details about the context of the error, feel free to close and delete it, I'm just asking in case you would have experienced anything similar.

Thank you again, I hope you're fine 🙂

Slow FPS 4 to 5 on Ubuntu 18.04 with NVIDIA GeForce RTX 2080 Ti

Hi @Kazuhito00 ,

Thank you for your work on this.

I was just trying and playing with the your repo and I have observed that FPS on my Ubuntu 18.04 with NVIDIA GeForce RTX 2080 Ti I am seeing only 4 to 5 FPS, whereas at your side I am seeing 25 to 30 FPS.

I have tried using following:

  1. TF lite model which was already in the repo - but same 4 to 5 FPS.
  2. Build and compile TF lite model on using the HDF5 model in the repo. Converted HDF5 model from repo to TF lite model on my machine, but same 4 to 5 FPS.
  3. Directly using HDF5 models for key point classifier and point history classifier model doing prediction / inference on GPU but still no luck same 4 to 5 FPS.

It will be helpful if you can provide some guidance on why at my side getting only 4 to 5 FPS.

How was the data generated?

I want to train my new gesture. But I see your dataset has a whopping 5k rows with approx 1k each gesture.

Was your data genuinely taken from 5k press on the app or it was generated by some trick?

How can I give the model a good dataset?

Many thanks

I cannot get correct the results

  1. I have finish the training based on the point_history.csv, but I cannot get the correct the results compare with weights you provided, I don‘t known why?
  2. I collect the training data about Stop、Clockwise、Anti_Clockwise、Moving、Click、grab,but I also can't get correct the results,just only ’stop‘。

New dataset, accuracy is very low 0.2- 0.3

Hi,
Taking the footprint of this work , i created my won getsure dataset from media pipe, did the same preprocessing as was written and trained a 3 dense layer classifier with adam and categorical cross entropy , but my accuracy doesnt increase beyond 0.3, I have more than 1k dataset for each class. i am baffled as to why is it not working? any input from your side would be useful

Preprocessing step- #shape(N,21,2)
def preprocess(data):
for i in range(data.shape[0]):
data[i,:,0]=data[i,:,0] - data[i,0,0]
data[i,:,1] =data[i,:,1]- data[i,0,1]
data[i]= data[i]/np.amax(np.abs(data[i]))
data=data.reshape(data.shape[0],42)
return data

model

def get_model():
reg= regularizers.l1(0)
inp= Input(shape=(42,))
x= Dense(32,activation='relu',activity_regularizer=reg, kernel_initializer='he_uniform')(inp)
x= Dropout(0)(x)
x= Dense(16,activation='relu',activity_regularizer=reg, kernel_initializer='he_uniform') (x)
x= Dropout(0)(x)
x= Dense(8,activation='relu',activity_regularizer=reg, kernel_initializer='he_uniform')(x)
out= Dense(5,activation='softmax')(x)
model= Model(inp,out)
opt = tf.keras.optimizers.Adam(learning_rate=0.1)
model.compile(optimizer=opt,loss='categorical_crossentropy',metrics=['accuracy'])
model.summary()
return model

model.fit

Jupyter notebook. I have an error in "model.fit(
X_train,
y_train,
epochs=1000,
batch_size=128,
validation_data=(X_test, y_test),
callbacks=[cp_callback, es_callback]
)"
It said "InvalidArgumentError Traceback (most recent call last)
Cell In[32], line 1"
I don't understand why and how to fix this problem, can you help me with this error?
Thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.