Giter Club home page Giter Club logo

hamoco's Introduction

pypi version license build coverage

Hamoco

hamoco (handy mouse controller) is a python application that allows you to control your mouse from your webcam using various hand gestures. You have a laptop equipped with a webcam? Well, good news, that's all you need to feel like Tom Cruise in Minority Report! Kind of.

Demonstration

In the example below, the hand is used to move the pointer, open a file by double-clicking on it, scroll through it, select a paragraph and cut it. The file is then dragged and dropped into a folder.

How does it work?

By using the power of PyAutoGUI to control the mouse, OpenCV to process the video feed, and MediaPipe to track hands, hamoco predicts the nature of a hand pose in real-time thanks to a neural network built with Keras and uses it to perform various kinds of mouse pointer actions.

Installation

1. From PyPI:

pip install hamoco

2. From the code repository:

git clone https://github.com/jorisparet/hamoco
cd hamoco
pip install -r requirements.txt
pip install .

The installation copies three scripts in the default script folder of pip:

  1. hamoco-run
  2. hamoco-data
  3. hamoco-train

Linux

The default folder should be under /home/<user>/.local/bin/. Make sure this location (or the correct one, if different) is included in your $PATH environment variable to be able to run the scripts from the console. If not, type the following command export PATH=$PATH:/path/to/hamoco/scripts/ in the console or add it your .bashrc file.

Windows

The default folder should be under C:\Users\<user>\AppData\Local\Programs\Python\<python_version>\Scripts\. Make sure this location (or the correct one, if different) is included in your $PATH environment variable to be able to run the scripts from the console. If not, type the following command set PATH=%PATH%;C:\path\to\hamoco\scripts\ in the console, or select Edit the system environment variables in the search bar, click Environment Variables…, click PATH, click Edit... and add the correct path to the scripts.

Requirements:

Quick start

Running the scripts

hamoco is composed of three executable scripts: hamoco-run, hamoco-data, and hamoco-train, that are listed below. Run these scripts directly from the console, e.g. hamoco-run --sensitivity 0.5 --show.

hamoco-run

hamoco-run is the main application. It activates the webcam and allows to use hand gestures to take control of the mouse pointer. Several basic actions can then be performed, such as left click, right click, drag and drop and scrolling. Note that it requires a bit of practice before getting comfortable with the controls. Various settings can be adjusted to customize the hand controller to your liking, such as the global sensivitity, parameters for motion smoothing and much more. Type hamoco-run --help for more information on the available options.

Examples:

  • hamoco-run --sensitivity 0.4 --scrolling_threshold 0.2 : adapts the sensitivity and sets a custom threshold value to trigger scrolling motions.
  • hamoco-run --min_cutoff_filter 0.05 --show : sets a custom value for the cutoff frequency used for motion smoothing and opens a window that shows the processed video feed in real-time.
  • hamoco-run --scrolling_speed 20 : sets a custom value for the scrolling speed. Note that for a given value, results may differ significantly depending on the operating system.
  • hamoco-run --margin 0.2 --stop_sequence THUMB_SIDE CLOSE INDEX_MIDDLE_UP : adapts the size of the detection margin (indicated by the dark frame in the preview windows using --show), and changes the sequence of consecutive poses to stop the application.

Configuration files with default values for the control parameters can be found in the installation folder, under hamoco/config/. Simply edit the file that corresponds to your operating system (posix.json for Linux and nt.json for Windows) to save your settings permanently, and hence avoid specifying the parameters by hand in the console.

Hand poses & Mouse actions:

  • OPEN : the pointer is free and follows the center of the palm (indicated by the white square) ;
  • CLOSE : the pointer stops all actions. The hand can be moved anywhere in the frame without moving the pointer. This is used to reset the origin of motion (see the nota bene below) ;
  • INDEX_UP : performs a left-click at the current pointer location. Execute twice rapidly for a double-click ;
  • PINKY_UP : performs a right click at the current pointer location ;
  • INDEX_MIDDLE_UP : holds the left mouse button down and moves the pointer by following the center of the palm. This is used for selection and drag & drop ;
  • THUMB_SIDE : enables vertical scrolling using the first triggering location as origin. Scrolling up or down is done by moving the hand up or down relative to the origin while keeping the same hand pose ;

N.B. note that, much like a real mouse, the recorded motion of the pointer is relative to its previous position. When your mouse reaches the edge of your mouse pad, you simply lift it and land it back somewhere on the pad to start moving again. Similarly, if your hand reaches the edge of the frame, the pointer will stop moving: simply close your fist and move it back into the frame to reset the origin of motion (exactly like when lifting and moving a real mouse).

The various hand poses are illustrated below:

Exiting the application:

There are two ways to exit the application:

  1. In the preview mode (--show option enabled), simply click on the preview window and press ESC ;
  2. Execute a predetermined sequence of consecutive hand poses. The default sequence can be found in the help message (hamoco-run --help). A new sequence can be specified with the --stop_sequence option followed by the consecutive hand poses, or it can simply be changed in the .json configuration file.

hamoco-data

hamoco-data activates the webcam and allows to record your own labeled data for hand poses in order to train a custom neural-network-based classification model for the main application. This model can then be used in place of the one provided by default and will be more performant, as it will be trained on your personal and natural hand poses (see hamoco-train). Type hamoco-data --help for more information on the available options.

This application requires two arguments:

  • pose: a string that indicates the type of hand pose you intend to record. It should be one of: OPEN, CLOSE, INDEX_UP, PINKY_UP, THUMB_SIDE, INDEX_MIDDLE_UP.
  • path_to_data: path to the folder inside of which you want the recorded data to be saved.

Examples:

  • hamoco-data OPEN data/ --delay 1.0 : starts the recording for the OPEN hand pose, stores the resulting data in the data folder (provided it exists!), and takes a new snapshot every second.
  • hamoco-data INDEX_UP data/ --delay 0.25 --images : starts the recording for the INDEX_UP hand pose, stores the resulting data in the data folder, takes a new snapshot every 0.25s, and saves the images (in addition to the numeric data file used for training the model). Saving images can be useful if you want to manually check if your hand was in a correct position when its numerical data was recorded, and hence keep or remove specific data files accordingly.
  • hamoco-data CLOSE data/ --reset --stop_after 200 : starts the recording of the CLOSE hand pose, stores the resulting data in the data folder, deletes every previously recorded file for this hand pose, and automatically stop the recording after taking 200 snapshots.

hamoco-train

Provided a path to a directory with compatible data, hamoco-train trains a customizable NN-based classification model to predict a hand pose. This classification model can then be used in the main application in place of the one provided by default. Type hamoco-train --help for more information on the available options.

This application requires two arguments:

  • path_to_model : path to save the newly trained model.
  • path_to_data : path to the data folder to use to train the model (see hamoco-data).

Examples:

  • hamoco-train my_custom_model.h5 data/ --hiden_layers 50 25 --epochs 20 : trains and save a model named my_custom_model.h5 that contains two hidden layers (with dimensions 50 and 25 respectively) over 20 epochs, by using the compatible data in the data folder.
  • hamoco-train my_custom_model.h5 data/ --epochs 10 --learning_rate 0.1 : trains and save a model named my_custom_model.h5 with default dimensions over 20 epochs and with a learning rate of 0.1, by using the compatible data in the data folder.

Your model can then be used in the main application with the --model flag of hamoco-run, e.g. hamoco-run --model <path_to_your_model> , or you can change the .json configuration file to point to it.

Author

Joris Paret

hamoco's People

Contributors

jorisparet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

omlins

hamoco's Issues

Move hand horizontally in relaxed handshake position to control mouse pointer

Moving the (open) hand in front of the camera up and down is naturally quite tiring for longer use. Thus, would it be possible to create a mode where one can move the hand in relaxed handshake position horizontally over the desk to control the mouse pointer? I mean just the same way as you would do with a ergonomic vertical mouse, but without holding a mouse! That would be the most ergonomic "mouse" ever made! The camera would of course have to be pointed downwards to the desk.

In the first step, it would be already absolutely awesome to only be able to move the mouse pointer that way without being able to do other gestures ( I would like to use it in any case in combination with a voice control software that I am developing, where I can control mouse buttons by voice, see https://github.com/omlins/JustSayIt.jl )

Select webcam

First, congratulations on the great project! (I have made quite a bit of google search before deciding to go with your project...)

Is there a way to select the webcam? In my case, on Ubuntu 22.04, it automatically selects the notebook integrated webcam. However, I would need to use a webcam connected with USB.

Windows 10/11: Code crashes when using thumb_side and moving hand up and down

Code crashes when you utilize the thumb_side gesture and then move your hand up or down.
Utilizing Python 3.8.10 (but also tested on Python 3.10.4)
Running the code as: hamoco-run -S 0.1 --min_cutoff_filter 0.5 --minimum_prediction_confidence 0.9 --beta_filter 15 --show
Warnings upon starting the code:

ZZZ: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
ZZZ: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
ZZZ: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
ZZZ: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
ZZZ: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
ZZZ: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
ZZZ: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
ZZZ: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found
ZZZ: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
ZZZ: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
ZZZ: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
ZZZ: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.

Traceback after making the code crash:

Traceback (most recent call last):
  File "C:/Users/zzz/AppData/Local/Programs/Python/Python38/Scripts/hamoco-run", line 123, in <module>
    hand_controller.operate_mouse(hand,
  File "C:\Users\zzz\AppData\Local\Programs\Python\Python38\lib\site-packages\hamoco\controller.py", line 162, in operate_mouse
    pyautogui.scroll(numpy.sign(diff_to_origin_y), _pause=False)
  File "C:\Users\zzz\AppData\Local\Programs\Python\Python38\lib\site-packages\pyautogui\__init__.py", line 598, in wrapper
    returnVal = wrappedFunction(*args, **kwargs)
  File "C:\Users\zzz\AppData\Local\Programs\Python\Python38\lib\site-packages\pyautogui\__init__.py", line 1196, in scroll
    platformModule._scroll(clicks, x, y)
  File "C:\Users\zzz\AppData\Local\Programs\Python\Python38\lib\site-packages\pyautogui\_pyautogui_win.py", line 534, in _scroll
    _sendMouseEvent(MOUSEEVENTF_WHEEL, x, y, dwData=clicks)
  File "C:\Users\zzz\AppData\Local\Programs\Python\Python38\lib\site-packages\pyautogui\_pyautogui_win.py", line 495, in _sendMouseEvent
    ctypes.windll.user32.mouse_event(ev, ctypes.c_long(convertedX), ctypes.c_long(convertedY), dwData, 0)
ctypes.ArgumentError: argument 4: <class 'TypeError'>: Don't know how to convert parameter 4
[ WARN:[email protected]] global D:\a\opencv-python\opencv-python\opencv\modules\videoio\src\cap_msmf.cpp (539) `anonymous-namespace'::SourceReaderCB::~SourceReaderCB terminating async callback

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.