Giter Club home page Giter Club logo

sign-language-recognition--mediapipe-dtw's Introduction

Sign Language Recognition - using MediaPipe and DTW

License: MIT

This repository proposes an implementation of a Sign Recognition Model using the MediaPipe library for landmark extraction and Dynamic Time Warping (DTW) as a similarity metric between signs.


Set up

1. Open terminal and go to the Project directory

2. Install the necessary libraries

  • pip install -r requirements.txt

3. Import Videos of signs which will be considered as reference

The architecture of the videos/ folder must be:

|data/
    |-videos/
          |-Hello/
            |-<video_of_hello_1>.mp4
            |-<video_of_hello_2>.mp4
            ...
          |-Thanks/
            |-<video_of_thanks_1>.mp4
            |-<video_of_thanks_2>.mp4
            ...

To automatically create a small dataset of French signs:

  • Install ffmpeg (for MacOS brew install ffmpeg)
  • Run: python yt_download.py
  • Add more YouTube links in yt_links.csv if needed

N.B. The current dataset is insufficient to obtain good results. Feel free to add more links or import your own videos

4. Load the dataset and turn on the Webcam

  • python main.py

5. Press the "r" key to record the sign.


Code Description

Landmark extraction (MediaPipe)

  • The Holistic Model of MediaPipe allows us to extract the keypoints of the Hands, Pose and Face models. For now, the implementation only uses the Hand model to predict the sign.

Hand Model

  • In this project a HandModel has been created to define the Hand gesture at each frame. If a hand is not present we set all the positions to zero.

  • In order to be invariant to orientation and scale, the feature vector of the HandModel is a list of the angles between all the connexions of the hand.

Sign Model

  • The SignModel is created from a list of landmarks (extracted from a video)

  • For each frame, we store the feature vectors of each hand.

Sign Recorder

  • The SignRecorder class stores the HandModels of left hand and right hand for each frame when recording.
  • Once the recording is finished, it computes the DTW of the recorded sign and all the reference signs present in the dataset.
  • Finally, a voting logic is added to output a result only if the prediction confidence is higher than a threshold.

Dynamic Time Warping

  • DTW is widely used for computing time series similarity.

  • In this project, we compute the DTW of the variation of hand connexion angles over time.


References

sign-language-recognition--mediapipe-dtw's People

Contributors

gabguerin avatar gabrielsicara avatar nomppy avatar uriegas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

sign-language-recognition--mediapipe-dtw's Issues

Question about sign recording

What are the ideal conditions to record videos for a given sign? Light, Background, camera resolution. I have recorded like 30 or so videos for a single sign. But every time I hit r to record, always throws Unknown Sign.

Also note that the camera I'm using has a 1920x1080 resolution at 30fps

Signe Inconnu

sign_recorder.py

Tuple expression not allowed in type annotation
  Use Tuple[T1, ..., Tn] to indicate a tuple type or Union[T1, T2] to indicate a union type

Issue: Pulling Video from Youtube

Hi,

When trying to pull videos from youtube, I receive the following error:

File "C:\Python\Sign-Language-Recognition--MediaPipe-DTW-master\yt_download.py", line 53, in
download_video(*row)
File "C:\Python\Sign-Language-Recognition--MediaPipe-DTW-master\yt_download.py", line 20, in download_video
YouTube(f"https://www.youtube.com/watch?v={video_id}")
File "C:\Python\Python38\lib\site-packages\pytube_main_.py", line 292, in streams
return StreamQuery(self.fmt_streams)
File "C:\Python\Python38\lib\site-packages\pytube_main_.py", line 177, in fmt_streams
extract.apply_signature(stream_manifest, self.vid_info, self.js)
File "C:\Python\Python38\lib\site-packages\pytube\extract.py", line 409, in apply_signature
cipher = Cipher(js=js)
File "C:\Python\Python38\lib\site-packages\pytube\cipher.py", line 43, in init
self.throttling_plan = get_throttling_plan(js)
File "C:\Python\Python38\lib\site-packages\pytube\cipher.py", line 387, in get_throttling_plan
raw_code = get_throttling_function_code(js)
File "C:\Python\Python38\lib\site-packages\pytube\cipher.py", line 301, in get_throttling_function_code
code_lines_list = find_object_from_startpoint(js, match.span()[1]).split('\n')
AttributeError: 'NoneType' object has no attribute 'span'

Any suggestions?

Ian

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.