Giter Club home page Giter Club logo

seniorcapstone's Introduction

Welcome to Wit

Wit is an eye tracking software that uses pupil labs core eye tracking glasses to read signs and any text the user looks at. It can read a book, look up restraints when you look at an advertisement. We test our software in three ways, images, video, and real-time.

Pupil Labs

We have used many of the resources that pupil labs has published along with a few community projects in order to connect this software.

File Structure

We have spilt the program into a collection of files that connect and allow for easy modification. Each file has their own methods that are either ran from within or are called from a Control file. The program can be split two sections. The first being data collection or recording, in which the user is wearing the glasses and recording what they are looking at. Then the analysis of the recorded data.

Programs that Analyse

  • Wit.py: Images
  • WitVideo.py: Video
  • Real-time.py: Real-time

Text Extraction

Contains all the files needed to collect and manipulate data that can then be used while analysing data.

Frame Selection

Frame selection takes the gaze data from a given recording and uses it to find which frame of the video where the user is staring at a single object. It does this by taking 3 columns of data from the gaze_position.csv. Which are norm_pos_x, norm_pos_y, and world_index(the closest world video frame).

  • perms: gaze_position.csv file
  • image_data: is an int that contains the selected frame number.

    Before

    Before

    After

    After

VideoFrameCapture

Purpose is to split the video into frames, save them, retrieve them, and check if they have already been created. Contains three methods, videoToFrames(), grabbingFrame(), and frame_check().

  • videoToFrames

    • Uses the video file that is linked from file_dir to get the world.mp4, then splits it into frames and saves them to the Frames directory.
  • grabbingFrame

    • Finds the frame that matches the given frame number and returns the name of it as a string.
  • frame_check

    • Used to check the /Frames directory for files. Designed to eliminate the need to constantly re-create the same frame files while testing.
    • If there are not any it will call videoToFrames to create them.

Real_Time_Recording

Built off of Pupil Labs network-api (https://docs.pupil-labs.com/developer/core/network-api/). It is used to start a recording and collect real-time data. Contains three methods, start_stop_recording(), export_recoding(), and data_collection().

  • start_stop_recording

    • starts and stops a recording based on a given amount of seconds. It is a slightly modified version of the Pupil Remote found in the link mentioned above.
    • Parms: Seconds (The desired length of the recording)
  • export_recording

    • Uses powershell commands to press the 'e' key once pupil player is started. This then starts Pupil players' export process which creates a collection of files based on the data within the initial recording.
  • data_collection

    • Acting as a main method, data_collection takes in a value for number of seconds that the user would like the recording to be. It then starts a recording, and connects to the glasses locally to receive a live data feed of the gaze coordinates. It stores that data into a dictionary that can accessed later to compare against the post processed data from Pupil Player.

Tesseract

Contains the files needed to extract text from images and videos.

Wit (images)

Combines East and Tesseract to build bounding boxes around text that is found within the given frame. Then takes the cleaned data frame from frame_selection and remaps them to match the Tesseracts bounding box grid. If any of the coordinates overlap with a bounding box that contains text, the word is added to a list that is returned one the program has finished.

Wit Video

Used to process video, by combining East and Tesseract we are able to take the world.mp4 file from the Pupil recording and create bounding boxes around the text that show up in the video. Then we take the raw coordinate data from frame_selection and convert to the Tesseract grid. If any of the coordinates overlap with a bounding box that contains text, the word is printed out and saved to a list.

seniorcapstone's People

Contributors

wcosta-01 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.