Giter Club home page Giter Club logo

deepeye's Introduction

deepEye - The third eye for Visually Impaired People

OpenCV announced its first Spatial AI Competition sponsored by Intel. As we know, OpenCV is a famous open-source computer vision library. They called for participants to solve real-world problems by using OAK-D (OpenCV AI Kit with Depth) module. The OAK-D module has built-in Stereo cameras along with an RGB camera. It also has powerful visual processing unit (Myriad X from Intel) to enable deep neural network inferences on board.

We decided to submit a project proposal for this competition back in July. Our group’s proposal was selected (among 32 out of 235).

So, we propose to build an advanced assist system for the Visually Impaired People to perceive the environment in a better way and would provide seamless, reliable navigation for them at a low cost so that anyone can leverage the benefits of computer vision.

Demo Videos

πŸ‘‰ deepEye Demo
deepEye Demo

Table of content

🎬 Software High Level Design

HLD

πŸ—ƒ Project structure

.
β”œβ”€β”€ android                      
β”‚   β”œβ”€β”€ apk                                 # Android APK File       
β”‚   β”‚   └── app-debug.apk
β”‚   └── startup_linux
β”‚       β”œβ”€β”€ deepeye.sh                      # deepeye startup script to enable RFCOMM
β”‚       └── rfcomm.service                  # systemd service for RFCOMM
|
β”œβ”€β”€ custom_model
β”‚   └── OI_Dataset                          # Mobile Net SSD V2 Custom training on OpenImage Dataset V4
β”‚       β”œβ”€β”€ README.md
β”‚       β”œβ”€β”€ requirements.txt
β”‚       β”œβ”€β”€ scripts
β”‚       β”‚   β”œβ”€β”€ csv2tfrecord.py             # Tensorflow: CSV to TFrecord Converter
β”‚       β”‚   β”œβ”€β”€ txt2xml.py                  # Tensorflow: TXT to XML Converter
β”‚       β”‚   └── xml2csv.py                  # Tensorflow: XML to CSV Converter
β”‚       └── tf_test.py                      # Test script for Trained model inference
|
β”œβ”€β”€ deepeye_app                             # Deepeye core application
β”‚   β”œβ”€β”€ app.py                              # Object detection and post processing
β”‚   β”œβ”€β”€ calibration                         # Camera Callibration
β”‚   β”‚   └── config
β”‚   β”‚       └── BW1098FFC.json
β”‚   β”œβ”€β”€ collision_avoidance.py              # Collision calculation
β”‚   β”œβ”€β”€ config.py
β”‚   β”œβ”€β”€ models                              # Mobilenet-ssd v2 trained model
β”‚   β”‚   β”œβ”€β”€ mobilenet-ssd.blob
β”‚   β”‚   └── mobilenet-ssd_depth.json
β”‚   β”œβ”€β”€ tracker.py                          # Object tracker
β”‚   └── txt2speech                          # txt2speech model
β”‚       β”œβ”€β”€ README.md
β”‚       β”œβ”€β”€ txt2speech.py
β”‚       └── txt-simulator.py
β”œβ”€β”€ images
β”œβ”€β”€ openvino_analysis                       # CNN model fom Intel and Opensouce ACC, FPS analysis
β”‚   β”œβ”€β”€ intel
β”‚   β”‚   β”œβ”€β”€ object-detection
β”‚   β”‚   └── semantic-segmentation
β”‚   β”œβ”€β”€ public
β”‚   β”‚   β”œβ”€β”€ ssd_mobilenet_v2_coco
β”‚   β”‚   └── yolo-v3
β”‚   └── README.md
β”œβ”€β”€ README.md                              # Deepeye README
β”œβ”€β”€ requirements.txt                  
└── scripts                                # OpenVino Toolkit scripts
    β”œβ”€β”€ inference_engine_native_myriad.sh  
    β”œβ”€β”€ model_intel.sh
    └── rpi_openvino_install-2020_1.sh

πŸ’» Hardware pre-requisite

πŸ“¦ Software pre-requisite

For Jetson: Flash Jetson board to jetpack 4.4 ⚑️

microSD card Prepration:

  1. Download Jetson Nano Developer Kit SD Card image Jetpack4.4 Image.
  2. Use etcher to burn a image.

CUDA Env PATH :

if ! grep 'cuda/bin' ${HOME}/.bashrc > /dev/null ; then
  echo "** Add CUDA stuffs into ~/.bashrc"
  echo >> ${HOME}/.bashrc
  echo "export PATH=/usr/local/cuda/bin:\${PATH}" >> ${HOME}/.bashrc
  echo "export LD_LIBRARY_PATH=/usr/local/cuda/lib64:\${LD_LIBRARY_PATH}" >> ${HOME}/.bashrc
fi
source ${HOME}/.bashrc

System dependencies :

sudo apt-get update
sudo apt-get install -y build-essential make cmake cmake-curses-gui
sudo apt-get install -y git g++ pkg-config curl libfreetype6-dev
sudo apt-get install -y libcanberra-gtk-module libcanberra-gtk3-module
sudo apt-get install -y python3-dev python3-testresources python3-pip
sudo pip3 install -U pip

Performance Improvements:

To set Jetson Nano to 10W performance mode (reference), execute the following from a terminal:

sudo nvpmodel -m 0
sudo jetson_clocks

Enable swap:

sudo fallocate -l 8G /mnt/8GB.swap
sudo mkswap /mnt/8GB.swap
sudo swapon /mnt/8GB.swap
if ! grep swap /etc/fstab > /dev/null; then \
    echo "/mnt/8GB.swap  none  swap  sw  0  0" | sudo tee -a /etc/fstab; \
fi

jetson performance analysis:

pip3 install jetson-stats

Recompile a Jetson Linux kernel - Support RFCOMM TTY Support:

We are using RFCOMM Serial protocol for Jetson-Android communication and the defauly kernel doesn't have a support for RFCOMM TTY. So, We have to recompile with new kernel config and update.

# Basic Update
sudo apt-get update
sudo apt-get install -y libncurses5-dev

# Downlaod Linux L4T(BSP) Source code from Nvidia Downlaod center
wget https://developer.nvidia.com/embedded/L4T/r32_Release_v4.3/Sources/T210/public_sources.tbz2

tar -xvf public_sources.tbz2

cp Linux_for_Tegra/source/public/kernel_src.tbz2 ~/

pushd ~/

tar -xvf kernel_src.tbz2

pushd ~/kernel/kernel-4.9

zcat /proc/config.gz > .config

# Enable RFCOMM TTY
make menuconfig # Networking Support --> Bluetooth subsystem support ---> Select RFCOMM TTY Support ---> Save ---> Exit

make prepare

make modules_prepare

# Compile kernel as an image file
make -j5 Image

# Compile all kernel modules
make -j5 modules

# Install modules and kernel image
cd ~/kernel/kernel-4.9
sudo make modules_install
sudo cp arch/arm64/boot/Image /boot/Image

# Reboot 
sudo reboot

Depth AI Python Interface Install

# Install dep
curl -fL http://docs.luxonis.com/install_dependencies.sh | bash
sudo apt install libusb-1.0-0-dev

# USB Udev 
echo 'SUBSYSTEM=="usb", ATTRS{idVendor}=="03e7", MODE="0666"' | sudo tee /etc/udev/rules.d/80-movidius.rules
sudo udevadm control --reload-rules && sudo udevadm trigger

git clone https://github.com/luxonis/depthai-python.git
cd depthai-python
git submodule update --init --recursive
mkdir -p ~/depthai_v1
python3 -m venv ~/depthai_v1
python3 -m pip install -U pip
python3 setup.py develop

# Check the Installation
python3 -c "import depthai"

# Install opencv
cd scripts
bash opencv.sh
cd ..

Camera Calibration

mkdir -p ~/depthai/ && pushd ~/depthai/
git clone https://github.com/luxonis/depthai.git
popd
cp calibration/config/BW1098FFC.json depthAI/depthai/resources/boards/
pushd ~/depthai/
python3 calibrate.py -s 2 -brd BW1098FFC -ih

Robotic Operating System

We use ROS framework multiprocess communication.

sudo sh -c 'echo "deb http://packages.ros.org/ros/ubuntu $(lsb_release -sc) main" > /etc/apt/sources.list.d/ros-latest.list'
sudo apt-key adv --keyserver 'hkp://keyserver.ubuntu.com:80' --recv-key C1CF6E31E6BADE8868B172B4F42ED6FBAB17C654

sudo apt update

sudo apt install -y ros-melodic-ros-base

# Env setup
echo "source /opt/ros/melodic/setup.bash" >> ~/.bashrc
source ~/.bashrc

# Dep to build ROS Package
sudo apt install python-rosdep python-rosinstall python-rosinstall-generator python-wstool build-essential

# Install inside virtual env
sudo apt install python-rosdep
rosdep init

rosdep update

Android RFCOMM Setup

We need to configure rfcomm service in order to use the Android application for text to speech feature.

sudo cp android/startup_linux/deepeye.sh /usr/bin/
sudo chmod a+x /usr/bin/deepeye.sh

sudo cp android/startup_linux/rfcomm.service /etc/systemd/system/
sudo systemctl enable rfcomm

Other Dependency

python3 -m pip install -r requirements.txt

# SOX for txt 2 speech
sudo apt-get install sox libsox-fmt-mp3

πŸ–– Quick Start

# Terminal one
# ROS Master
roscore &

# Terminal Two
# Deepeye core app
pushd deepeye_app
python3 app.py
popd

# Terminal three
# Txt2speech middleware component
pushd deepeye_app
python3 txt2speech/txt2speech.py
popd

πŸŽ› Advanced uses

Custom Object Detector

We have retrained an SSD MobileNet SSD-V2 with Open Image dataset. We picked up and trained all the object classes that help visually impaired people to navigate when they go to outdoor environments.

We have added README for the end to end training and the OpenVino Conversion before loading to depth AI.

πŸ›  Hardware Details

We plan use the DepthAI USB3 Modular Cameras[BW1098FFC] for POC. We are using RPI and Jeston. The AI/vision processing is done on the depthAI based on Myriad X Arch.

depthAI

Key Features of the device:

  • 2 BG0250TG mono camera module interfaces
  • 1 BG0249 RGB camera module interface
  • 5V power input via barrel jack
  • USB 3.1 Gen 1 Type-C
  • Pads for DepthAI SoM 1.8V SPI
  • Pads for DepthAI SoM 3.3V SDIO
  • Pads for DepthAI SoM 1.8V Aux Signals (I2C, UART, GPIO)
  • 5V Fan/Aux header
  • Pads for DepthAI SoM aux signals
  • Design files produced with Altium Designer 20

πŸ’Œ Acknowledgments

DepthaAI Home Page
depthaAI core development
OpenVino toolkit development
BW1098FFC_DepthAI_USB3 HW
OIDv4 ToolKit

deepeye's People

Contributors

dependabot[bot] avatar nullbyte91 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.