YOLO-NAS-OpenVino.cpp

YOLO-NAS is a state-of-the-art object detector by Deci AI. This project implements the YOLO-NAS object detector in C++ with an OpenVINO backend to speed up inference performance.

Features

Supports both image and video inference.
Faster inference speeds.

Getting Started

The following instructions demonstrates how to build this project on a Windows system and Linux systems supported by OpenVINO.

Prerequisites

CMake v3.8+ - found at https://cmake.org/
MSVC 2017++ (Windows Build) - MinGW will not work on Windows Build as OpenVINO libraries are not compatible with MinGW.
GNU (Linux Build) - tested on v11.4.0.
OpenVINO Toolkit - tested on 2022.1. Download here.
OpenCV v4.0+ - tested on v4.7. Download here.

Building the project

Set the OpenCV_DIR environment variable to point to your ../../opencv/build directory.
Set the OpenVINO_DIR environment variable to point to your ../../openvino/runtime/cmake directory.
Run the following build commands:

a. [Windows] VS Developer Command Prompt:

cd \d <yolo-nas-openvino-cpp-directory>
cmake -S. -Bbuild -DCMAKE_BUILD_TYPE=Release
cd build

MSBuild yolo-nas-openvino-cpp.sln /property:Configuration=Release

b. [Linux] Bash:

cd <yolo-nas-openvino-cpp-directory>
cmake -S. -Bbuild -DCMAKE_BUILD_TYPE=Release
cd build

make

The compiled .exe will be inside the Release folder for Windows build, while the executable will be in root folder for Linux build.

Inference

Export the ONNX file:

from super_gradients.training import models

model = models.get("yolo_nas_s", pretrained_weights="coco")
model.eval()
model.prep_model_for_conversion(input_size=(1, 3, 640, 640))
models.convert_to_onnx(model=model, prep_model_for_conversion_kwargs={"input_size":(1, 3, 640, 640)}, out_path="yolo_nas_s.onnx")

Convert the ONNX model to OpenVINO IR:

mo --input_model yolo_nas_s.onnx -s 255 --reverse_input_channels

To run the inference, execute the following command:

yolo-nas-openvino-cpp --model <OPENVINO_IR_XML_PATH> [-i <IMAGE_PATH> | -v <VIDEO_PATH>] [--imgsz IMAGE_SIZE] [--gpu] [--iou-thresh IOU_THRESHOLD] [--score-thresh CONFIDENCE_THRESHOLD]

Benchmarks

The following benchmarks were done on Google Colab using Intel® Xeon® Processor E5-2699 v4 @ 2.20GHz with 2 vCPUs.

Backend	Latency	FPS	Implementation
PyTorch	867.02ms	1.15	Native (`model.predict()` in `super_gradients`)
ONNX C++ (via OpenCV DNN)	962.27ms	1.04	Hyuotu
ONNX Python	626.37ms	1.59	Hyuotu
OpenVINO C++	628.04ms	1.59	Y-T-G

Authors

Mohammed Yasin - @Y-T-G

Acknowledgements

Thanks to @Hyuto for his work on ONNX implementation of YOLO-NAS in C++ which was utilized in this project.

License

This project is licensed under the MIT License - see the LICENSE file for details.

y-t-g / yolo-nas-openvino-cpp Goto Github PK