ksachdeva / opencv-mtcnn Goto Github PK

An implementation of MTCNN Face detector using OpenCV's DNN module

License: Apache License 2.0

CMake 6.15% C++ 93.85%

opencv mtcnn opencv-dnn inference opencl dnn face-detection

opencv-mtcnn's Introduction

opencv-mtcnn

This is an inference implementation of MTCNN (Multi-task Cascaded Convolutional Network) to perform Face Detection and Alignment using OpenCV's DNN module.

MTCNN

[ZHANG2016] Zhang, K., Zhang, Z., Li, Z., and Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10):1499–1503.

https://kpzhang93.github.io/MTCNN_face_detection_alignment/paper/spl.pdf

OpenCV's DNN module

Since OpenCV 3.1 there is a module called DNN that provides the inference support. The module is capable of taking models & weights from various popular frameworks such as Caffe, tensorflow, darknet etc.

You can read more about it here - https://github.com/opencv/opencv/wiki/Deep-Learning-in-OpenCV

Note that at present there is no support to perform training in OpenCV's DNN module and if I understood correctly there is no intention either.

Compile / Run

Requirements

OpenCV 3.4+
Boost FileSystem (1.58+) [only required for the sample application]
CMake 3.2+

I am using CMake as the build tool. Here are the steps to try the implementation -

# compiling the library and the sample application
git clone https://github.com/ksachdeva/opencv-mtcnn
cd opencv-mtcnn
mkdir build
cd build
cmake ..
cmake --build .

# running the sample application
cd build
./sample/app <path_to_models_dir> <path_to_test_image>

# here are some example cmd lines to run with the model and images in the test repository

# An image with 0 human faces (have picture of 4 dogs)
./sample/app ../data/models ../data/dogs.jpg

# An image with 1 face
./sample/app ../data/models ../data/Aaron_Peirsol_0003.jpg

# An image with 7 faces
./sample/app ../data/models ../data/2007_007763.jpg

Result

Here is an example of how the execution of the sample application looks like

Acknowledgments

Most of the implementations of MTCNN are based on either Caffe or Tensorflow. I wanted to play with OpenCV's DNN implementation and understand the paper bit better. While implementing it, I looked at various other C++ implementations (again all of them use Caffe) and more specifically borrowed utilities from https://github.com/golunovas/mtcnn-cpp. IMHO, I found his implementation (in C++) that is based on Caffe to be the cleanest amongst many others.

The model files are taken from https://github.com/kpzhang93/MTCNN_face_detection_alignment/tree/master/code

The image file "Aaron_Peirsol_0003.jpg" is from the LFW database (http://vis-www.cs.umass.edu/lfw/)

The image files "dog.jpg" & "2007_007763.jpg" are from dlib's github repository (https://github.com/davisking/dlib/blob/master/examples/faces)

opencv-mtcnn's People

Contributors

Stargazers

Watchers

opencv-mtcnn's Issues

Using CUDA is slower

I tried to change your codes to support cuda, but after using CUDA, the program is slower. When not using cuda, it takes about 0.2 seconds to infer a picture, but after using cuda, it becomes 1.6 seconds. The following is my code, why is this happening?

ProposalNetwork::ProposalNetwork(const ProposalNetwork::Config &config)
{
_net = cv::dnn::readNetFromCaffe(config.protoText, config.caffeModel);

if (_net.empty())
{
throw std::invalid_argument("invalid protoText or caffeModel");
}
else
{
if (config.useGPU)
{
_net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
_net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
std::cout << "using CUDA" << std::endl;
}
}
_threshold = config.threshold;
}

@ksachdeva

Why does rgbImg need a .t()?

Hello, I have one question about the input data processing in MTCNNDetector::detect().
rgbImg.convertTo(rgbImg, CV_32FC3); rgbImg = rgbImg.t();
Why does rgbImg need a .t()?

False detections

Hi Kapil,

Thanks for the wonderful project. Compiling the code with OpenCV was straight forward.

The modified the sample application to take inputs from my webcam. Rarely, false detection pops up along with the original face. I'm trying to debug this further. Any pointers to debug this further would be great.

Thanks,
San

increased floating point prediction precision

Hello,

I see that MTCNN detect_faces prediction is an integer.
Is there a way to get higher precision on its predictions?

[{'box': [33, 137, 443, 580], 'confidence': 0.9859113693237305, 'keypoints': {'left_eye': (169, 330), 'right_eye': (380, 335), 'nose': (276, 440), 'mouth_left': (162, 551), 'mouth_right': (369, 554)}}]

I would like to see 'left_eye': (169.xx, 330.yy) instead of just 'left_eye': (169, 330)

get cv::Exception from RefineNetwork::run

when I try to detect a face in an image I get:
terminate called after throwing an instance of 'cv::Exception'
what(): OpenCV(4.0.0-pre) /opt/opencv/modules/dnn/src/dnn.cpp:142: error: (-215:Assertion failed) !images.empty() in function 'blobFromImages'

Thread 1 "faceengine" received signal SIGABRT, Aborted.
0x00007fffd8f50428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007fffd8f50428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007fffd8f5202a in __GI_abort () at abort.c:89
#2 0x00007fffd98928f7 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007fffd9898a46 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007fffd9898a81 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007fffd9898cb4 in _cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fffd9dcb52a in cv::error(cv::Exception const&) () from /usr/local/lib/libopencv_core.so.4.0
#7 0x00007fffd9dcc36f in cv::error(int, cv::String const&, char const*, char const*, int) () from /usr/local/lib/libopencv_core.so.4.0
#8 0x00007fffdacdb932 in cv::dnn::experimental_dnn_34_v7::blobFromImages(cv::InputArray const&, cv::OutputArray const&, double, cv::Size, cv::Scalar const&, bool, bool, int) ()
from /usr/local/lib/libopencv_dnn.so.4.0
#9 0x00007fffdacdbc51 in cv::dnn::experimental_dnn_34_v7::blobFromImages(cv::InputArray const&, double, cv::Size, cv::Scalar const&, bool, bool, int) ()
from /usr/local/lib/libopencv_dnn.so.4.0
#10 0x0000000000564ab3 in RefineNetwork::run(cv::Mat const&, std::vector<Face, std::allocator > const&) ()
#11 0x000000000055d496 in MTCNNDetector::detect(cv::Mat const&, float, float) ()