Giter Club home page Giter Club logo

opencv-mtcnn's Introduction

opencv-mtcnn

This is an inference implementation of MTCNN (Multi-task Cascaded Convolutional Network) to perform Face Detection and Alignment using OpenCV's DNN module.

MTCNN

[ZHANG2016] Zhang, K., Zhang, Z., Li, Z., and Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10):1499โ€“1503.

https://kpzhang93.github.io/MTCNN_face_detection_alignment/paper/spl.pdf

OpenCV's DNN module

Since OpenCV 3.1 there is a module called DNN that provides the inference support. The module is capable of taking models & weights from various popular frameworks such as Caffe, tensorflow, darknet etc.

You can read more about it here - https://github.com/opencv/opencv/wiki/Deep-Learning-in-OpenCV

Note that at present there is no support to perform training in OpenCV's DNN module and if I understood correctly there is no intention either.

Compile / Run

Requirements

  • OpenCV 3.4+
  • Boost FileSystem (1.58+) [only required for the sample application]
  • CMake 3.2+

I am using CMake as the build tool. Here are the steps to try the implementation -

# compiling the library and the sample application
git clone https://github.com/ksachdeva/opencv-mtcnn
cd opencv-mtcnn
mkdir build
cd build
cmake ..
cmake --build .
# running the sample application
cd build
./sample/app <path_to_models_dir> <path_to_test_image>

# here are some example cmd lines to run with the model and images in the test repository

# An image with 0 human faces (have picture of 4 dogs)
./sample/app ../data/models ../data/dogs.jpg

# An image with 1 face
./sample/app ../data/models ../data/Aaron_Peirsol_0003.jpg

# An image with 7 faces
./sample/app ../data/models ../data/2007_007763.jpg

Result

Here is an example of how the execution of the sample application looks like

Result

Acknowledgments

Most of the implementations of MTCNN are based on either Caffe or Tensorflow. I wanted to play with OpenCV's DNN implementation and understand the paper bit better. While implementing it, I looked at various other C++ implementations (again all of them use Caffe) and more specifically borrowed utilities from https://github.com/golunovas/mtcnn-cpp. IMHO, I found his implementation (in C++) that is based on Caffe to be the cleanest amongst many others.

The model files are taken from https://github.com/kpzhang93/MTCNN_face_detection_alignment/tree/master/code

The image file "Aaron_Peirsol_0003.jpg" is from the LFW database (http://vis-www.cs.umass.edu/lfw/)

The image files "dog.jpg" & "2007_007763.jpg" are from dlib's github repository (https://github.com/davisking/dlib/blob/master/examples/faces)

opencv-mtcnn's People

Contributors

ksachdeva avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

opencv-mtcnn's Issues

Using CUDA is slower

I tried to change your codes to support cuda, but after using CUDA, the program is slower. When not using cuda, it takes about 0.2 seconds to infer a picture, but after using cuda, it becomes 1.6 seconds. The following is my code, why is this happening?


ProposalNetwork::ProposalNetwork(const ProposalNetwork::Config &config)
{
_net = cv::dnn::readNetFromCaffe(config.protoText, config.caffeModel);

if (_net.empty())
{
throw std::invalid_argument("invalid protoText or caffeModel");
}
else
{
if (config.useGPU)
{
_net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
_net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
std::cout << "using CUDA" << std::endl;
}
}
_threshold = config.threshold;
}

@ksachdeva

Why does rgbImg need a .t()?

Hello, I have one question about the input data processing in MTCNNDetector::detect().
rgbImg.convertTo(rgbImg, CV_32FC3); rgbImg = rgbImg.t();
Why does rgbImg need a .t()?

False detections

Hi Kapil,

Thanks for the wonderful project. Compiling the code with OpenCV was straight forward.

The modified the sample application to take inputs from my webcam. Rarely, false detection pops up along with the original face. I'm trying to debug this further. Any pointers to debug this further would be great.

Thanks,
San

increased floating point prediction precision

Hello,

I see that MTCNN detect_faces prediction is an integer.
Is there a way to get higher precision on its predictions?

[{'box': [33, 137, 443, 580], 'confidence': 0.9859113693237305, 'keypoints': {'left_eye': (169, 330), 'right_eye': (380, 335), 'nose': (276, 440), 'mouth_left': (162, 551), 'mouth_right': (369, 554)}}]

I would like to see 'left_eye': (169.xx, 330.yy) instead of just 'left_eye': (169, 330)

S

get cv::Exception from RefineNetwork::run

when I try to detect a face in an image I get:
terminate called after throwing an instance of 'cv::Exception'
what(): OpenCV(4.0.0-pre) /opt/opencv/modules/dnn/src/dnn.cpp:142: error: (-215:Assertion failed) !images.empty() in function 'blobFromImages'

img_to_detect_2

Thread 1 "faceengine" received signal SIGABRT, Aborted.
0x00007fffd8f50428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007fffd8f50428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007fffd8f5202a in __GI_abort () at abort.c:89
#2 0x00007fffd98928f7 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007fffd9898a46 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007fffd9898a81 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007fffd9898cb4 in _cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fffd9dcb52a in cv::error(cv::Exception const&) () from /usr/local/lib/libopencv_core.so.4.0
#7 0x00007fffd9dcc36f in cv::error(int, cv::String const&, char const*, char const*, int) () from /usr/local/lib/libopencv_core.so.4.0
#8 0x00007fffdacdb932 in cv::dnn::experimental_dnn_34_v7::blobFromImages(cv::InputArray const&, cv::OutputArray const&, double, cv::Size, cv::Scalar const&, bool, bool, int) ()
from /usr/local/lib/libopencv_dnn.so.4.0
#9 0x00007fffdacdbc51 in cv::dnn::experimental_dnn_34_v7::blobFromImages(cv::InputArray const&, double, cv::Size, cv::Scalar
const&, bool, bool, int) ()
from /usr/local/lib/libopencv_dnn.so.4.0
#10 0x0000000000564ab3 in RefineNetwork::run(cv::Mat const&, std::vector<Face, std::allocator > const&) ()
#11 0x000000000055d496 in MTCNNDetector::detect(cv::Mat const&, float, float) ()

python code

if you have python code of project please publish code

is the Detection is multi-thread safe?

I have an issue when using the detector with multithreaded, me process core if I removed the call to the detector no core.
how can I check this or fix this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.