sai-bi / facealignment Goto Github PK

Face Alignment by Explicit Shape Regression

License: MIT License

C++ 98.95% CMake 1.05%

facealignment's Introduction

###Face Alignment This program is a C++ reimplementation of algorithms in paper "Face Alignment by Explicit Regression" by Cao et al. This program can be used to train models for detecting facial keypoints, and it is extremely fast both in training and testing.

Please go to folder FaceAlignment to see the source files.

###Update

Nov 13, 2014 Improve the speed of model training. Now it takes about 40 min to train a model on 1345 images with 20 initial for each on a Core i7 3.40 GHz CPU. Considering no parallel programing is used, this performance is acceptable.

###Usage To compile the program(OpenCV required):

// Go to folder FaceAlignment
cmake .
make TrainDemo.out
make TestDemo.out

To train a new model:

ShapeRegressor regressor;
regressor.Train(images,ground_truth_shapes,bounding_box,first_level_num,second_level_num,
                    candidate_pixel_num,fern_pixel_num,initial_number);
regressor.Save("./data/model.txt");

To predict a new input:

ShapeRegressor regressor;
regressor.load("./data/model_cofw_2.txt");
regressor.Predict(test_images[index],bounding_box[index],initial_number);

For details, please see TrainDemo.cpp and TestDemo.cpp.

###Dataset A public dataset is provided here. The dataset contains 1345 training images, 507 testing images, and each image has 29 landmarks. You can change the path in TrainDemo.cpp and TestDemo.cpp to train new models.

###Model I have prepared a model trained by me on COFW dataset, and you can access it here.

###FAQ

How to get the bounding box of an input face image? You can get the bounding box with a face detector, which has been implemented in OpenCV. However, do remember that, if you use the model provided by me, you must provide a bounding box of similar measure with the training data. Otherwise, the result will be poor. If the bounding box of training data is very small, but you provide a very big bounding box for testing data, it is certain that you will get a poor result. Here the same measure doesn't mean that they have to be the same size, but they have to be taken using the same standard, for example, the ratio between bounding box width and the two-eye distance should be the same.
It seems that there are still some errors when I do testing, so is there any problem? Generally speaking, the dataset attached is very challenging because it includes heavy occlusions and large shape variations. You could try other standard datasets such as Helen and LFW, which should lead to better performance.
Format of keypoints.txt and boundingbox.txt? For boundingbox.txt, each row is in the following format, specifying the bounding box of a face in the corresponding image:

x  // x coordinates of top-left corner
y  // y coordinates of top-right corner
width 
height

For keypoints.txt, each row is in the following format, specifying the ground truth of keypoints locations:

x_1 x_2 ... x_N y_1 y_2 ... y_N

###Contact If you have any question about the code, I prefer that you create an issue on GitHub rather than send me emails directly, so that others can also refer to it when they have problems. I will respond to it as soon as possible.

###Sample Results

###Reference papers: Face Alignment by Explicit Shape Regression

facealignment's People

Contributors

Stargazers

Watchers

Forkers

amos-zq phecy vsooda leezivin redsuncmx jwyang digitalimagep twyt dxj19831029 hwjfordev jishaoming bestdpf fdteam bullud nuanxinqing simonoso mudelin qilongzhang cheaster zippon pfshawn yijiayue rokirt songlei198705 zjucsxxd cmxnono miaomiao1989 samuel--hu twnming es0m bowrein timedcy mrgloom dengcy028 zibomeng kensun0 xiamenwcy dywangshsf ericchen2013 invisible0n harveyliufly jassonvia matthill randomicons fwzhuangcg andyhx milestonesvn faciallandmark biotrump kencoken chongbingbao chrisyang narinyir qingsong99 fleogefyr shinesmilels lisongjiang zmxu iscas-lee ming81 crycrane liulei2776 haolinwei jacobdstephens arvinyang runauto ashhher3 yantim dboyliao tpys elvnwolf zhangkom mydude lixuwork wenwei-dev matrixplayer llkey221 rivid zuoshaobo cuijianzhu silasxue ashwinrajendraprasad ddrun protez ymao1993 ilovecv gensou5 dstnation walkoncross clcarwin airyym shirleyyim sunxingxingtf leizi007 iamwx zzuzhan1 yehaibuaa wangxianliang sohuren xiexianhai

facealignment's Issues

How to accelerate the algorithm ，And OpenMP is OK?

Hi, I used the test demo ,and ,test 157 Images ,it cost 5129ms totally, i just do a loop
for i 1:157
current_shape=regressor.Predict(test_images[index],test_bounding_box[index],initial_number);
Average image cost 30ms . the init number had been set to 2 ,can you tell me why the speed is much
slower the the paper;
And the 3000fps using OpenMp support to accelerate the speed . Your project can also use openmp ?.

Is there a model for more landmarks?

Thanks a lot for your kind sharing. But the model you shared is for 29 landmarks. Could you share a model for more landmarks, say, 64 or 83 landmarks?

Couldn't get output model.txt after trainning

Thanks for your code! I complied it with cmake and run the TrainDemo.out. It took me 36 minutes to train the data. But after the training, I couldn't find ./data/model.txt, which was supposed to appear in my current directory. How should I fix this problem? Thank you!

What can we do to improve this method?

Thank you for sharing your code. I think your code is perfect. I have trained the 68 landmarks model using your code. I want to ask you what can we do to improve this method(I think this method is already very good)? Do you have some good ideas?
Thank you!

face detector

If I use the model that you have trained and provided, then which face detector would be prefered? The face detector realized in OpenCV?

keypoint.txt landmark is error?

keypoint.txt landmark is error?,
ifstream fin2;
locale::global(locale(""));
fin2.open("oursTrain/keypoints.txt");
locale::global(locale("C"));
string line;
int sample_freq=0;

while(getline(fin2,line))
{

    sample_freq++;
    printf("loading test:%d\n",sample_freq);

    for(int j = 0;j < 29;j++)
    {
                  fin2>>shapes(j,0); 
            }
           for(int j = 0;j < 29;j++)
    {
                  fin2>>shapes(j,1); 
            }   
}

   for (int m=0;m<shapes.rows;++m)
{
    circle(testImg1,Point(shapes(m,0),shapes(m,1)),3,Scalar(255));
}
imwrite("mark_face.jpg",testImg1);

i find the testImg1 that it landmark error, thanks

It seem has a bug in 天河

two level cascaded regression

Could u gimme more detail information of two level cascaded regression, please. Do u have any ppt that have more information about ESR rather than this one ? please provide more information about the regression and the regressors that used in ESR, please.

How much time does it cost to do predicting for an image?

Can the algorithm predict landmarks in 15 ms as the paper?
Thanks

Bounding Box issue

We trained a model on LFPW dataset with 68 landmark points. We used 1300 images for training and provided the bounding box according to the viola jones face detector as given in opencv. It took about 10 hours to create the model on intel(R) Xeon(R) processor. But we are not getting the results as expected during testing.
Is there any relation between number of training images to be used and number of landmarks?? Any other suggestions??

why use the two ProjectShape and ReProjectShape functions?

i have a lillte puzzle that you use the two ProjectShape and ReProjectShape functions in the code.
can you give me some ideas that how to understand the two functions?

Reg. Training for face alignment

when i started running TrainDemo.cpp, it takes more than a day, what is the minimum number of images required for training and what is the minimum first_level_num ,second_level_num; in your code you have given first_level_num=10.

how to understand the random projection in this parper

as the title,why can the random projection be useful?

Segmentation Fault in test code

Firstly, I tried to test your code with the trained model provided which was already provided here https://drive.google.com/file/d/0B0tUTCaZBkccOGZTcjJNcDMwa28/edit?usp=sharing

but, I ran into a couple of issues :

The output window - result only shows the face image when index is given without alignment data on the images.
I tried to print the values of current_shape inside the last for loop in TestDemo.cpp as
cout<<current_shape(i,0)<<"\t"<<current_shape(i,1)<<"\n";
but, all this prints is - nan

so, I thought that something is wrong with the given model, and I decided to train on my own. I have trained on the training set provided in COFW dataset. It took about 30mins to train. Now, when I try to test this model with the same test code (with updated paths to the new model), I keep getting segmentation fault whenever I access current_shape(i,0) and current_shape(i,1).

Let me know what further can be done to fix this issue.

training data

i don't understand what is the meaning of the parameter initial number.

i try to train why 5 land marks in the data base MTFL

Runtime error on Ubuntu 14.04

I have compiled the code successfully on my Ubuntu 14.04 machine. Also I've changed the path of the data in both TrainDemo.cpp and TestDemo.cpp. The training data I used is COFW dataset and when I ran TrainDemo.out, it crashed at line 104 of FernCascade.cpp while doing 1 out of 10 fern cascades training. Even if I skipped training step and used the model.txt you provided and ran TestDemo.out, it still crashed at line 137 of ShapeRegressor.cpp, when loading the model.txt file. I checked the location of the model file and I think it is correct. Since I'm not familiar with the details of your implementation I find it pretty hard for me to figure out the solution. Could you tell me how to solve these problems? Thank you.

about boundingbox

I find there are numbers over 10,000 in "boundingbox.txt". I think those are width or height of facebox, but could them be so large?

Sir, can you upload the dabaset and tain model which is shown in the sample results.

It looks like that the sample results contains much more landmarks than the COFW dataset which contains only 29 landmarks. could you please upload the dabaset and tain model of the sample result to the google cloud disk. Thank you !

SimilarityTransform, regress target

what's the SimilarityTransform theory? I can't understand the SimilarityTransform function code. what i can refer to ?
although i can understand the pixel is indexed realitve to mean shape, i want to know why the gradient,(i.e. regress target) must be transform to mean shape coordination like this: regression_targets[i] = scale * regression_targets[i] * rotation;

Missing term in correlation calculation

There's a missing term in your implementation of the equation that calculates correlation, which is the sample variance of the Yprob, under the sqrt in the denominator of that equation (equation 11). Please correct me if I am mistaken about this.

how long would this code spend when we detect feature points

你好！我有个困惑，我检测一张图片的特征点时耗费了140ms,然而论文里面说大概15ms，你能告诉我怎么去提高这个效率吗？

CMakeLists.txt should include opencv headers

if (OpenCV_FOUND)
include_directories(${OpenCV_INCLUDE_DIRS})
endif()

or the complier will fail to find cv.h etc.

i can not download the training data and the model

can you share the training data and the model in github?

Random Projection

I see the paper《Random projection in dimensionality reduction: Applications to image and text data》,which is quite different from your code.

        // RNG random_generator(i);
        Mat_<double> random_direction(landmark_num_ , 2);
        random_generator.fill(random_direction, RNG::UNIFORM, -1.1, 1.1);

        normalize(random_direction,random_direction);
        vector<double> projection_result(regression_targets.size(), 0);  //size = (1, image_num)
        // project regression targets along the random direction 
        for(int j = 0; j < regression_targets.size(); j++){ //for each sample
            double temp = 0;
            temp = sum(regression_targets[j].mul(random_direction))[0]; 
            projection_result[j] = temp;
        }

However, in the paper,

The left of equation is the projected result, k is the lower dimension, d is the current dimension, N is the number of data. Your projection matrix（random_direction），however, is not consistent with the paper. Neither the computation is.

Second, you fill the matrix by random_generator.fill(random_direction, RNG::UNIFORM, -1.1, 1.1);
In the paper,

It subjects itself to some distribution. So what's your consideration of generating total random matrix?

how to get boundingbox.txt and keypoints.txt?

Can you tell me the concrete way to get boundingbox.txt and keypoints.txt?

Compilation issues

Trying to compile this code under Linux with g++ 4.8.2 fails, because your code uses C++ 11 features. So adding these two lines to CMakeLists.txt fixes the problem:
set_property(TARGET TrainDemo.out PROPERTY CXX_STANDARD 11)
set_property(TARGET TestDemo.out PROPERTY CXX_STANDARD 11)

hi, do you have plans to implement the cvpr 2014 paper: 3000fps face alignment

is there a compiled EXE?

Thanks for your code. But I am wondering whether there is EXE files that we can directly use. Many thanks!

Quite confused by "find max correlation" in Fern.cpp

ｓｏｌｖｅｄ

Performance issue

Hello,

How long does the prediction process take in your test? What was the parameter configuration you used in your experiment? (candidate_pixel_num, fern_pixel_num, first_level_num, second_level_num, landmark_num)

The prediction process takes around 500 ms in my machine, is that reasonable?

Thank you!

The purpose of the bounding box

Hello,I have trained on the facedatabase MUCT,and 1000 pictures are used of testing set,another 500 pictures are used as training set.The result of alignment is good,but when I use JAFFE as the testing set,the result is poor.I think that the reason is the bounding box,such as the follow picture,the bounding box as it is,
.
But I don't kown where I went wrong?

Bug in calculate_covariance?

Hi,

First of all, thanks for putting up online this source code, it's very useful to have a reference implementation of that great paper.
I was curious to understand the rationale behind the 0.2 multiplicator in your fern's threshold calculation so I set out to inspect the max_diff values and noticed that some of them were greater than 255 which didn't seem to make sense. Ultimately, I tracked down the problem to what I believe is a bug in the implementation of calculate_covariance. In this function you pass the image intensities as references to const vector and build OpenCV Mats from them, sharing the data (i.e. copyData is left to false in the Mat constructor). However, at some point, you modify the values inside these Mats, e.g.:
v1 = v1 - mean_1;
This seems to modify the value of the v_1 vector for me (despite the const qualifier on the argument) and as a result after returning from this function, the values in the 'densities' array are changed.
Here is what I think the code should be:

double calculate_covariance(const vector<double>& v_1, 
                            const vector<double>& v_2) {
    Mat_<double> v1(v_1);
    Mat_<double> v2(v_2);
    double mean_1 = mean(v1)[0];
    double mean_2 = mean(v2)[0];
    // FIXME: bug - this modifies v_1 and v_2 (supposed to be const)
    //v1 = v1 - mean_1;
    //v2 = v2 - mean_2;
    //return mean((v1).mul((v2)))[0];
    return mean((v1 - mean_1).mul((v2 - mean_2)))[0];
}

Could you please confirm that you are not intentionally modifying the values passed in that code?
Once the densities values are back in [0,255], do we still need the 0.2 multiplicator in the threshold computation?

Thanks!