Hi, Thank you very much for sharing the face landmark model. When I

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Why the perfomance of AVS(SAN) model is not good? about stylealign HOT 13 CLOSED

thesouthfrog commented on June 19, 2024

Why the perfomance of AVS(SAN) model is not good?

from stylealign.

Comments (13)

TheSouthFrog commented on June 19, 2024

You need to run a face detector at first to obtain the coordinate of the face bounding box, instead of directly resizing the original image. The reason is that the standard training and testing images are cropped according to their bounding box and fed into the model with a larger size of the face.

from stylealign.

ilovecv commented on June 19, 2024

Hi @TheSouthFrog ,

Thank you very much for your quick reply. So I tried to preprocess the image you show in crop_pic.py, i.e., pre prop with expand ratio = 0.2, resize image to 256*256. The result is still not very good. Should I expand the face bounding box? Here is the result:

from stylealign.

TheSouthFrog commented on June 19, 2024

The bounding box you provided is too small. You may expand the ratio then. I guess the result should be improved a lot.

More concretely, you can also refer to this issue. The general reason is that you should provide the bounding box with a similar style in your training data, e.g. MTCNN for 300W. Since the model we are using is trained on WFLW. , we are supposed to provide similar box annotations to the Wider Dataset. You can either choose to expand the ratio or use a face detector pre-trained on Wider Face.

from stylealign.

ilovecv commented on June 19, 2024

Hi @TheSouthFrog,

As you suggested, I tried another face detector which is trained on Wider face dataset, https://github.com/TencentYoutuResearch/FaceDetection-DSFD
But still got no luck. I am wondering if it is possible for you to run your face detector and show the face alignment results? You can find the original image below. Thanks!

Here is the original image:

from stylealign.

TheSouthFrog commented on June 19, 2024

As I mentioned, the box you provided is too small. have you tried to expand the box ratio?

Here is an example of cropped out training images which show the proper size of input bounding box.

from stylealign.

ilovecv commented on June 19, 2024

Hi @TheSouthFrog,

I tried to expand the box ratio, it did improve the result. However, for faces with large poses, like the image below, what should I do? Thanks!

from stylealign.

ilovecv commented on June 19, 2024

Hi @TheSouthFrog,

Now I get good results on landmark detection. And I had trained the variational u-net for the style transfer. However, there is a problem with the result, please see the below image. It looks like the landmarks positions are not correct:

from stylealign.

TheSouthFrog commented on June 19, 2024

Hi, that's actually a very interesting problem in facial landmark detection. One general approach for mitigating large poses is to pre-use a 5-point landmark detector to obtain the basic angles and coordinates, and then align&crop from the original image according to the initial points and then run more fine-grained detector, e.g. 98 points in our case.

from stylealign.

TheSouthFrog commented on June 19, 2024

The reason of your second question is that you didn't align the input images first.

If you want to train the generator on your own images, you have to make sure that the landmarks are aligned and centered as well as cropping the corresponding regions of input images. You can refer to the training set I provided which we have pre-processed and cropeed the input raw data.

from stylealign.

ilovecv commented on June 19, 2024

Hi @TheSouthFrog,

Thank you very much for your quick response. I learned a lot from it. I am wondering if you can provide the pre-processing script, so I can test on my own images?

from stylealign.

TheSouthFrog commented on June 19, 2024

I am sorry that I can't provide my pre-processing script to you since we have shipped the detection, align and crop pipeline into SDK for convenience that might not be able to release.

However, I would recommend you to re-train the model on your own dataset. Note that in this work we are trying to augment the styles on the originally-available landmarks, thus we didn't emphasize too much on the generalization of unseen images. If the testing image is significantly different from the training ones. The result could be not that great. So I recommend you to first pre-process your data by align&cropping your own training set using some open-sourced script(there are quite a few). And then train a model using the given hyper-parameters. I suppose that way you can see some okay results.

from stylealign.

ilovecv commented on June 19, 2024

Hi @TheSouthFrog,

Thank you very much for your suggestions. I am wondering if it is possible for you to point me to some open-sourced scripts for the preprocessing? Thank you very much!

from stylealign.

TheSouthFrog commented on June 19, 2024

Sorry for the late response since I missed the notification these days and forget to respond.

I believe the simplest tool you can use is dlib or a similar widely-used package.

from stylealign.

Why the perfomance of AVS(SAN) model is not good? about stylealign HOT 13 CLOSED

Comments (13)

Related Issues (14)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent