Giter Club home page Giter Club logo

hrnet-facial-landmark-detection's Introduction

High-resolution networks (HRNets) for facial landmark detection

News

Introduction

This is the official code of High-Resolution Representations for Facial Landmark Detection. We extend the high-resolution representation (HRNet) [1] by augmenting the high-resolution representation by aggregating the (upsampled) representations from all the parallel convolutions, leading to stronger representations. The output representations are fed into classifier. We evaluate our methods on four datasets, COFW, AFLW, WFLW and 300W.

Performance

ImageNet pretrained models

HRNetV2 ImageNet pretrained models are now available! Codes and pretrained models are in HRNets for Image Classification

We adopt HRNetV2-W18(#Params=9.3M, GFLOPs=4.3G) for facial landmark detection on COFW, AFLW, WFLW and 300W.

COFW

The model is trained on COFW train and evaluated on COFW test.

Model NME FR0.1 pretrained model model
HRNetV2-W18 3.45 0.20 HRNetV2-W18 HR18-COFW.pth

AFLW

The model is trained on AFLW train and evaluated on AFLW full and frontal.

Model NMEfull NMEfrontal pretrained model model
HRNetV2-W18 1.57 1.46 HRNetV2-W18 HR18-AFLW.pth

WFLW

NME test pose illumination occlution blur makeup expression pretrained model model
HRNetV2-W18 4.60 7.86 4.57 5.42 5.36 4.26 4.78 HRNetV2-W18 HR18-WFLW.pth

300W

NME common challenge full test pretrained model model
HRNetV2-W18 2.91 5.11 3.34 3.85 HRNetV2-W18 HR18-300W.pth

Quick start

Environment

This code is developed using on Python 3.6 and PyTorch 1.0.0 on Ubuntu 16.04 with NVIDIA GPUs. Training and testing are performed using 1 NVIDIA P40 GPU with CUDA 9.0 and cuDNN 7.0. Other platforms or GPUs are not fully tested.

Install

  1. Install PyTorch 1.0 following the official instructions
  2. Install dependencies
pip install -r requirements.txt
  1. Clone the project
git clone https://github.com/HRNet/HRNet-Facial-Landmark-Detection.git

HRNetV2 pretrained models

cd HRNet-Facial-Landmark-Detection
# Download pretrained models into this folder
mkdir hrnetv2_pretrained

Data

  1. You need to download the annotations files which have been processed from OneDrive, Cloudstor, and BaiduYun(Acess Code:ypxg).

  2. You need to download images (300W, AFLW, WFLW) from official websites and then put them into images folder for each dataset.

Your data directory should look like this:

HRNet-Facial-Landmark-Detection
-- lib
-- experiments
-- tools
-- data
   |-- 300w
   |   |-- face_landmarks_300w_test.csv
   |   |-- face_landmarks_300w_train.csv
   |   |-- face_landmarks_300w_valid.csv
   |   |-- face_landmarks_300w_valid_challenge.csv
   |   |-- face_landmarks_300w_valid_common.csv
   |   |-- images
   |-- aflw
   |   |-- face_landmarks_aflw_test.csv
   |   |-- face_landmarks_aflw_test_frontal.csv
   |   |-- face_landmarks_aflw_train.csv
   |   |-- images
   |-- cofw
   |   |-- COFW_test_color.mat
   |   |-- COFW_train_color.mat  
   |-- wflw
   |   |-- face_landmarks_wflw_test.csv
   |   |-- face_landmarks_wflw_test_blur.csv
   |   |-- face_landmarks_wflw_test_expression.csv
   |   |-- face_landmarks_wflw_test_illumination.csv
   |   |-- face_landmarks_wflw_test_largepose.csv
   |   |-- face_landmarks_wflw_test_makeup.csv
   |   |-- face_landmarks_wflw_test_occlusion.csv
   |   |-- face_landmarks_wflw_train.csv
   |   |-- images

Train

Please specify the configuration file in experiments (learning rate should be adjusted when the number of GPUs is changed).

python tools/train.py --cfg <CONFIG-FILE>
# example:
python tools/train.py --cfg experiments/wflw/face_alignment_wflw_hrnet_w18.yaml

Test

python tools/test.py --cfg <CONFIG-FILE> --model-file <MODEL WEIGHT> 
# example:
python tools/test.py --cfg experiments/wflw/face_alignment_wflw_hrnet_w18.yaml --model-file HR18-WFLW.pth

Other applications of HRNets (codes and models):

Citation

If you find this work or code is helpful in your research, please cite:

@inproceedings{SunXLW19,
  title={Deep High-Resolution Representation Learning for Human Pose Estimation},
  author={Ke Sun and Bin Xiao and Dong Liu and Jingdong Wang},
  booktitle={CVPR},
  year={2019}
}

@article{WangSCJDZLMTWLX19,
  title={Deep High-Resolution Representation Learning for Visual Recognition},
  author={Jingdong Wang and Ke Sun and Tianheng Cheng and 
          Borui Jiang and Chaorui Deng and Yang Zhao and Dong Liu and Yadong Mu and 
          Mingkui Tan and Xinggang Wang and Wenyu Liu and Bin Xiao},
  journal   = {TPAMI}
  year={2019}
}

Reference

[1] Deep High-Resolution Representation Learning for Visual Recognition. Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, Wenyu Liu, Bin Xiao. Accepted by TPAMI. download

hrnet-facial-landmark-detection's People

Contributors

sunke123 avatar welleast avatar wondervictor avatar yangyangkiki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hrnet-facial-landmark-detection's Issues

model download

Hi, thank you for sharing your models. And I can't download the models from the links you given. Can you give me other links like baiduyun? Thank you!

About AFLW dataset

I download AFLW dataset from someone baiduyun link,maybe the some images is broken.
And when I train the model,Warning message :

TiffImagePlugin.py:754: UserWarning: Possibly corrupt EXIF data. Expecting to read 19660800 bytes but only got 0. Skipping tag 0
" Skipping tag %s" % (size, len(data), tag))
PIL/TiffImagePlugin.py:771: UserWarning: Corrupt EXIF data. Expecting to read 12 bytes but only got 6.

I don't know wheather the image in native dataset is OK.
Have you ever come across the same situation , I wanna ensure wheather the baiduyun_dataset is the same with native one.

may be a bug on last epoch

In tools.train.py LINE 69,

last_epoch = config.TRAIN.BEGIN_EPOCH

I think it may should be config.TRAIN.END_EPOCH?

inference problem

It shows that the results is good in dataset,but it depends on scale, center_w, center_h.If I want to do inference of an image without information of scale, center_w, center_h, how do I get scale and center_w,center_h accurately.if these variable are not accurate,the landmark is not good.So I must get accurate scale, center_w and center_h. Any suggestions?

Error while trying to run inference.

while trying to run inference for a webcam.

error.

Traceback (most recent call last):
File "camera.py", line 10, in
from model.pfld import PFLDInference, AuxiliaryNet
ModuleNotFoundError: No module named 'model.pfld'; 'model' is not a package

please help

train error

ERROR: ValueError: only one element tensors can be converted to Python scalars

please help ~~THANK YOU.

some issue about dataloader wflw

wflw.py the data loader of wflw.

in line 63:
img = np.array(Image.open(image_path).convert('RGB'), dtype=np.float32)

after that, normalizing the img:
img = (img/255.0 - self.mean) / self.std

why dont u just use
PIL and pytorch.transform to normalize while training and testing?

Missing keys while loading the model

Hi,

I was trying out test.py, but I get the error missing keys(s) in state_dict and list of keys.
Am I missing something ?

RuntimeError: Error(s) in loading state_dict for HighResolutionNet:
Missing key(s) in state_dict: "conv1.weight", "bn1.weight", "bn1.bias", "bn1.running_mean",
.... goes on

Some questions about inference function

after running wlfw dataset and models,
i got confused....

File "tools/test.py", line 78, in main
nme, predictions = function.inference(config, test_loader, model)

File 'lib/core/function.py ', line 194, in inference
preds = decode_preds(score_map, meta['center'], meta['scale'], [64, 64])

File 'lib/core/evolution.py ', line 64, in decode_preds
coords = get_preds(output) # shape (8,98,2)
actually, thats the matrix number of the greatest value in 64*64 output matrix.
also, i think that number should be int.
i dont understand why u just let
coords[n][p] += diff.sign() * .25
and
coords += 0.5

code using probloms

hi, I've tried to use your code for some demo tests on my own images, but it seems it can't be used for single forward inference. So what should I do to make it into my practical using?
thanks a lot!

Not convergence at WFLW dataset

Hi, thanks for your excellent work. When I tried to train from pretrained imagenet model on WFLW, it's wired that the loss is convergence at about 0.0011 no matter how I change the lr or the optimizer.BTW,the nme on test set is about 0.20.
All my configuration is just the config file you provided. Do you have any idea about this?

the files in the OneDrive is not reachable

hi, i want try to download the trained weights and preprocessed files, but the file link of OneDrive is not reachable, does anyone share the trained weights and preprocessed annotation files with any other locations?

[demo] for own dataset

Thanks for your great job!
Would it be possible for you to inform me about how I should config test.py if I only want to extract facial landmark in my own dataset.
Thanks in advance !

Some problems about generate_target

Hello, I'm so glad to read your paper and code. But I have some problems.

image

+1 in transform_pixel
image
-1 in __getitem__

I'm a little confused.

I debug the code generate_targe

def generate_target(img, pt, sigma, label_type='Gaussian'):
    """
    :param img: heatmap of a landmark    64 64
    :param pt: a landmark (1,2)
    :param sigma:
    :param label_type:
    :return:
    """
    # Check that any part of the gaussian is in-bounds
    tmp_size = sigma * 3                                        #辐射范围为4.5
    ul = [int(pt[0] - tmp_size), int(pt[1] - tmp_size)]
    br = [int(pt[0] + tmp_size + 1), int(pt[1] + tmp_size + 1)] #这里+1之后两边对称,
    if (ul[0] >= img.shape[1] or ul[1] >= img.shape[0] or
            br[0] < 0 or br[1] < 0):                            #如果超出了辐射范围
        # If not, just return the image as is
        return img                                              #返回空白的heatmap

    # Generate gaussian
    size = 2 * tmp_size + 1                                     #10
    x = np.arange(0, size, 1, np.float32)
    y = x[:, np.newaxis]
    x0 = y0 = size // 2                                         #5
    # The gaussian is not normalized, we want the center value to equal 1
    if label_type == 'Gaussian':                                #11x11高斯核
        g = np.exp(- ((x - x0) ** 2 + (y - y0) ** 2) / (2 * sigma ** 2))
    else:
        g = sigma / (((x - x0) ** 2 + (y - y0) ** 2 + sigma ** 2) ** 1.5)

    # Usable gaussian range
    g_x = max(0, -ul[0]), min(br[0], img.shape[1]) - ul[0]
    g_y = max(0, -ul[1]), min(br[1], img.shape[0]) - ul[1]


    # Image range
    img_x = max(0, ul[0]), min(br[0], img.shape[1])
    img_y = max(0, ul[1]), min(br[1], img.shape[0])

    img[img_y[0]:img_y[1], img_x[0]:img_x[1]] = g[g_y[0]:g_y[1], g_x[0]:g_x[1]]
    return img

if __name__ == "__main__":
    heatmap = np.zeros((64,64))
    pt = (0,1)
    sigma = 1.5
    heatmap = generate_target(heatmap, pt, sigma)

the heatmap is like this..the max value index is (1, 1)
image

when I change pt = (10, 10) and get correct heatmap

image

I'm not sure it's a bug or something.

Thanks for your reply.

About Inference Time

When I run tools/test.py with WFLW dataset , It may cost 115s on single 2080Ti, 85s on 4 2080Tis.
And the CPU cost is high with 300% , but gpu always cost 0% and gpu-memory is very low even though set batchsize as 256.

So It is working fine? If not, can u tell me how to do with it.

300W Dataset Numbers

Hi authors,
Thankyou for releasing your code. I just happened to see a different number on the README as well as on the arxiv paper for 300W dataset (although the number for the test is same). It would be really great if you could tell the differences between the training of these two models.

Common Challenge Full Test
README 2.91 5.11 3.34 3.85
Arxiv paper 2.87 5.15 3.32 3.85

Bug in load best model?

When i load the best saved model as below:
python tools/test.py --cfg experiments/300w/face_alignment_300w_hrnet_w18.yaml --model-file output/300W/face_alignment_300w_hrnet_w18/model_best.pth
step 1
In tools/test.py
##args.model_file = output/300W/face_alignment_300w_hrnet_w18/model_best.pth
state_dict = torch.load(args.model_file) ## The state_dict is not dict but module
if 'state_dict' in state_dict.keys(): ## Here raise an error
state_dict = state_dict['state_dict']
model.load_state_dict(state_dict)
else:
model.module.load_state_dict(state_dict)
step 2
So, i follow the best model save code in lib/utils/utils.py/save_checkpiont()
if is_best and 'state_dict' in states.keys():
torch.save(states['state_dict'].module, os.path.join(output_dir, 'model_best.pth')) ##Here is the bug. The code save module instead of state_dict. So, the error raise when we load the best model.

post-processing

Dear author, how can I understand the post-processing part in the decode_preds function?

# pose-processing
for n in range(coords.size(0)):
    for p in range(coords.size(1)):
        hm = output[n][p]
        px = int(math.floor(coords[n][p][0]))
        py = int(math.floor(coords[n][p][1]))
        if (px > 1) and (px < res[0]) and (py > 1) and (py < res[1]):
            diff = torch.Tensor([hm[py - 1][px] - hm[py - 1][px - 2], hm[py][px - 1]-hm[py - 2][px - 1]])
            coords[n][p] += diff.sign() * .25
coords += 0.5
preds = coords.clone()

training on AFLW

Hello,

I have a question regarding training on AFLW. The dataset provides annotations for landmark visibility. I am interested in the reason why have you not took this into account when you train for AFLW. I guess the NME would have been lower in this way.

Thank you,
Andrei

Inference on our own images

Hi! Thank you for your work. Like some others, I've modified the code to take my own dataset and get predictions on it using dummy values for the ground truths, but the points look nothing like expected. Any help would be appreciated.

Hey, I rewrote a script to do inference on single image. However, I observed that the predicted points are not correct after transformation. Any suggestions relating to it?

Originally posted by @testingshanu in #7 (comment)

Thank you! Ana

Hi! some errors during the training!

face_alignment_300w_hrnet_w18_2019-07-18-21-22_train.log
Hi!, I am very interesting in your codes, and I want to retrain the network by your configure file, but it seems that it has a big difference between yours and mine. It is my log file, the nme is about 0.048 which is much bigger than your 0.038, can you help me to get the same result with yours? By the way, in your training log, the version of pytorch seems not 1.0, is it right, can you tell me your version. Thank you !!!

The effect of scale, center_w, center_h on performance?

Thanks for you code! But the annoying thing is that scale, center_w, center_h are needed when inference. Whether there are substitutable transformations without scale, center_w, center_h? Or how much of an impact does while preprocessing with scale, center_w, center_h? Looking forward to a reply, thanks~

inconsistent implementation for saved .pth model

Hi,

Thanks for sharing such a awesome work.

As current implementation did not using model.module anymore.

Using model.load_state_dict(state_dict) instead of model.module.load_state_dict(state_dict)
works for all trained weights provided except HR18-300W.pth which still saved under model.module.

For now I just modify the keys as below and it works correctly.

new_state_dict = OrderedDict()
for i, key in enumerate(state_dict):
    new_state_dict[key.replace('module.', '')] = state_dict[key]
model.load_state_dict(new_state_dict)

Results without pre-training

Hi, thanks for this awesome repo.
I noticed that all models use imagenet pre-trained initialization. I am wondering, would you mind to provide results without imagenet pre-training.

rotations are not considered when training NME is computed.

Hello, it seems that rotations are not considered when counting training NME.

I'll put 300W experiments as example.

In lib.datasets.face300w, the original ground truth landmarks are passed (in some cases they are flipped), and random rotation is carried out for training set (note that ground truth landmarks are not rotated).

And in lib.core.function.train, the NME is computed with

preds = decode_preds(score_map, meta['center'], meta['scale'], [64, 64])
nme_batch = compute_nme(preds, meta)

Where the parameters don't contain rotation information

And in lib.utils.transforms.transform_preds

coords[p, 0:2] = torch.tensor(transform_pixel(coords[p, 0:2], center, scale, output_size, 1, 0))

the rotation factor is set to constant 0.

So a non-rotated ground truth and prediction based on random rotated image (without rotated back) is used to compute training NME, is there anything wrong?

Explanation of Annotation .csv file

In your provided annotation file face_landmarks_wflw_test.csv, how are the annotations for "scale", "center_w" and "center_h" computed from the original face bounding box annotations of <x_min, y_min, x_max, y_max> for the face bounding provided by the authors of the WFLW dataset? Could you explain?

Default Model don't match weights from link?

Another question plz, refer to your tools/test.py, I use model = models.get_face_alignment_net(config) to load a default model ,then load HR18-300W.pth. However, some mistakes happen, it show that the feature nums from default model are inconsistent with weights from link. Waiting for your responding~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.