Giter Club home page Giter Club logo

hrnet-facial-landmark-detection's Issues

train error

ERROR: ValueError: only one element tensors can be converted to Python scalars

please help ~~THANK YOU.

training on AFLW

Hello,

I have a question regarding training on AFLW. The dataset provides annotations for landmark visibility. I am interested in the reason why have you not took this into account when you train for AFLW. I guess the NME would have been lower in this way.

Thank you,
Andrei

Bug in load best model?

When i load the best saved model as below:
python tools/test.py --cfg experiments/300w/face_alignment_300w_hrnet_w18.yaml --model-file output/300W/face_alignment_300w_hrnet_w18/model_best.pth
step 1
In tools/test.py
##args.model_file = output/300W/face_alignment_300w_hrnet_w18/model_best.pth
state_dict = torch.load(args.model_file) ## The state_dict is not dict but module
if 'state_dict' in state_dict.keys(): ## Here raise an error
state_dict = state_dict['state_dict']
model.load_state_dict(state_dict)
else:
model.module.load_state_dict(state_dict)
step 2
So, i follow the best model save code in lib/utils/utils.py/save_checkpiont()
if is_best and 'state_dict' in states.keys():
torch.save(states['state_dict'].module, os.path.join(output_dir, 'model_best.pth')) ##Here is the bug. The code save module instead of state_dict. So, the error raise when we load the best model.

Error while trying to run inference.

while trying to run inference for a webcam.

error.

Traceback (most recent call last):
File "camera.py", line 10, in
from model.pfld import PFLDInference, AuxiliaryNet
ModuleNotFoundError: No module named 'model.pfld'; 'model' is not a package

please help

The effect of scale, center_w, center_h on performance?

Thanks for you code! But the annoying thing is that scale, center_w, center_h are needed when inference. Whether there are substitutable transformations without scale, center_w, center_h? Or how much of an impact does while preprocessing with scale, center_w, center_h? Looking forward to a reply, thanks~

code using probloms

hi, I've tried to use your code for some demo tests on my own images, but it seems it can't be used for single forward inference. So what should I do to make it into my practical using?
thanks a lot!

inference problem

It shows that the results is good in dataset,but it depends on scale, center_w, center_h.If I want to do inference of an image without information of scale, center_w, center_h, how do I get scale and center_w,center_h accurately.if these variable are not accurate,the landmark is not good.So I must get accurate scale, center_w and center_h. Any suggestions?

300W Dataset Numbers

Hi authors,
Thankyou for releasing your code. I just happened to see a different number on the README as well as on the arxiv paper for 300W dataset (although the number for the test is same). It would be really great if you could tell the differences between the training of these two models.

Common Challenge Full Test
README 2.91 5.11 3.34 3.85
Arxiv paper 2.87 5.15 3.32 3.85

Some questions about inference function

after running wlfw dataset and models,
i got confused....

File "tools/test.py", line 78, in main
nme, predictions = function.inference(config, test_loader, model)

File 'lib/core/function.py ', line 194, in inference
preds = decode_preds(score_map, meta['center'], meta['scale'], [64, 64])

File 'lib/core/evolution.py ', line 64, in decode_preds
coords = get_preds(output) # shape (8,98,2)
actually, thats the matrix number of the greatest value in 64*64 output matrix.
also, i think that number should be int.
i dont understand why u just let
coords[n][p] += diff.sign() * .25
and
coords += 0.5

Inference on our own images

Hi! Thank you for your work. Like some others, I've modified the code to take my own dataset and get predictions on it using dummy values for the ground truths, but the points look nothing like expected. Any help would be appreciated.

Hey, I rewrote a script to do inference on single image. However, I observed that the predicted points are not correct after transformation. Any suggestions relating to it?

Originally posted by @testingshanu in #7 (comment)

Thank you! Ana

the files in the OneDrive is not reachable

hi, i want try to download the trained weights and preprocessed files, but the file link of OneDrive is not reachable, does anyone share the trained weights and preprocessed annotation files with any other locations?

some issue about dataloader wflw

wflw.py the data loader of wflw.

in line 63:
img = np.array(Image.open(image_path).convert('RGB'), dtype=np.float32)

after that, normalizing the img:
img = (img/255.0 - self.mean) / self.std

why dont u just use
PIL and pytorch.transform to normalize while training and testing?

Missing keys while loading the model

Hi,

I was trying out test.py, but I get the error missing keys(s) in state_dict and list of keys.
Am I missing something ?

RuntimeError: Error(s) in loading state_dict for HighResolutionNet:
Missing key(s) in state_dict: "conv1.weight", "bn1.weight", "bn1.bias", "bn1.running_mean",
.... goes on

Explanation of Annotation .csv file

In your provided annotation file face_landmarks_wflw_test.csv, how are the annotations for "scale", "center_w" and "center_h" computed from the original face bounding box annotations of <x_min, y_min, x_max, y_max> for the face bounding provided by the authors of the WFLW dataset? Could you explain?

About AFLW dataset

I download AFLW dataset from someone baiduyun link,maybe the some images is broken.
And when I train the model,Warning message :

TiffImagePlugin.py:754: UserWarning: Possibly corrupt EXIF data. Expecting to read 19660800 bytes but only got 0. Skipping tag 0
" Skipping tag %s" % (size, len(data), tag))
PIL/TiffImagePlugin.py:771: UserWarning: Corrupt EXIF data. Expecting to read 12 bytes but only got 6.

I don't know wheather the image in native dataset is OK.
Have you ever come across the same situation , I wanna ensure wheather the baiduyun_dataset is the same with native one.

About Inference Time

When I run tools/test.py with WFLW dataset , It may cost 115s on single 2080Ti, 85s on 4 2080Tis.
And the CPU cost is high with 300% , but gpu always cost 0% and gpu-memory is very low even though set batchsize as 256.

So It is working fine? If not, can u tell me how to do with it.

Not convergence at WFLW dataset

Hi, thanks for your excellent work. When I tried to train from pretrained imagenet model on WFLW, it's wired that the loss is convergence at about 0.0011 no matter how I change the lr or the optimizer.BTW,the nme on test set is about 0.20.
All my configuration is just the config file you provided. Do you have any idea about this?

[demo] for own dataset

Thanks for your great job!
Would it be possible for you to inform me about how I should config test.py if I only want to extract facial landmark in my own dataset.
Thanks in advance !

Default Model don't match weights from link?

Another question plz, refer to your tools/test.py, I use model = models.get_face_alignment_net(config) to load a default model ,then load HR18-300W.pth. However, some mistakes happen, it show that the feature nums from default model are inconsistent with weights from link. Waiting for your responding~

Some problems about generate_target

Hello, I'm so glad to read your paper and code. But I have some problems.

image

+1 in transform_pixel
image
-1 in __getitem__

I'm a little confused.

I debug the code generate_targe

def generate_target(img, pt, sigma, label_type='Gaussian'):
    """
    :param img: heatmap of a landmark    64 64
    :param pt: a landmark (1,2)
    :param sigma:
    :param label_type:
    :return:
    """
    # Check that any part of the gaussian is in-bounds
    tmp_size = sigma * 3                                        #辐射范围为4.5
    ul = [int(pt[0] - tmp_size), int(pt[1] - tmp_size)]
    br = [int(pt[0] + tmp_size + 1), int(pt[1] + tmp_size + 1)] #这里+1之后两边对称,
    if (ul[0] >= img.shape[1] or ul[1] >= img.shape[0] or
            br[0] < 0 or br[1] < 0):                            #如果超出了辐射范围
        # If not, just return the image as is
        return img                                              #返回空白的heatmap

    # Generate gaussian
    size = 2 * tmp_size + 1                                     #10
    x = np.arange(0, size, 1, np.float32)
    y = x[:, np.newaxis]
    x0 = y0 = size // 2                                         #5
    # The gaussian is not normalized, we want the center value to equal 1
    if label_type == 'Gaussian':                                #11x11高斯核
        g = np.exp(- ((x - x0) ** 2 + (y - y0) ** 2) / (2 * sigma ** 2))
    else:
        g = sigma / (((x - x0) ** 2 + (y - y0) ** 2 + sigma ** 2) ** 1.5)

    # Usable gaussian range
    g_x = max(0, -ul[0]), min(br[0], img.shape[1]) - ul[0]
    g_y = max(0, -ul[1]), min(br[1], img.shape[0]) - ul[1]


    # Image range
    img_x = max(0, ul[0]), min(br[0], img.shape[1])
    img_y = max(0, ul[1]), min(br[1], img.shape[0])

    img[img_y[0]:img_y[1], img_x[0]:img_x[1]] = g[g_y[0]:g_y[1], g_x[0]:g_x[1]]
    return img

if __name__ == "__main__":
    heatmap = np.zeros((64,64))
    pt = (0,1)
    sigma = 1.5
    heatmap = generate_target(heatmap, pt, sigma)

the heatmap is like this..the max value index is (1, 1)
image

when I change pt = (10, 10) and get correct heatmap

image

I'm not sure it's a bug or something.

Thanks for your reply.

Hi! some errors during the training!

face_alignment_300w_hrnet_w18_2019-07-18-21-22_train.log
Hi!, I am very interesting in your codes, and I want to retrain the network by your configure file, but it seems that it has a big difference between yours and mine. It is my log file, the nme is about 0.048 which is much bigger than your 0.038, can you help me to get the same result with yours? By the way, in your training log, the version of pytorch seems not 1.0, is it right, can you tell me your version. Thank you !!!

Results without pre-training

Hi, thanks for this awesome repo.
I noticed that all models use imagenet pre-trained initialization. I am wondering, would you mind to provide results without imagenet pre-training.

inconsistent implementation for saved .pth model

Hi,

Thanks for sharing such a awesome work.

As current implementation did not using model.module anymore.

Using model.load_state_dict(state_dict) instead of model.module.load_state_dict(state_dict)
works for all trained weights provided except HR18-300W.pth which still saved under model.module.

For now I just modify the keys as below and it works correctly.

new_state_dict = OrderedDict()
for i, key in enumerate(state_dict):
    new_state_dict[key.replace('module.', '')] = state_dict[key]
model.load_state_dict(new_state_dict)

rotations are not considered when training NME is computed.

Hello, it seems that rotations are not considered when counting training NME.

I'll put 300W experiments as example.

In lib.datasets.face300w, the original ground truth landmarks are passed (in some cases they are flipped), and random rotation is carried out for training set (note that ground truth landmarks are not rotated).

And in lib.core.function.train, the NME is computed with

preds = decode_preds(score_map, meta['center'], meta['scale'], [64, 64])
nme_batch = compute_nme(preds, meta)

Where the parameters don't contain rotation information

And in lib.utils.transforms.transform_preds

coords[p, 0:2] = torch.tensor(transform_pixel(coords[p, 0:2], center, scale, output_size, 1, 0))

the rotation factor is set to constant 0.

So a non-rotated ground truth and prediction based on random rotated image (without rotated back) is used to compute training NME, is there anything wrong?

may be a bug on last epoch

In tools.train.py LINE 69,

last_epoch = config.TRAIN.BEGIN_EPOCH

I think it may should be config.TRAIN.END_EPOCH?

post-processing

Dear author, how can I understand the post-processing part in the decode_preds function?

# pose-processing
for n in range(coords.size(0)):
    for p in range(coords.size(1)):
        hm = output[n][p]
        px = int(math.floor(coords[n][p][0]))
        py = int(math.floor(coords[n][p][1]))
        if (px > 1) and (px < res[0]) and (py > 1) and (py < res[1]):
            diff = torch.Tensor([hm[py - 1][px] - hm[py - 1][px - 2], hm[py][px - 1]-hm[py - 2][px - 1]])
            coords[n][p] += diff.sign() * .25
coords += 0.5
preds = coords.clone()

model download

Hi, thank you for sharing your models. And I can't download the models from the links you given. Can you give me other links like baiduyun? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.