huangjunjie2017 / udp-pose Goto Github PK

View Code? Open in Web Editor NEW

306.0 306.0 54.0 43.21 MB

Official code of The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation

License: Apache License 2.0

Python 56.41% Dockerfile 0.50% Shell 0.01% Makefile 0.02% Cuda 43.05% C++ 0.02%

udp-pose's People

Contributors

Stargazers

Watchers

udp-pose's Issues

AID

Hi, bro. Thanks for your excellent job. First of all, i'd like to ask whether you put the AID code such as Cutout and HideAndSeek in the UDP-Pose repository and add it to the data process pipeline in the train procedure, because i didn't see any of them in the UDP-Pose/deep-high-resolution-net.pytorch/lib/utils/transforms.py, UDP-Pose/deep-high-resolution-net.pytorch/tools/train.py or your config file. Second i'd like to ask that the AP results by using mmpose(your mmpose repository) are the same as the AP results in the 《AID: Pushing the Performance Boundary of Human Pose Estimation with Information Dropping Augmentation》.

How can i use my own custom dataset and obtain metrics?

I have some 500 images and annotation json for those 500 files. I want to evaluate the model on this new test dataset and obtain AP, AP0.5 from that. How can I do that? I could change data dir to my new data directory. However what should I do about the person detection result of COCO val2017 as I do not have that for my dataset for now.

does the repo contain the code for detecting the persons?

hi @HuangJunJie2017
how do i get the human detector with the ap 65.1 as described in the readme?

where is the bottom-up version?

Hi, can you show me the path to higherHRNet + udp codes? i can't find them (ノへ￣)
And how did you make the tradeoff in bottom-up methods, will there be a paper or blog to explain it?

why your json's bboxes have so many no human's joints

Dear JunJie

I am wondering in multi_scales_val_msra.json have 87481 boxes and many of joints are not human's joints such as cat or elephant's joints . but if I use GT's box it just have 6000+ around boxes and these joints are full human's joints . and I found "joints_vis" all set 1. I am very confused. can you tell me the reason.

thank you

About KPD

In the offset model, KPD should be the hyperparameter that controls the radius of the region of interest around the ground truth location. But in the implementation, it also controls the slope (and intercept) of the two offset heatmaps (seel below code):

elif self.target_type == 'offset':
            # self.heatmap_size: [48,64] [w,h]
            target = np.zeros((self.num_joints,
                               3,
                               self.heatmap_size[1]*
                               self.heatmap_size[0]),
                              dtype=np.float32)
            feat_width = self.heatmap_size[0]
            feat_height = self.heatmap_size[1]
            feat_x_int = np.arange(0, feat_width)
            feat_y_int = np.arange(0, feat_height)
            feat_x_int, feat_y_int = np.meshgrid(feat_x_int, feat_y_int)
            feat_x_int = feat_x_int.reshape((-1,))
            feat_y_int = feat_y_int.reshape((-1,))
            kps_pos_distance_x = self.kpd
            kps_pos_distance_y = self.kpd
            feat_stride = (self.image_size - 1.0) / (self.heatmap_size - 1.0)
            for joint_id in range(self.num_joints):
                mu_x = joints[joint_id][0] / feat_stride[0]
                mu_y = joints[joint_id][1] / feat_stride[1]
                # Check that any part of the gaussian is in-bounds

                x_offset = (mu_x - feat_x_int) / kps_pos_distance_x
                y_offset = (mu_y - feat_y_int) / kps_pos_distance_y

                dis = x_offset ** 2 + y_offset ** 2
                keep_pos = np.where((dis <= 1) & (dis >= 0))[0]
                v = target_weight[joint_id]
                if v > 0.5:
                    target[joint_id, 0, keep_pos] = 1
                    target[joint_id, 1, keep_pos] = x_offset[keep_pos]
                    target[joint_id, 2, keep_pos] = y_offset[keep_pos]
            target=target.reshape((self.num_joints*3,self.heatmap_size[1],self.heatmap_size[0]))

Is there any reason why you do that? It seems to be another "devil in the detail" worth studying...:)

paper中公式的一些疑问

公式36中第三行两个flip相乘的结果为什么会多除一个Pwi，这里比较困惑

About the multi-scale testing

I was trying to implement multi-scale testing for my project based on HRNet's official source code. I have downloaded their pre-trained model and run the MPII test set. But I only got 91.6% instead of 92.3% as reported in the original paper. I know i should probably post the issue on the original HRNet GitHub page (i did and I also wrote an email to the author but i got no response).

So, I post here as it is a newer paper based on HRNet's source code, and also no open issue here. I have included my implementation of multi-test as well as the Matlab evaluation code directly evaluating PCKh from the .mat file generated by the official code with 7247 predictions and see if there are problems with my code:

def read_scaled_image(image_file, s, center, scale, image_size, COLOR_RGB, DATA_FORMAT, image_transform):
    if DATA_FORMAT == 'zip':
        from utils import zipreader
        data_numpy = zipreader.imread(image_file, cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION)
    else:
        data_numpy = cv2.imread(image_file, cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION)
    if COLOR_RGB:
        data_numpy = cv2.cvtColor(data_numpy, cv2.COLOR_BGR2RGB)
    trans = get_affine_transform(center, s * scale, 0, image_size)
    images_warp = cv2.warpAffine(data_numpy, trans, tuple(image_size), flags=cv2.INTER_LINEAR)
    return image_transform(images_warp)


def scale_back_output(output_hm, s, output_size):
    hm_size = [output_hm.size(3), output_hm.size(2)]
    # original_max_val1, _ = torch.max(output_hm, dim=2, keepdim=True)
    # original_max_val2, _ = torch.max(original_max_val1, dim=3, keepdim=True)
    if s != 1.0:
        hm_w_margin = int(abs(1.0 - s) * hm_size[0] / 2.0)
        hm_h_margin = int(abs(1.0 - s) * hm_size[1] / 2.0)
        if s < 1.0:
            hm_padding = torch.nn.ZeroPad2d((hm_w_margin, hm_w_margin, hm_h_margin, hm_h_margin))
            resized_hm = hm_padding(output_hm)
        else:
            resized_hm = output_hm[:, :, hm_h_margin:hm_size[1] - hm_h_margin, hm_w_margin:hm_size[0] - hm_w_margin]
        resized_hm = torch.nn.functional.interpolate(
            resized_hm,
            size=(output_size[0], output_size[1]),
            mode='bilinear',  # bilinear bicubic area
            align_corners=False
        )
    else:
        resized_hm = output_hm
        if hm_size[0] != output_size[0] or hm_size[1] != output_size[1]:
            resized_hm = torch.nn.functional.interpolate(
                resized_hm,
                size=(output_size[0], output_size[1]),
                mode='bilinear',  # bilinear bicubic area
                align_corners=False
            )

    # max_val1, _ = torch.max(resized_hm, dim=2, keepdim=True)
    # max_val2, _ = torch.max(max_val1, dim=3, keepdim=True)
    # resized_hm = resized_hm/max_val2*original_max_val2

    # resized_hm = resized_hm / torch.amax(resized_hm, dim=[2, 3], keepdim=True)
    # resized_hm = torch.nn.functional.normalize(resized_hm, dim=[2, 3], p=1)
    # resized_hm = resized_hm/(torch.sum(resized_hm, dim=[2, 3], keepdim=True) + 1e-9)
    return resized_hm


def validate(config, val_loader, val_dataset, model, criterion, output_dir, tb_log_dir, writer_dict=None, test_scale=None, image_transform=None):
    batch_time = AverageMeter()
    losses = AverageMeter()
    acc = AverageMeter()

    # switch to evaluate mode
    model.eval()

    num_samples = len(val_dataset)
    all_preds = np.zeros((num_samples, config.MODEL.NUM_JOINTS, 3), dtype=np.float32)
    all_boxes = np.zeros((num_samples, 6))
    image_path = []
    filenames = []
    imgnums = []
    idx = 0

    # PRINT_FREQ = min(config.PRINT_FREQ//10, 5)
    PRINT_FREQ = config.PRINT_FREQ
    thread_pool = multiprocessing.Pool(multiprocessing.cpu_count())

    image_size = np.array([config.MODEL.IMAGE_SIZE[1], config.MODEL.IMAGE_SIZE[0]])
    final_test_scale = test_scale if test_scale is not None else config.TEST.SCALE_FACTOR
    with torch.no_grad():
        end = time.time()

        start_time = time.time()
        for i, (input, target, target_weight, meta) in enumerate(val_loader):
            # compute output
            # print("Batch", i, "Batch Size", input.size(0))

            target = target.cuda(non_blocking=True)
            target_weight = target_weight.cuda(non_blocking=True)

            outputs = []
            hm_size = None
            for sidx, s in enumerate(sorted(final_test_scale, reverse=True)):
                print("Test Scale", s)
                if s != 1.0:
                    image_files = meta["image"]
                    centers = meta["center"].numpy()
                    scales = meta["scale"].numpy()

                    images_resized = thread_pool.starmap(read_scaled_image, [(image_file,
                                                                              s,
                                                                              center,
                                                                              scale,
                                                                              image_size,
                                                                              config.DATASET.COLOR_RGB,
                                                                              config.DATASET.DATA_FORMAT,
                                                                              image_transform) for (image_file, center, scale) in zip(image_files, centers, scales)])
                    images_resized = torch.stack(images_resized, dim=0)
                else:
                    images_resized = input

                model_outputs = model(images_resized)
                if isinstance(model_outputs, list):
                    model_outputs = model_outputs[-1]

                if config.TEST.FLIP_TEST:
                    print("Test Flip")
                    input_flipped = images_resized.flip(3)
                    output_flipped = model(input_flipped)
                    if isinstance(output_flipped, list):
                        output_flipped = output_flipped[-1]

                    output_flipped = flip_back(output_flipped.cpu().numpy(), val_dataset.flip_pairs)
                    output_flipped = torch.from_numpy(output_flipped.copy()).cuda()

                    # feature is not aligned, shift flipped heatmap for higher accuracy
                    if config.TEST.SHIFT_HEATMAP:
                        output_flipped[:, :, :, 1:] = output_flipped.clone()[:, :, :, 0:-1]

                    model_outputs = 0.5 * (model_outputs + output_flipped)

                hm_size = [model_outputs.size(3), model_outputs.size(2)]
                # hm_size = image_size
                # hm_size = [128, 128]
                output_flipped_resized = scale_back_output(model_outputs, s, hm_size)
                outputs.append(output_flipped_resized)

            for indv_output in outputs:
                _, avg_acc, _, _ = accuracy(indv_output.cpu().numpy(), target.cpu().numpy())
                print("Indv Accuracy", avg_acc)

            output = torch.stack(outputs, dim=0).mean(dim=0)

            target = scale_back_output(target, 1.0, hm_size)
            loss = criterion(output, target, target_weight)

            num_images = input.size(0)
            # measure accuracy and record loss
            losses.update(loss.item(), num_images)
            _, avg_acc, cnt, pred = accuracy(output.cpu().numpy(), target.cpu().numpy())
            print("Avg Accuracy", avg_acc)
            acc.update(avg_acc, cnt)

            # measure elapsed time
            batch_time.update(time.time() - end)
            end = time.time()

            c = meta['center'].numpy()
            s = meta['scale'].numpy()
            score = meta['score'].numpy()

            preds, maxvals = get_final_preds(config, output.clone().cpu().numpy(), c, s)

            all_preds[idx:idx + num_images, :, 0:2] = preds[:, :, 0:2]
            all_preds[idx:idx + num_images, :, 2:3] = maxvals
            # double check this all_boxes parts
            all_boxes[idx:idx + num_images, 0:2] = c[:, 0:2]
            all_boxes[idx:idx + num_images, 2:4] = s[:, 0:2]
            all_boxes[idx:idx + num_images, 4] = np.prod(s*200, 1)
            all_boxes[idx:idx + num_images, 5] = score
            image_path.extend(meta['image'])

            idx += num_images

            if i % PRINT_FREQ == 0:
                msg = 'Test: [{0}/{1}]\t' \
                      'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t' \
                      'Loss {loss.val:.4f} ({loss.avg:.4f})\t' \
                      'Accuracy {acc.val:.3f} ({acc.avg:.3f})'.format(i, len(val_loader), batch_time=batch_time, loss=losses, acc=acc)
                logger.info(msg)

                prefix = '{}_{}'.format(os.path.join(output_dir, 'val'), i)
                save_debug_images(config, input, meta, target, pred*4, output, prefix)

        total_duration = time.time() - start_time
        logger.info("Total test time: {:.1f}".format(total_duration))
        name_values, perf_indicator = val_dataset.evaluate(config, all_preds, output_dir, all_boxes, image_path, filenames, imgnums)

        model_name = config.MODEL.NAME
        if isinstance(name_values, list):
            for name_value in name_values:
                _print_name_value(name_value, model_name)
        else:
            _print_name_value(name_values, model_name)

        if writer_dict:
            writer = writer_dict['writer']
            global_steps = writer_dict['valid_global_steps']
            writer.add_scalar('valid_loss', losses.avg, global_steps)
            writer.add_scalar('valid_acc', acc.avg, global_steps)
            if isinstance(name_values, list):
                for name_value in name_values:
                    writer.add_scalars('valid', dict(name_value), global_steps)
            else:
                writer.add_scalars('valid', dict(name_values), global_steps)
            writer_dict['valid_global_steps'] = global_steps + 1

    thread_pool.close()
    thread_pool.join()
    return perf_indicator

Below is the Matlab MPII test set evaluation code (evalMPIITest.m), you need to download their newly released test set annotation in http://human-pose.mpi-inf.mpg.de/#download:

% Evaluate performance by comparing predictions to ground truth annotations.

%%% OPTIONS %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% IDs of prediction sets to include in results
PRED_IDS = [1, 2, 3];
% Subset of the data that the predictions correspond to ('val' or 'train')
plotcurve = false;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

addpath ('eval')

fprintf('# MPII single-person pose evaluation script\n')

range = 0:0.01:0.5;

tableDir = './latex'; if (~exist(tableDir,'dir')), mkdir(tableDir); end
plotsDir = './plots'; if (~exist(plotsDir,'dir')), mkdir(plotsDir); end
tableTex = cell(length(PRED_IDS)+1,1);

% load ground truth
p = getExpParams(-1)
load([p.gtDir '/annolist_dataset_v12'], 'annolist');
load([p.gtDir '/mpii_human_pose_v1_u12'], 'RELEASE');
annolist_test = annolist(RELEASE.img_train == 0);
% evaluate on the "single person" subset only
single_person_test = RELEASE.single_person(RELEASE.img_train == 0);
% convert to annotation list with a single pose per entry
[annolist_test_flat, single_person_test_flat] = flatten_annolist(annolist_test,single_person_test);
% represent ground truth as a matrix 2x14xN_images
gt = annolist2matrix(annolist_test_flat(single_person_test_flat == 1));
% compute head size
headSize = getHeadSizeAll(annolist_test_flat(single_person_test_flat == 1));

pckAll = zeros(length(range),16,length(PRED_IDS));

for i = 1:length(PRED_IDS);
  % load predictions
  p = getExpParams(PRED_IDS(i));
  try
    load(p.predFilename, 'preds');
  catch
    preds = h5read(p.predFilename, '/preds');
  end
  
  if size(preds, 1) == 2
    preds = permute(preds, [3, 2, 1]);
  end
  
  % Check that there are the same number of predictions and ground truth
  % annotations. If this assertion fails, a likely cause is a mismatch in
  % subsets (eg predictions are for the training set but ground truth
  % annotations are for the validation set).
  fprintf('%d\n', length(preds))
  fprintf('%d\n', length(gt))
  assert(length(preds) == length(gt));

  pred_flat = annolist_test_flat(single_person_test_flat == 1);
  for idx = 1:length(preds);
    for pidx = 1:length(pred_flat(idx).annorect.annopoints.point);
      joint = pred_flat(idx).annorect.annopoints.point(pidx).id + 1;
      xy = preds(idx, joint, :);
      pred_flat(idx).annorect.annopoints.point(pidx).x = xy(1);
      pred_flat(idx).annorect.annopoints.point(pidx).y = xy(2);
    end
  end

  % pred = annolist2matrix(pred_flat(single_person_flat == 1));
  pred = annolist2matrix(pred_flat);
  
  % only gt is allowed to have NaN
  pred(isnan(pred)) = inf;

  % compute distance to ground truth joints
  dist = getDistPCKh(pred,gt,headSize);

  % compute PCKh
  pck = computePCK(dist,range);

  % plot results
  [row, header] = genTablePCK(pck(end,:),p.name);
  tableTex{1} = header;
  tableTex{i+1} = row;

  pckAll(:,:,i) = pck;

  auc = area_under_curve(scale01(range),pck(:,end));
  fprintf('%s, AUC: %1.1f\n',p.name,auc);
end

% Save results
fid = fopen([tableDir '/pckh.tex'],'wt');assert(fid ~= -1);
for i=1:length(tableTex),fprintf(fid,'%s\n',tableTex{i}); end; fclose(fid);

% plot curves
bSave = true;
if (plotcurve)
    plotCurveNew(squeeze(pckAll(:,end,:)),range,PRED_IDS,'PCKh total, MPII',[plotsDir '/pckh-total-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[1 6],:),2)),range,PRED_IDS,'PCKh ankle, MPII',[plotsDir '/pckh-ankle-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[2 5],:),2)),range,PRED_IDS,'PCKh knee, MPII',[plotsDir '/pckh-knee-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[3 4],:),2)),range,PRED_IDS,'PCKh hip, MPII',[plotsDir '/pckh-hip-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[7 12],:),2)),range,PRED_IDS,'PCKh wrist, MPII',[plotsDir '/pckh-wrist-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[8 11],:),2)),range,PRED_IDS,'PCKh elbow, MPII',[plotsDir '/pckh-elbow-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[9 10],:),2)),range,PRED_IDS,'PCKh shoulder, MPII',[plotsDir '/pckh-shoulder-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[13 14],:),2)),range,PRED_IDS,'PCKh head, MPII',[plotsDir '/pckh-head-mpii'],bSave,range(1:5:end));
end

display('Done.')

why need norm = np.ones((pred.shape[0], 2)) * np.array([h, w]) / 10 in accuracy

Dear JunJie

UDP-Pose/deep-high-resolution-net.pytorch/lib/core/evaluate.py

Line 55 in d742edd

norm = np.ones((pred.shape[0], 2)) * np.array([h, w]) / 10

norm = np.ones((pred.shape[0], 2)) * np.array([h, w]) / 10
I did know why use norm and why need /10
and in

UDP-Pose/deep-high-resolution-net.pytorch/lib/core/evaluate.py

Line 16 in d742edd

def calc_dists(preds, target, normalize):

and why
normed_preds = preds[n, c, :] / normalize[n]
normed_targets = target[n, c, :] / normalize[n]

when I print the output preds[n, c, :] and target[n, c, :] are very different such as[ 27,30] and [4,8]

Thank you

pose_hrnet_w48* means using additional data from AI challenger for training？有 AI challenger 和coco数据集合并完成的数据集提供吗？

pose_hrnet_w48* means using additional data from AI challenger for training？有 AI challenger 和coco数据集合并完成的数据集提供吗？@HuangJunJie2017

can't Makefile

At the first step Makefile, it raises :EnvironmentError: The nvcc binary could not be located in your $PATH. Either add it to your path
Could you plz tell me the solution?

missing keypoint-aware occlusion augmentation

In the v1, a keypoint-aware occlusion augmentation(shorten as OA in the following) is proposed, while in the v2, there is no place for that. But when I compare table 1 in v1 and the table 2 in v2, the result is consistent.
After reading the source code of AID in mmpose pull request, I failed to find anything related to OA.
Why do you drop OA? Does that indicate that using CutOut and HaS can get the same result as OA does?

Is it for single person pose estimation

I tested on the 2958 samples and got the predicted results in single person. Isn't it for multi-person or single?

RSN18+UDP pretrained model suppurt?

Could you share the pretrained model about the RSN18+UDP? BaiduDisk only have hrnet32.

Question about the pre-process

In _xywh2cs(self, x, y, w, h), to get the scale, the bounding box is rescaled to match the aspect ratio of the model input.

From the source code, I think the standard workflow is to rescale the bounding box and then crop the image according to the rescaled bounding box, which is inconsistent with the standard workflow mentioned in paper(which is to rescale the cropped image).
And I found this behavior in mmpose, which seems to be a common behavior.

Is there any special reason to do so? In my point of view, this behavior may includes much more context then expected.
Or the input aspect ratio chosen is designed for COCO, based on the statistics of aspect ratio of person bounding boxes?

Saved Model for HRNet-W48-256x192

Hi Huang, could you provide the configuration file and saved model for HRNet-W48-256x192?

The final result is produced by "gaussian" or "offset"?

Sorry to disturb you , the form of heatmap or loss depend on the parameter "config.MODEL.TARGET_TYPE"，the result in your paper(such as SOTA) is produced by which target type, "gaussian" or "offset"?

Confuse about merge origin hrnet into mmpose

Sry to bother. I use UDP for cloth landmark detection, and I implement it in your origin repo. But when I try to merge it in to mmpose, the mAP is not nomal when compare with origin implementation. Do you have any advice ? I'm new to mmpose.( I implement my dataset by inheriting the 'Topdowndataset' instead of origin Jointdataset.) Two more questions:
1.Btw, I notice that the implementation of the hyperparameter: KPD is different between origin and mmpose. In mmpose, we must calculate it by: valid_radius = factor * heatmap_size[1], so does it mean that i must change the factor if I got a different heatmap_size?
2.I wonder if I need to change the KPD for my own dataset? KPD=3.5 is just suitable for human pose?
Thanks a lot !!! :)

Hello, where is the PRM code in your RSN project

Hello, where is the PRM code in your RSN project, i can not find that, is the code not published?

代码问题请教

您好，在deep-high-resolution-net.pytorch/lib/utils/transforms.py中，第四十一行output_flipped = output_flipped.reshape(shape_ori[0],-1,3,shape_ori[2],shape_ori[3]) 运行报错，因为coco数据集应该是[64,17,64,48], 但reshape的17无法被3整除，这个要如何解决呢？
感谢回复！

ERROR: Could not find a version that satisfies the requirement opencv-python==3.4.1.15

according your Quick start

pip install -r requirements.txt
Collecting EasyDict==1.7
Using cached easydict-1.7.tar.gz (6.2 kB)
ERROR: Could not find a version that satisfies the requirement opencv-python==3.4.1.15 (from -r requirements.txt (line 2)) (from versions: 3.4.2.17, 3.4.3.18, 3.4.4.19, 3.4.5.20, 3.4.6.27, 3.4.7.28, 3.4.8.29, 3.4.9.31, 3.4.9.33, 3.4.10.35, 3.4.10.37, 3.4.11.39, 3.4.11.41, 3.4.11.43, 3.4.11.45, 4.0.0.21, 4.0.1.23, 4.0.1.24, 4.1.0.25, 4.1.1.26, 4.1.2.30, 4.2.0.32, 4.2.0.34, 4.3.0.36, 4.3.0.38, 4.4.0.40, 4.4.0.42, 4.4.0.44, 4.4.0.46)
ERROR: No matching distribution found for opencv-python==3.4.1.15 (from -r requirements.txt (line 2))

excuse me which python version and ubuntu version you use, I use ubuntu 18.04 and I tried python3.7,3.8 isn not success

Is there direct setting for adding or removing UDP in config file

Is there direct setting for adding or removing UDP in config file？

what is difference pred and preds

UDP-Pose/deep-high-resolution-net.pytorch/lib/core/evaluate.py

Line 51 in d742edd

pred, _ = get_max_preds(output)

UDP-Pose/deep-high-resolution-net.pytorch/lib/core/inference.py

Line 164 in d742edd

preds = coords.copy()

Difference between checkpoint.pth and model_best.pth

When I was training the network, checkpoint.pth and model_best.pth were saved? What's the difference between them? Why is the former larger?

The result of person detector

Hi, Huang, can you offer the json file of the person detection result on valid and test dataset?
Where can I download them? Thank you.

我需要你的帮助

你好，黄俊杰，我是在别人项目里看到了你。我有一个关于integral regression的问题一直不能解开，觉得你能帮助我。这可能花不了你多少时间，但对我很重要，你能否帮助我一下。谢谢！

可以在UDP的heatmap的基础上运用simcc的方法吗？

@HuangJunJie2017 您好，可以在UDP的heatmap的基础上运用simcc的方法吗？

there is no w32_256x256_adam_lr1e-3.yaml

I run
python tools/test.py
--cfg experiments/coco/hrnet/w32_256x192_adam_lr1e-3.yaml
TEST.MODEL_FILE models/pytorch/pose_coco/pose_hrnet_w32_256x192.pth
TEST.USE_GT_BBOX Fals
from https://github.com/HuangJunJie2017/UDP-Pose/tree/master/deep-high-resolution-net.pytorch

but in experiments/coco/hrnet/ just have w32_256x192_adam_lr1e-3_offset_ofm.yaml and does not match pose_hrnet_w32_256x192.pth
https://github.com/HuangJunJie2017/UDP-Pose/tree/master/deep-high-resolution-net.pytorch/experiments/coco/hrnet

and how to run your UDP-hrnet_w32_256x192

Configure on 384x288

Hi Huang,
I'm very interested in your work. Can you send me the configure file on 384x288 input size?

Thank you very much

Download the pre-trained model

Thank you for the code. I have a question concerning the pre-trained model. Is it possible to download the pre-trained model from another cloud storage service than BaiduDisk (e.g. Google Drive/ OneDrive)? I am not able to download the file without a Baidu-account (nor by means of a download manager). Thanks in advance!

keypoints shifted?

I have tested the hrnet-w32-256x192 with and without udp on coco valset. Your implementation achieved better score with 78% AP compared to the original one with 76.5% AP. However when i check the visualization result of the output, i found that the keypoints are not really accurate on the position. As shown in the screenshot, the left one is the output of your implementation, and the right one is from the original hrnet. Do you know the reason?

cvpr2020

哥，能方便透露一下你们cvpr的初始得分吗？我在学习你们的paper。

Reproduce

Hi @HuangJunJie2017 ,

thanks for releasing code. I'm trying to use it to reproduce the results (UDP-HRNet-W32) reported in the paper. I used the trained model stored in BaiduDisk and tested it on COCO val2017 set. Here is the result i obtained:

offset | 256x192 | w32 | | gt bbox | 74.6
offset | 256x192 | w32 | | det bbox | 73.3

I wonder if it exists bugs or not ?

can you release the udp code for simple-R50 and pretrained weight?

I found that you compare the performance for udp based on simplebaseline, can you release that part code and weights? thanks!

Good job，but a little flaw？

In “Results on COCO val2017 with detector having human AP of 65.1 on COCO val2017 dataset”： you give the hrnet results just like the original paper rather than the results tested by the same detector of UDP（AP 65.1）? Emma... I am happy to know will the HRNet be better than UDP? Or, what is it that causes the unsatisfactory result of UDP when tested on coco_dev?
Your results:

HRNet results with AP609 person detector:

Can you give some explanation on apply affine transform, i.e, the use of get_warpmatrix and warpAffine?

For the two lines:

UDP-Pose/deep-high-resolution-net.pytorch/lib/dataset/JointsDataset.py

Lines 213 to 214 in 096fa07

 trans = get_warpmatrix(r,c*2.0,self.image_size-1.0,s) 

 input = cv2.warpAffine(data_numpy, trans, (int(self.image_size[0]), int(self.image_size[1])), flags=cv2.WARP_INVERSE_MAP|cv2.INTER_LINEAR)

Can you give some explanation on it?

Why pass c*2 (why multiply by 2)?
How it works by passing the given parameters?

Thank you very much.

Results on COCO val2017

Hello, Thanks for your awesome code, I find pose estimation use box by results on COCO val2017 with detector having human AP of 56.4 on COCO val2017 dataset， but why we don't use person_keypoints_val2017.json directly？

Pay Attention to the Constraint Cue

你好

How to Train Your Robust Human Pose Estimator:Pay Attention to the Constraint Cue
请问这篇文章的实现代码，在什么地方？

文章中，遮挡中心的偏移量δ，怎么随机的？有范围限制吗？
每次都会选择一个关键点做增强吗？这个有没有随机概率？

test accuracy too low use offset

thanks for your great job, I have a question, ap and ar is higher than without udp but avarage accuracy is too low, use udp test accuracy around 0.3 but without udp is 0.9 ~ 1. can you help me? thanks

udp code!

So for HRNet with UDP, have you only modify some detials in dataset? e.g. set feature_stride = (image_size-1)/(heatmap_size-1)? where did you modify your encode and decode process?

About evaluation index

In the evaluate.py , I think the evaluation index looks like PCK rather than OKS ? Am I wrong ?

May I ask questions related to the paper?

In the paper, 3.1.1
The way to transform output back to source is described as:

I have difficult to understand this explanation.
I can't see there is a padding 1 operation to recover source coordinate in other works (which suppose to be biased)

sec 3.1.2
"Specifically, we adopt unit length as the image size measurement criterion,..."
The unit length is "the distance between two adjacent pixels".
Pixel suppose to be squared and the distance between pixels suppose to be equal to pixel size.
What is the different between using "distance" or "pixel" to measure.
I have difficulties to understand this concept.

If this paper related question is not proper for git issues here, may I send you separate email for these issues?
Thanks.

why preds need X scale

UDP-Pose/deep-high-resolution-net.pytorch/lib/utils/transforms.py

Line 67 in d742edd

def transform_preds(coords, center, scale, output_size):

def transform_preds(coords, center, scale, output_size):
scale = scale * 200.0
scale_x = scale[0]/(output_size[0]-1.0)
scale_y = scale[1]/(output_size[1]-1.0)
target_coords = np.zeros(coords.shape)
target_coords[:,0] = coords[:,0]*scale_x + center[0]-scale[0]*0.5
target_coords[:,1] = coords[:,1]*scale_y + center[1]-scale[1]*0.5
return target_coords

	trans = get_warpmatrix(r,c*2.0,self.image_size-1.0,s)
	input = cv2.warpAffine(data_numpy, trans, (int(self.image_size[0]), int(self.image_size[1])), flags=cv2.WARP_INVERSE_MAP\|cv2.INTER_LINEAR)

huangjunjie2017 / udp-pose Goto Github PK

udp-pose's People

Contributors

Stargazers

Watchers

Forkers

udp-pose's Issues

Recommend Projects

Recommend Topics

Recommend Org