Giter Club home page Giter Club logo

udp-pose's People

Contributors

canwang-sjtu avatar dreampoet avatar hellock avatar hobeom avatar huangjunjie2017 avatar innerlee avatar jin-s13 avatar joannalxy avatar liuxin9608 avatar motokimura avatar smhendryx avatar vsatyakumar avatar wusize avatar xiaohangcd avatar yaochaorui avatar zengwang430521 avatar zhouhang95 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

udp-pose's Issues

AID

Hi, bro. Thanks for your excellent job. First of all, i'd like to ask whether you put the AID code such as Cutout and HideAndSeek in the UDP-Pose repository and add it to the data process pipeline in the train procedure, because i didn't see any of them in the UDP-Pose/deep-high-resolution-net.pytorch/lib/utils/transforms.py, UDP-Pose/deep-high-resolution-net.pytorch/tools/train.py or your config file. Second i'd like to ask that the AP results by using mmpose(your mmpose repository) are the same as the AP results in the 《AID: Pushing the Performance Boundary of Human Pose Estimation with Information Dropping Augmentation》.

How can i use my own custom dataset and obtain metrics?

I have some 500 images and annotation json for those 500 files. I want to evaluate the model on this new test dataset and obtain AP, AP0.5 from that. How can I do that? I could change data dir to my new data directory. However what should I do about the person detection result of COCO val2017 as I do not have that for my dataset for now.

where is the bottom-up version?

Hi, can you show me the path to higherHRNet + udp codes? i can't find them (ノへ ̄)
And how did you make the tradeoff in bottom-up methods, will there be a paper or blog to explain it?

why your json's bboxes have so many no human's joints

Dear JunJie

I am wondering in multi_scales_val_msra.json have 87481 boxes and many of joints are not human's joints such as cat or elephant's joints . but if I use GT's box it just have 6000+ around boxes and these joints are full human's joints . and I found "joints_vis" all set 1. I am very confused. can you tell me the reason.

thank you

About KPD

In the offset model, KPD should be the hyperparameter that controls the radius of the region of interest around the ground truth location. But in the implementation, it also controls the slope (and intercept) of the two offset heatmaps (seel below code):

elif self.target_type == 'offset':
            # self.heatmap_size: [48,64] [w,h]
            target = np.zeros((self.num_joints,
                               3,
                               self.heatmap_size[1]*
                               self.heatmap_size[0]),
                              dtype=np.float32)
            feat_width = self.heatmap_size[0]
            feat_height = self.heatmap_size[1]
            feat_x_int = np.arange(0, feat_width)
            feat_y_int = np.arange(0, feat_height)
            feat_x_int, feat_y_int = np.meshgrid(feat_x_int, feat_y_int)
            feat_x_int = feat_x_int.reshape((-1,))
            feat_y_int = feat_y_int.reshape((-1,))
            kps_pos_distance_x = self.kpd
            kps_pos_distance_y = self.kpd
            feat_stride = (self.image_size - 1.0) / (self.heatmap_size - 1.0)
            for joint_id in range(self.num_joints):
                mu_x = joints[joint_id][0] / feat_stride[0]
                mu_y = joints[joint_id][1] / feat_stride[1]
                # Check that any part of the gaussian is in-bounds

                x_offset = (mu_x - feat_x_int) / kps_pos_distance_x
                y_offset = (mu_y - feat_y_int) / kps_pos_distance_y

                dis = x_offset ** 2 + y_offset ** 2
                keep_pos = np.where((dis <= 1) & (dis >= 0))[0]
                v = target_weight[joint_id]
                if v > 0.5:
                    target[joint_id, 0, keep_pos] = 1
                    target[joint_id, 1, keep_pos] = x_offset[keep_pos]
                    target[joint_id, 2, keep_pos] = y_offset[keep_pos]
            target=target.reshape((self.num_joints*3,self.heatmap_size[1],self.heatmap_size[0]))

Is there any reason why you do that? It seems to be another "devil in the detail" worth studying...:)

About the multi-scale testing

I was trying to implement multi-scale testing for my project based on HRNet's official source code. I have downloaded their pre-trained model and run the MPII test set. But I only got 91.6% instead of 92.3% as reported in the original paper. I know i should probably post the issue on the original HRNet GitHub page (i did and I also wrote an email to the author but i got no response).

So, I post here as it is a newer paper based on HRNet's source code, and also no open issue here. I have included my implementation of multi-test as well as the Matlab evaluation code directly evaluating PCKh from the .mat file generated by the official code with 7247 predictions and see if there are problems with my code:

def read_scaled_image(image_file, s, center, scale, image_size, COLOR_RGB, DATA_FORMAT, image_transform):
    if DATA_FORMAT == 'zip':
        from utils import zipreader
        data_numpy = zipreader.imread(image_file, cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION)
    else:
        data_numpy = cv2.imread(image_file, cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION)
    if COLOR_RGB:
        data_numpy = cv2.cvtColor(data_numpy, cv2.COLOR_BGR2RGB)
    trans = get_affine_transform(center, s * scale, 0, image_size)
    images_warp = cv2.warpAffine(data_numpy, trans, tuple(image_size), flags=cv2.INTER_LINEAR)
    return image_transform(images_warp)


def scale_back_output(output_hm, s, output_size):
    hm_size = [output_hm.size(3), output_hm.size(2)]
    # original_max_val1, _ = torch.max(output_hm, dim=2, keepdim=True)
    # original_max_val2, _ = torch.max(original_max_val1, dim=3, keepdim=True)
    if s != 1.0:
        hm_w_margin = int(abs(1.0 - s) * hm_size[0] / 2.0)
        hm_h_margin = int(abs(1.0 - s) * hm_size[1] / 2.0)
        if s < 1.0:
            hm_padding = torch.nn.ZeroPad2d((hm_w_margin, hm_w_margin, hm_h_margin, hm_h_margin))
            resized_hm = hm_padding(output_hm)
        else:
            resized_hm = output_hm[:, :, hm_h_margin:hm_size[1] - hm_h_margin, hm_w_margin:hm_size[0] - hm_w_margin]
        resized_hm = torch.nn.functional.interpolate(
            resized_hm,
            size=(output_size[0], output_size[1]),
            mode='bilinear',  # bilinear bicubic area
            align_corners=False
        )
    else:
        resized_hm = output_hm
        if hm_size[0] != output_size[0] or hm_size[1] != output_size[1]:
            resized_hm = torch.nn.functional.interpolate(
                resized_hm,
                size=(output_size[0], output_size[1]),
                mode='bilinear',  # bilinear bicubic area
                align_corners=False
            )

    # max_val1, _ = torch.max(resized_hm, dim=2, keepdim=True)
    # max_val2, _ = torch.max(max_val1, dim=3, keepdim=True)
    # resized_hm = resized_hm/max_val2*original_max_val2

    # resized_hm = resized_hm / torch.amax(resized_hm, dim=[2, 3], keepdim=True)
    # resized_hm = torch.nn.functional.normalize(resized_hm, dim=[2, 3], p=1)
    # resized_hm = resized_hm/(torch.sum(resized_hm, dim=[2, 3], keepdim=True) + 1e-9)
    return resized_hm


def validate(config, val_loader, val_dataset, model, criterion, output_dir, tb_log_dir, writer_dict=None, test_scale=None, image_transform=None):
    batch_time = AverageMeter()
    losses = AverageMeter()
    acc = AverageMeter()

    # switch to evaluate mode
    model.eval()

    num_samples = len(val_dataset)
    all_preds = np.zeros((num_samples, config.MODEL.NUM_JOINTS, 3), dtype=np.float32)
    all_boxes = np.zeros((num_samples, 6))
    image_path = []
    filenames = []
    imgnums = []
    idx = 0

    # PRINT_FREQ = min(config.PRINT_FREQ//10, 5)
    PRINT_FREQ = config.PRINT_FREQ
    thread_pool = multiprocessing.Pool(multiprocessing.cpu_count())

    image_size = np.array([config.MODEL.IMAGE_SIZE[1], config.MODEL.IMAGE_SIZE[0]])
    final_test_scale = test_scale if test_scale is not None else config.TEST.SCALE_FACTOR
    with torch.no_grad():
        end = time.time()

        start_time = time.time()
        for i, (input, target, target_weight, meta) in enumerate(val_loader):
            # compute output
            # print("Batch", i, "Batch Size", input.size(0))

            target = target.cuda(non_blocking=True)
            target_weight = target_weight.cuda(non_blocking=True)

            outputs = []
            hm_size = None
            for sidx, s in enumerate(sorted(final_test_scale, reverse=True)):
                print("Test Scale", s)
                if s != 1.0:
                    image_files = meta["image"]
                    centers = meta["center"].numpy()
                    scales = meta["scale"].numpy()

                    images_resized = thread_pool.starmap(read_scaled_image, [(image_file,
                                                                              s,
                                                                              center,
                                                                              scale,
                                                                              image_size,
                                                                              config.DATASET.COLOR_RGB,
                                                                              config.DATASET.DATA_FORMAT,
                                                                              image_transform) for (image_file, center, scale) in zip(image_files, centers, scales)])
                    images_resized = torch.stack(images_resized, dim=0)
                else:
                    images_resized = input

                model_outputs = model(images_resized)
                if isinstance(model_outputs, list):
                    model_outputs = model_outputs[-1]

                if config.TEST.FLIP_TEST:
                    print("Test Flip")
                    input_flipped = images_resized.flip(3)
                    output_flipped = model(input_flipped)
                    if isinstance(output_flipped, list):
                        output_flipped = output_flipped[-1]

                    output_flipped = flip_back(output_flipped.cpu().numpy(), val_dataset.flip_pairs)
                    output_flipped = torch.from_numpy(output_flipped.copy()).cuda()

                    # feature is not aligned, shift flipped heatmap for higher accuracy
                    if config.TEST.SHIFT_HEATMAP:
                        output_flipped[:, :, :, 1:] = output_flipped.clone()[:, :, :, 0:-1]

                    model_outputs = 0.5 * (model_outputs + output_flipped)

                hm_size = [model_outputs.size(3), model_outputs.size(2)]
                # hm_size = image_size
                # hm_size = [128, 128]
                output_flipped_resized = scale_back_output(model_outputs, s, hm_size)
                outputs.append(output_flipped_resized)

            for indv_output in outputs:
                _, avg_acc, _, _ = accuracy(indv_output.cpu().numpy(), target.cpu().numpy())
                print("Indv Accuracy", avg_acc)

            output = torch.stack(outputs, dim=0).mean(dim=0)

            target = scale_back_output(target, 1.0, hm_size)
            loss = criterion(output, target, target_weight)

            num_images = input.size(0)
            # measure accuracy and record loss
            losses.update(loss.item(), num_images)
            _, avg_acc, cnt, pred = accuracy(output.cpu().numpy(), target.cpu().numpy())
            print("Avg Accuracy", avg_acc)
            acc.update(avg_acc, cnt)

            # measure elapsed time
            batch_time.update(time.time() - end)
            end = time.time()

            c = meta['center'].numpy()
            s = meta['scale'].numpy()
            score = meta['score'].numpy()

            preds, maxvals = get_final_preds(config, output.clone().cpu().numpy(), c, s)

            all_preds[idx:idx + num_images, :, 0:2] = preds[:, :, 0:2]
            all_preds[idx:idx + num_images, :, 2:3] = maxvals
            # double check this all_boxes parts
            all_boxes[idx:idx + num_images, 0:2] = c[:, 0:2]
            all_boxes[idx:idx + num_images, 2:4] = s[:, 0:2]
            all_boxes[idx:idx + num_images, 4] = np.prod(s*200, 1)
            all_boxes[idx:idx + num_images, 5] = score
            image_path.extend(meta['image'])

            idx += num_images

            if i % PRINT_FREQ == 0:
                msg = 'Test: [{0}/{1}]\t' \
                      'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t' \
                      'Loss {loss.val:.4f} ({loss.avg:.4f})\t' \
                      'Accuracy {acc.val:.3f} ({acc.avg:.3f})'.format(i, len(val_loader), batch_time=batch_time, loss=losses, acc=acc)
                logger.info(msg)

                prefix = '{}_{}'.format(os.path.join(output_dir, 'val'), i)
                save_debug_images(config, input, meta, target, pred*4, output, prefix)

        total_duration = time.time() - start_time
        logger.info("Total test time: {:.1f}".format(total_duration))
        name_values, perf_indicator = val_dataset.evaluate(config, all_preds, output_dir, all_boxes, image_path, filenames, imgnums)

        model_name = config.MODEL.NAME
        if isinstance(name_values, list):
            for name_value in name_values:
                _print_name_value(name_value, model_name)
        else:
            _print_name_value(name_values, model_name)

        if writer_dict:
            writer = writer_dict['writer']
            global_steps = writer_dict['valid_global_steps']
            writer.add_scalar('valid_loss', losses.avg, global_steps)
            writer.add_scalar('valid_acc', acc.avg, global_steps)
            if isinstance(name_values, list):
                for name_value in name_values:
                    writer.add_scalars('valid', dict(name_value), global_steps)
            else:
                writer.add_scalars('valid', dict(name_values), global_steps)
            writer_dict['valid_global_steps'] = global_steps + 1

    thread_pool.close()
    thread_pool.join()
    return perf_indicator

Below is the Matlab MPII test set evaluation code (evalMPIITest.m), you need to download their newly released test set annotation in http://human-pose.mpi-inf.mpg.de/#download:

% Evaluate performance by comparing predictions to ground truth annotations.

%%% OPTIONS %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% IDs of prediction sets to include in results
PRED_IDS = [1, 2, 3];
% Subset of the data that the predictions correspond to ('val' or 'train')
plotcurve = false;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

addpath ('eval')

fprintf('# MPII single-person pose evaluation script\n')

range = 0:0.01:0.5;

tableDir = './latex'; if (~exist(tableDir,'dir')), mkdir(tableDir); end
plotsDir = './plots'; if (~exist(plotsDir,'dir')), mkdir(plotsDir); end
tableTex = cell(length(PRED_IDS)+1,1);

% load ground truth
p = getExpParams(-1)
load([p.gtDir '/annolist_dataset_v12'], 'annolist');
load([p.gtDir '/mpii_human_pose_v1_u12'], 'RELEASE');
annolist_test = annolist(RELEASE.img_train == 0);
% evaluate on the "single person" subset only
single_person_test = RELEASE.single_person(RELEASE.img_train == 0);
% convert to annotation list with a single pose per entry
[annolist_test_flat, single_person_test_flat] = flatten_annolist(annolist_test,single_person_test);
% represent ground truth as a matrix 2x14xN_images
gt = annolist2matrix(annolist_test_flat(single_person_test_flat == 1));
% compute head size
headSize = getHeadSizeAll(annolist_test_flat(single_person_test_flat == 1));

pckAll = zeros(length(range),16,length(PRED_IDS));

for i = 1:length(PRED_IDS);
  % load predictions
  p = getExpParams(PRED_IDS(i));
  try
    load(p.predFilename, 'preds');
  catch
    preds = h5read(p.predFilename, '/preds');
  end
  
  if size(preds, 1) == 2
    preds = permute(preds, [3, 2, 1]);
  end
  
  % Check that there are the same number of predictions and ground truth
  % annotations. If this assertion fails, a likely cause is a mismatch in
  % subsets (eg predictions are for the training set but ground truth
  % annotations are for the validation set).
  fprintf('%d\n', length(preds))
  fprintf('%d\n', length(gt))
  assert(length(preds) == length(gt));

  pred_flat = annolist_test_flat(single_person_test_flat == 1);
  for idx = 1:length(preds);
    for pidx = 1:length(pred_flat(idx).annorect.annopoints.point);
      joint = pred_flat(idx).annorect.annopoints.point(pidx).id + 1;
      xy = preds(idx, joint, :);
      pred_flat(idx).annorect.annopoints.point(pidx).x = xy(1);
      pred_flat(idx).annorect.annopoints.point(pidx).y = xy(2);
    end
  end

  % pred = annolist2matrix(pred_flat(single_person_flat == 1));
  pred = annolist2matrix(pred_flat);
  
  % only gt is allowed to have NaN
  pred(isnan(pred)) = inf;

  % compute distance to ground truth joints
  dist = getDistPCKh(pred,gt,headSize);

  % compute PCKh
  pck = computePCK(dist,range);

  % plot results
  [row, header] = genTablePCK(pck(end,:),p.name);
  tableTex{1} = header;
  tableTex{i+1} = row;

  pckAll(:,:,i) = pck;

  auc = area_under_curve(scale01(range),pck(:,end));
  fprintf('%s, AUC: %1.1f\n',p.name,auc);
end

% Save results
fid = fopen([tableDir '/pckh.tex'],'wt');assert(fid ~= -1);
for i=1:length(tableTex),fprintf(fid,'%s\n',tableTex{i}); end; fclose(fid);

% plot curves
bSave = true;
if (plotcurve)
    plotCurveNew(squeeze(pckAll(:,end,:)),range,PRED_IDS,'PCKh total, MPII',[plotsDir '/pckh-total-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[1 6],:),2)),range,PRED_IDS,'PCKh ankle, MPII',[plotsDir '/pckh-ankle-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[2 5],:),2)),range,PRED_IDS,'PCKh knee, MPII',[plotsDir '/pckh-knee-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[3 4],:),2)),range,PRED_IDS,'PCKh hip, MPII',[plotsDir '/pckh-hip-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[7 12],:),2)),range,PRED_IDS,'PCKh wrist, MPII',[plotsDir '/pckh-wrist-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[8 11],:),2)),range,PRED_IDS,'PCKh elbow, MPII',[plotsDir '/pckh-elbow-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[9 10],:),2)),range,PRED_IDS,'PCKh shoulder, MPII',[plotsDir '/pckh-shoulder-mpii'],bSave,range(1:5:end));
    plotCurveNew(squeeze(mean(pckAll(:,[13 14],:),2)),range,PRED_IDS,'PCKh head, MPII',[plotsDir '/pckh-head-mpii'],bSave,range(1:5:end));
end

display('Done.')

why need norm = np.ones((pred.shape[0], 2)) * np.array([h, w]) / 10 in accuracy

Dear JunJie

in

norm = np.ones((pred.shape[0], 2)) * np.array([h, w]) / 10

norm = np.ones((pred.shape[0], 2)) * np.array([h, w]) / 10
I did know why use norm and why need /10
and in
def calc_dists(preds, target, normalize):

and why
normed_preds = preds[n, c, :] / normalize[n]
normed_targets = target[n, c, :] / normalize[n]

when I print the output preds[n, c, :] and target[n, c, :] are very different such as[ 27,30] and [4,8]

Thank you

can't Makefile

At the first step Makefile, it raises :EnvironmentError: The nvcc binary could not be located in your $PATH. Either add it to your path
Could you plz tell me the solution?

missing keypoint-aware occlusion augmentation

In the v1, a keypoint-aware occlusion augmentation(shorten as OA in the following) is proposed, while in the v2, there is no place for that. But when I compare table 1 in v1 and the table 2 in v2, the result is consistent.
After reading the source code of AID in mmpose pull request, I failed to find anything related to OA.
Why do you drop OA? Does that indicate that using CutOut and HaS can get the same result as OA does?

Question about the pre-process

In _xywh2cs(self, x, y, w, h), to get the scale, the bounding box is rescaled to match the aspect ratio of the model input.

From the source code, I think the standard workflow is to rescale the bounding box and then crop the image according to the rescaled bounding box, which is inconsistent with the standard workflow mentioned in paper(which is to rescale the cropped image).
And I found this behavior in mmpose, which seems to be a common behavior.

Is there any special reason to do so? In my point of view, this behavior may includes much more context then expected.
Or the input aspect ratio chosen is designed for COCO, based on the statistics of aspect ratio of person bounding boxes?

Confuse about merge origin hrnet into mmpose

Sry to bother. I use UDP for cloth landmark detection, and I implement it in your origin repo. But when I try to merge it in to mmpose, the mAP is not nomal when compare with origin implementation. Do you have any advice ? I'm new to mmpose.( I implement my dataset by inheriting the 'Topdowndataset' instead of origin Jointdataset.) Two more questions:
1.Btw, I notice that the implementation of the hyperparameter: KPD is different between origin and mmpose. In mmpose, we must calculate it by: valid_radius = factor * heatmap_size[1], so does it mean that i must change the factor if I got a different heatmap_size?
2.I wonder if I need to change the KPD for my own dataset? KPD=3.5 is just suitable for human pose?
Thanks a lot !!! :)

代码问题请教

您好,在deep-high-resolution-net.pytorch/lib/utils/transforms.py中,第四十一行output_flipped = output_flipped.reshape(shape_ori[0],-1,3,shape_ori[2],shape_ori[3]) 运行报错,因为coco数据集应该是[64,17,64,48], 但reshape的17无法被3整除,这个要如何解决呢?
感谢回复!

ERROR: Could not find a version that satisfies the requirement opencv-python==3.4.1.15

according your Quick start

pip install -r requirements.txt
Collecting EasyDict==1.7
Using cached easydict-1.7.tar.gz (6.2 kB)
ERROR: Could not find a version that satisfies the requirement opencv-python==3.4.1.15 (from -r requirements.txt (line 2)) (from versions: 3.4.2.17, 3.4.3.18, 3.4.4.19, 3.4.5.20, 3.4.6.27, 3.4.7.28, 3.4.8.29, 3.4.9.31, 3.4.9.33, 3.4.10.35, 3.4.10.37, 3.4.11.39, 3.4.11.41, 3.4.11.43, 3.4.11.45, 4.0.0.21, 4.0.1.23, 4.0.1.24, 4.1.0.25, 4.1.1.26, 4.1.2.30, 4.2.0.32, 4.2.0.34, 4.3.0.36, 4.3.0.38, 4.4.0.40, 4.4.0.42, 4.4.0.44, 4.4.0.46)
ERROR: No matching distribution found for opencv-python==3.4.1.15 (from -r requirements.txt (line 2))

excuse me which python version and ubuntu version you use, I use ubuntu 18.04 and I tried python3.7,3.8 isn not success

The result of person detector

Hi, Huang, can you offer the json file of the person detection result on valid and test dataset?
Where can I download them? Thank you.

我需要你的帮助

你好,黄俊杰,我是在别人项目里看到了你。我有一个关于integral regression的问题一直不能解开,觉得你能帮助我。这可能花不了你多少时间,但对我很重要,你能否帮助我一下。谢谢!

there is no w32_256x256_adam_lr1e-3.yaml

I run
python tools/test.py
--cfg experiments/coco/hrnet/w32_256x192_adam_lr1e-3.yaml
TEST.MODEL_FILE models/pytorch/pose_coco/pose_hrnet_w32_256x192.pth
TEST.USE_GT_BBOX Fals
from https://github.com/HuangJunJie2017/UDP-Pose/tree/master/deep-high-resolution-net.pytorch

but in experiments/coco/hrnet/ just have w32_256x192_adam_lr1e-3_offset_ofm.yaml and does not match pose_hrnet_w32_256x192.pth
https://github.com/HuangJunJie2017/UDP-Pose/tree/master/deep-high-resolution-net.pytorch/experiments/coco/hrnet

and how to run your UDP-hrnet_w32_256x192

Configure on 384x288

Hi Huang,
I'm very interested in your work. Can you send me the configure file on 384x288 input size?

Thank you very much

Download the pre-trained model

Thank you for the code. I have a question concerning the pre-trained model. Is it possible to download the pre-trained model from another cloud storage service than BaiduDisk (e.g. Google Drive/ OneDrive)? I am not able to download the file without a Baidu-account (nor by means of a download manager). Thanks in advance!

keypoints shifted?

I have tested the hrnet-w32-256x192 with and without udp on coco valset. Your implementation achieved better score with 78% AP compared to the original one with 76.5% AP. However when i check the visualization result of the output, i found that the keypoints are not really accurate on the position. As shown in the screenshot, the left one is the output of your implementation, and the right one is from the original hrnet. Do you know the reason?

Screenshot from 2020-09-25 11-58-38

cvpr2020

哥,能方便透露一下你们cvpr的初始得分吗?我在学习你们的paper。

Reproduce

Hi @HuangJunJie2017 ,

thanks for releasing code. I'm trying to use it to reproduce the results (UDP-HRNet-W32) reported in the paper. I used the trained model stored in BaiduDisk and tested it on COCO val2017 set. Here is the result i obtained:

offset | 256x192 | w32 | | gt bbox | 74.6
offset | 256x192 | w32 | | det bbox | 73.3

I wonder if it exists bugs or not ?

Good job,but a little flaw?

In “Results on COCO val2017 with detector having human AP of 65.1 on COCO val2017 dataset”: you give the hrnet results just like the original paper rather than the results tested by the same detector of UDP(AP 65.1)? Emma... I am happy to know will the HRNet be better than UDP? Or, what is it that causes the unsatisfactory result of UDP when tested on coco_dev?
Your results:
image

HRNet results with AP609 person detector:
image

Can you give some explanation on apply affine transform, i.e, the use of get_warpmatrix and warpAffine?

For the two lines:

trans = get_warpmatrix(r,c*2.0,self.image_size-1.0,s)
input = cv2.warpAffine(data_numpy, trans, (int(self.image_size[0]), int(self.image_size[1])), flags=cv2.WARP_INVERSE_MAP|cv2.INTER_LINEAR)

Can you give some explanation on it?

  • Why pass c*2 (why multiply by 2)?
  • How it works by passing the given parameters?

Thank you very much.

Results on COCO val2017

Hello, Thanks for your awesome code, I find pose estimation use box by results on COCO val2017 with detector having human AP of 56.4 on COCO val2017 dataset, but why we don't use person_keypoints_val2017.json directly?

Pay Attention to the Constraint Cue

你好

How to Train Your Robust Human Pose Estimator:Pay Attention to the Constraint Cue
请问这篇文章的实现代码,在什么地方?

文章中,遮挡中心的偏移量δ,怎么随机的?有范围限制吗?
每次都会选择一个关键点做增强吗?这个有没有随机概率?

test accuracy too low use offset

thanks for your great job, I have a question, ap and ar is higher than without udp but avarage accuracy is too low, use udp test accuracy around 0.3 but without udp is 0.9 ~ 1. can you help me? thanks

udp code!

So for HRNet with UDP, have you only modify some detials in dataset? e.g. set feature_stride = (image_size-1)/(heatmap_size-1)? where did you modify your encode and decode process?

About evaluation index

In the evaluate.py , I think the evaluation index looks like PCK rather than OKS ? Am I wrong ?

May I ask questions related to the paper?

In the paper, 3.1.1
The way to transform output back to source is described as:
image
I have difficult to understand this explanation.
I can't see there is a padding 1 operation to recover source coordinate in other works (which suppose to be biased)

sec 3.1.2
"Specifically, we adopt unit length as the image size measurement criterion,..."
The unit length is "the distance between two adjacent pixels".
Pixel suppose to be squared and the distance between pixels suppose to be equal to pixel size.
What is the different between using "distance" or "pixel" to measure.
I have difficulties to understand this concept.

If this paper related question is not proper for git issues here, may I send you separate email for these issues?
Thanks.

why preds need X scale

def transform_preds(coords, center, scale, output_size):

def transform_preds(coords, center, scale, output_size):
scale = scale * 200.0
scale_x = scale[0]/(output_size[0]-1.0)
scale_y = scale[1]/(output_size[1]-1.0)
target_coords = np.zeros(coords.shape)
target_coords[:,0] = coords[:,0]*scale_x + center[0]-scale[0]*0.5
target_coords[:,1] = coords[:,1]*scale_y + center[1]-scale[1]*0.5
return target_coords

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.