Giter Club home page Giter Club logo

Comments (6)

haojc avatar haojc commented on June 12, 2024

I have the same issue. The reproduced performance on the test set is very poor.

[Epoch] 200
[Loss]
> loss_label 0.0943
> class_error 0.0000
> loss_span 0.0331
> loss_giou 0.5693
> loss_label_0 0.0979
> class_error_0 0.0000
> loss_span_0 0.0354
> loss_giou_0 0.5991
> loss_label_1 0.0954
> class_error_1 0.0000
> loss_span_1 0.0334
> loss_giou_1 0.5739
> loss_label_2 0.0950
> class_error_2 0.0000
> loss_span_2 0.0335
> loss_giou_2 0.5748
> loss_overall 2.8351
[Metrics_No_NMS]
OrderedDict([ ('[email protected]', 50.64),
('[email protected]', 35.1),
('[email protected]', 19.71),
('[email protected]', 6.27),
('[email protected]', 90.42),
('[email protected]', 81.11),
('[email protected]', 66.65),
('[email protected]', 31.87),
('VG-full-mAP', 32.6),
('VG-full-mIoU@R1', 0.2262),
('VG-full-mIoU@R5', 0.5488)])

from explore-and-match.

sangminwoo avatar sangminwoo commented on June 12, 2024

Please refer to the following configurations.

aux_loss=True
backbone=clip
bs=16
data_type=features
dec_layers=4
dim_feedforward=1024
dropout=0.1
early_stop_patienc=10
enc_layers=4
eos_coef=0.1
eval_bs=16
eval_untrainedFalse
hidden_dim256
input_dropout0.5
lr=0.0001
lr_drop_step=20
method=joint
n_input_proj=2
nheads=8
norm_tfeat=True
norm_vfeat=True
num_input_frames=64
num_input_sentences=4
num_queries=40
optimizer=adamw
pre_norm=False
pred_label=cos
scheduler=steplr
seed=1
set_cost_class=1
set_cost_giou=2
set_cost_span=1
span_type=cw
txt_drop_ratio=0
txt_feat_dim=512
txt_position_embedding=sine
use_txt_pos=True
vid_feat_dim=512
vid_position_embedding=sine
wd=0.0001

from explore-and-match.

haojc avatar haojc commented on June 12, 2024

@sangminwoo Thanks for your help! We tried your new configurations but still cannot reproduce the performance shown in the paper.

[Epoch] 200
[Loss]
	> loss_label 0.0842
	> class_error 0.0000
	> loss_span 0.0328
	> loss_giou 0.3810
	> loss_label_0 0.1045
	> class_error_0 0.0250
	> loss_span_0 0.0373
	> loss_giou_0 0.4137
	> loss_label_1 0.0890
	> class_error_1 0.0000
	> loss_span_1 0.0339
	> loss_giou_1 0.3922
	> loss_label_2 0.0835
	> class_error_2 0.0000
	> loss_span_2 0.0328
	> loss_giou_2 0.3828
	> loss_overall 2.0678
[Metrics_No_NMS]
OrderedDict([   ('[email protected]', 46.63),
                ('[email protected]', 31.36),
                ('[email protected]', 19.32),
                ('[email protected]', 6.19),
                ('[email protected]', 82.59),
                ('[email protected]', 74.99),
                ('[email protected]', 62.8),
                ('[email protected]', 36.0),
                ('VG-full-mAP', 33.36),
                ('VG-full-mIoU@R1', 0.2087),
                ('VG-full-mIoU@R5', 0.5248)])

And we find you give a new parameter name 'set_cost_class' which is not in the public code( there is a parameter named 'set_cost_query' instead).
Here is the configurations we used in training:

| results_dir            | results     |
| device                 | 0           |
| seed                   | 1           |
| log_interval           | 1           |
| val_interval           | 5           |
| save_interval          | 50          |
| use_gpu                | True        |
| debug                  | False       |
| eval_untrained         | False       |
| log_dir                | logs        |
| resume                 |             |
| resume_all             | False       |
| att_visualize          | False       |
| corr_visualize         | False       |
| dist_visualize         | False       |
| start_epoch            |             |
| end_epoch              | 200         |
| early_stop_patience    | -1          |
| lr                     | 0.0001      |
| lr_drop_step           | 20          |
| wd                     | 0.0001      |
| optimizer              | adamw       |
| scheduler              | steplr      |
| dataset                | charades    |
| data_type              | features    |
| num_input_frames       | 64          |
| num_input_sentences    | 4           |
| bs                     | 16          |
| eval_bs                | 1           |
| num_workers            | 16          |
| pin_memory             | True        |
| checkpoint             | ./save      |
| norm_vfeat             | True        |
| norm_tfeat             | True        |
| txt_drop_ratio         | 0           |
| backbone               | clip        |
| method                 | joint       |
| hidden_dim             | 256         |
| nheads                 | 8           |
| enc_layers             | 4           |
| dec_layers             | 4           |
| vid_feat_dim           | 512         |
| txt_feat_dim           | 512         |
| num_proposals          | 40          |
| input_dropout          | 0.5         |
| use_vid_pos            | True        |
| use_txt_pos            | True        |
| n_input_proj           | 2           |
| dropout                | 0.1         |
| dim_feedforward        | 1024        |
| pre_norm               | False       |
| vid_position_embedding | sine        |
| txt_position_embedding | sine        |
| set_cost_span          | 1           |
| set_cost_giou          | 2           |
| set_cost_query         | 1           |
| aux_loss               | True        |
| eos_coef               | 0.1         |
| pred_label             | cos         |
| span_type              | cw          |
| no_sort_results        | False       |
| max_before_nms         | 10          |
| max_after_nms          | 10          |
| conf_thd               | 0.0         |
| nms_thd                | -1          |

from explore-and-match.

Xiyu-AI avatar Xiyu-AI commented on June 12, 2024

c

hello ,how long did your training take? How many GPUs did you use?

from explore-and-match.

TensorsSun avatar TensorsSun commented on June 12, 2024

@sangminwoo I have the same issue. Can you give me some advice to solve this problem? Thank you very much.

[Epoch] 200

[Loss]

  > loss_label 0.6117
  > class_error 6.6606
  > loss_span 0.0517
  > loss_giou 0.7141
  > loss_label_0 0.6115
  > class_error_0 4.8437
  > loss_span_0 0.0525
  > loss_giou_0 0.7238
  > loss_label_1 0.6097
  > class_error_1 5.1319
  > loss_span_1 0.0523
  > loss_giou_1 0.7173
  > loss_label_2 0.6095
  > class_error_2 7.1385
  > loss_span_2 0.0521
  > loss_giou_2 0.7161
  > loss_overall 5.5224

[Metrics_No_NMS]

OrderedDict([ ('[email protected]', 72.64),
('[email protected]', 39.87),
('[email protected]', 15.49),
('[email protected]', 6.77),
('[email protected]', 89.32),
('[email protected]', 80.04),
('[email protected]', 59.88),
('[email protected]', 34.18),
('VG-full-mIoU@R1', 0.2702),
('VG-full-mIoU@R5', 0.5454),
('[email protected]', 0.16),
('[email protected]', 31.79),
('[email protected]', 11.88),
('[email protected]', 73.29),
('[email protected]', 21.51),
('[email protected]', 57.77)])

from explore-and-match.

ericashimomoto avatar ericashimomoto commented on June 12, 2024

Hi! I am also facing the same issue. Can anyone tell me what are your hardware specs?

I trained the model on one V100 GPU (took me about 1h to train for the 200 epochs) and got the following best performance (epoch 65):
"[email protected]": 49.59,
"[email protected]": 32.92,
"[email protected]": 18.63,
"[email protected]": 6.91,
"[email protected]": 88.91,
"[email protected]": 79.27,
"[email protected]": 62.75,
"[email protected]": 33.54,
"VG-full-mAP": 32.82,
"VG-full-mIoU@R1": 0.2193,
"VG-full-mIoU@R5": 0.54

from explore-and-match.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.