Comments (6)
I have the same issue. The reproduced performance on the test set is very poor.
[Epoch] 200
[Loss]
> loss_label 0.0943
> class_error 0.0000
> loss_span 0.0331
> loss_giou 0.5693
> loss_label_0 0.0979
> class_error_0 0.0000
> loss_span_0 0.0354
> loss_giou_0 0.5991
> loss_label_1 0.0954
> class_error_1 0.0000
> loss_span_1 0.0334
> loss_giou_1 0.5739
> loss_label_2 0.0950
> class_error_2 0.0000
> loss_span_2 0.0335
> loss_giou_2 0.5748
> loss_overall 2.8351
[Metrics_No_NMS]
OrderedDict([ ('[email protected]', 50.64),
('[email protected]', 35.1),
('[email protected]', 19.71),
('[email protected]', 6.27),
('[email protected]', 90.42),
('[email protected]', 81.11),
('[email protected]', 66.65),
('[email protected]', 31.87),
('VG-full-mAP', 32.6),
('VG-full-mIoU@R1', 0.2262),
('VG-full-mIoU@R5', 0.5488)])
from explore-and-match.
Please refer to the following configurations.
aux_loss=True
backbone=clip
bs=16
data_type=features
dec_layers=4
dim_feedforward=1024
dropout=0.1
early_stop_patienc=10
enc_layers=4
eos_coef=0.1
eval_bs=16
eval_untrainedFalse
hidden_dim256
input_dropout0.5
lr=0.0001
lr_drop_step=20
method=joint
n_input_proj=2
nheads=8
norm_tfeat=True
norm_vfeat=True
num_input_frames=64
num_input_sentences=4
num_queries=40
optimizer=adamw
pre_norm=False
pred_label=cos
scheduler=steplr
seed=1
set_cost_class=1
set_cost_giou=2
set_cost_span=1
span_type=cw
txt_drop_ratio=0
txt_feat_dim=512
txt_position_embedding=sine
use_txt_pos=True
vid_feat_dim=512
vid_position_embedding=sine
wd=0.0001
from explore-and-match.
@sangminwoo Thanks for your help! We tried your new configurations but still cannot reproduce the performance shown in the paper.
[Epoch] 200
[Loss]
> loss_label 0.0842
> class_error 0.0000
> loss_span 0.0328
> loss_giou 0.3810
> loss_label_0 0.1045
> class_error_0 0.0250
> loss_span_0 0.0373
> loss_giou_0 0.4137
> loss_label_1 0.0890
> class_error_1 0.0000
> loss_span_1 0.0339
> loss_giou_1 0.3922
> loss_label_2 0.0835
> class_error_2 0.0000
> loss_span_2 0.0328
> loss_giou_2 0.3828
> loss_overall 2.0678
[Metrics_No_NMS]
OrderedDict([ ('[email protected]', 46.63),
('[email protected]', 31.36),
('[email protected]', 19.32),
('[email protected]', 6.19),
('[email protected]', 82.59),
('[email protected]', 74.99),
('[email protected]', 62.8),
('[email protected]', 36.0),
('VG-full-mAP', 33.36),
('VG-full-mIoU@R1', 0.2087),
('VG-full-mIoU@R5', 0.5248)])
And we find you give a new parameter name 'set_cost_class' which is not in the public code( there is a parameter named 'set_cost_query' instead).
Here is the configurations we used in training:
| results_dir | results |
| device | 0 |
| seed | 1 |
| log_interval | 1 |
| val_interval | 5 |
| save_interval | 50 |
| use_gpu | True |
| debug | False |
| eval_untrained | False |
| log_dir | logs |
| resume | |
| resume_all | False |
| att_visualize | False |
| corr_visualize | False |
| dist_visualize | False |
| start_epoch | |
| end_epoch | 200 |
| early_stop_patience | -1 |
| lr | 0.0001 |
| lr_drop_step | 20 |
| wd | 0.0001 |
| optimizer | adamw |
| scheduler | steplr |
| dataset | charades |
| data_type | features |
| num_input_frames | 64 |
| num_input_sentences | 4 |
| bs | 16 |
| eval_bs | 1 |
| num_workers | 16 |
| pin_memory | True |
| checkpoint | ./save |
| norm_vfeat | True |
| norm_tfeat | True |
| txt_drop_ratio | 0 |
| backbone | clip |
| method | joint |
| hidden_dim | 256 |
| nheads | 8 |
| enc_layers | 4 |
| dec_layers | 4 |
| vid_feat_dim | 512 |
| txt_feat_dim | 512 |
| num_proposals | 40 |
| input_dropout | 0.5 |
| use_vid_pos | True |
| use_txt_pos | True |
| n_input_proj | 2 |
| dropout | 0.1 |
| dim_feedforward | 1024 |
| pre_norm | False |
| vid_position_embedding | sine |
| txt_position_embedding | sine |
| set_cost_span | 1 |
| set_cost_giou | 2 |
| set_cost_query | 1 |
| aux_loss | True |
| eos_coef | 0.1 |
| pred_label | cos |
| span_type | cw |
| no_sort_results | False |
| max_before_nms | 10 |
| max_after_nms | 10 |
| conf_thd | 0.0 |
| nms_thd | -1 |
from explore-and-match.
c
hello ,how long did your training take? How many GPUs did you use?
from explore-and-match.
@sangminwoo I have the same issue. Can you give me some advice to solve this problem? Thank you very much.
[Epoch] 200
[Loss]
> loss_label 0.6117
> class_error 6.6606
> loss_span 0.0517
> loss_giou 0.7141
> loss_label_0 0.6115
> class_error_0 4.8437
> loss_span_0 0.0525
> loss_giou_0 0.7238
> loss_label_1 0.6097
> class_error_1 5.1319
> loss_span_1 0.0523
> loss_giou_1 0.7173
> loss_label_2 0.6095
> class_error_2 7.1385
> loss_span_2 0.0521
> loss_giou_2 0.7161
> loss_overall 5.5224
[Metrics_No_NMS]
OrderedDict([ ('[email protected]', 72.64),
('[email protected]', 39.87),
('[email protected]', 15.49),
('[email protected]', 6.77),
('[email protected]', 89.32),
('[email protected]', 80.04),
('[email protected]', 59.88),
('[email protected]', 34.18),
('VG-full-mIoU@R1', 0.2702),
('VG-full-mIoU@R5', 0.5454),
('[email protected]', 0.16),
('[email protected]', 31.79),
('[email protected]', 11.88),
('[email protected]', 73.29),
('[email protected]', 21.51),
('[email protected]', 57.77)])
from explore-and-match.
Hi! I am also facing the same issue. Can anyone tell me what are your hardware specs?
I trained the model on one V100 GPU (took me about 1h to train for the 200 epochs) and got the following best performance (epoch 65):
"[email protected]": 49.59,
"[email protected]": 32.92,
"[email protected]": 18.63,
"[email protected]": 6.91,
"[email protected]": 88.91,
"[email protected]": 79.27,
"[email protected]": 62.75,
"[email protected]": 33.54,
"VG-full-mAP": 32.82,
"VG-full-mIoU@R1": 0.2193,
"VG-full-mIoU@R5": 0.54
from explore-and-match.
Related Issues (12)
- LVTR-CLIP HOT 2
- Frames of some videos in the charades cannot be read HOT 1
- it seems that some queries were not evaluated?
- Reproducing with C3D
- Get CLIP Feature for Charades HOT 3
- No zero_shot_clip HOT 2
- What's the definition of run ? HOT 1
- ap_array is empty HOT 7
- Regarding Figure 7. on the paper. HOT 3
- 'clip' Package in preprocess/clip_encoder.py HOT 2
- Training equipment HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from explore-and-match.