Giter Club home page Giter Club logo

gsm's People

Contributors

swathikirans avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gsm's Issues

about the gate shift module

HI!
How does GSM selectively integrate spatial and temporal information through gating?
When the gating values are different, which py file in the project can be found the differencing and averaging operation for the temporal feature ?
hope for your reply,thanks!

Different dimension in figure 2 and figure 3 in your paper

In your paper, you show your GSM design in Figure 2 with a component "133 convolutions". However, in Figure 3, you show your GSM implementation with a component "333 convolutions". Why? Does that mean I need to do 3D convolutions in GSM?

Nan or Inf found in input tensor.

Hellow, I just want to train your methods in my machine with something-v1 dataset,
And in my machine:
(1)Pytorch-1.2
(2)Python-3.7
(4)TensorboardX-also is suitable
(5)4-Gpus

However, when I run the train scripts, Loss is always 'nan' and the outputing the warning:
Nan or Inf found in input tensor.
I have tried to handle this nut by:
(1)turn the learning rate smaller, but this not work;
(2)check the 'loss.backward()',before and after the backpropragate , I print the losss , I find that before backpropragate, the loss is normal, while after this operation, loss=nan

What's more ,
(3)I also checked the 'datasets_video.py', It appears that in your fuction 'return_something():' it doesn't need the file 'filename_categories.txt' compared with other methods, I am also confused about this.

So could you please show some light to me about those two questions.
Thx very much!

RuntimeError: invalid argument 2 while testing

While testing the model after training, I'm getting the following error:

RuntimeError: invalid argument 2: size '[0 x 16 x 64 x 27 x 27]' is invalid for input with 186624 elements at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/TH/THStorage.cpp:84

Do you have any idea to fix this?

Training issue on num_segment=12

Hi, I successfully run your network on Somethin-v1 with num_segment=8

However, when I use num_segment=12, after 1st epoch, I am receiving the following error:
RuntimeError: shape '[-1, 8, 4, 27, 27]' is invalid for input of size 34992

Any ideas?

how many gpu used when training

Hi, I wonder to know how many gpus you used when training models on something-something datasets, and the batch size on each gpu.

How to get the result in Tab 3

In your paper, you show the result on something-v1 dataset in Tab 3. How to get them. Do I need to train 4 models with different segment parameter (8, 12, 16, 24)?

How does the batch_size parameter influence the accuracy

Hi, I try to train the model on my computer on Diving48 dataset. But if I set the batch_size 16, the training program will be out of my GPU memory. So I try to train the model with batch_size 14 and I get 27.02 classAccuracy, 34.49 Prec@1 and 62.69 Prec@5, which are lower than the result in your paper. Is there something wrong with my training setting or is the batch_size parameter so important.

Training setting:
python main.py diving48 RGB --arch InceptionV3 --num_segments 16 --consensus_type avg --batch-size 14 --iter_size 2 --dropout 0.5 --lr 0.01 --warmup 10 --epochs 60 --eval-freq 5 --gd 20 --run_iter 1 -j 16 --npb --gsm

Testing setting:
python test_models.py diving48 RGB model/diving48_InceptionV3_avg_segment16_batch14_epochs60_best.pth.tar --arch InceptionV3 --crop_fusion_type avg --test_segments 16  --test_crops 1 --num_clips 2 --gsm --save_scores

Too long training time on Something-V1

It seems like with my 2 GPU training Something-v1 dataset takes ~3.5 days for 60 epochs.

Do we need to train 60 epochs to get desirable results or it can be obtained with fewer epochs?

Could you tell me please, can we get your results with less number of epochs(f/e, 30-40)? Did you try that? Cause training time is too long

Could you share your .log file please if it is possible?

P/s: I am training num_segments=8 case

Can you share the Diving48 dataset

I think you did a great work. But I find that there is something wrong with Diving48 dataset's official website and I can't dowload the dataset from it. So, would you mind to share the dataset in some other ways.

Difference in Result

Thanks for nice work . I have trained a model on my own dataset which have three classes and every class have 20 videos each . I have formated the dataset in the somehting-something-v1 format and start the training. Durning training I have got testing accuracy as follow

ss1

After completing the training, I have tested the model on the same data, it giving me as follow

Screenshot from 2020-03-02 19-41-05

Class Accuracy 36.67%
Overall Prec@1 36.67% Prec@5 100.00%

can you explain the result ,is it ok or something wrong ?

Performance difference in Diving48 dataset

Hi, Swathikirans

Thank you for sharing your nice work.

I trained your algorithm using Diving48 dataset, however, my result is lower than 40.27%.

Below my configuration:
python3 main.py diving48 RGB --split 1 --arch InceptionV3 --num_segments 16 --consensus_type avg --batch-size 8 --iter_size 1 --dropout 0.7 --lr 0.01 --warmup 10 --epochs 20 --eval-freq 5 --gd 20 --run_iter 1 -j 16 --npb --gsm

Can not understand why I am receiving poor performance.

Any ideas? Please, if you can help me.

Thank you!

Exact Configuration for reproducing results on Diving48

Hi,

I would like to achieve the performance you mentioned in the paper (~40%). I am training the model with the following configuration which after 15 epochs gave me 18.65% accuracy,

python3 main.py diving48 RGB --split 1 --arch InceptionV3 --num_segments 16 --consensus_type avg \
--batch-size 8 --iter_size 2 --dropout 0.5 --lr 0.01 --warmup 10 --epochs 20 \
--eval-freq 5 --gd 20 --run_iter 1 -j 16 --npb --gsm

Can you please provide the exact configuration used?

Run model on Input Video

Thanks for nice work. Can you provide the inference script for running the model on the input Video. Thanks

test_rgb.sh cannot work. And get an error:RuntimeError: shape '[0, 8, 64, 28, 28]' is invalid for input of size 50176

Thanks for your beautiful work.
I trained the .sh file python main.py something-v1 RGB --arch BNInception \ --num_segments 8 --consensus_type avg \ --batch-size 16 --iter_size 2 --dropout 0.5 \ --lr 0.01 --warmup 10 --epochs 60 --eval-freq 5 \ --gd 20 --run_iter 1 -j 16 --npb --gsm
I got a model "something-v1_BNInception_avg_segment8_checkpoint.pth.tar"
the test_rgb.sh: python test_models.py somethong-v1 RGB models/something-v1_BNInception_avg_segment8_checkpoint.pth.tar \ --arch BNInception --crop_fusion_type avg --test_segments 8 --test_crops 1 --num_clips 1 --gsm

When I run the test._rgb.sh, the error arised
**
File "/data/users/xuyang/xuyang/Downloads/GSM-master/gsm.py", line 31, in forward
x = x.view(batchSize, self.num_segments, shape).permute(0, 2, 1, 3, 4).contiguous()
RuntimeError: shape '[0, 8, 64, 28, 28]' is invalid for input of size 50176
*

I try some ways, but the error still asised.
Please guide me. Look forward to your reply!

batch_size?

hello! I'm find your batch_size=32 in paper while batch_size=16 in github. Why are they not equal?Thanks!

Implementing GSM by using custom dataset

I want to implement this GSM model on a custom dataset. Could you please kindly let me know to what all files I have to make changes to in order to adapt this model to Custom dataset? As far as I know I have to make changes to "main.py", "dataset_video" and "opts". Do I need to make changes in any other file?Could you please kindly help me with this?

about how to set frames to train the model

As a novice, I saw that the author used different frames to train the model in the paper. Could you tell me how to set the parameters to change the frame number?Thank you very much for your help.

about the input

thanks for your work. I want know if there is the theoretical basis about the double input can improve the acc.

t-sne

In the paper, which layer of features is used in Figure 5 to make t-sne

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.