Giter Club home page Giter Club logo

Comments (5)

swathikirans avatar swathikirans commented on August 23, 2024

Hi,
we use slow updating of parameters. The parameters are updated after waiting for 'iter_size' iterations. We set the batch size as 16 and iter_size as 2. Thus, the effective batch size is 32. You can change the batch_size as 32 and iter_size as 1 to have the same effect.

from gsm.

FloydEdwin avatar FloydEdwin commented on August 23, 2024

Hi,
we use slow updating of parameters. The parameters are updated after waiting for 'iter_size' iterations. We set the batch size as 16 and iter_size as 2. Thus, the effective batch size is 32. You can change the batch_size as 32 and iter_size as 1 to have the same effect.

hello! Thanks for your reply.
I test your something-v1_RGB_InceptionV3_avg_segment8_checkpoint.pth.tar and got the same result as in your paper.(49.01% top1.However when i using follow shell to train model by myself:

#!/usr/bin/env bash
python main.py something-v1 RGB --arch InceptionV3 \
               --num_segments 8 --consensus_type avg \
               --batch-size 32 --iter_size 1 --dropout 0.5 \
               --lr 0.01 --warmup 10 --epochs 60 --eval-freq 5 \
               --gd 20 --run_iter 1 -j 16 --npb --gsm

I got the 48% acc. If using --batch_size 16 --iter_size=2 I get the result about 48% not the 49% acc reported in your paper too. For BNInception I got the 46% acc(it is 47% in paper!).Maybe there are some details i have ignored. Such as BN? I using 2 2080Ti GPUS.
1.Is there anything I should be aware of when using your GSM code?
2.How many GPUS used duing your training time?
3.The acc 49.01% is a normal vaule or just a maximum of many experiments?
Thanks! Looking forward to your reply!

from gsm.

swathikirans avatar swathikirans commented on August 23, 2024

Due to the stochastic nature of training, there will be some minor differences in the result. All the models are trained using 2 1080TI gpus. We report the accuracy obtained from a single run.

from gsm.

FloydEdwin avatar FloydEdwin commented on August 23, 2024

Due to the stochastic nature of training, there will be some minor differences in the result. All the models are trained using 2 1080TI gpus. We report the accuracy obtained from a single run.

Hello! Thanks for your help! I have get the 49% result of sthv1 through my many experiments! I think 49% is not easy to get out. Anyway, it is a good job. Thanks for your share!
And have you ever train GSM with sthv2? Why not writer it into paper?

from gsm.

swathikirans avatar swathikirans commented on August 23, 2024

Thank you for the nice words.

I recently noticed that the BN statistics are update in a different way when slow updating of parameters are used (batch_size=16, iter_size=2). However, I am not sure if this causes a significant impact in the final result.

Regarding sthv2, we did not train the model on this dataset since it a superset of sthv1 with less label noise and more samples. We will do an evaluation on sthv2.

from gsm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.