Hi, thank you for your great work! I was wondering if you could shar

Hyperparameters to replicate the paper's results about tanet HOT 6 OPEN

ZachL1 commented on July 19, 2024

Hyperparameters to replicate the paper's results

from tanet.

Comments (6)

woshidandan commented on July 19, 2024 1

Hi, thanks for following our work! When using the tool NNI, we did not use any other special search strategy, you can see our settings for hyperparameters in "search_space.json". Anyway, everyone's training process may be different, you can try more combinations of parameters, in our experience, batchsize at [24,32,40, 48, 56, 64, 72] and other multiples of 8, and init_lr at [0.0001,0.00001,0.000001] might be better, we will adjust the range of these thresholds according to the training situation. Whenever the model converges, you can save the model, adjust the search space, read in the model weights and continue the training,there is no special time-saving ways.

from tanet.

ZachL1 commented on July 19, 2024

Can I reproduce the results of the paper by using the default options?
https://github.com/woshidandan/TANet/blob/d911967eede3fc8a526d5a43bf5359aece187a1c/code/AVA/option.py#L3-L28

I apologize for not knowing much about NNI, my previous understanding of it was: "find a better set of parameters, such as batchsize, lr, epoch number, etc., through multiple experimental results. And then others can use this set of parameters for training to get the same good results."
But from your answer (and I also skimmed the NNI documentation), it seems that NNI is actually: "the hyperparameters are constantly adjusted during training (if it is specified in the search space), and the model converges from the initial state to the best (assuming it is best) state in only one experiment. That means if I want to reproduce your results I should use the same from batchsize and lr at each epoch as you do. But if I use the same NNI configuration as you do, I will theoretically also get the same hyperparameters as you do at each epoch and get the same results. And it might not be very time consuming because the whole process is only one experiment instead of multiple experiments to take the best."
Is that so? If there is anything wrong with what I said please let me know.

One more question, train_nni.py does not anywhere read search_space.json, so how does NNI know which hyperparameters need to be adjusted?

from tanet.

woshidandan commented on July 19, 2024

Q: Can I reproduce the results of the paper by using the default options?
A: No, these configurations in "options.py" will be overwritten by NNI.

Q: But if I use the same NNI configuration as you do, I will theoretically also get the same hyperparameters as you do at each epoch and get the same results
A: No, the training process will be somewhat random, for example, the initialization parameters of the network.

Q: How does NNI know which hyperparameters need to be adjusted?
A: Read this and you will understand everything: https://github.com/microsoft/nni

from tanet.

ZachL1 commented on July 19, 2024

I see, thank you!

from tanet.

ZachL1 commented on July 19, 2024

Hi, I think I really understand nni now, it's just normal hyperparameter search. nni uses different hyperparameters to conduct experiments, and supervises the experimental process and results to get better parameters for the next experiment.

So if that's the case, what hyperparameters did you end up using to get the results in the paper? I think that searching for hyperparameters from scratch is pointless and may lead to different results than yours.
Specifically, what are the epoch, batch size and learing rate?

from tanet.

JennyVanessa commented on July 19, 2024

Hi, I think I really understand nni now, it's just normal hyperparameter search. nni uses different hyperparameters to conduct experiments, and supervises the experimental process and results to get better parameters for the next experiment.

So if that's the case, what hyperparameters did you end up using to get the results in the paper? I think that searching for hyperparameters from scratch is pointless and may lead to different results than yours. Specifically, what are the epoch, batch size and learing rate?

I have the same question. I think it's best to provide the hyperparameter settings directly. ;)

from tanet.

Hyperparameters to replicate the paper's results about tanet HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent