grypesc / seed Goto Github PK

View Code? Open in Web Editor NEW

25.0 25.0 4.0 437 KB

ICLR2024 paper on Continual Learning

Home Page: https://arxiv.org/abs/2401.10191

License: MIT License

Shell 0.80% Python 99.20%

class-incremental class-incremental-learning computer-vision continual-learning facil machine-learning

seed's People

Contributors

Stargazers

Watchers

Forkers

tinyloop thewinczi kandeldeepak46

seed's Issues

About the parameters in Table 3

In Table 3, it is shown that SEED utilizes 3.2 million parameters, whereas ResNet18 has 11.7 million parameters. Could you clarify which part of the parameters is referred to by "#Params." in the table?

Can not import an module

This is a very naive issue.

I cloned the repo, installed the dependencies, and tried the script

sh cifar10x10.sh

I am getting the following error

This is a very basic issue, I have verified that everything is there. I have no clue

About the avg accuracy on CIFAR-100 in Table 2

Hello, Thank you for your amazing work and code.
I use the setting of SEED for 10 and 5 steps, but I got a significant lower results than those documented in the paper.

I modify the --num-tasks and --nc-first-task to run the experiment about T=6 (|C1|=50) and T=11 (|C1|=50) separately:

python src/main_incremental.py --approach seed --gmms 1 --max-experts 5 --use-multivariate --nepochs 200 --tau 3 --batch-size 128 --num-workers 4 --datasets cifar100_icarl --num-tasks 6 --nc-first-task 50 --lr 0.05 --weight-decay 5e-4 --clipping 1 --alpha 0.99 --use-test-as-val --network resnet32 --extra-aug fetril --momentum 0.9 --exp-name exp_50+5x10 --seed 0

python src/main_incremental.py --approach seed --gmms 1 --max-experts 5 --use-multivariate --nepochs 200 --tau 3 --batch-size 128 --num-workers 4 --datasets cifar100_icarl --num-tasks 11 --nc-first-task 50 --lr 0.05 --weight-decay 5e-4 --clipping 1 --alpha 0.99 --use-test-as-val --network resnet32 --extra-aug fetril --momentum 0.9 --exp-name exp_50+10x5 --seed 0

I get avg_acc = 67.2 on T=6 (|C1|=50) and avg_acc = 66.6 on T=11 (|C1|=50), which is more lower than the results in the paper.
Could you please elaborate more details on what method you used in the paper? And how I can reproduce the results.

Thanks.

Distribution overlap calculation

Dear authors,

Congratulations your accepted paper at ICLR 2024 and your effort to release the code for community. After reading your paper, I think it's a good start to investigate the application of ensembling experts for CL. However, I have a question related to how did you select the experts for finetuning a new task.

As depicted in the Fig.3, the new distribution of task 3's classes will be compared with the old tasks t1 and t2 by the KL divergence. As a result, we have to save the distribution of old tasks. However, as indicated in the text below, the distribution set Q_k only contains the distributions of the current task's classes from 1 to C_t and ignore all tasks from previous classes. Then, in Eq.(2), the KL divergence is computed with in the set Q_k since both q_ik and q_ik are in Q_k. Therefore, we don't have to take into account the class distributions from previous tasks. And I take a look at the code from line 188, it's seem like you are still consider the class distribution of prior tasks.

I was confused about how to interpret it correctly and I am looking forward to hearing from you for clarification. Feel free to correct me if I was wrong.

Best,
Cuong

grypesc / seed Goto Github PK

seed's People

Contributors

Stargazers

Watchers

Forkers

seed's Issues

About the parameters in Table 3

Can not import an module

About the avg accuracy on CIFAR-100 in Table 2

Distribution overlap calculation

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent