Giter Club home page Giter Club logo

seed's People

Contributors

grypesc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

seed's Issues

About the parameters in Table 3

In Table 3, it is shown that SEED utilizes 3.2 million parameters, whereas ResNet18 has 11.7 million parameters. Could you clarify which part of the parameters is referred to by "#Params." in the table?

Can not import an module

This is a very naive issue.

I cloned the repo, installed the dependencies, and tried the script

sh cifar10x10.sh

I am getting the following error

image

This is a very basic issue, I have verified that everything is there. I have no clue

About the avg accuracy on CIFAR-100 in Table 2

Hello, Thank you for your amazing work and code.
I use the setting of SEED for 10 and 5 steps, but I got a significant lower results than those documented in the paper.

I modify the --num-tasks and --nc-first-task to run the experiment about T=6 (|C1|=50) and T=11 (|C1|=50) separately:

python src/main_incremental.py --approach seed --gmms 1 --max-experts 5 --use-multivariate --nepochs 200 --tau 3 --batch-size 128 --num-workers 4 --datasets cifar100_icarl --num-tasks 6 --nc-first-task 50 --lr 0.05 --weight-decay 5e-4 --clipping 1 --alpha 0.99 --use-test-as-val --network resnet32 --extra-aug fetril --momentum 0.9 --exp-name exp_50+5x10 --seed 0

python src/main_incremental.py --approach seed --gmms 1 --max-experts 5 --use-multivariate --nepochs 200 --tau 3 --batch-size 128 --num-workers 4 --datasets cifar100_icarl --num-tasks 11 --nc-first-task 50 --lr 0.05 --weight-decay 5e-4 --clipping 1 --alpha 0.99 --use-test-as-val --network resnet32 --extra-aug fetril --momentum 0.9 --exp-name exp_50+10x5 --seed 0

I get avg_acc = 67.2 on T=6 (|C1|=50) and avg_acc = 66.6 on T=11 (|C1|=50), which is more lower than the results in the paper.
Could you please elaborate more details on what method you used in the paper? And how I can reproduce the results.

Thanks.

Distribution overlap calculation

Dear authors,

Congratulations your accepted paper at ICLR 2024 and your effort to release the code for community. After reading your paper, I think it's a good start to investigate the application of ensembling experts for CL. However, I have a question related to how did you select the experts for finetuning a new task.

As depicted in the Fig.3, the new distribution of task 3's classes will be compared with the old tasks t1 and t2 by the KL divergence. As a result, we have to save the distribution of old tasks. However, as indicated in the text below, the distribution set Q_k only contains the distributions of the current task's classes from 1 to C_t and ignore all tasks from previous classes. Then, in Eq.(2), the KL divergence is computed with in the set Q_k since both q_ik and q_ik are in Q_k. Therefore, we don't have to take into account the class distributions from previous tasks. And I take a look at the code from line 188, it's seem like you are still consider the class distribution of prior tasks.

I was confused about how to interpret it correctly and I am looking forward to hearing from you for clarification. Feel free to correct me if I was wrong.

Best,
Cuong

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.