Giter Club home page Giter Club logo

knowledgefactor's Issues

The formulations of paper

Excuse, a question about the derivations of the Variational Bound for Mutual Information is not in the supplementary materials .

The question about the teachers' checkpoint

Excuse me, are these ckpts pretrained by ImageNet? I trained the wideresnet's teacher model with a pretrained ckpt solely on cifar10 initially and found it started at a accuracy of 50%, so I guess the ckpts aren't pretrained on the datasets of cifar10/100. Is that right?

Some equation errors in the paper

Nice work and the code. I find some equation errors in the main text.

  1. The upper bound of I(X,T_j) should be E_{p(x)}D_{KL} not the E_{p(t_j)}D_{KL} in Equ.(7) and Equ.(19) in supplementary materials.
  2. The D_{KL} in Equ.(11) misses a minus before 1/2. But your released code is correct.

a problem occurs when running KD experiments

Excuse me, I wanted to follow this wonderful work. I have successfully conducted KF experiments on cifar10 and
imagenet; however, a problem occurs when running KD experiments.
on cifar10, I run with: python tools/dist_train.py configs/cifar10-kd/wideresnet28-10_resnet18_b128x1
_cifar10.py 3
I get an error:
image
so I changed line 159 of the /KnowledgeFactor-main/cls/mmcls/models/classifiers/kd.py with:
loss_cls = self.criterionCls(student_logit, gt_label.squeeze(dim=1))
However, training with the modified code achieved more than 96% acc,which is higher than the acc reported in the paper
image
So I think my modification is wrong, would you please tell me the right way? thank you very much!

KD experiment problem and teacher checkpoint directory

Hi,
Great work! When I run the tools/train.py file with the config file I get this following error.
Also, where should I put the checkpoints after downloading them?
Thanks

Traceback (most recent call last):
File "tools/train.py", line 15, in
from mmcls.apis import set_random_seed, train_model
File "/home/amir-e/.conda/envs/kf/lib/python3.8/site-packages/mmcls/apis/init.py", line 2, in
from .inference import inference_model, init_model, show_result_pyplot
File "/home/amir-e/.conda/envs/kf/lib/python3.8/site-packages/mmcls/apis/inference.py", line 10, in
from mmcls.datasets.pipelines import Compose
File "/home/amir-e/.conda/envs/kf/lib/python3.8/site-packages/mmcls/datasets/init.py", line 2, in
from .base_dataset import BaseDataset
File "/home/amir-e/.conda/envs/kf/lib/python3.8/site-packages/mmcls/datasets/base_dataset.py", line 13, in
from mmcls.models.losses import accuracy
File "/home/amir-e/.conda/envs/kf/lib/python3.8/site-packages/mmcls/models/init.py", line 2, in
from .backbones import * # noqa: F401,F403
File "/home/amir-e/.conda/envs/kf/lib/python3.8/site-packages/mmcls/models/backbones/init.py", line 3, in
from .conformer import Conformer
File "/home/amir-e/.conda/envs/kf/lib/python3.8/site-packages/mmcls/models/backbones/conformer.py", line 9, in
from mmcv.cnn.bricks.transformer import AdaptivePadding
File "/home/amir-e/.conda/envs/kf/lib/python3.8/site-packages/mmcv/cnn/bricks/transformer.py", line 22, in
from mmcv.ops.multi_scale_deform_attn import MultiScaleDeformableAttention # noqa F401
File "/home/amir-e/.conda/envs/kf/lib/python3.8/site-packages/mmcv/ops/init.py", line 10, in
from .corner_pool import CornerPool
File "/home/amir-e/.conda/envs/kf/lib/python3.8/site-packages/mmcv/ops/corner_pool.py", line 8, in
ext_module = ext_loader.load_ext('_ext', [
File "/home/amir-e/.conda/envs/kf/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 15, in load_ext
assert hasattr(ext, fun), f'{fun} miss in module {name}'
AssertionError: top_pool_forward miss in module _ext

A question occurs when load pretrained wideresnet as teacher

Nice work! But a problem occur when i test running :
Pretrained ResNet18 load successfully in the fold configs/Imagenet-kf ,while it shows "Teacher model not loaded โ€œ when I test running with configs/imagenet-kf/wideresnet28-2*.py. Then I catch the exception , following is some of the error.

RuntimeError ('Error(s) in loading state_dict for ModuleDict:\n\tsize mismatch for backbone.layer1.0.conv1.weight: copying a param with shape torch.Size([160, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 16, 3, 3]).\n\tsize mismatch for backbone.layer1.0.conv2.weight: copying a param with shape torch.Size([160, 160, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for backbone.layer1.0.bn2.weight: copying a param with shape torch.Size([160]) from checkpoint, the shape in current model is torch.Size([32]).\n\tsize mismatch for backbone.layer1.0.bn2.bias: copying a param with shape torch.Size([160]) from checkpoint, the shape in current model is torch.Size([32]).\n\tsize mismatch for backbone.layer1.0.bn2.running_mean: copying a param with shape torch.Size([160]) from checkpoint, the shape in current model is torch.Size([32]).\n\tsize mismatch for backbone.layer1.0.bn2.running_var: copying a param with shape torch.Size([160]) from checkpoint, the shape in current model is torch.Size([32]).\n\tsize mismatch for backbone.layer1.0.downsample.0.weight: copying a param with shape torch.Size([160, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.