Giter Club home page Giter Club logo

mobilenetv3's People

Contributors

xiaolai-sqlai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mobilenetv3's Issues

!!!

mobilev3 large那里是不是少了一个池化层

learning rate setting and data augumentation

Hi Xiaolai,

I found it is hard to reduplicate the stated results. Could you please upload the train file also? The tricks in the learning rate setting and augmentation maybe the point for reproducing.

Thanks and it would be a wonderful work if the results can be reproduced.

Doubt the accuracy

Get 65.07% using the new small model, it's much lower than your claim.
Use these pretreatments.
transform_test = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

SE's position differs from paper.

image
image

SE should be placed after Depthwise and before Pointwise. However, you set it after Pointwise. Is there any reason or just a mistake?

SeModule

我想知道,在Se模块中,AdaptiveAvgPool2d(1)使用来做啥的呢?

Validation accuracy for mobilenetv3_small

I downloaded the small model and tested it, but only got accuracy 64.926%. Not sure is it because of preprocessing difference? Here is what I used:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                     std=[0.229, 0.224, 0.225])

val_loader = torch.utils.data.DataLoader(
        datasets.ImageFolder(valdir, transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            normalize,
        ])),
        batch_size=128, shuffle=False,
        num_workers=1, pin_memory=True)

Stride error

I wonder if the stride is right, for the stride of 13th bneck is 2 in the paper, however it's 1 in yours and the 14th bneck's is 2.

SE模块位置与论文中不符?

源码是不是有问题?我在看论文中与别人的实现代码中发现SE模块是在DeptWise之后使用,其中的输入通道数是expand_size也就是扩大的通道数,随后进行PointWise,而此代码是在PointWise结束后进行SE机制。

如图为其它实现代码
微信图片_20240510010107

batchnorm handing issue when inference

When inference on single image (batch size = 1) got error:

mobilenetv3.py", line 199, in forward
    out = self.hs3(self.bn3(self.linear3(out)))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/batchnorm.py", line 76, in forward
    exponential_average_factor, self.eps)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 1619, in batch_norm
    raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 1280])

Difference on NBN from paper

Hi, Beautiful code.
In Searching for MobileNetV3, the operator of the last two layers of Mobilenet_V3_Large are conv and NBN(No Batch Normlization), but self.bn3 appears on your code mobilenetv3/mobilenetv3.py line 130 and 187.
Please check your code.Thank you!

SeModule Error

你好,在SeModule中插入的的BatchNorm2d模块是不是错了, 因为在AdaptiveAvgPool2d后特征图大小变为1x1,对这种特征图进行BN的话会报错。我的操作系统是Ubuntu18.04,torch==1.11.0+cu113; torchvision==0.12.0+cu113; CUDA:11.3

Validation accuracy for large model low, mistakes in model

As with #5, the validation accuracy for the large model is also well below the stated. I was curious because the stated result, beating the official, with 1.4m less parameters would be impressive.

I only get: Prec@1 70.788 (29.212) Prec@5 89.410 (10.590)

Several things to fix in the model:

  • squeeze-excite layers should reduce the spatial dims with either a mean across spatial dim or avgpool. You have the avg pool in there but aren't using it.
  • should be no BN in SE module
  • SE module should be applied between the 3x3 DW conv and the 1x1 PWL, not after the PWL
  • as per paper, the reduction for the SE layer in mobilnet v3 should be applied to the expanded width
  • there were mistakes in the last block of 5x5 convs in the paper, those mistakes have been fixed with a new version, location of the last stride 2 changed and one of the 672 expansions should be 960
  • should be no batch norm after the linear before the classifier layer

loss nan

hi , my training loss is normal, but the validate loss is nan , Have you met anybody?

疑问求教篇

大佬, 你的这个InvterMobileBlock, 不应该是先SElayer,然后在进行point covn操作嘛?

Difference on SE from paper

  1. I am just curious, why do you add batch normalization inside SeModule? Is there a reference to do so?
  2. Please correct me if I make a mistake: I think SeModule should be added between dw and pw-linear, but your code seems to add that after pw-linear and right before residual connection
  3. Do you think it's necessary to consider expand_ratio = 1? When expand_channel == output_channel, I feel that pw might be redundant, since the shape won't change a bit after pw.

Thank you!

The last two FC layers don't need batch norm.

NBN denotes no batch normalization.
In the original paper. In table 1. The author use conv2d 1x1, NBN in the 1x1x960 to 1x1x1280, which means they don't use batch norm in fc layers, but your code use bn. This is not the same as the original version of the paper. Although this little change may not influence the final results.

out = self.hs3(self.bn3(self.linear3(out)))
out = self.linear4(out)

fail to load pre-model

    model = torch.load(‘mbv3_small.pth.tar‘, map_location='cpu') 
    print('Loading base network...')
    weight = model["state_dict"]
    base_net = torch.nn.DataParallel(base_net)
    base_net.load_state_dict(weights)

The program has been running, no response

模型解压报错

tar -xvf mbv3_large.pth.tar
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors

About SE Reduction ratio?

In the original Squeeze-and-Excitation Networks, the author found through experiments that when reduction = 16, precision and complexity are well balanced, and in your module, reduction is set to the current number of bneck output channels , what is the reason for this setting? Did it come from the experiment?

Image preprocessing method

Hi, thanks for your project.
I have a question: what's your image preprocessing method before inference? Do you subtract the mean and divide the standard value or other method? I want to use your trained model to finetune other classification task.

About the Residual Block

In code

    def __init__(self, kernel_size, in_size, expand_size, out_size, nolinear, semodule, stride):
        super(Block, self).__init__()
        self.stride = stride
        self.se = semodule

        self.conv1 = nn.Conv2d(in_size, expand_size, kernel_size=1, stride=1, padding=0, bias=False)
        self.bn1 = nn.BatchNorm2d(expand_size)
        self.nolinear1 = nolinear
        self.conv2 = nn.Conv2d(expand_size, expand_size, kernel_size=kernel_size, stride=stride, padding=kernel_size//2, groups=expand_size, bias=False)
        self.bn2 = nn.BatchNorm2d(expand_size)
        self.nolinear2 = nolinear
        self.conv3 = nn.Conv2d(expand_size, out_size, kernel_size=1, stride=1, padding=0, bias=False)
        self.bn3 = nn.BatchNorm2d(out_size)

        self.shortcut = nn.Sequential()
        if stride == 1 and in_size != out_size:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_size, out_size, kernel_size=1, stride=1, padding=0, bias=False),
                nn.BatchNorm2d(out_size),
            )

    def forward(self, x):
        out = self.nolinear1(self.bn1(self.conv1(x)))
        out = self.nolinear2(self.bn2(self.conv2(out)))
        out = self.bn3(self.conv3(out))
        if self.se != None:
            out = self.se(out)
        out = out + self.shortcut(x) if self.stride==1 else out
        return out

Due to the cat operation of residual network , I think "if stride == 1 and in_size != out_size:" should be changed to "if stride == 1 and in_size == out_size:"
Besides, "out = out + self.shortcut(x) if self.stride==1 else out" here I also think you may use residual network incorrectly

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.