lxtgh / octaveconv_pytorch Goto Github PK

View Code? Open in Web Editor NEW

580.0 16.0 88.0 3.14 MB

Pytorch implementation of newly added convolution

License: MIT License

Python 99.78% Shell 0.22%

octaveconv_pytorch's Introduction

Beyond Convolution

OctaveConv_pytorch

Pytorch implementation of recent operators

This is third parity implementation(un-official) of Following Paper.

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution(ICCV 2019). paper
Adaptively Connected Neural Networks.(CVPR 2019) paper
Res2net:A New Multi-scale Backbone Architecture(PAMI 2019) paper
ScaleNet:Data-Driven Neuron Allocation for Scale Aggregation Networks (CVPR2019) paper
SRM : A Style-based Recalibration Module for Convolutional Neural Networks paper
SEnet: Squeeze-and-Excitation Networks(CVPR 2018) paper
GEnet: Exploiting Feature Context in Convolutional Neural Networks(NIPS 2018) paper
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks paper
SK-Net: Selective Kernel Networks(CVPR 2019) paper
More Net will be added.

Plan

add Res2Net bolock with SE-layer (done)
add Adaptive-Convolution: both pixel-aware and dataset-aware (done)
Train code on Imagenet. (done)
Add SE-like models. (done)
Keep tracking with new proposed operators. (-)

Usage

check model files under the fig/nn floder.

from lib.nn.OCtaveResnet import resnet50
from lib.nn.res2net import se_resnet50
from lib.nn.AdaptiveConvResnet import PixelAwareResnet50, DataSetAwareResnet50

model = resnet50().cuda()
model = se_resnet50().cuda()
model = PixelAwareResnet50().cuda()
model = DataSetAwareResnet50().cuda()

Training

see exp floder for the detailed information

CheckPoint

Reference and Citation:

OctaveConv: MXNet implementation here
AdaptiveCov: Offical tensorflow implementation here
ScaleNet: here
SGENet:here

Please consider cite the author's paper when using the code for your research.

License

MIT License

octaveconv_pytorch's People

Contributors

Stargazers

Watchers

Forkers

suyanzhou626 zzmcdc youjp salt-fly wgwangang templeblock leo-xxx a13668349935 shilpaj1994 sijiayang shadowkun shiyongde shlpu cclauss runauto hhy5277 nguyenvo09 jangocheng ddxk lugeluge cedrickchee zhaojp-frank ml-lab saintlogos1234 wuyujiji qiuhuihuang sadjadasghari ssins alzayats kelvinson shiyuan0806 urbaneman fantasydreams maojingyi gershonmf songhou lkyee deepdeepdot henry9910 gsygsy96 zxgapollo yingmuying xiangqianma yangsenwxy guoruiwang guoswang xingliujia jacke121 zhangyuxuan1996 zhe-meng lxmwust xiaoketongxue yuzhijun2 nnu-gisa vuducquan96 m-and-ms chaos1992 vaedan renly lipeng-gu ww2401 ssusantachary zhuqiran zhangkai2017 boyuezhong nooneust lliai lchenxidian polyrhythml desera wylcasia yutpa midsdsy sangkwun detectrecog zhangdao97 hexuanfang wuzhihao7788 planetceres tenglang123 tomsirliu blue-blue272 updating00 sjjdd nudtersll

octaveconv_pytorch's Issues

Strided convolutions handled incorrectly

Edit

Now I am not so sure, since a careful reading of the papers seems to show the authors believe that strided convolutions cause misaligned feature maps, which is an erroneous conclusion. I go into more detail in the linked issue, below.

Original issue text

OctaveConv_pytorch/nn/OctaveConv2.py

Line 93 in 83b4cc1

X_h, X_l = self.h2g_pool(X_h), self.h2g_pool(X_l)

I don't think strided convolutions receive any special treatment in the paper. I think maybe there was some confusion about the discussion of how downsampling is implemented. That refers to the H->L downsampling operation, it does not refer to the way a strided convolution layer is ported to OctConv.

Since the paper is not explicit about strided convolutions, I assume they are ported to OctConv just as any other convolutional layer. So I would just use the given stride argument in each of the convolutions, L->L, L->H, H->L, H-H.

Unless somebody finds information to the contrary, this appears to be the only sensible reading of the paper.

there is a problem in "upsample"

  X_h = X_l2h + X_h2h
RuntimeError: The size of tensor a (6) must match the size of tensor b (7) at non-singleton dimension 3

because X_l comes from X_h throgh AvgPooling , when H×W=7×7, the output is 3×3.Through upsample, the result is 6×6, not 7×7

1d version

Thank you very much. I would welcome if you can share the 1d version of OctConv as well.

Handling different input and output number of channels in AdaptiveConv

In this line shouldn't we pass x to gap, otherwise if our input and output number of channels is different this will throw error, because x1 will have different number of channels than x and fc1 layers maps from number of input channels to number of output channels.

gap = self.gap(x1)

Please let me know if I am missing something.

training is slow

I tried to train oct_res50 on 4 TITAN X (Pascal). It took around 0.9 s/iter, while original res50 took around 0.4~0.5 s/iter. Is it reasonable?

the test speed

the test speed of the oct101 is 17fps is much slower than res101(55fps) ,is that right?

Why bias sharing?

Hi, lxtGH.

Thanks for the implement, however, there is some details that confused me.

As defined under this line

X_h2h = F.conv2d(X_h, self.weights[0:end_h_y, 0:end_h_x, :,:], self.bias[0:end_h_y], 1,
                        self.padding, self.dilation, self.groups)

X_l2l = F.conv2d(X_l, self.weights[end_h_y:, end_h_x:, :,:], self.bias[end_h_y:], 1,
                        self.padding, self.dilation, self.groups)

X_h2l = F.conv2d(X_h2l, self.weights[end_h_y:, 0: end_h_x, :,:], self.bias[end_h_y:], 1,
                        self.padding, self.dilation, self.groups)

X_l2h = F.conv2d(X_l, self.weights[0:end_h_y, end_h_x:, :,:], self.bias[0:end_h_y], 1,
                        self.padding, self.dilation, self.groups)

why the calculation of X_h2h and X_l2h sharing convolution bias? Which same as X_l2l and X_h2l.

I didn't find any details about bias sharing written in the paper, is this sharing reasonable?

Thanks in advance.

AdaptiveConv通道数不对

https://github.com/lxtGH/OctaveConv_pytorch/blob/master/nn/AdaptiveConvResnet.py
158行

forward:3x3后的x1，in->out，通道数为out
而作为fc1的输入，通道数为in

训练效果

你好，我实际应用于我的一个算法，替代resnet50，其他保持不变，但是发现训练速度明显变慢了，而且模型收敛反而更不好。不知道是和原因。

Training Strategy

Recently, I am doing experiments on octave convolution. Should I be getting the same performance as vanilla convolution neural network by keeping the learning strategy same while training my model with octave convolution or I might have to change my learning strategy in order to get the same performance as that of vanilla convolution neural network?

请问有提供基于imagenet的预训练模型吗

softmax in AdaptiveConv class

首先非常感谢您的代码，让人受益匪浅，在这里提一点发现的可能的问题：在文件resnet_adaptiveconv.py的class AdaptiveConv中的代码行“weight = self.softmax(self.w)”，看起来直接对shape为torch.Size([3, 1, 64, 64])的self.w按您代码中如下softmax并没起作用，操作后weight仍然是全1的矩阵

there is a bug in line 38?

Bug and error in Octave conv

When I used the model in libs/nn/OCtaveResnet.py, I found tow things.

One is the stride of last Bottleneck is 2, which cause bug when using this model, because the input size of last Bottleneck is 1414 and 77, and after conv block of stride=2, there will be two tensor, 66 and 77 which can not be added together, bug comes. Setting the stride of last Bottleneck equals to 1 will solve this bug, with a little change of resnet.

Another one is about stride of Octaveconv2.py. I found that when stride=2, you still used conv layer with stride=1, and implement pooing(stride=2) before ocnv layer instead, which is totally different from conv layer with stride=2. I think the latter one is what author do in original paper.

Still, thanks for sharing your work!

reduction not be used？？

OctaveConv_pytorch/libs/nn/resnet_se.py

Line 9 in 079f7da

def __init__(self, channel, reduction = 16):

Upsample should be after L->H convolution

OctaveConv_pytorch/nn/OctaveConv2.py

Line 36 in 83b4cc1

X_l2h = self.upsample(X_l)

In the paper, the upsample operation occurs after the L->H convolution, not before.

Is there an implementation of the Octave UNet version?

Looking forward to your reply.

Environment

Hi @lxtGH ,

Thank you for your contribution. It would be nice if you can share the required libraries(i.e. the version of pytorch). Just tiny advice.

some bugs

exp/train_val_step_se_resnet50.sh : tha last parameter name should be --warmup
libs/nn/res2net.py : line 57: missing paranthesis (generates syntax error)

Non-linearity in Res2Net

@lxtGH, thanks for the great repo.

I just have a question regarding your implementation of Res2Net. Your implementation has sub-convolution along with BN & ReLU.

https://github.com/lxtGH/OctaveConv_pytorch/blob/master/libs/nn/res2net.py#L57

However, I couldn't find any mention of non-linearity in the paper so it is bit unclear to me. Have you tested the performance with & without non-linearity? I think adding BN & ReLU to every sub conv might slow down the training process.