Comments (11)
The authors released their reference implementation: https://github.com/facebookresearch/OctConv
from octaveconv.
Hello, How do you handle with feature map size is 7x7?
When feature map size is 7x7, using Pooling(data, (2, 2), 'avg', stride=(2, 2))
would make it to 3x3, and using UpSampling(avg_pool, scale=2, sample_type='nearest', num_args=1)
would make it to 6x6 but not 7x7?How do you handle this?
I read your code. it seem that you don't use OctConv in last residual Block?
from octaveconv.
@PistonY Looks like the authors just set alpha=0
in the last ResNet stages, so that the 7x7 feature map is not downsampled to a lower resolution:
# ratio is forced to be 0. for the last stage
# (because do 3x3 conv on 3.5x3.5 resolution map does not make sense)
from octaveconv.
I searched oct conv in github and found that the implementations are the same as this one. It seems like that the strided convolutional and upsample/downsample should be considered carefullly when the size is odd. @andravin
from octaveconv.
I agree that we have to be careful when the strided convolution (or max-pool) filter size is odd, because the OctConv downsampling/upsampling filter size is even.
Just passing the stride to the HH, LL, HL, and LL convolutions inside OctConv would be the wrong thing to do (and at least 1 github implementation does this), because it does not correct the half-pixel shift misalignment that is caused by downsampling with an even sized filter.
Average pooling is the 2-tap filter [1 1]/2
, and nearest neighbor upsampling is the filter [1 1]
. Average pooling shifts the feature map by 1/2 pixel, and, with proper padding, nearest neighbor upsampling unshifts the feature map by the same amount.
Of course, the authors were not required to choose even-size downsampling/upsampling filters. Laplacian pyramids almost always use odd-size filters so that there are no half-pixel phase shifts between layers.
There are still different choices you could make when implementing strided OctConv, and we could discuss each of them. But we still would not know exactly what the authors did and would not be confident that we were reproducing the paper accurately.
from octaveconv.
I believe it's stated in this part of the paper,
However, since the index of X_H can only be an integer, we could either round the index to (2∗p+i, 2∗q+j) or approximate the value at (2∗p+0.5+i, 2∗q+0.5+j) by averaging all 4 adjacent locations. The first one is also known as strided convolution and the second one as average pooling. As we discuss in Section 3.3 and Fig. 3, strided convolution leads to misalignment; we therefore use average pooling to approximate this value for the rest of the paper.
that average pooling is chosen over strided convolution for the rest of the paper. Please also see ThoroughImages/OctConv#1 (comment)
from octaveconv.
That does not actually say how they port Conv2d(stride=2)
to OctConv(stride=2)
, which is the subject of this issue.
It is true however that mixing Conv2d(kernel_size=3, stride=2)
with AvgPool(kernel_size=2, stride=2)
is not good, because this creates 2 different half-scale grids that are shifted relative to each other.
So the authors probably did replace Conv2d(stride=2) with something like AvgPool(stride=2)->Conv2d(stride=1
), as this repo does, but the exact formulation is not documented in the paper.
from octaveconv.
@andravin Yeah,thanks.So it's not a totally plug-and-play "tools" and not suitable for singular side length featuer maps.
from octaveconv.
@PistonY I guess .. in my experiments, I avoided this issue by training with 256x256 input image resolution, so all layer sizes are a power of 2. If the filter was adapted to take the image boundary into account, one could downsample from 7x7 to 4x4. Or maybe it would make more sense to downsample the last 14x14 layer to 8x8.
from octaveconv.
Closing this issue, because we can now look at the reference implementation to see how OctConv implements strided convolutions: https://github.com/facebookresearch/OctConv
from octaveconv.
Hello, How do you handle with feature map size is 7x7?
When feature map size is 7x7, usingPooling(data, (2, 2), 'avg', stride=(2, 2))
would make it to 3x3, and usingUpSampling(avg_pool, scale=2, sample_type='nearest', num_args=1)
would make it to 6x6 but not 7x7?How do you handle this?
I read your code. it seem that you don't use OctConv in last residual Block?
@PistonY have you solved this problem? I met the same problem with you when deal with last octave conv layer. I don't know how to handle it.
from octaveconv.
Related Issues (8)
- no use of hf_ch_in?
- does this work on segmentation models ?
- Could you provide the elapsed time it takes to complete the learning or sec/epoch for each models ResNet-v1-50, OctResNet-v1-50? HOT 4
- 1d version
- How does OctConv really implementstrided convolutions HOT 1
- about last layer of octconv in mobilenetv2
- Super resolution result
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from octaveconv.