Comments (4)
FLOPs comparison does not make any sense without specifying input sizes. We recompute all FLOPs with the same input size for a fair comparison. The FLOPs reported in the Swin Transformer paper use a different input size.
from maskformer.
In the appendix of Swin-Transformer: "Swin-T, Swin-S are trained on the standard-setting as the previous approaches with an input of 512×512. Swin-B and Swin-L with z indicate that these two models are pre-trained on ImageNet-22K, and trained with the input of 640×640".
In your paper, the crop size is also 512x512. What's the difference in the input size between Swin and Maskformer.
from maskformer.
Yes, we train our model using the same crop size as Swin Transformer. But FLOPs calculation is separate and does not need to use the same crop size as training. We confirmed with authors of Swin Transformer and they used different input size to calculate FLOPs. Thus, we re-calculate FLOPs for a fair comparison.
from maskformer.
from maskformer.
Related Issues (20)
- Attention mask in last Swin basic layer HOT 1
- The comparsion between per-pixel classification and masks classification. HOT 1
- Question about general inference HOT 2
- Pascal_VOC12 is not working HOT 2
- Result in log file. HOT 9
- About ignore value HOT 1
- Unable to open file (file signature not found) HOT 1
- Query on the Matching Step
- Get raw logits per pixel HOT 3
- Inputs preprocessing HOT 2
- Question about transformer decoder HOT 2
- How to train instance segmentation with COCO dataset? HOT 6
- Question about different decoder head HOT 1
- Extract only mask from the output
- Question about Pixel decoder last Conv2d layer
- A few questions about the configuration files
- Unable to train the model
- Error in training
- Whether mask embedding can also be seen as a dynamic kernel parameter?
- ValueError
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from maskformer.