Comments (4)
Thanks for your reply and suggestions, that help me a lot. I will do more experiments and try other hyper parameters.
from asl.
This is not the original code (i cannot release it due to commercial reasons), but a reproduction code from the community.
we didn't do any major hyper-parameter optimization for the reproduction code, and you are welcome to contribute. Try playing with the ema decay factor, learning rate schedule pct_start and number of epochs, and i predict you could easily surpass 86.0.
Two differences i am aware in the code level: In the original code we had better augmentations than randAugment and better EMA implementation. there could be other differences.
anyway, 85.5 is not far from the article score. This result is significantly higher than previous SOTA, and achieved with a faster model and significantly less training time.
Try training with ASL and compare to CE, and you will see the advantage of our proposed loss over the commonly used losses in multi-label
from asl.
@SlongLiu
(1)
i updated the hyper-parameters (just lr and epoch adjustments)
now the run fully reproduces article results, 86.52, see attached file log:
ltresnet_86.52.zip
(2) i fixed the paths following the suggestions, although I am not sure that it will work on all machines
from asl.
Thanks very much for your sharing and help!
from asl.
Related Issues (20)
- How can I normalize these datasets NUS-WIDE and Open Image?
- Some questions about the 'clamp' operation
- Can this be used for multilabel dataset in classification HOT 1
- Unit tests are invalid
- 超参数设置 HOT 1
- question about data augmentation HOT 3
- What is the best practice to increase the number of tags of an existing model without retraining the whole model again?
- How to pass weights to ASL?
- the problem of shifted probability HOT 1
- 如何能够实现自适应的超参数?
- Some questions about openimages v6 datasets? HOT 1
- Do you have Tensorflow implementation? HOT 1
- Hello
- In my task,it alwasy return nan,May be it is not as wide use as BCE? HOT 3
- Maybe a Bug HOT 1
- How to multi-GPU training
- ResNet101+ASL in MS-COCO HOT 7
- target by object area
- bug: this class equals to nn.CrossEntropyLoss with labelsoothing
- Adaptive Asymmetry
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from asl.