Comments (4)
seg这个目录是很久之前的code, 已经1年半没有维护过了。如果你现阶段想跑,我推荐用基于MMSeg的seg_light_ham跑,可以结合VAN作为backbone。性能很强,可见2022.03.26的更新。
使用seg_light_ham,你可以选VAN-Tiny或者VAN-Small作为backbone,Ham或者Light-Ham作为head,在ADE20K数据集上跑。我觉得有希望能fit进1张卡,尤其是Light-Ham,非常轻量级。注意这个目录下的dist_train.sh的DDP是允许单卡跑的(印象中是),指定GPU数为1即可,应该也不需要修改norm_cfg,SyncBN可以支持单卡。如果单卡memory不太够,可以在config尝试修改把总的output stride从8改到16,这会进一步缩小训练的memory开销。
如果使用seg这套旧code,我记得PASCAL VOC数据集上一个单卡可复现且速度比较快的setting。你可以用ResNet 50,stride 16,batch size 12,30K iterations来做ablation。这个是单卡11G用seg这个目录可以跑的。我建议在这个setting下跑3~4组,用average+best来衡量在ham之上做进一步的修改是否有效。
应该是在settings.py里面设置
RUN_FOR_TEST = False
TRAIN_BATCH_SIZE = 12
ITER_MAX = 30000
N_LAYERS = 50
STRIDE = 16
可以加assets里面的Wechat,有问题欢迎继续问!
from enjoy-hamburger.
我是用的seg_light_ham,选择的是van_base。
我的意思是能够用dist_train.sh,gpus=1跑通,而且在我自己的数据集上第一次就达到了82.5mIoU,这是很好的结果。
所以对于该模型,我想深入学习下,想通过使用train.py的命令而不是.sh(因为train.py能够一步一步debug),但是一直报错Default process group has not been initialized, please make sure to call init_process_group.
from enjoy-hamburger.
OK我理解你的问题了。我发现了下面几个链接跟你的问题相关:
facebookresearch/detectron2#3972
pytorch/pytorch#63662
open-mmlab/mmsegmentation#772
看样子pytorch尚未支持单卡下DDP+SyncBN的训练,MMSeg选择在这种情况下把SyncBN换回BN,见code。所以我的理解是,是不是应该直接在train.py里面手动设置distributed为False,或者激活该变量为False,见此处code。可以尝试一下。
我使用的是seg_light_ham/train.py而非进一步的tools下面的seg_light_ham/tools/train.py。
感谢你的兴趣!希望能帮到你!
from enjoy-hamburger.
感谢您的细心解答!!!我会继续尝试下
from enjoy-hamburger.
Related Issues (11)
- Applying Hamburger to other models makes the training collapse soon HOT 2
- Confusion on update rules of Ham HOT 1
- About Fix point iteration HOT 3
- question about the implementation of one-step gradient
- ZeroDivisionError: float division by zero
- Arxiv version or blog HOT 5
- BatchNormalization HOT 1
- KeyError: 'van_tiny is not in the models registry' HOT 4
- 缺少主干网络 HOT 2
- Difference between the code and Eq(13) in the paper about the gradient calculation HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from enjoy-hamburger.