Giter Club home page Giter Club logo

Comments (4)

Gsunshine avatar Gsunshine commented on August 16, 2024

seg这个目录是很久之前的code, 已经1年半没有维护过了。如果你现阶段想跑,我推荐用基于MMSeg的seg_light_ham跑,可以结合VAN作为backbone。性能很强,可见2022.03.26的更新。

使用seg_light_ham,你可以选VAN-Tiny或者VAN-Small作为backbone,Ham或者Light-Ham作为head,在ADE20K数据集上跑。我觉得有希望能fit进1张卡,尤其是Light-Ham,非常轻量级。注意这个目录下的dist_train.sh的DDP是允许单卡跑的(印象中是),指定GPU数为1即可,应该也不需要修改norm_cfg,SyncBN可以支持单卡。如果单卡memory不太够,可以在config尝试修改把总的output stride从8改到16,这会进一步缩小训练的memory开销。

如果使用seg这套旧code,我记得PASCAL VOC数据集上一个单卡可复现且速度比较快的setting。你可以用ResNet 50,stride 16,batch size 12,30K iterations来做ablation。这个是单卡11G用seg这个目录可以跑的。我建议在这个setting下跑3~4组,用average+best来衡量在ham之上做进一步的修改是否有效。

应该是在settings.py里面设置
RUN_FOR_TEST = False
TRAIN_BATCH_SIZE = 12
ITER_MAX = 30000
N_LAYERS = 50
STRIDE = 16

可以加assets里面的Wechat,有问题欢迎继续问!

from enjoy-hamburger.

m828 avatar m828 commented on August 16, 2024

我是用的seg_light_ham,选择的是van_base。
我的意思是能够用dist_train.sh,gpus=1跑通,而且在我自己的数据集上第一次就达到了82.5mIoU,这是很好的结果。
所以对于该模型,我想深入学习下,想通过使用train.py的命令而不是.sh(因为train.py能够一步一步debug),但是一直报错Default process group has not been initialized, please make sure to call init_process_group.

from enjoy-hamburger.

Gsunshine avatar Gsunshine commented on August 16, 2024

OK我理解你的问题了。我发现了下面几个链接跟你的问题相关:
facebookresearch/detectron2#3972
pytorch/pytorch#63662
open-mmlab/mmsegmentation#772

看样子pytorch尚未支持单卡下DDP+SyncBN的训练,MMSeg选择在这种情况下把SyncBN换回BN,见code。所以我的理解是,是不是应该直接在train.py里面手动设置distributed为False,或者激活该变量为False,见此处code。可以尝试一下。

我使用的是seg_light_ham/train.py而非进一步的tools下面的seg_light_ham/tools/train.py。

感谢你的兴趣!希望能帮到你!

from enjoy-hamburger.

m828 avatar m828 commented on August 16, 2024

感谢您的细心解答!!!我会继续尝试下

from enjoy-hamburger.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.