Giter Club home page Giter Club logo

Comments (4)

limbo0000 avatar limbo0000 commented on August 11, 2024

@chenyihang1993
Thanks for using our codebase. Could you please share the config file you use and I could check it for you.

from tpn.

chenyihang1993 avatar chenyihang1993 commented on August 11, 2024

The config file is

model = dict(
    type='TSN2D',
    backbone=dict(
        type='ResNet',
        pretrained='modelzoo://resnet50',
        depth=50,
        nsegments=8,
        out_indices=(2, 3),
        tsm=True,
        bn_eval=False,
        partial_bn=False),
    necks=dict(
        type='TPN',
        in_channels=[1024, 2048],
        out_channels=1024,
        spatial_modulation_config=dict(
            inplanes=[1024, 2048],
            planes=2048,
        ),
        temporal_modulation_config=dict(
            scales=(8, 8),
            param=dict(
                inplanes=-1,
                planes=-1,
                downsample_scale=-1,
            )),
        upsampling_config=dict(
            scale=(1, 1, 1),
        ),
        downsampling_config=dict(
            scales=(1, 1, 1),
            param=dict(
                inplanes=-1,
                planes=-1,
                downsample_scale=-1,
            )),
        level_fusion_config=dict(
            in_channels=[1024, 1024],
            mid_channels=[1024, 1024],
            out_channels=2048,
            ds_scales=[(1, 1, 1), (1, 1, 1)],
        ),
        aux_head_config=dict(
            inplanes=-1,
            planes=174,
            loss_weight=0.5
        ),
    ),
    spatial_temporal_module=dict(
        type='SimpleSpatialModule',
        spatial_type='avg',
        spatial_size=7),
    segmental_consensus=dict(
        type='SimpleConsensus',
        consensus_type='avg'),
    cls_head=dict(
        type='ClsHead',
        with_avg_pool=False,
        temporal_feature_size=1,
        spatial_feature_size=1,
        dropout_ratio=0.5,
        in_channels=2048,
        num_classes=174))
train_cfg = None
test_cfg = None
# dataset settings
dataset_type = 'RawFramesDataset'
data_root = ''
data_root_val = ''

img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)

data = dict(
    videos_per_gpu=8,
    workers_per_gpu=8,
    train=dict(
        type=dataset_type,
        ann_file='/DATA/disk1/cyh/data/action_recognization/NTU/labels/train_subjects_videofolder.txt',
        img_prefix=data_root,
        img_norm_cfg=img_norm_cfg,
        num_segments=8,
        new_length=1,
        new_step=1,
        random_shift=True,
        modality='RGB',
        image_tmpl='{:05d}.jpg',
        img_scale=256,
        input_size=224,
        flip_ratio=0.5,
        resize_keep_ratio=True,
        resize_crop=True,
        color_jitter=True,
        color_space_aug=True,
        oversample=None,
        max_distort=1,
        test_mode=False),
    val=dict(
        type=dataset_type,
        ann_file='/DATA/disk1/cyh/data/action_recognization/NTU/labels/val_subjects_videofolder.txt',
        img_prefix=data_root_val,
        img_norm_cfg=img_norm_cfg,
        num_segments=8,
        new_length=1,
        new_step=1,
        random_shift=False,
        modality='RGB',
        image_tmpl='{:05d}.jpg',
        img_scale=256,
        input_size=224,
        flip_ratio=0,
        resize_keep_ratio=True,
        oversample=None,
        test_mode=False),
    test=dict(
        type=dataset_type,
        ann_file='/DATA/disk1/cyh/data/action_recognization/NTU/labels/train_subjects_videofolder.txt',
        img_prefix=data_root_val,
        img_norm_cfg=img_norm_cfg,
        num_segments=16,
        new_length=1,
        new_step=1,
        random_shift=False,
        modality='RGB',
        image_tmpl='{:05d}.jpg',
        img_scale=256,
        input_size=256,
        flip_ratio=0,
        resize_keep_ratio=True,
        oversample="three_crop",
        test_mode=True))
# optimizer
optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0005, nesterov=True)
optimizer_config = dict(grad_clip=dict(max_norm=20, norm_type=2))
# learning policy
lr_config = dict(
    policy='step',
    step=[10,20,30,40,50])
checkpoint_config = dict(interval=1)
workflow = [('train', 1)]
# yapf:disable
log_config = dict(
    interval=20,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])
# yapf:enable
# runtime settings
total_epochs = 50
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = 'ckpt/sthv1_tpn.pth'
resume_from = None

It is kind of you! THX!

from tpn.

limbo0000 avatar limbo0000 commented on August 11, 2024

Hi @chenyihang1993
BTW, could you successfully run the code of the quick demo?

from tpn.

limbo0000 avatar limbo0000 commented on August 11, 2024

Hi @chenyihang1993
I have re-created a conda env and re-installed our codebase following INSTALL. The codebase is ok and quick demo runs well. Note that I specified the required version of mmcv. FYI, the version of my pytorch and torchvision is 1.1.0 and 0.3.0 respectively. I recommend you could use anaconda to manage your env, re-install our codebase following the doc and try the demo first.

I would close this issue since my testing is ok. You could reopen it and let me know if you still get the problem.

from tpn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.