Thanks for contributing this amazing repo! I found that all your configs use ImageNet

Can I use CIFAR10 dataset to substitute ImageNet dataset? about mmselfsup HOT 6 CLOSED

etbox commented on August 16, 2024

Can I use CIFAR10 dataset to substitute ImageNet dataset?

from mmselfsup.

Comments (6)

XiaohangZhan commented on August 16, 2024

You only need to change data_source_cfg in the config. Do not change others. You may use SGD if the batch size is small, you may also adjust hyperparams such as lr.

from mmselfsup.

etbox commented on August 16, 2024

Thanks for your reply. Following your instruction, I reset my config and only change data_source_cfg, but it still makes no effect.
The log shows below:

(open-mmlab) lhy@mustdl2:/disk1/lhy/Documents/github/OpenSelfSup$ bash tools/dist_train.sh configs/selfsup/byol/r50_cifar.py 2
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
*****************************************
2020-09-01 15:35:03,052 - openselfsup - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.7.6 (default, Jan  8 2020, 19:59:22) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda-8.0
NVCC: Cuda compilation tools, release 8.0, V8.0.61
GPU 0,1: GeForce GTX 1080 Ti
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
PyTorch: 1.5.1
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.3
  - Magma 2.5.2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 

TorchVision: 0.6.0a0+35d732a
OpenCV: 4.3.0
MMCV: 1.0.3
OpenSelfSup: 0.2.0+dbfc6b1
------------------------------------------------------------

2020-09-01 15:35:03,053 - openselfsup - INFO - Distributed training: True
2020-09-01 15:35:03,053 - openselfsup - INFO - Config:
/disk1/lhy/Documents/github/OpenSelfSup/configs/base.py
train_cfg = {}
test_cfg = {}
optimizer_config = dict()  # grad_clip, coalesce, bucket_size_mb
# yapf:disable
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        dict(type='TensorboardLoggerHook')
    ])
# yapf:enable
# runtime settings
dist_params = dict(backend='nccl')
cudnn_benchmark = True
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]

/disk1/lhy/Documents/github/OpenSelfSup/configs/selfsup/byol/r50_cifar.py
import copy
_base_ = '../../base.py'
# Model settings
model = dict(
    type='BYOL',
    pretrained=None,
    base_momentum=0.996,
    backbone=dict(
        type='ResNet',
        depth=50,
        in_channels=3,
        out_indices=[4],  # 0: conv-1, x: stage-x
        norm_cfg=dict(type='BN')),
    neck=dict(
        type='NonLinearNeckV2',
        in_channels=2048,
        hid_channels=4096,
        out_channels=256,
        with_avg_pool=True),
    head=dict(type='LatentPredictHead',
              size_average=True,
              predictor=dict(type='NonLinearNeckV2',
                             in_channels=256, hid_channels=4096,
                             out_channels=256, with_avg_pool=False)))
# Dataset settings
data_source_cfg = dict(type='Cifar10', root='data')
# data_source_cfg = dict(
#     type='ImageNet',
#     memcached=True,
#     mclient_path='/mnt/lustre/share/memcached_client')
# data_train_list = 'data/imagenet/meta/train.txt'
# data_train_root = 'data/imagenet/train'
dataset_type = 'BYOLDataset'
img_norm_cfg = dict(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
train_pipeline = [
    dict(type='RandomResizedCrop', size=224, interpolation=3), # bicubic
    dict(type='RandomHorizontalFlip'),
    dict(
        type='RandomAppliedTrans',
        transforms=[
            dict(
                type='ColorJitter',
                brightness=0.4,
                contrast=0.4,
                saturation=0.2,
                hue=0.1)
        ],
        p=0.8),
    dict(type='RandomGrayscale', p=0.2),
    dict(
        type='RandomAppliedTrans',
        transforms=[
            dict(
                type='GaussianBlur',
                sigma_min=0.1,
                sigma_max=2.0,
                kernel_size=23)
        ],
        p=1.),
    dict(type='RandomAppliedTrans',
         transforms=[dict(type='Solarization')], p=0.),
    dict(type='ToTensor'),
    dict(type='Normalize', **img_norm_cfg),
]
train_pipeline1 = copy.deepcopy(train_pipeline)
train_pipeline2 = copy.deepcopy(train_pipeline)
train_pipeline2[4]['p'] = 0.1 # gaussian blur
train_pipeline2[5]['p'] = 0.2 # solarization

data = dict(
    imgs_per_gpu=32,  # total 32*8=256
    workers_per_gpu=4,
    train=dict(
        type=dataset_type,
        data_source=dict(
            # list_file=data_train_list, root=data_train_root,
            **data_source_cfg),
        pipeline1=train_pipeline1,
        pipeline2=train_pipeline2))
# Additional hooks
custom_hooks = [
    dict(type='BYOLHook', end_momentum=1.)
]
# Optimizer
optimizer = dict(type='SGD', lr=0.1, momentum=0.9, weight_decay=0.0005)
# optimizer = dict(type='LARS', lr=0.2, weight_decay=0.0000015, momentum=0.9,
#                  paramwise_options={
#                     '(bn|gn)(\d+)?.(weight|bias)': dict(weight_decay=0., lars_exclude=True),
#                     'bias': dict(weight_decay=0., lars_exclude=True)})
# Learning policy
lr_config = dict(
    policy='CosineAnnealing',
    min_lr=0.,
    warmup='linear',
    warmup_iters=2,
    warmup_ratio=0.0001, # cannot be 0
    warmup_by_epoch=True)
checkpoint_config = dict(interval=10)
# Runtime settings
total_epochs = 200

2020-09-01 15:35:03,053 - openselfsup - INFO - Set random seed to 0, deterministic: False
Traceback (most recent call last):
  File "tools/train.py", line 142, in <module>
    main()
  File "tools/train.py", line 124, in main
    datasets = [build_dataset(cfg.data.train)]
  File "/disk1/lhy/Documents/github/OpenSelfSup/openselfsup/datasets/builder.py", line 37, in build_dataset
    dataset = build_from_cfg(cfg, DATASETS, default_args)
  File "/disk1/lhy/Documents/github/OpenSelfSup/openselfsup/utils/registry.py", line 79, in build_from_cfg
    return obj_cls(**args)
  File "/disk1/lhy/Documents/github/OpenSelfSup/openselfsup/datasets/byol.py", line 18, in __init__
    self.data_source = build_datasource(data_source)
  File "/disk1/lhy/Documents/github/OpenSelfSup/openselfsup/datasets/builder.py", line 43, in build_datasource
    return build_from_cfg(cfg, DATASOURCES)
  File "/disk1/lhy/Documents/github/OpenSelfSup/openselfsup/utils/registry.py", line 79, in build_from_cfg
    return obj_cls(**args)
TypeError: __init__() missing 1 required positional argument: 'split'
Traceback (most recent call last):
  File "tools/train.py", line 142, in <module>
    main()
  File "tools/train.py", line 124, in main
    datasets = [build_dataset(cfg.data.train)]
  File "/disk1/lhy/Documents/github/OpenSelfSup/openselfsup/datasets/builder.py", line 37, in build_dataset
    dataset = build_from_cfg(cfg, DATASETS, default_args)
  File "/disk1/lhy/Documents/github/OpenSelfSup/openselfsup/utils/registry.py", line 79, in build_from_cfg
    return obj_cls(**args)
  File "/disk1/lhy/Documents/github/OpenSelfSup/openselfsup/datasets/byol.py", line 18, in __init__
    self.data_source = build_datasource(data_source)
  File "/disk1/lhy/Documents/github/OpenSelfSup/openselfsup/datasets/builder.py", line 43, in build_datasource
    return build_from_cfg(cfg, DATASOURCES)
  File "/disk1/lhy/Documents/github/OpenSelfSup/openselfsup/utils/registry.py", line 79, in build_from_cfg
    return obj_cls(**args)
TypeError: __init__() missing 1 required positional argument: 'split'
Traceback (most recent call last):
  File "/disk1/lhy/Applications/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/disk1/lhy/Applications/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/disk1/lhy/Applications/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in <module>
    main()
  File "/disk1/lhy/Applications/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/disk1/lhy/Applications/anaconda3/envs/open-mmlab/bin/python', '-u', 'tools/train.py', '--local_rank=1', 'configs/selfsup/byol/r50_cifar.py', '--work_dir', 'work_dirs/selfsup/byol/r50_cifar/', '--seed', '0', '--launcher', 'pytorch']' returned non-zero exit status 1.

Should I change your code in /openselfsup?

from mmselfsup.

XiaohangZhan commented on August 16, 2024

The bug is obvious. It shows init() missing argument "split". The key "data_source" in the config under data.train shall accept an argument "split". You may refer to configs/classification/cifar/r50.py to confirm it. I'm willing to help but I suggest carefully reading the log to find the bug by yourself first before raising issues, so that we could save time for both of us :)

from mmselfsup.

etbox commented on August 16, 2024

Please forgive my carelessness. You are right! After fixing this bug, I met another one:

Original Traceback (most recent call last):
  File "/disk1/lhy/Applications/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/disk1/lhy/Applications/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/disk1/lhy/Applications/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/disk1/lhy/Documents/github/OpenSelfSup/openselfsup/datasets/byol.py", line 29, in __getitem__
    img1 = self.pipeline1(img)
  File "/disk1/lhy/Applications/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 61, in __call__
    img = t(img)
  File "/disk1/lhy/Applications/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 680, in __call__
    i, j, h, w = self.get_params(img, self.scale, self.ratio)
  File "/disk1/lhy/Applications/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 641, in get_params
    width, height = _get_image_size(img)
  File "/disk1/lhy/Applications/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 40, in _get_image_size
    raise TypeError("Unexpected type {}".format(type(img)))
TypeError: Unexpected type <class 'tuple'>
raise self.exc_type(msg)

Then I found the img variable contains the origin image data (<PIL.Image.Image image mode=RGB size=32x32 at 0x7F1197BDA690>, 1) with a tuple. So I changed your code to extract the data, and it works now.

Thank you for your help, and your instruction did inspire me a lot!

from mmselfsup.

XiaohangZhan commented on August 16, 2024

I notice that your code is still in an old version. Please follow the latest code, otherwise there may be bugs and the result cannot be reproduced.

from mmselfsup.

etbox commented on August 16, 2024

Roger that!

from mmselfsup.

Can I use CIFAR10 dataset to substitute ImageNet dataset? about mmselfsup HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent