I ran the following command.
I got a CUDA error as below, how can I resolve it? The environment was created with Docker.
docker@a68944098dc2:/Study-MaskFormer$ python train_net.py \
> --config-file configs/ade20k-150/maskformer_R50_bs16_160k.yaml \
> --num-gpus 1 SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0001
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-187ux1uq because the default path (/home/docker/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-x6qwnu56 because the default path (/tmp/matplotlib-187ux1uq) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Command Line Args: Namespace(config_file='configs/ade20k-150/maskformer_R50_bs16_160k.yaml', dist_url='tcp://127.0.0.1:50153', eval_only=False, machine_rank=0, num_gpus=1, num_machines=1, opts=['SOLVER.IMS_PER_BATCH', '2', 'SOLVER.BASE_LR', '0.0001'], resume=False)
Loading config configs/ade20k-150/Base-ADE20K-150.yaml with yaml.unsafe_load. Your machine may be at risk if the file contains malicious content.
[08/09 16:12:59 detectron2]: Rank of current process: 0. World size: 1
/.pyenv/versions/3.8.6/lib/python3.8/site-packages/setuptools/distutils_patch.py:25: UserWarning: Distutils was imported before Setuptools. This usage is discouraged and may exhibit undesirable behaviors or errors. Please use Setuptools' objects directly or at least import Setuptools first.
warnings.warn(
[08/09 16:13:00 detectron2]: Environment info:
---------------------- ---------------------------------------------------------------------------
sys.platform linux
Python 3.8.6 (default, Aug 9 2021, 07:43:54) [GCC 7.5.0]
numpy 1.21.1
detectron2 0.4 @/detectron2_repo/detectron2
Compiler GCC 7.5
CUDA compiler CUDA 10.1
detectron2 arch flags 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 7.0, 7.5
DETECTRON2_ENV_MODULE <not set>
PyTorch 1.8.0+cu101 @/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch
PyTorch debug build False
GPU available True
GPU 0,1 GeForce RTX 2080 Ti (arch=7.5)
CUDA_HOME /usr/local/cuda
TORCH_CUDA_ARCH_LIST Kepler;Kepler+Tesla;Maxwell;Maxwell+Tegra;Pascal;Volta;Turing
Pillow 8.3.1
torchvision 0.9.0+cu101 @/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torchvision
torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5
fvcore 0.1.3.post20210317
cv2 4.5.3
---------------------- ---------------------------------------------------------------------------
PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 10.1
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
- CuDNN 7.6.3
- Magma 2.5.2
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.1, CUDNN_VERSION=7.6.3, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
[08/09 16:13:00 detectron2]: Command line arguments: Namespace(config_file='configs/ade20k-150/maskformer_R50_bs16_160k.yaml', dist_url='tcp://127.0.0.1:50153', eval_only=False, machine_rank=0, num_gpus=1, num_machines=1, opts=['SOLVER.IMS_PER_BATCH', '2', 'SOLVER.BASE_LR', '0.0001'], resume=False)
[08/09 16:13:00 detectron2]: Contents of args.config_file=configs/ade20k-150/maskformer_R50_bs16_160k.yaml:
_BASE_: Base-ADE20K-150.yaml
MODEL:
META_ARCHITECTURE: "MaskFormer"
SEM_SEG_HEAD:
NAME: "MaskFormerHead"
IN_FEATURES: ["res2", "res3", "res4", "res5"]
IGNORE_VALUE: 255
NUM_CLASSES: 150
COMMON_STRIDE: 4 # not used, hard-coded
LOSS_WEIGHT: 1.0
CONVS_DIM: 256
MASK_DIM: 256
NORM: "GN"
MASK_FORMER:
TRANSFORMER_IN_FEATURE: "res5"
DEEP_SUPERVISION: True
NO_OBJECT_WEIGHT: 0.1
DICE_WEIGHT: 1.0
MASK_WEIGHT: 20.0
HIDDEN_DIM: 256
NUM_OBJECT_QUERIES: 100
NHEADS: 8
DROPOUT: 0.1
DIM_FEEDFORWARD: 2048
ENC_LAYERS: 0
DEC_LAYERS: 6
PRE_NORM: False
[08/09 16:13:00 detectron2]: Running with full config:
CUDNN_BENCHMARK: False
DATALOADER:
ASPECT_RATIO_GROUPING: True
FILTER_EMPTY_ANNOTATIONS: True
NUM_WORKERS: 4
REPEAT_THRESHOLD: 0.0
SAMPLER_TRAIN: TrainingSampler
DATASETS:
PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
PROPOSAL_FILES_TEST: ()
PROPOSAL_FILES_TRAIN: ()
TEST: ('ade20k_sem_seg_val',)
TRAIN: ('ade20k_sem_seg_train',)
GLOBAL:
HACK: 1.0
INPUT:
COLOR_AUG_SSD: True
CROP:
ENABLED: True
SINGLE_CATEGORY_MAX_AREA: 1.0
SIZE: [512, 512]
TYPE: absolute
DATASET_MAPPER_NAME: mask_former_semantic
FORMAT: RGB
MASK_FORMAT: polygon
MAX_SIZE_TEST: 2048
MAX_SIZE_TRAIN: 2048
MIN_SIZE_TEST: 512
MIN_SIZE_TRAIN: (256, 307, 358, 409, 460, 512, 563, 614, 665, 716, 768, 819, 870, 921, 972, 1024)
MIN_SIZE_TRAIN_SAMPLING: choice
RANDOM_FLIP: horizontal
SIZE_DIVISIBILITY: 512
MODEL:
ANCHOR_GENERATOR:
ANGLES: [[-90, 0, 90]]
ASPECT_RATIOS: [[0.5, 1.0, 2.0]]
NAME: DefaultAnchorGenerator
OFFSET: 0.0
SIZES: [[32, 64, 128, 256, 512]]
BACKBONE:
FREEZE_AT: 0
NAME: build_resnet_backbone
DEVICE: cuda
FPN:
FUSE_TYPE: sum
IN_FEATURES: []
NORM:
OUT_CHANNELS: 256
KEYPOINT_ON: False
LOAD_PROPOSALS: False
MASK_FORMER:
DEC_LAYERS: 6
DEEP_SUPERVISION: True
DICE_WEIGHT: 1.0
DIM_FEEDFORWARD: 2048
DROPOUT: 0.1
ENC_LAYERS: 0
ENFORCE_INPUT_PROJ: False
HIDDEN_DIM: 256
MASK_WEIGHT: 20.0
NHEADS: 8
NO_OBJECT_WEIGHT: 0.1
NUM_OBJECT_QUERIES: 100
PRE_NORM: False
SIZE_DIVISIBILITY: 32
TEST:
OBJECT_MASK_THRESHOLD: 0.0
OVERLAP_THRESHOLD: 0.0
PANOPTIC_ON: False
SEM_SEG_POSTPROCESSING_BEFORE_INFERENCE: False
TRANSFORMER_IN_FEATURE: res5
MASK_ON: False
META_ARCHITECTURE: MaskFormer
PANOPTIC_FPN:
COMBINE:
ENABLED: True
INSTANCES_CONFIDENCE_THRESH: 0.5
OVERLAP_THRESH: 0.5
STUFF_AREA_LIMIT: 4096
INSTANCE_LOSS_WEIGHT: 1.0
PIXEL_MEAN: [123.675, 116.28, 103.53]
PIXEL_STD: [58.395, 57.12, 57.375]
PROPOSAL_GENERATOR:
MIN_SIZE: 0
NAME: RPN
RESNETS:
DEFORM_MODULATED: False
DEFORM_NUM_GROUPS: 1
DEFORM_ON_PER_STAGE: [False, False, False, False]
DEPTH: 50
NORM: FrozenBN
NUM_GROUPS: 1
OUT_FEATURES: ['res2', 'res3', 'res4', 'res5']
RES2_OUT_CHANNELS: 256
RES4_DILATION: 1
RES5_DILATION: 1
RES5_MULTI_GRID: [1, 1, 1]
STEM_OUT_CHANNELS: 64
STEM_TYPE: basic
STRIDE_IN_1X1: False
WIDTH_PER_GROUP: 64
RETINANET:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_WEIGHTS: (1.0, 1.0, 1.0, 1.0)
FOCAL_LOSS_ALPHA: 0.25
FOCAL_LOSS_GAMMA: 2.0
IN_FEATURES: ['p3', 'p4', 'p5', 'p6', 'p7']
IOU_LABELS: [0, -1, 1]
IOU_THRESHOLDS: [0.4, 0.5]
NMS_THRESH_TEST: 0.5
NORM:
NUM_CLASSES: 80
NUM_CONVS: 4
PRIOR_PROB: 0.01
SCORE_THRESH_TEST: 0.05
SMOOTH_L1_LOSS_BETA: 0.1
TOPK_CANDIDATES_TEST: 1000
ROI_BOX_CASCADE_HEAD:
BBOX_REG_WEIGHTS: ((10.0, 10.0, 5.0, 5.0), (20.0, 20.0, 10.0, 10.0), (30.0, 30.0, 15.0, 15.0))
IOUS: (0.5, 0.6, 0.7)
ROI_BOX_HEAD:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS: (10.0, 10.0, 5.0, 5.0)
CLS_AGNOSTIC_BBOX_REG: False
CONV_DIM: 256
FC_DIM: 1024
NAME:
NORM:
NUM_CONV: 0
NUM_FC: 0
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
SMOOTH_L1_BETA: 0.0
TRAIN_ON_PRED_BOXES: False
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
IN_FEATURES: ['res4']
IOU_LABELS: [0, 1]
IOU_THRESHOLDS: [0.5]
NAME: Res5ROIHeads
NMS_THRESH_TEST: 0.5
NUM_CLASSES: 80
POSITIVE_FRACTION: 0.25
PROPOSAL_APPEND_GT: True
SCORE_THRESH_TEST: 0.05
ROI_KEYPOINT_HEAD:
CONV_DIMS: (512, 512, 512, 512, 512, 512, 512, 512)
LOSS_WEIGHT: 1.0
MIN_KEYPOINTS_PER_IMAGE: 1
NAME: KRCNNConvDeconvUpsampleHead
NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: True
NUM_KEYPOINTS: 17
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
ROI_MASK_HEAD:
CLS_AGNOSTIC_MASK: False
CONV_DIM: 256
NAME: MaskRCNNConvUpsampleHead
NORM:
NUM_CONV: 0
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
RPN:
BATCH_SIZE_PER_IMAGE: 256
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS: (1.0, 1.0, 1.0, 1.0)
BOUNDARY_THRESH: -1
HEAD_NAME: StandardRPNHead
IN_FEATURES: ['res4']
IOU_LABELS: [0, -1, 1]
IOU_THRESHOLDS: [0.3, 0.7]
LOSS_WEIGHT: 1.0
NMS_THRESH: 0.7
POSITIVE_FRACTION: 0.5
POST_NMS_TOPK_TEST: 1000
POST_NMS_TOPK_TRAIN: 2000
PRE_NMS_TOPK_TEST: 6000
PRE_NMS_TOPK_TRAIN: 12000
SMOOTH_L1_BETA: 0.0
SEM_SEG_HEAD:
ASPP_CHANNELS: 256
ASPP_DILATIONS: [6, 12, 18]
ASPP_DROPOUT: 0.1
COMMON_STRIDE: 4
CONVS_DIM: 256
IGNORE_VALUE: 255
IN_FEATURES: ['res2', 'res3', 'res4', 'res5']
LOSS_TYPE: hard_pixel_mining
LOSS_WEIGHT: 1.0
MASK_DIM: 256
NAME: MaskFormerHead
NORM: GN
NUM_CLASSES: 150
PIXEL_DECODER_NAME: BasePixelDecoder
PROJECT_CHANNELS: [48]
PROJECT_FEATURES: ['res2']
TRANSFORMER_ENC_LAYERS: 0
USE_DEPTHWISE_SEPARABLE_CONV: False
SWIN:
APE: False
ATTN_DROP_RATE: 0.0
DEPTHS: [2, 2, 6, 2]
DROP_PATH_RATE: 0.3
DROP_RATE: 0.0
EMBED_DIM: 96
MLP_RATIO: 4.0
NUM_HEADS: [3, 6, 12, 24]
OUT_FEATURES: ['res2', 'res3', 'res4', 'res5']
PATCH_NORM: True
PATCH_SIZE: 4
PRETRAIN_IMG_SIZE: 224
QKV_BIAS: True
QK_SCALE: None
WINDOW_SIZE: 7
WEIGHTS: detectron2://ImageNetPretrained/torchvision/R-50.pkl
OUTPUT_DIR: ./output
SEED: -1
SOLVER:
AMP:
ENABLED: False
BACKBONE_MULTIPLIER: 0.1
BASE_LR: 0.0001
BIAS_LR_FACTOR: 1.0
CHECKPOINT_PERIOD: 5000
CLIP_GRADIENTS:
CLIP_TYPE: full_model
CLIP_VALUE: 0.01
ENABLED: True
NORM_TYPE: 2.0
GAMMA: 0.1
IMS_PER_BATCH: 2
LR_SCHEDULER_NAME: WarmupPolyLR
MAX_ITER: 160000
MOMENTUM: 0.9
NESTEROV: False
OPTIMIZER: ADAMW
POLY_LR_CONSTANT_ENDING: 0.0
POLY_LR_POWER: 0.9
REFERENCE_WORLD_SIZE: 0
STEPS: (30000,)
WARMUP_FACTOR: 1.0
WARMUP_ITERS: 0
WARMUP_METHOD: linear
WEIGHT_DECAY: 0.0001
WEIGHT_DECAY_BIAS: 0.0001
WEIGHT_DECAY_EMBED: 0.0
WEIGHT_DECAY_NORM: 0.0
TEST:
AUG:
ENABLED: False
FLIP: True
MAX_SIZE: 3584
MIN_SIZES: (256, 384, 512, 640, 768, 896)
DETECTIONS_PER_IMAGE: 100
EVAL_PERIOD: 5000
EXPECTED_RESULTS: []
KEYPOINT_OKS_SIGMAS: []
PRECISE_BN:
ENABLED: False
NUM_ITER: 200
VERSION: 2
VIS_PERIOD: 0
[08/09 16:13:00 detectron2]: Full config saved to ./output/config.yaml
[08/09 16:13:00 d2.utils.env]: Using a generated random seed 881166
[08/09 16:13:04 d2.engine.defaults]: Model:
MaskFormer(
(backbone): ResNet(
(stem): BasicStem(
(conv1): Conv2d(
3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
)
(res2): Sequential(
(0): BottleneckBlock(
(shortcut): Conv2d(
64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv1): Conv2d(
64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv2): Conv2d(
64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv3): Conv2d(
64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
)
(1): BottleneckBlock(
(conv1): Conv2d(
256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv2): Conv2d(
64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv3): Conv2d(
64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
)
(2): BottleneckBlock(
(conv1): Conv2d(
256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv2): Conv2d(
64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv3): Conv2d(
64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
)
)
(res3): Sequential(
(0): BottleneckBlock(
(shortcut): Conv2d(
256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv1): Conv2d(
256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv2): Conv2d(
128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv3): Conv2d(
128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
)
(1): BottleneckBlock(
(conv1): Conv2d(
512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv2): Conv2d(
128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv3): Conv2d(
128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
)
(2): BottleneckBlock(
(conv1): Conv2d(
512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv2): Conv2d(
128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv3): Conv2d(
128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
)
(3): BottleneckBlock(
(conv1): Conv2d(
512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv2): Conv2d(
128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv3): Conv2d(
128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
)
)
(res4): Sequential(
(0): BottleneckBlock(
(shortcut): Conv2d(
512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
(conv1): Conv2d(
512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
(1): BottleneckBlock(
(conv1): Conv2d(
1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
(2): BottleneckBlock(
(conv1): Conv2d(
1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
(3): BottleneckBlock(
(conv1): Conv2d(
1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
(4): BottleneckBlock(
(conv1): Conv2d(
1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
(5): BottleneckBlock(
(conv1): Conv2d(
1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
)
(res5): Sequential(
(0): BottleneckBlock(
(shortcut): Conv2d(
1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False
(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
)
(conv1): Conv2d(
1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv2): Conv2d(
512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv3): Conv2d(
512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
)
)
(1): BottleneckBlock(
(conv1): Conv2d(
2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv2): Conv2d(
512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv3): Conv2d(
512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
)
)
(2): BottleneckBlock(
(conv1): Conv2d(
2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv2): Conv2d(
512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv3): Conv2d(
512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
)
)
)
)
(sem_seg_head): MaskFormerHead(
(pixel_decoder): BasePixelDecoder(
(adapter_1): Conv2d(
256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): GroupNorm(32, 256, eps=1e-05, affine=True)
)
(layer_1): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): GroupNorm(32, 256, eps=1e-05, affine=True)
)
(adapter_2): Conv2d(
512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): GroupNorm(32, 256, eps=1e-05, affine=True)
)
(layer_2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): GroupNorm(32, 256, eps=1e-05, affine=True)
)
(adapter_3): Conv2d(
1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): GroupNorm(32, 256, eps=1e-05, affine=True)
)
(layer_3): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): GroupNorm(32, 256, eps=1e-05, affine=True)
)
(layer_4): Conv2d(
2048, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): GroupNorm(32, 256, eps=1e-05, affine=True)
)
(mask_features): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
(predictor): TransformerPredictor(
(pe_layer): PositionEmbeddingSine()
(transformer): Transformer(
(encoder): TransformerEncoder(
(layers): ModuleList()
)
(decoder): TransformerDecoder(
(layers): ModuleList(
(0): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(multihead_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(linear1): Linear(in_features=256, out_features=2048, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
(linear2): Linear(in_features=2048, out_features=256, bias=True)
(norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(dropout1): Dropout(p=0.1, inplace=False)
(dropout2): Dropout(p=0.1, inplace=False)
(dropout3): Dropout(p=0.1, inplace=False)
)
(1): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(multihead_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(linear1): Linear(in_features=256, out_features=2048, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
(linear2): Linear(in_features=2048, out_features=256, bias=True)
(norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(dropout1): Dropout(p=0.1, inplace=False)
(dropout2): Dropout(p=0.1, inplace=False)
(dropout3): Dropout(p=0.1, inplace=False)
)
(2): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(multihead_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(linear1): Linear(in_features=256, out_features=2048, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
(linear2): Linear(in_features=2048, out_features=256, bias=True)
(norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(dropout1): Dropout(p=0.1, inplace=False)
(dropout2): Dropout(p=0.1, inplace=False)
(dropout3): Dropout(p=0.1, inplace=False)
)
(3): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(multihead_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(linear1): Linear(in_features=256, out_features=2048, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
(linear2): Linear(in_features=2048, out_features=256, bias=True)
(norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(dropout1): Dropout(p=0.1, inplace=False)
(dropout2): Dropout(p=0.1, inplace=False)
(dropout3): Dropout(p=0.1, inplace=False)
)
(4): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(multihead_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(linear1): Linear(in_features=256, out_features=2048, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
(linear2): Linear(in_features=2048, out_features=256, bias=True)
(norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(dropout1): Dropout(p=0.1, inplace=False)
(dropout2): Dropout(p=0.1, inplace=False)
(dropout3): Dropout(p=0.1, inplace=False)
)
(5): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(multihead_attn): MultiheadAttention(
(out_proj): _LinearWithBias(in_features=256, out_features=256, bias=True)
)
(linear1): Linear(in_features=256, out_features=2048, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
(linear2): Linear(in_features=2048, out_features=256, bias=True)
(norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(dropout1): Dropout(p=0.1, inplace=False)
(dropout2): Dropout(p=0.1, inplace=False)
(dropout3): Dropout(p=0.1, inplace=False)
)
)
(norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
)
)
(query_embed): Embedding(100, 256)
(input_proj): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1))
(class_embed): Linear(in_features=256, out_features=151, bias=True)
(mask_embed): MLP(
(layers): ModuleList(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): Linear(in_features=256, out_features=256, bias=True)
(2): Linear(in_features=256, out_features=256, bias=True)
)
)
)
)
(criterion): SetCriterion(
(matcher): Matcher HungarianMatcher
cost_class: 1
cost_mask: 20.0
cost_dice: 1.0
)
)
[08/09 16:13:04 mask_former.data.dataset_mappers.mask_former_semantic_dataset_mapper]: [MaskFormerSemanticDatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=..., max_size=2048, sample_style='choice'), RandomCrop_CategoryAreaConstraint(crop_type='absolute', crop_size=[512, 512], single_category_max_area=1.0, ignored_category=255), <detectron2.projects.point_rend.color_augmentation.ColorAugSSDTransform object at 0x7fb8efc1b040>, RandomFlip()]
[08/09 16:13:05 d2.data.datasets.coco]: Loaded 20210 images with semantic segmentation from datasets/ADEChallengeData2016/images/training
[08/09 16:13:05 d2.data.build]: Using training sampler TrainingSampler
[08/09 16:13:05 d2.data.common]: Serializing 20210 elements to byte tensors and concatenating them all ...
[08/09 16:13:05 d2.data.common]: Serialized dataset takes 3.97 MiB
[08/09 16:13:05 fvcore.common.checkpoint]: Loading checkpoint from detectron2://ImageNetPretrained/torchvision/R-50.pkl
/home/docker/.torch/iopath_cache is not accessible! Using /tmp/iopath_cache instead!
R-50.pkl: 102MB [00:09, 11.2MB/s]
[08/09 16:13:14 fvcore.common.checkpoint]: Reading a file from 'torchvision'
[08/09 16:13:14 d2.checkpoint.c2_model_loading]: Following weights matched with submodule backbone:
| Names in Model | Names in Checkpoint | Shapes |
|:------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
| res2.0.conv1.* | res2.0.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,64,1,1) |
| res2.0.conv2.* | res2.0.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,64,3,3) |
| res2.0.conv3.* | res2.0.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,64,1,1) |
| res2.0.shortcut.* | res2.0.shortcut.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,64,1,1) |
| res2.1.conv1.* | res2.1.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,256,1,1) |
| res2.1.conv2.* | res2.1.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,64,3,3) |
| res2.1.conv3.* | res2.1.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,64,1,1) |
| res2.2.conv1.* | res2.2.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,256,1,1) |
| res2.2.conv2.* | res2.2.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,64,3,3) |
| res2.2.conv3.* | res2.2.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,64,1,1) |
| res3.0.conv1.* | res3.0.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,256,1,1) |
| res3.0.conv2.* | res3.0.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,128,3,3) |
| res3.0.conv3.* | res3.0.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,128,1,1) |
| res3.0.shortcut.* | res3.0.shortcut.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,256,1,1) |
| res3.1.conv1.* | res3.1.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,512,1,1) |
| res3.1.conv2.* | res3.1.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,128,3,3) |
| res3.1.conv3.* | res3.1.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,128,1,1) |
| res3.2.conv1.* | res3.2.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,512,1,1) |
| res3.2.conv2.* | res3.2.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,128,3,3) |
| res3.2.conv3.* | res3.2.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,128,1,1) |
| res3.3.conv1.* | res3.3.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,512,1,1) |
| res3.3.conv2.* | res3.3.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,128,3,3) |
| res3.3.conv3.* | res3.3.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,128,1,1) |
| res4.0.conv1.* | res4.0.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,512,1,1) |
| res4.0.conv2.* | res4.0.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) |
| res4.0.conv3.* | res4.0.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) |
| res4.0.shortcut.* | res4.0.shortcut.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,512,1,1) |
| res4.1.conv1.* | res4.1.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,1024,1,1) |
| res4.1.conv2.* | res4.1.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) |
| res4.1.conv3.* | res4.1.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) |
| res4.2.conv1.* | res4.2.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,1024,1,1) |
| res4.2.conv2.* | res4.2.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) |
| res4.2.conv3.* | res4.2.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) |
| res4.3.conv1.* | res4.3.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,1024,1,1) |
| res4.3.conv2.* | res4.3.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) |
| res4.3.conv3.* | res4.3.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) |
| res4.4.conv1.* | res4.4.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,1024,1,1) |
| res4.4.conv2.* | res4.4.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) |
| res4.4.conv3.* | res4.4.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) |
| res4.5.conv1.* | res4.5.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,1024,1,1) |
| res4.5.conv2.* | res4.5.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) |
| res4.5.conv3.* | res4.5.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) |
| res5.0.conv1.* | res5.0.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,1024,1,1) |
| res5.0.conv2.* | res5.0.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,512,3,3) |
| res5.0.conv3.* | res5.0.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (2048,) (2048,) (2048,) (2048,) (2048,512,1,1) |
| res5.0.shortcut.* | res5.0.shortcut.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (2048,) (2048,) (2048,) (2048,) (2048,1024,1,1) |
| res5.1.conv1.* | res5.1.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,2048,1,1) |
| res5.1.conv2.* | res5.1.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,512,3,3) |
| res5.1.conv3.* | res5.1.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (2048,) (2048,) (2048,) (2048,) (2048,512,1,1) |
| res5.2.conv1.* | res5.2.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,2048,1,1) |
| res5.2.conv2.* | res5.2.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,512,3,3) |
| res5.2.conv3.* | res5.2.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (2048,) (2048,) (2048,) (2048,) (2048,512,1,1) |
| stem.conv1.* | stem.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,3,7,7) |
[08/09 16:13:14 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint:
criterion.empty_weight
sem_seg_head.pixel_decoder.adapter_1.norm.{bias, weight}
sem_seg_head.pixel_decoder.adapter_1.weight
sem_seg_head.pixel_decoder.adapter_2.norm.{bias, weight}
sem_seg_head.pixel_decoder.adapter_2.weight
sem_seg_head.pixel_decoder.adapter_3.norm.{bias, weight}
sem_seg_head.pixel_decoder.adapter_3.weight
sem_seg_head.pixel_decoder.layer_1.norm.{bias, weight}
sem_seg_head.pixel_decoder.layer_1.weight
sem_seg_head.pixel_decoder.layer_2.norm.{bias, weight}
sem_seg_head.pixel_decoder.layer_2.weight
sem_seg_head.pixel_decoder.layer_3.norm.{bias, weight}
sem_seg_head.pixel_decoder.layer_3.weight
sem_seg_head.pixel_decoder.layer_4.norm.{bias, weight}
sem_seg_head.pixel_decoder.layer_4.weight
sem_seg_head.pixel_decoder.mask_features.{bias, weight}
sem_seg_head.predictor.class_embed.{bias, weight}
sem_seg_head.predictor.input_proj.{bias, weight}
sem_seg_head.predictor.mask_embed.layers.0.{bias, weight}
sem_seg_head.predictor.mask_embed.layers.1.{bias, weight}
sem_seg_head.predictor.mask_embed.layers.2.{bias, weight}
sem_seg_head.predictor.query_embed.weight
sem_seg_head.predictor.transformer.decoder.layers.0.linear1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.0.linear2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.0.multihead_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.0.multihead_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.layers.0.norm1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.0.norm2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.0.norm3.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.0.self_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.0.self_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.layers.1.linear1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.1.linear2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.1.multihead_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.1.multihead_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.layers.1.norm1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.1.norm2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.1.norm3.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.1.self_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.1.self_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.layers.2.linear1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.2.linear2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.2.multihead_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.2.multihead_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.layers.2.norm1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.2.norm2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.2.norm3.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.2.self_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.2.self_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.layers.3.linear1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.3.linear2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.3.multihead_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.3.multihead_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.layers.3.norm1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.3.norm2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.3.norm3.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.3.self_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.3.self_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.layers.4.linear1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.4.linear2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.4.multihead_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.4.multihead_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.layers.4.norm1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.4.norm2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.4.norm3.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.4.self_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.4.self_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.layers.5.linear1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.5.linear2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.5.multihead_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.5.multihead_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.layers.5.norm1.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.5.norm2.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.5.norm3.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.5.self_attn.out_proj.{bias, weight}
sem_seg_head.predictor.transformer.decoder.layers.5.self_attn.{in_proj_bias, in_proj_weight}
sem_seg_head.predictor.transformer.decoder.norm.{bias, weight}
[08/09 16:13:14 fvcore.common.checkpoint]: The checkpoint state_dict contains keys that are not used by the model:
stem.fc.{bias, weight}
[08/09 16:13:14 d2.engine.train_loop]: Starting training from iteration 0
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [67,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [71,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [75,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [79,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [83,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [87,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [91,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [95,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [99,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [103,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [107,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [111,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [115,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [119,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [123,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [127,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [3,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [7,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [11,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [15,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [19,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [23,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [27,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [31,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [35,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [39,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [43,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [47,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [51,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [55,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [59,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [0,0,0], thread: [63,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
ERROR [08/09 16:13:15 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
File "/detectron2_repo/detectron2/engine/train_loop.py", line 138, in train
self.run_step()
File "/detectron2_repo/detectron2/engine/defaults.py", line 441, in run_step
self._trainer.run_step()
File "/detectron2_repo/detectron2/engine/train_loop.py", line 232, in run_step
loss_dict = self.model(data)
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/Study-MaskFormer/mask_former/mask_former_model.py", line 180, in forward
losses = self.criterion(outputs, targets)
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/Study-MaskFormer/mask_former/modeling/criterion.py", line 162, in forward
indices = self.matcher(outputs_without_aux, targets)
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/Study-MaskFormer/mask_former/modeling/matcher.py", line 163, in forward
return self.memory_efficient_forward(outputs, targets)
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/Study-MaskFormer/mask_former/modeling/matcher.py", line 123, in memory_efficient_forward
cost_mask = batch_sigmoid_focal_loss(out_mask, tgt_mask)
File "/Study-MaskFormer/mask_former/modeling/matcher.py", line 49, in batch_sigmoid_focal_loss
focal_pos = ((1 - prob) ** gamma) * F.binary_cross_entropy_with_logits(
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/tensor.py", line 528, in __rsub__
return _C._VariableFunctions.rsub(self, other)
RuntimeError: CUDA error: device-side assert triggered
[08/09 16:13:15 d2.engine.hooks]: Total training time: 0:00:00 (0:00:00 on hooks)
[08/09 16:13:15 d2.utils.events]: iter: 0 lr: N/A max_mem: 1009M
Traceback (most recent call last):
File "train_net.py", line 264, in <module>
launch(
File "/detectron2_repo/detectron2/engine/launch.py", line 62, in launch
main_func(*args)
File "train_net.py", line 258, in main
return trainer.train()
File "/detectron2_repo/detectron2/engine/defaults.py", line 431, in train
super().train(self.start_iter, self.max_iter)
File "/detectron2_repo/detectron2/engine/train_loop.py", line 138, in train
self.run_step()
File "/detectron2_repo/detectron2/engine/defaults.py", line 441, in run_step
self._trainer.run_step()
File "/detectron2_repo/detectron2/engine/train_loop.py", line 232, in run_step
loss_dict = self.model(data)
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/Study-MaskFormer/mask_former/mask_former_model.py", line 180, in forward
losses = self.criterion(outputs, targets)
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/Study-MaskFormer/mask_former/modeling/criterion.py", line 162, in forward
indices = self.matcher(outputs_without_aux, targets)
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/Study-MaskFormer/mask_former/modeling/matcher.py", line 163, in forward
return self.memory_efficient_forward(outputs, targets)
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/Study-MaskFormer/mask_former/modeling/matcher.py", line 123, in memory_efficient_forward
cost_mask = batch_sigmoid_focal_loss(out_mask, tgt_mask)
File "/Study-MaskFormer/mask_former/modeling/matcher.py", line 49, in batch_sigmoid_focal_loss
focal_pos = ((1 - prob) ** gamma) * F.binary_cross_entropy_with_logits(
File "/.pyenv/versions/3.8.6/lib/python3.8/site-packages/torch/tensor.py", line 528, in __rsub__
return _C._VariableFunctions.rsub(self, other)
RuntimeError: CUDA error: device-side assert triggered
docker@a68944098dc2:/Study-MaskFormer$