Giter Club home page Giter Club logo

xview2's Introduction

Problem description

In this repository, you can train models for the xView2 challenge to create an accurate and efficient model for building localization and damage assessment based on satellite imagery. For building localization, predicted pixel values must be either 0 (no building) or 1 (building), whereas for building damage classification: 1 (undamaged building), 2 (minor damaged building), 3 (major damaged building), 4 (destroyed building)

IMAGE LABEL PREDICTION

Methods

The following options can be used to train U-Net models:

  • U-Net encoders:
  • Loss functions:
  • U-Net variants for damage assessment:
    • Siamese - share weights for pre and post disaster images; two variants - with shared encoder only or encoder and decoder
    • Fused - use two U-Nets with fused blocks to aggregate context from pre and post disaster images; two variants - with fused encoder only or encoder and decoder
    • Parallel - use two U-Nets for pre and post images separately; two variants - with parallel encoder only or encoder and decoder
    • Concatenated - use 6-channel input i.e. concatenation of pre and post images
  • Deep Supervision
  • Attention
  • Pyramid Parsing Module
  • Atrous Spatial Pyramid Pooling Module
  • Test time augmentation
  • Supported optimizers: SGD, Adam, RAdam, Adabelief, Adabound, Adamp, Novograd
  • Supported learning rate scheduler: Noam

In the usage section, you can find the full list of available options whereas in examples you can find a few commands for launching training and evaluation.

Dataset

The dataset used in the contests is called xBD and contains 22,068 images each of 1024x1024 size with RGB colors (see the xBD paper for more details). The data is available for download from the xView2 challenge website (registration required).

This repository assumes the following data layout:

/data
 ├── train
 │      ├── images
 │      │      └── <image_id>.png
 │      │      └── ...
 │      └── targets
 │             └── <image_id>.png
 │             └── ...
 ├── test
 │      ├── images
 │      │      └── <image_id>.png
 │      │      └── ...
 │      └── targets
 │             └── <image_id>.png
 │             └── ...
 └── holdout
        ├── images
        │      └── <image_id>.png
        │      └── ...
        └── targets
               └── <image_id>.png
               └── ...

For example to convert json files within DATA_PATH/train directory, use:

python utils/convert2png.py --data DATA_PATH/train

Installation

The repository contains Dockerfile which handles all required dependencies. Here are the steps to prepare the environment:

  1. Clone repository: git clone https://github.com/michal2409/xview2 && cd xview2

  2. Build docker image: docker build -t xview2 .

  3. Run docker container:

docker run -it --rm --gpus all --shm-size=8g --ulimit memlock=-1 --ulimit stack=67108864 -v RESULTS_PATH:/results -v DATA_PATH:/data xview2 bash

where

  • DATA_PATH is the path to xBD directory with layout as described in the dataset section
  • RESULTS_PATH is the path to the directory for artifacts like checkpoints, log output or predictions

Usage

Here are the options for the main.py script:

usage: python main.py [--optional arguments] 

optional arguments:
  -h, --help            show this help message and exit
  --exec_mode {train,eval}
                        Execution mode of main script
  --data                Path to the data directory
  --results             Path to the results directory
  --gpus                Number of gpus to use
  --num_workers         Number of subprocesses to use for data loading
  --batch_size          Training batch size
  --val_batch_size      Evaluation batch size
  --precision {16,32}   Numerical precision
  --epochs              Max number of epochs      
  --patience            Early stopping patience
  --ckpt                Path to pretrained checkpoint
  --logname             Name of logging file
  --ckpt_pre            Path to pretrained checkpoint of localization model used to initialize network for damage assesment
  --type {pre,post}     Type of task to run; pre - localization, post - damage assesment
  --seed
  --optimizer {sgd,adam,adamw,radam,adabelief,adabound,adamp,novograd}
  --dmg_model {siamese,siameseEnc,fused,fusedEnc,parallel,parallelEnc,diff,cat}
                        U-Net variant for damage assessment task
  --encoder {resnest50,resnest101,resnest200,resnest269,resnet50,resnet101,resnet152}
                        U-Net encoder
  --loss_str            String used for creation of loss function, e.g focal+dice creates the loss function as sum of focal and dice.
                        Available functions: dice, focal, ce, ohem, mse, coral
  --use_scheduler       Enable Noam learning rate scheduler
  --warmup              Warmup epochs for Noam learning rate scheduler
  --init_lr             Initial learning rate for Noam scheduler
  --final_lr            Final learning rate for Noam scheduler
  --lr                  Learning rate, or a target learning rate for Noam scheduler
  --weight_decay        Weight decay (L2 penalty)
  --momentum            Momentum for SGD optimizer
  --dilation {1,2,4}    Dilation rate for a encoder, e.g dilation=2 uses dilation instead of stride in the last encoder block
  --tta                 Enable test time augmentation
  --ppm                 Use pyramid pooling module
  --aspp                Use atrous spatial pyramid pooling
  --no_skip             Disable skip connections in UNet
  --deep_supervision    Enable deep supervision
  --attention           Enable attention module at the decoder
  --autoaugment         Use imageNet autoaugment pipeline
  --interpolate         Interpolate feature map from encoder without a decoder
  --dec_interp          Use interpolation instead of transposed convolution in a decoder

Examples

To train the building localization task with the resnest200 encoder, cross entropy + dice loss function, deep supervision, attention and test time augmentation with 1 gpu and batch size 16 for training and 8 for evaluation, launch:

python main.py --type pre --encoder resnest200 --loss_str ce+dice --deep_supervision --attention --tta --gpus 1 --batch_size 16 --val_batch_size 8 --gpus 1

To train the building damage assessment task with the siamese version of U-Net, resnest200 encoder, focal + dice loss function, deep supervision, attention and test time augmentation with 8 gpus and batch size 16 for training and 8 GPUs for evaluation, launch:

python main.py --type post --dmg_model siamese --encoder resnest200 --loss_str focal+dice --attention --deep_supervision --tta --gpus 8 --batch_size 16 --val_batch_size 8 

To run inference with batch size 8 on the test set, launch:

python main.py --exec_mode eval --type {pre,post} --ckpt <path/to/checkpoint> --gpus 1 --val_batch_size 8

To post process the saved predictions during inference, launch:

python utils/post_process.py

To get the final score, launch:

python utils/xview2_metrics.py /results/predictions /results/targets /results/score.json && cat /results/score.json

References

xview2's People

Contributors

dependabot[bot] avatar michal2409 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

xview2's Issues

assert issue

I have downloaded the dataset from the xView2 website:
/data
├── train
│ ├── images contain 5598 images
│ │ └── <image_id>.png
│ │ └── ...
│ └── targets contain 5598 images
│ └── <image_id>.png
│ └── ...
├── test
│ ├── images contain 1866 images
│ │ └── <image_id>.png
│ │ └── ...
│ └── targets contain 1866 images
│ └── <image_id>.png
│ └── ...
└── holdout
├── images contain 1866 images
│ └── <image_id>.png
│ └── ...
└── targets contain 1866 images
└── <image_id>.png
└── ...

File "/workspace/xview2/data_loading/pytorch_loader.py", line 55, in __init__
    self.imgs_pre, self.lbls_pre = load_data(path, "pre")
  File "/workspace/xview2/data_loading/pytorch_loader.py", line 35, in load_data
    assert len(imgs) == len(lbls) and len(imgs) > 0

I am getting this error, I don't know why these assert statements are added??

pynvml.nvml.NVMLError_NotSupported: Not Supported

when I running main.py
traceback this:
C:\ProgramData\Anaconda3\python.exe D:/ChromeDownloads/xview2-master/xview2-master/main.py
Traceback (most recent call last):
File "D:/ChromeDownloads/xview2-master/xview2-master/main.py", line 65, in
affinity = set_affinity(os.getenv("LOCAL_RANK", "0"), "socket_unique_interleaved")
File "D:\ChromeDownloads\xview2-master\xview2-master\utils\gpu_affinity.py", line 139, in set_affinity
set_socket_unique_affinity(gpu_id, world_size, "interleaved")
File "D:\ChromeDownloads\xview2-master\xview2-master\utils\gpu_affinity.py", line 81, in set_socket_unique_affinity
socket_affinities = [dev.getCpuAffinity() for dev in device_ids]
File "D:\ChromeDownloads\xview2-master\xview2-master\utils\gpu_affinity.py", line 81, in
socket_affinities = [dev.getCpuAffinity() for dev in device_ids]
File "D:\ChromeDownloads\xview2-master\xview2-master\utils\gpu_affinity.py", line 34, in getCpuAffinity
for j in pynvml.nvmlDeviceGetCpuAffinity(self.handle, device._nvml_affinity_elements):
File "C:\ProgramData\Anaconda3\lib\site-packages\pynvml\nvml.py", line 989, in nvmlDeviceGetCpuAffinity
check_return(ret)
File "C:\ProgramData\Anaconda3\lib\site-packages\pynvml\nvml.py", line 366, in check_return
raise NVMLError(ret)
pynvml.nvml.NVMLError_NotSupported: Not Supported

Process finished with exit code 1
thanks very much!

idx issue

After starting the docker container, when I run a command to train a model. It gives this error

Traceback (most recent call last):
  File "main.py", line 10, in <module>
    from data_loading.data_module import DataModule
  File "/workspace/xview2/data_loading/data_module.py", line 5, in <module>
    from data_loading.pytorch_loader import fetch_pytorch_loader
  File "/workspace/xview2/data_loading/pytorch_loader.py", line 5, in <module>
    import albumentations as A
  File "/opt/conda/lib/python3.8/site-packages/albumentations/__init__.py", line 5, in <module>
    from .core.composition import *
  File "/opt/conda/lib/python3.8/site-packages/albumentations/core/composition.py", line 8, in <module>
    from albumentations.augmentations.keypoints_utils import KeypointsProcessor
  File "/opt/conda/lib/python3.8/site-packages/albumentations/augmentations/__init__.py", line 4, in <module>
    from .functional import *
  File "/opt/conda/lib/python3.8/site-packages/albumentations/augmentations/functional.py", line 7, in <module>
    import cv2
  File "/opt/conda/lib/python3.8/site-packages/cv2/__init__.py", line 5, in <module>
    from .cv2 import *

I resolved this issue with the command

apt-get install libgl1

But then I am getting an index error, After printing the idx, in the TrainPreDataset.getitem function, this is the output.

Epoch 0:   0%|                                                                                 | 0/652 [00:00<?, ?it/s]
------>idx:  7880
------>idx:  3611
------>idx:  2916
------>idx:  6289
------>idx:  8301
------>idx:  4774
------>idx:  2625
------>idx:  7334
------>idx:  7379
------>idx:  8538
------>idx:  5376
------>idx:  1684
------>idx:  4964
------>idx:  1204
------>idx:  4311
------>idx:  452
Traceback (most recent call last):
  File "main.py", line 114, in <module>
------>idx:  557
    trainer.fit(model, data_module)

These idx are read from the utils/index.csv which contains 8566 entries, but the total images in my data are 8130. I don't how this csv file is used. ANY SOLUTION????

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.