Giter Club home page Giter Club logo

rmi's People

Contributors

mzhaoshuai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rmi's Issues

Effect of resizing image & mask without keeping width-height ratio in the data augmentation

Hello
How are you?
Thanks for contribution to this project.
I'm NOT sure if this RMI loss would work well in case that we resize the image & mask to input size(NxN in pixels) without keeping width-height ratio in the data augmentation step.
I am working on image segmentation project.
There are many images & masks with different sizes in my dataset.
The data by dataloader are resized to input size(ex: 256x256) and feed into the model.
So the original width-height ratio of image & mask are NOT kept.
Even in such case, does this RMI loss work well?

Negative RMI loss

Out of the box, i'm seeing negative RMI loss. Is that expected? I'm using the provided docker image.

save model into /home/dcg-adlr-atao-source.cosmos318/sources/RMI/rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-0-0.5_n
Namespace(accumulation_steps=1, backbone='resnet50', base_size=513, batch_size=8, bn_mom=0.0003, checkname='deeplab-resnet', crf_iter_steps=1, crop_size=513, cuda=True, data_dir='/home/dcg-adlr-atao-data.cosmos277/data/PASCAL/2012/VOCdevkit/VOC2012', dataset='pascal', dist_backend='nccl', distributed=False, epochs=23, eval_interval=2, freeze_bn=False, ft=False, gpu_ids=[0], init_global_step=0, init_lr=0.007, local_rank=0, loss_type=2, loss_weight_lambda=0.5, lr_multiplier=10.0, lr_scheduler='poly', main_gpu=0, max_ckpt_nums=15, model_dir='/home/dcg-adlr-atao-source.cosmos318/sources/RMI/rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-0-0.5_n', momentum=0.9, multiprocessing_distributed=False, nesterov=False, no_cuda=False, no_val=False, out_stride=16, output_dir='/home/zhaoshuai/models/deeplabv3_cbl_2/', proc_name='rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-0-0.5_n', resume='None', rmi_pool_size=4, rmi_pool_stride=4, rmi_pool_way=1, rmi_radius=3, save_ckpt_steps=500, seed=1, seg_model='deeplabv3', slow_start_lr=0.0001, slow_start_steps=1500, start_epoch=0, sync_bn=False, test_batch_size=8, train_split='trainaug', use_balanced_weights=False, use_sbd=False, weight_decay=4e-05, workers=8, world_size=1)
INFO:PyTorch: Using PASCAL VOC dataset, the training batch size 8 and crop size is 513.
Number of image_lists in trainaug: 10582
Number of image_lists in val: 1449
Restore parameters from the /home/atao/.encoding/models/resnet101-2a57e44d.pth
INFO:PyTorch: Using Region Mutual Information Loss.
INFO:PyTorch: Using poly learning rate scheduler!
INFO:PyTorch: Starting Epoch: 0
INFO:PyTorch: Total Epoches: 23
INFO:PyTorch: epoch=1/23, steps=20, loss=-30.94986, learning_rate=0.00019, train_miou=0.02007, px_accuracy=0.22422 (20.791 sec)
INFO:PyTorch: epoch=1/23, steps=40, loss=-31.49414, learning_rate=0.00028, train_miou=0.02949, px_accuracy=0.44150 (15.590 sec)
INFO:PyTorch: epoch=1/23, steps=60, loss=-32.38523, learning_rate=0.00038, train_miou=0.03192, px_accuracy=0.51059 (15.593 sec)
INFO:PyTorch: epoch=1/23, steps=80, loss=-32.04794, learning_rate=0.00047, train_miou=0.03653, px_accuracy=0.55031 (15.519 sec)
INFO:PyTorch: epoch=1/23, steps=100, loss=-30.86725, learning_rate=0.00056, train_miou=0.04386, px_accuracy=0.57324 (15.507 sec)
INFO:PyTorch: epoch=1/23, steps=120, loss=-31.81583, learning_rate=0.00065, train_miou=0.05846, px_accuracy=0.59686 (15.437 sec)
INFO:PyTorch: epoch=1/23, steps=140, loss=-32.34855, learning_rate=0.00074, train_miou=0.07140, px_accuracy=0.62079 (15.426 sec)
...

Not available to download ResNet101 pretrained model.

Hello there,
I tried to download ResNet101 pretrained model using by model_store.py scripts,

but, it was not possible to download ResNet101 pretrained model.

Can you provide pretrained model by other way?
otherwise, it is not possible to reproduce your work....
i just trained from scratch using camvid datasets... it was around 60% mIoU using RMI Loss..

thanks.

The RMI loss provides negative loss

Hello,
Firstly thanks for this work

I'm currently working with this RMI loss on my own segmentation toolbox,
but i found the RMI loss provide negative loss

I just copied all codes from rmi.py and rmi_utils.py, then use this RMI loss instead of cross entropy loss

is it normal issue that RMI loss provide negative loss at the beginning of training?

thanks you

Use CRF as post-process in image segmentation

Hello, thank you for your code.
I have a question to ask you, my network output is y_ pred: (10242,36) 10242 is the number of pixels, 36 is the number of categories, y_ PRED can be expressed as the probability that each pixel belongs to a certain class. Y_ true: (10242,36) one hot. How do you use CRF for post-processing?

cholesky_cuda: For batch 0: U(1,1) is zero, singular U.

Great work!

We meet an issue caused by the computation of chol = torch.cholesky(matrix). We have pasted the error information as shown below,

RuntimeError:     cholesky_cuda: For batch 0: U(1,1) is zero, singular U.
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: rmi_now = 0.5 * log_det_by_cholesky(appro_var + diag_matrix.type_as(appro_var) * _POS_ALPHA)
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010:       File "/teamdrive/yuyua/code/segmentation/mmsegmentation/mmseg/models/losses/rmi_loss.py", line 118, in log_det_by_cholesky
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: chol = torch.cholesky(matrix)
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010:         chol = torch.cholesky(matrix)
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: chol = torch.cholesky(matrix)
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: RuntimeError    chol = torch.cholesky(matrix)
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: RuntimeError: cholesky_cuda: For batch 0: U(1,1) is zero, singular U.
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: : cholesky_cuda: For batch 0: U(1,1) is zero, singular U.

RMI

import java.io.;
import java.net.
;
// Server class
class Server {
public static void main(String[] args)
{
ServerSocket server = null;
try {
// server is listening on port 1234
server = new ServerSocket(1234);
server.setReuseAddress(true);
// running infinite loop for getting
// client request
while (true) {
// socket object to receive incoming client
// requests
Socket client = server.accept();
// Displaying that new client is connected
// to server
System.out.println("New client connected"

  • client.getInetAddress()
    .getHostAddress());
    // create a new thread object
    ClientHandler clientSock
    = new ClientHandler(client);
    // This thread will handle the client
    // separately
    new Thread(clientSock).start();
    }
    }
    catch (IOException e) {
    e.printStackTrace();
    }
    finally {
    if (server != null) {
    try {
    server.close();
    }
    catch (IOException e) {
    e.printStackTrace();
    }
    }
    }
    }
    // ClientHandler class
    private static class ClientHandler implements Runnable {
    private final Socket clientSocket;
    // Constructor
    public ClientHandler(Socket socket)
    {
    this.clientSocket = socket;
    }
    public void run()
    {
    PrintWriter out = null;
    BufferedReader in = null;
    try {
    // get the outputstream of client
    out = new PrintWriter(
    clientSocket.getOutputStream(), true);
    // get the inputstream of client
    in = new BufferedReader(
    new InputStreamReader(
    clientSocket.getInputStream()));
    String line;
    while ((line = in.readLine()) != null) {

Training a DeepLabv3 model with output_stride=16, crop_size=513, and batch_size=8 on two 2080Ti GPUs

Hi, I only have two 2080Ti GPUs with memory 11G per gpu. I'd like to train the baseline deeplabv3 with resnet-101 as backbone and batch_size=8 per gpu (for 2 gpus, global batch_size=16):

input the gpu (seperate by comma (,) ): 0,1
using gpus 0,1

0  --  deeplabv3
1  --  deeplabv3+
2  --  pspnet
choose the base network: 0

0  --  resnet_v1_50
1  --  resnet_v1_101
2  --  resnet_v1_152
choose the base network: 1
The backbone is resnet101
The base model is deeplabv3

0  --  softmax cross entropy loss.
1  --  sigmoid binary cross entropy loss.
2  --  bce and RMI loss.
3  --  Affinity field loss.
5  --  Pyramid loss.
input the loss type of the first stage: 2

0 -- PASCAL VOC2012 dataset
1 -- Cityscapes
2 -- CamVid
input the dataset: 0

input the batch_size (4, 8, 12 or 16): 8
The data dir is /workspace/data/PASCAL_VOC2012/VOCdevkit/VOC2012, the batch size is 8.
make the directory /workspace/pyroom/RMISegLoss/rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-1-0.5_n
Namespace(accumulation_steps=1, backbone='resnet101', base_size=513, batch_size=8, bn_mom=0.05, checkname='deeplab-resnet', crf_iter_steps=1, crop_size=513, cuda=True, data_dir='/workspace/data/PASCAL_VOC2012/VOCdevkit/VOC2012', dataset='pascal', dist_backend='nccl', distributed=True, epochs=23, eval_interval=2, freeze_bn=False, ft=False, gpu_ids=[0, 1], init_global_step=0, init_lr=0.007, local_rank=0, loss_type=2, loss_weight_lambda=0.5, lr_multiplier=10.0, lr_scheduler='poly', main_gpu=0, max_ckpt_nums=15, model_dir='/workspace/pyroom/RMISegLoss/rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-1-0.5_n', momentum=0.9, multiprocessing_distributed=False, nesterov=False, no_cuda=False, no_val=False, out_stride=16, output_dir='/home/zhaoshuai/models/deeplabv3_cbl_2/', proc_name='rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-1-0.5_n', resume='None', rmi_pool_size=4, rmi_pool_stride=4, rmi_pool_way=1, rmi_radius=3, save_ckpt_steps=500, seed=1, seg_model='deeplabv3', slow_start_lr=0.0001, slow_start_steps=1500, start_epoch=0, sync_bn=True, test_batch_size=8, train_split='trainaug', use_balanced_weights=False, use_sbd=False, weight_decay=0.0001, workers=8, world_size=2)
INFO:PyTorch: Using PASCAL VOC dataset, the training batch size 8 and crop size is 513.
Number of image_lists in trainaug: 10582
Number of image_lists in val: 1449
Restore parameters from the /root/.encoding/models/resnet101-2a57e44d.pth
INFO:PyTorch: Using Region Mutual Information Loss.
INFO:PyTorch: The batch norm layer is Hang Zhang's <class 'model.sync_bn.syncbn.BatchNorm2d'>
INFO:PyTorch: Using poly learning rate scheduler!
INFO:PyTorch: Starting Epoch: 0
INFO:PyTorch: Total Epoches: 23

I wonder if it is equal to train a DeepLabv3 model with output_stride=16, crop_size=513, and batch_size=16 on a single 1 TITAN RTX GPUs? Will it achieve similar convergence in 23 epochs.

Does the batch_size matter? If so, how can I adjust other hyperparams with batch_size=8, like epochs, lr as well as the lr_scheduler?

Sample weights for RMI loss

Some augmentations (e.g. random angle rotation) make image and mask not fully significant.
To deal with such cases i usually use per pixel weights (0. for holes, 1. for correct parts) and multiply per pixel loss on that weights.

But RMI loss uses "high dimension points" and final loss has shape incompatible with original labels.

Could you please suggest what is the best way to decouple such "holes" loss (multiply by pixel weight)?

How to weight some examples if loss may be negative

In most cases when we have loss >= 0 with shape [batch_size] and we want to weight up importance of some examples we would multiply loss by weight. E.g. loss = [0.1, 0.3], weights = [2., 1.], weighted_loss = [0.2, 0.3]

But how should we do that for RMI loss that may be negative?
E.g. loss = [-0.1, -0.3], weights = [2., 1.], weighted_loss = [-0.2, -0.3]. In this example weighted loss will be smaller instead of expected "larger".

Should we multiply loss by weights or divide?

Benchmarking on Cityscapes

Hello,

Do you have any results on Cityscapes datasets?

I just wonder rmi loss will bring better performance on cityscapes datset

Thank you

About setting of baseline

Hi,your work is so amazing and help me a lot!
I mentioned in your reported result in paper, you did some comparison between deeplab and your results on different datasets, but I ran your code with batch size=16, deeplab v3+ model , resnet101 backbone on VOC2012 dataset, but only get 0.772miou after about 30k iterations(just the default setting), so can you tell me how can I set the hyper-parameters to get the desire result (including 78.8 with crossentropy loss and higer miou with your proposed loss)?
this is my result on val dataset:
image
Thank you very much for your excellent work!

The RMI loss does not change too much

I tested the rmi loss with random inputs and I found the rmi loss does not change too much. Is it normal? My test code is as follows.

logits = np.random.randn(5, 3, 32, 32)
labels = np.random.randint(0, 3, size=(5, 32, 32))

logits = torch.from_numpy(logits.astype(np.float32))
labels = torch.from_numpy(labels.astype(np.int32))

rmiloss= RMILoss(num_classes=3)(logits, labels)
print(rmiloss)

The intuition behind RMI loss

Since the intuition behind RMI loss is to model the dependencies among pixels, the improvement of boundary segmentation should be not obvious. Is it right?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.