zjulearning / rmi Goto Github PK
View Code? Open in Web Editor NEWThis is the code for the NeurIPS 2019 paper Region Mutual Information Loss for Semantic Segmentation.
License: MIT License
This is the code for the NeurIPS 2019 paper Region Mutual Information Loss for Semantic Segmentation.
License: MIT License
Hello
How are you?
Thanks for contribution to this project.
I'm NOT sure if this RMI loss would work well in case that we resize the image & mask to input size(NxN in pixels) without keeping width-height ratio in the data augmentation step.
I am working on image segmentation project.
There are many images & masks with different sizes in my dataset.
The data by dataloader are resized to input size(ex: 256x256) and feed into the model.
So the original width-height ratio of image & mask are NOT kept.
Even in such case, does this RMI loss work well?
Out of the box, i'm seeing negative RMI loss. Is that expected? I'm using the provided docker image.
save model into /home/dcg-adlr-atao-source.cosmos318/sources/RMI/rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-0-0.5_n
Namespace(accumulation_steps=1, backbone='resnet50', base_size=513, batch_size=8, bn_mom=0.0003, checkname='deeplab-resnet', crf_iter_steps=1, crop_size=513, cuda=True, data_dir='/home/dcg-adlr-atao-data.cosmos277/data/PASCAL/2012/VOCdevkit/VOC2012', dataset='pascal', dist_backend='nccl', distributed=False, epochs=23, eval_interval=2, freeze_bn=False, ft=False, gpu_ids=[0], init_global_step=0, init_lr=0.007, local_rank=0, loss_type=2, loss_weight_lambda=0.5, lr_multiplier=10.0, lr_scheduler='poly', main_gpu=0, max_ckpt_nums=15, model_dir='/home/dcg-adlr-atao-source.cosmos318/sources/RMI/rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-0-0.5_n', momentum=0.9, multiprocessing_distributed=False, nesterov=False, no_cuda=False, no_val=False, out_stride=16, output_dir='/home/zhaoshuai/models/deeplabv3_cbl_2/', proc_name='rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-0-0.5_n', resume='None', rmi_pool_size=4, rmi_pool_stride=4, rmi_pool_way=1, rmi_radius=3, save_ckpt_steps=500, seed=1, seg_model='deeplabv3', slow_start_lr=0.0001, slow_start_steps=1500, start_epoch=0, sync_bn=False, test_batch_size=8, train_split='trainaug', use_balanced_weights=False, use_sbd=False, weight_decay=4e-05, workers=8, world_size=1)
INFO:PyTorch: Using PASCAL VOC dataset, the training batch size 8 and crop size is 513.
Number of image_lists in trainaug: 10582
Number of image_lists in val: 1449
Restore parameters from the /home/atao/.encoding/models/resnet101-2a57e44d.pth
INFO:PyTorch: Using Region Mutual Information Loss.
INFO:PyTorch: Using poly learning rate scheduler!
INFO:PyTorch: Starting Epoch: 0
INFO:PyTorch: Total Epoches: 23
INFO:PyTorch: epoch=1/23, steps=20, loss=-30.94986, learning_rate=0.00019, train_miou=0.02007, px_accuracy=0.22422 (20.791 sec)
INFO:PyTorch: epoch=1/23, steps=40, loss=-31.49414, learning_rate=0.00028, train_miou=0.02949, px_accuracy=0.44150 (15.590 sec)
INFO:PyTorch: epoch=1/23, steps=60, loss=-32.38523, learning_rate=0.00038, train_miou=0.03192, px_accuracy=0.51059 (15.593 sec)
INFO:PyTorch: epoch=1/23, steps=80, loss=-32.04794, learning_rate=0.00047, train_miou=0.03653, px_accuracy=0.55031 (15.519 sec)
INFO:PyTorch: epoch=1/23, steps=100, loss=-30.86725, learning_rate=0.00056, train_miou=0.04386, px_accuracy=0.57324 (15.507 sec)
INFO:PyTorch: epoch=1/23, steps=120, loss=-31.81583, learning_rate=0.00065, train_miou=0.05846, px_accuracy=0.59686 (15.437 sec)
INFO:PyTorch: epoch=1/23, steps=140, loss=-32.34855, learning_rate=0.00074, train_miou=0.07140, px_accuracy=0.62079 (15.426 sec)
...
Hello there,
I tried to download ResNet101 pretrained model using by model_store.py scripts,
but, it was not possible to download ResNet101 pretrained model.
Can you provide pretrained model by other way?
otherwise, it is not possible to reproduce your work....
i just trained from scratch using camvid datasets... it was around 60% mIoU using RMI Loss..
thanks.
Hello
How are you?
Thanks for contributing to this project.
Did u look at this paper?
https://www.mdpi.com/2072-4292/13/3/454
The author of this paper says that the Potential Energy Loss on Gibbs distribution outperforms your RMI loss.
Excuse me, is that really true?
I think that u can easily implement the PE loss and compare it with your RMI loss.
If u can implement the PE loss, could u share the code?
Thanks.
Hello,
Firstly thanks for this work
I'm currently working with this RMI loss on my own segmentation toolbox,
but i found the RMI loss provide negative loss
I just copied all codes from rmi.py and rmi_utils.py, then use this RMI loss instead of cross entropy loss
is it normal issue that RMI loss provide negative loss at the beginning of training?
thanks you
Hello, thank you for your code.
I have a question to ask you, my network output is y_ pred: (10242,36) 10242 is the number of pixels, 36 is the number of categories, y_ PRED can be expressed as the probability that each pixel belongs to a certain class. Y_ true: (10242,36) one hot. How do you use CRF for post-processing?
Great work!
We meet an issue caused by the computation of chol = torch.cholesky(matrix). We have pasted the error information as shown below,
RuntimeError: cholesky_cuda: For batch 0: U(1,1) is zero, singular U.
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: rmi_now = 0.5 * log_det_by_cholesky(appro_var + diag_matrix.type_as(appro_var) * _POS_ALPHA)
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: File "/teamdrive/yuyua/code/segmentation/mmsegmentation/mmseg/models/losses/rmi_loss.py", line 118, in log_det_by_cholesky
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: chol = torch.cholesky(matrix)
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: chol = torch.cholesky(matrix)
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: chol = torch.cholesky(matrix)
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: RuntimeError chol = torch.cholesky(matrix)
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: RuntimeError: cholesky_cuda: For batch 0: U(1,1) is zero, singular U.
2020-08-11T12:37:40.000Z /container_e2240_1583898264103_325873_01_000010: : cholesky_cuda: For batch 0: U(1,1) is zero, singular U.
import java.io.;
import java.net.;
// Server class
class Server {
public static void main(String[] args)
{
ServerSocket server = null;
try {
// server is listening on port 1234
server = new ServerSocket(1234);
server.setReuseAddress(true);
// running infinite loop for getting
// client request
while (true) {
// socket object to receive incoming client
// requests
Socket client = server.accept();
// Displaying that new client is connected
// to server
System.out.println("New client connected"
Hi, I only have two 2080Ti GPUs with memory 11G per gpu
. I'd like to train the baseline deeplabv3 with resnet-101
as backbone and batch_size=8 per gpu
(for 2 gpus, global batch_size=16):
input the gpu (seperate by comma (,) ): 0,1
using gpus 0,1
0 -- deeplabv3
1 -- deeplabv3+
2 -- pspnet
choose the base network: 0
0 -- resnet_v1_50
1 -- resnet_v1_101
2 -- resnet_v1_152
choose the base network: 1
The backbone is resnet101
The base model is deeplabv3
0 -- softmax cross entropy loss.
1 -- sigmoid binary cross entropy loss.
2 -- bce and RMI loss.
3 -- Affinity field loss.
5 -- Pyramid loss.
input the loss type of the first stage: 2
0 -- PASCAL VOC2012 dataset
1 -- Cityscapes
2 -- CamVid
input the dataset: 0
input the batch_size (4, 8, 12 or 16): 8
The data dir is /workspace/data/PASCAL_VOC2012/VOCdevkit/VOC2012, the batch size is 8.
make the directory /workspace/pyroom/RMISegLoss/rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-1-0.5_n
Namespace(accumulation_steps=1, backbone='resnet101', base_size=513, batch_size=8, bn_mom=0.05, checkname='deeplab-resnet', crf_iter_steps=1, crop_size=513, cuda=True, data_dir='/workspace/data/PASCAL_VOC2012/VOCdevkit/VOC2012', dataset='pascal', dist_backend='nccl', distributed=True, epochs=23, eval_interval=2, freeze_bn=False, ft=False, gpu_ids=[0, 1], init_global_step=0, init_lr=0.007, local_rank=0, loss_type=2, loss_weight_lambda=0.5, lr_multiplier=10.0, lr_scheduler='poly', main_gpu=0, max_ckpt_nums=15, model_dir='/workspace/pyroom/RMISegLoss/rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-1-0.5_n', momentum=0.9, multiprocessing_distributed=False, nesterov=False, no_cuda=False, no_val=False, out_stride=16, output_dir='/home/zhaoshuai/models/deeplabv3_cbl_2/', proc_name='rmi_model/rmi_re_pascal_r3_pw1_st4_si4_bp513-8_net0-1-0.5_n', resume='None', rmi_pool_size=4, rmi_pool_stride=4, rmi_pool_way=1, rmi_radius=3, save_ckpt_steps=500, seed=1, seg_model='deeplabv3', slow_start_lr=0.0001, slow_start_steps=1500, start_epoch=0, sync_bn=True, test_batch_size=8, train_split='trainaug', use_balanced_weights=False, use_sbd=False, weight_decay=0.0001, workers=8, world_size=2)
INFO:PyTorch: Using PASCAL VOC dataset, the training batch size 8 and crop size is 513.
Number of image_lists in trainaug: 10582
Number of image_lists in val: 1449
Restore parameters from the /root/.encoding/models/resnet101-2a57e44d.pth
INFO:PyTorch: Using Region Mutual Information Loss.
INFO:PyTorch: The batch norm layer is Hang Zhang's <class 'model.sync_bn.syncbn.BatchNorm2d'>
INFO:PyTorch: Using poly learning rate scheduler!
INFO:PyTorch: Starting Epoch: 0
INFO:PyTorch: Total Epoches: 23
I wonder if it is equal to train a DeepLabv3 model with output_stride=16, crop_size=513, and batch_size=16 on a single 1 TITAN RTX GPUs
? Will it achieve similar convergence in 23 epochs
.
Does the batch_size matter? If so, how can I adjust other hyperparams with batch_size=8, like epochs, lr as well as the lr_scheduler?
Some augmentations (e.g. random angle rotation) make image and mask not fully significant.
To deal with such cases i usually use per pixel weights (0. for holes, 1. for correct parts) and multiply per pixel loss on that weights.
But RMI loss uses "high dimension points" and final loss has shape incompatible with original labels.
Could you please suggest what is the best way to decouple such "holes" loss (multiply by pixel weight)?
In most cases when we have loss >= 0 with shape [batch_size] and we want to weight up importance of some examples we would multiply loss by weight. E.g. loss = [0.1, 0.3], weights = [2., 1.], weighted_loss = [0.2, 0.3]
But how should we do that for RMI loss that may be negative?
E.g. loss = [-0.1, -0.3], weights = [2., 1.], weighted_loss = [-0.2, -0.3]. In this example weighted loss will be smaller instead of expected "larger".
Should we multiply loss by weights or divide?
Hello,
Do you have any results on Cityscapes datasets?
I just wonder rmi loss will bring better performance on cityscapes datset
Thank you
Hi,your work is so amazing and help me a lot!
I mentioned in your reported result in paper, you did some comparison between deeplab and your results on different datasets, but I ran your code with batch size=16, deeplab v3+ model , resnet101 backbone on VOC2012 dataset, but only get 0.772miou after about 30k iterations(just the default setting), so can you tell me how can I set the hyper-parameters to get the desire result (including 78.8 with crossentropy loss and higer miou with your proposed loss)?
this is my result on val dataset:
Thank you very much for your excellent work!
I tested the rmi loss with random inputs and I found the rmi loss does not change too much. Is it normal? My test code is as follows.
logits = np.random.randn(5, 3, 32, 32)
labels = np.random.randint(0, 3, size=(5, 32, 32))
logits = torch.from_numpy(logits.astype(np.float32))
labels = torch.from_numpy(labels.astype(np.int32))
rmiloss= RMILoss(num_classes=3)(logits, labels)
print(rmiloss)
Since the intuition behind RMI loss is to model the dependencies among pixels, the improvement of boundary segmentation should be not obvious. Is it right?
Hello, thank you for your excellent work. I have a question about that when the output channels of my model is 1 for 2 classification tasks, is rmi loss meaningful?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.