xinjcheng / cspn Goto Github PK

View Code? Open in Web Editor NEW

489.0 20.0 94.0 295 KB

Convolutional Spatial Propagation Network

Python 98.82% Shell 1.18%

paddlepaddle cspn pytorch depth-estimation

cspn's People

Contributors

Stargazers

Watchers

Forkers

liushuchun satoshirobatofujimoto tedyhabtegebrial zouhongwei wutianyirosun xinkeae gaopeng91 pranaypratyush mzy97 wpfhtl stonegiggity xuezhisd zhang405744522 forvd wang-kx shahabe zswang666 hurricane2018 lwh19921101 winwinjjiang lihaotiansky fendaq zkwalt songya stalin18 maotianwhu lidongyv fighterzzzh ningbende archive-git-repo yangyongguang suerpx thanhhoang283 progressforever pkurainbow walims yellowyuga chnold swayfreeda nnu-gisa rensimon sunlibocs versatran01 rancheng zebrajack ssociopath edentliang xiaoming-2019 dengqingkang ling-zzz shengjie-lin ccj5351 itking666 jiady1990 paulalopez10 ame430 hwb0314 liupenglei hansry visionresearch wanghuayou cyrilyang 13717630148 minygd liuguoyou endlesswho lilujunai huyaoyu wallzfe liranaz zhou-frank zhwzhong pc2005 freefxy pwswierczynski zuru franky-ciomp rnri liangji-l superxiaoying fanrz wuzhongwulidong taohuang95 qinhao87 yangguanqunit ashok-arjun 155ybw johnbhlm dkaliroff ml-edu zhuangge13 liqingcode iq-scm yooooo00

cspn's Issues

cspn求和的卷积核应该在forward的时候创建

self.weight = torch.ones(1, 8, 1, 1, 1).cuda()
cspn中的求和卷积如果在init过程构建的话，batch运行的时候，pytorch会提示不在一张卡上。
所以该weight应该在forward的过程中创建

Im trying to run eval.py script with GTX 1070 8GB Vram card. I've set batch_size to 1 and tried various n_samples but error still prevails.
The error im getting is:
====TOTAL MEMORY==== 8589934592 GeForce GTX 1070 Memory Usage: Allocated: 0.0 GB Cached: 0.0 GB Traceback (most recent call last): File "eval.py", line 122, in <module> net.cuda() File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 258, in cuda return self._apply(lambda t: t.cuda(device)) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 185, in _apply module._apply(fn) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 185, in _apply module._apply(fn) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 191, in _apply param.data = fn(param.data) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 258, in <lambda> return self._apply(lambda t: t.cuda(device)) RuntimeError: CUDA error: out of memory ==> evaluating model with cspn and unet on nyudepth ==> Preparing data.. ==> Building model.. {'norm_type': '8sum', 'step': 24, 'kernel': 3} ==> Resuming from best model.. ==> model dict with addtional module, remove it...

In eval.py I've just added this in beginning:
os.environ["CUDA_VISIBLE_DEVICES"] = '0' print("====TOTAL MEMORY====") print(torch.cuda.get_device_properties(0).total_memory) print(torch.cuda.get_device_name(0)) print('Memory Usage:') print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB') print('Cached: ', round(torch.cuda.memory_cached(0)/1024**3,1), 'GB')

Does anyone knows how to fix this problem?

Is stereo depth estimation code available now?

cspn iteration Segmentation fault (core dumped)

If I use cspn.py in my code directly, my python code will have an error
Segmentation fault (core dumped)
if I change the fllowing code range(16) to range(1), it will run with no error

        for i in range(2):
            # one propagation
            spn_kernel = 3
            elewise_max_gate1 = self.eight_way_propagation(gate1_w1_cmb, result_depth, spn_kernel)
            elewise_max_gate2 = self.eight_way_propagation(gate2_w1_cmb, result_depth, spn_kernel)
            elewise_max_gate3 = self.eight_way_propagation(gate3_w1_cmb, result_depth, spn_kernel)
            elewise_max_gate4 = self.eight_way_propagation(gate4_w1_cmb, result_depth, spn_kernel)
            elewise_max_gate5 = self.eight_way_propagation(gate5_w1_cmb, result_depth, spn_kernel)
            elewise_max_gate6 = self.eight_way_propagation(gate6_w1_cmb, result_depth, spn_kernel)
            elewise_max_gate7 = self.eight_way_propagation(gate7_w1_cmb, result_depth, spn_kernel)
            elewise_max_gate8 = self.eight_way_propagation(gate8_w1_cmb, result_depth, spn_kernel)

What happened to this code?
Hope your reply, thank you

Kitti Validation Results overfitting?

Hi, Thanks for your contributions, as we have made our own branch to train it in Kitti (# 25 pull request) and after one week training, we got the following results:

val_result.txt

and the following results on validation samples, we made totally 2000+ inference on validation sets and found most of the samples are look like this:

eva_nyudepth script can not find cuda device

Hello,

I am not sure if this is an error with my setup or the code, but when I execute "bash eval_nyudepth_cspn.sh" for testing, I receive following error:

I already tested if google colab has CUDA installed and if pytorch can detect it. And even executing that line outside the script in the cell before works without any errors, but when I try to run the script, it fails.

I appreciate any help or suggestion to fix this

How to implement WASPP?

How to implement WASPP?Is that using ASPP to get multi-scale feature pooling without feature size reduction first ,then using weighted pooling(2d cspn) like WSPP.Or just using dailation conv in 2d cspn's conv?
btw,is max_of_4_tensor used in Stereo or just used in Depth Completion ?

Thanks a lot.

about 3D CSPN module

Can you opensource 3D CSPN code? Thanks a lot. After i reading the paper, i didn't see the operation or details about how to train or apply 3DCSPN to PSMnet.

I test your code in PyTorch0.4.1 and find some bugs

    x = self.gud_up_proj_layer4(x, skip4)       # 64 channels features
    x= self.gud_up_proj_layer5(x)                  # blur depth
    guidance = self.gud_up_proj_layer6(x)    # affinity matrix
    x = self.post_process_layer(guidance, x, sparse_depth)

`
I think the first x is 64 channels features, which are used to generate blur depth and affinity matrix, and the second x is blur depth. Therefore, the input of self.gud_up_proj_layer6 is the first x. I modify the code as follows:

    x = self.gud_up_proj_layer4(x, skip4)
    blur_depth = self.gud_up_proj_layer5(x)
    guidance = self.gud_up_proj_layer6(x)
    x = self.post_process_layer(guidance, blur_depth, sparse_depth)

`
And I find the cspn.py have the following errors.

IndexError: only integers, slices (:), ellipsis (...), None and long or byte Variables are valid indices (got float)

So, I suggest replace '/'(div operation) using '//' when you want to get 'int' rather than 'float'.

up_pooling is not fast

I think your un_pooling operation is not fast. I recommend using the code below.

Unpool: 2*2 unpooling with zero padding

class Unpool(nn.Module):
  def __init__(self, num_channels, stride=2):
    super(Unpool, self).__init__()
    self.num_channels = num_channels
    self.stride = stride

  def forward(self, x):
    weights = torch.zeros(self.num_channels, 1, self.stride, self.stride)
    if torch.cuda.is_available():
        weights = weights.cuda()
    weights[:, :, 0, 0] = 1
    return F.conv_transpose2d(x, weights, stride=self.stride, groups=self.num_channels)

The code does not match the description of the paper

Thanks for the awesome contribution!

When I read the code, I found that the code did not match the paper.

CSPN/cspn_pytorch/models/torch_resnet_cspn_nyu.py

Line 269 in 24eff12

out = self.relu(self.bn1(self.conv1(x)))

According to the description of your paper, I think we should change

out = self.relu(self.bn1(self.conv1(x)))
out = torch.cat((out, side_input), 1)

x = self.relu(self.bn1(self.conv1(x)))
out = torch.cat((x, side_input), 1)

Looking forward to your reply!

image input size

is there a way to parse a larger image to the net? i am able to successfully load a png image in your code, and the depth info in an exr file, but i want to get a larger image as output. i tried changing the center crop and resize transform to twice the original parameters and I get the following error:

File "./models/torch_resnet_cspn_nyu.py", line 270, in forward out = torch.cat((out, side_input), 1) RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 57 and 29 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:71

Training with dense depth

Is it possible to train the model with dense depth annotations? In the form of aligned greyscale png data, which can be converted if needed. (I am experimenting with using synthetic data for training.)
Kind regards.

Code for CSPN++?

Thanks for the awesome work!
I notice that the performance of the recent paper CSPN++ is much better than the original CSPN.So do you plan to release the code for CSPN++ recently?
Best regards!

Training parameters

Hi, could u share the training parameters in the experiment? Such as the learning rate, optimizer, training epochs etc.

where is the post_process?

For KITTI model

Tutorial

Is it possible to get a short Tutorial that covers how to use the code and maybe how to train the Network?

CUDA OOM error

Does anyone knows how to fix this problem?

Whether this is mistake

Hi Xinjin,
We can see that guidance can get 12 channels from the follow code:
self.gud_up_proj_layer6 = self._make_gud_up_conv_layer(Simple_Gudi_UpConv_Block_Last_Layer, 64, 12, 228, 304)

But then you only use the first 8 channels of this output from the code:
gate1_wb_cmb = torch.abs(guidance.narrow(1, 0 , self.out_feature))
gate2_wb_cmb = torch.abs(guidance.narrow(1, 1 * self.out_feature, self.out_feature))
gate3_wb_cmb = torch.abs(guidance.narrow(1, 2 * self.out_feature, self.out_feature))
gate4_wb_cmb = torch.abs(guidance.narrow(1, 3 * self.out_feature, self.out_feature))
gate5_wb_cmb = torch.abs(guidance.narrow(1, 4 * self.out_feature, self.out_feature))
gate6_wb_cmb = torch.abs(guidance.narrow(1, 5 * self.out_feature, self.out_feature))
gate7_wb_cmb = torch.abs(guidance.narrow(1, 6 * self.out_feature, self.out_feature))
gate8_wb_cmb = torch.abs(guidance.narrow(1, 7 * self.out_feature, self.out_feature))

Whether this is mistake ? And the code should be :
self.gud_up_proj_layer6 = self._make_gud_up_conv_layer(Simple_Gudi_UpConv_Block_Last_Layer, 64, 8, 228, 304) ?

Curious about CSPN++ baseline

Thanks for the awesome paper CSPN++ !
I'm curious about the results of the baseline "Ma, Cavalheiro, and Karaman 2019".
On kitti depth completion validation, your implementation gets RMSE 799.08, compare to RMSE 856.75 of the original paper.
Is there any difference between your implementation and the original paper?

Thanks !

demo errors due to insufficient gpu ram or missing module paddle.fluid.core_avx?

Hello and first of all thank you for your contribution!

I wanted to execute the provided paddle demo but am just receiving errors.
I tested it twice with the same notebook on Google Colab and Kaggle but both times it fails.

This is my notebook: https://colab.research.google.com/drive/1CkgfxGwsbEvfWxkE6iKwShKIK5lFlARu

WARNING: Do not have avx core. You may not build with AVX, but AVX is supported on local machine.
 You could build paddle WITH_AVX=ON to get better performance.
The original error is: No module named 'paddle.fluid.core_avx'
W1209 03:51:00.008404   422 init.cc:162] AVX is available, Please re-compile on local machine
name: "cspn_affinity_propagate.tmp_23"
type {
  type: LOD_TENSOR
  lod_tensor {
    tensor {
      data_type: FP32
      dims: -1
      dims: 1
      dims: 48
      dims: 64
      dims: 128
    }
    lod_level: 0
  }
}
persistable: false

W1209 03:51:00.632901   422 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 37, Driver API Version: 10.1, Runtime API Version: 9.0
W1209 03:51:00.642139   422 device_context.cc:267] device: 0, cuDNN Version: 7.6.
An exception was thrown!
 Invoke operator fill_constant error.
Python Callstacks: 
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/framework.py", line 1844, in _prepend_op
    attrs=kwargs.get("attrs", None))
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/initializer.py", line 189, in __call__
    stop_gradient=True)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/framework.py", line 1627, in create_var
    kwargs['initializer'](var, self)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/layer_helper_base.py", line 383, in set_variable_initializer
    initializer=initializer)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/layers/tensor.py", line 142, in create_global_var
    value=float(value), force_cpu=force_cpu))
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 226, in _create_global_learning_rate
    persistable=True)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 365, in _create_optimization_pass
    self._create_global_learning_rate()
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 532, in apply_gradients
    optimize_ops = self._create_optimization_pass(params_grads)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 562, in apply_optimize
    optimize_ops = self.apply_gradients(params_grads)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 601, in minimize
    loss, startup_program=startup_program, params_grads=params_grads)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/dygraph/base.py", line 86, in __impl__
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
    return wrapped_func(*args, **kwargs)
  File "</usr/local/lib/python3.6/dist-packages/decorator.py:decorator-gen-20>", line 2, in minimize
  File "demo.py", line 75, in demo
    optim.minimize(output)
  File "demo.py", line 97, in <module>
    MODULE.demo()
C++ Callstacks: 
Enforce failed. Expected allocating <= available, but received allocating:10485338519 > available:1249705728.
Insufficient GPU memory to allocation. at [/paddle/paddle/fluid/platform/gpu_info.cc:293]
PaddlePaddle Call Stacks: 
0       0x7f41c0889955p void paddle::platform::EnforceNotMet::Init<std::string>(std::string, char const*, int) + 357
1       0x7f41c0889cb2p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 82
2       0x7f41c260abe9p paddle::platform::GpuMaxChunkSize() + 617
3       0x7f41c2534064p
4       0x7f41fd309827p
5       0x7f41c253450dp paddle::memory::legacy::GetGPUBuddyAllocator(int) + 109
6       0x7f41c2534721p void* paddle::memory::legacy::Alloc<paddle::platform::CUDAPlace>(paddle::platform::CUDAPlace const&, unsigned long) + 33
7       0x7f41c2534df5p paddle::memory::allocation::NaiveBestFitAllocator::AllocateImpl(unsigned long) + 405
8       0x7f41c252f113p paddle::memory::allocation::AllocatorFacade::Alloc(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 227
9       0x7f41c252f3bbp paddle::memory::allocation::AllocatorFacade::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 27
10      0x7f41c2199d6cp paddle::memory::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 44
11      0x7f41c2507458p paddle::framework::Tensor::mutable_data(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, paddle::framework::proto::VarType_Type, unsigned long) + 136
12      0x7f41c0c29994p paddle::operators::FillConstantKernel<float>::Compute(paddle::framework::ExecutionContext const&) const + 500
13      0x7f41c0c2c8b0p std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::FillConstantKernel<float>, paddle::operators::FillConstantKernel<double>, paddle::operators::FillConstantKernel<long>, paddle::operators::FillConstantKernel<int>, paddle::operators::FillConstantKernel<paddle::platform::float16> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) + 32
14      0x7f41c24b576dp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 381
15      0x7f41c24b5dabp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 555
16      0x7f41c24b321cp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 300
17      0x7f41c09f8216p paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 438
18      0x7f41c09fadc4p paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 132
19      0x7f41c087bae3p
20      0x7f41c08b9b54p
21            0x5674fcp _PyCFunction_FastCallDict + 860
22            0x50abb3p
23            0x50c5b9p _PyEval_EvalFrameDefault + 1097
24            0x508245p
25            0x50a080p
26            0x50aa7dp
27            0x50d390p _PyEval_EvalFrameDefault + 4640
28            0x508245p
29            0x50a080p
30            0x50aa7dp
31            0x50d390p _PyEval_EvalFrameDefault + 4640
32            0x508245p
33            0x50a080p
34            0x50aa7dp
35            0x50c5b9p _PyEval_EvalFrameDefault + 1097
36            0x508245p
37            0x50a080p
38            0x50aa7dp
39            0x50c5b9p _PyEval_EvalFrameDefault + 1097
40            0x508245p
41            0x50b403p PyEval_EvalCode + 35
42            0x635222p
43            0x6352d7p PyRun_FileExFlags + 151
44            0x638a8fp PyRun_SimpleFileExFlags + 383
45            0x639631p Py_Main + 1425
46            0x4b0f40p main + 224
47      0x7f41fd53ab97p __libc_start_main + 231
48            0x5b2fdap _start + 42

Traceback (most recent call last):
  File "demo.py", line 97, in <module>
    MODULE.demo()
  File "demo.py", line 79, in demo
    exe.run(fluid.default_startup_program())
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/executor.py", line 644, in run
    raise e
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/executor.py", line 640, in run
    use_program_cache=use_program_cache)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/executor.py", line 669, in _run_impl
    use_program_cache=use_program_cache)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/executor.py", line 766, in _run_program
    exe.run(program.desc, scope, 0, True, True, fetch_var_name)
paddle.fluid.core_noavx.EnforceNotMet: Invoke operator fill_constant error.
Python Callstacks: 
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/framework.py", line 1844, in _prepend_op
    attrs=kwargs.get("attrs", None))
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/initializer.py", line 189, in __call__
    stop_gradient=True)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/framework.py", line 1627, in create_var
    kwargs['initializer'](var, self)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/layer_helper_base.py", line 383, in set_variable_initializer
    initializer=initializer)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/layers/tensor.py", line 142, in create_global_var
    value=float(value), force_cpu=force_cpu))
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 226, in _create_global_learning_rate
    persistable=True)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 365, in _create_optimization_pass
    self._create_global_learning_rate()
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 532, in apply_gradients
    optimize_ops = self._create_optimization_pass(params_grads)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 562, in apply_optimize
    optimize_ops = self.apply_gradients(params_grads)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 601, in minimize
    loss, startup_program=startup_program, params_grads=params_grads)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/dygraph/base.py", line 86, in __impl__
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
    return wrapped_func(*args, **kwargs)
  File "</usr/local/lib/python3.6/dist-packages/decorator.py:decorator-gen-20>", line 2, in minimize
  File "demo.py", line 75, in demo
    optim.minimize(output)
  File "demo.py", line 97, in <module>
    MODULE.demo()
C++ Callstacks: 
Enforce failed. Expected allocating <= available, but received allocating:10485338519 > available:1249705728.
Insufficient GPU memory to allocation. at [/paddle/paddle/fluid/platform/gpu_info.cc:293]
PaddlePaddle Call Stacks: 
0       0x7f41c0889955p void paddle::platform::EnforceNotMet::Init<std::string>(std::string, char const*, int) + 357
1       0x7f41c0889cb2p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 82
2       0x7f41c260abe9p paddle::platform::GpuMaxChunkSize() + 617
3       0x7f41c2534064p
4       0x7f41fd309827p
5       0x7f41c253450dp paddle::memory::legacy::GetGPUBuddyAllocator(int) + 109
6       0x7f41c2534721p void* paddle::memory::legacy::Alloc<paddle::platform::CUDAPlace>(paddle::platform::CUDAPlace const&, unsigned long) + 33
7       0x7f41c2534df5p paddle::memory::allocation::NaiveBestFitAllocator::AllocateImpl(unsigned long) + 405
8       0x7f41c252f113p paddle::memory::allocation::AllocatorFacade::Alloc(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 227
9       0x7f41c252f3bbp paddle::memory::allocation::AllocatorFacade::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 27
10      0x7f41c2199d6cp paddle::memory::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 44
11      0x7f41c2507458p paddle::framework::Tensor::mutable_data(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, paddle::framework::proto::VarType_Type, unsigned long) + 136
12      0x7f41c0c29994p paddle::operators::FillConstantKernel<float>::Compute(paddle::framework::ExecutionContext const&) const + 500
13      0x7f41c0c2c8b0p std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::FillConstantKernel<float>, paddle::operators::FillConstantKernel<double>, paddle::operators::FillConstantKernel<long>, paddle::operators::FillConstantKernel<int>, paddle::operators::FillConstantKernel<paddle::platform::float16> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) + 32
14      0x7f41c24b576dp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 381
15      0x7f41c24b5dabp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 555
16      0x7f41c24b321cp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 300
17      0x7f41c09f8216p paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 438
18      0x7f41c09fadc4p paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 132
19      0x7f41c087bae3p
20      0x7f41c08b9b54p
21            0x5674fcp _PyCFunction_FastCallDict + 860
22            0x50abb3p
23            0x50c5b9p _PyEval_EvalFrameDefault + 1097
24            0x508245p
25            0x50a080p
26            0x50aa7dp
27            0x50d390p _PyEval_EvalFrameDefault + 4640
28            0x508245p
29            0x50a080p
30            0x50aa7dp
31            0x50d390p _PyEval_EvalFrameDefault + 4640
32            0x508245p
33            0x50a080p
34            0x50aa7dp
35            0x50c5b9p _PyEval_EvalFrameDefault + 1097
36            0x508245p
37            0x50a080p
38            0x50aa7dp
39            0x50c5b9p _PyEval_EvalFrameDefault + 1097
40            0x508245p
41            0x50b403p PyEval_EvalCode + 35
42            0x635222p
43            0x6352d7p PyRun_FileExFlags + 151
44            0x638a8fp PyRun_SimpleFileExFlags + 383
45            0x639631p Py_Main + 1425
46            0x4b0f40p main + 224
47      0x7f41fd53ab97p __libc_start_main + 231
48            0x5b2fdap _start + 42

So what I understand from this error, is that not enough GPU memory can be allocated. But there is around 13GB ram on Google Colab with the GPU available. How much is required for this demo?
Or is there another error that leads to this?

I also noticed that in the beginning it states "The original error is: No module named 'paddle.fluid.core_avx'" but from my understanding this is a warning and not what leads to this error, right?

Thanks for your help!

other code proposed in your paper

Thanks for your share of code. You really did an amazing job. I wonder if you have a plan to pulish the code just for depth estimation proposed in your paper. Thank you.

Confuse about the way to get Affinity Matrix

Hello.
I have read your paper,and make some experiment about the 3DCSPN. But I didn't get good result.
I'm confused about the way to get Affinity Matrix in 3D Module (page 7).
Is use a simple 3d Conv and ReLU to get Affinity matrix or a multi-layer Conv model to get Affinity martix?
Thx

作者你好，立体视觉（stereo matching）没找到，难道它是自由的吗

后面会提供吗

Loss NaN when training on NYUv2

I have reproduced the model trained on NYUv2 but met with problems. The loss and all evaluation metrics came to NAN after training with 26 epochs.
My setting is one 3090 GPU, pytorch == 1.9.1 and use default settings for all parameters . Do you have any idea to solve this problem?
Thanks for your help.

Get nan value by using affinity model due to torch.div operation

could you upload the post_process file?

Strange behavior in dataloader

First of all, thank you for sharing code for your great work!

While checking dataloader part by myself,
I found strange behavior of data loader in following part:

CSPN/nyu_dataset_loader.py

Lines 84 to 102 in 15fa8a0

 tRgb = data_transform.Compose([transforms.Resize(s), 

 data_transform.Rotation(degree), 

 transforms.ColorJitter(brightness = 0.4, contrast = 0.4, saturation = 0.4), 

 # data_transform.Lighting(0.1, imagenet_eigval, imagenet_eigvec)]) 

 transforms.CenterCrop((228, 304)), 

 transforms.ToTensor(), 

 transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)), 

 transforms.ToPILImage()]) 

 tDepth = data_transform.Compose([transforms.Resize(s), 

 data_transform.Rotation(degree), 

 transforms.CenterCrop((228, 304))]) 

 rgb_image = tRgb(rgb_image) 

 depth_image = tDepth(depth_image) 

 if np.random.uniform()<0.5: 

 rgb_image = rgb_image.transpose(Image.FLIP_LEFT_RIGHT) 

 depth_image = depth_image.transpose(Image.FLIP_LEFT_RIGHT) 

 rgb_image = transforms.ToTensor()(rgb_image)

This process includes ToTensor() -> Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225) -> ToPILImage(),
but input value range changes as follows:
[0, 1] -> around [-2.5, 2.5] -> [0, 255]

In this process, because of implementation of ToPILImage(), values outside [0, 1] is not correctly considered, so your current implementation gives unexpected values after data augmentation.

Here I post some examples.

Original Image

Un-normalized without this process (No second PIL conversion)

Un-normalized with this process

Did you use this process for ECCV and AAAI submission?

Interesting point is that it still works as it is if I train with your code by myself.
(But slightly worse than reported value (0.117 vs. 0.131 for NYU).

Thank you very much!

Confusion about the self.max_of_8_tensor() operation

Hi,
I noticed that you employ a function self.max_of_8_tensor(...) to obtain the final results. Why? It seems hard for me to associate the code with the operation proposed in the paper.

stereo matching

can the provided code be used to perfom stereo matching given two images?

Where is 3D convolutional spatial propagation network?

I didn't find the code about it.

Did you up-sampling the depth when eval?

Hi
Thank you very much for your contribution. I have a question when I read your paper. Did you up-sampling the depth to the original size when evaluation?

Resnet18 or Resnet50?

Hi thanks for the code.

Just wondering if the listed results on NYU are achieved via resnet18 or resnet50 encoder?

nvm it is resnet50

How to guarantee that refined depths have the exact same value at those valid pixels in sparse depth map?

raw_depth_input = blur_depth
...
if sparse_depth is not None:
result_depth = (1 - sparse_mask) * result_depth + sparse_mask * raw_depth_input

你好，我有一个疑惑。为了保留sparse depth上的有效值，为什么不是用sparse上的值替换，而是用blur_depth。blur_depth是网络预测的，这个不一定准确啊？请麻烦帮我解答一下，谢谢！

No module named 'base_model'

It seems that no module named 'base_model' , Please help!Thanks!

How to implement ACSPP?

Thanks a lot for sharing your code! I'm trying to understand your variants of spatial pyramid pooling layers, specially atrous convolutional SPP. Since there is no code for those modules, I hope that the author confirm my understandings below.

I suppose that ACSPP is basically based on CSPP in Fig 3 (b) with some modifications to make it "atrous". I suppose this module (between input and output feature in the figure) should replace the following three lines in your PyTorch code.

CSPN/cspn_pytorch/models/torch_resnet_cspn_nyu.py

Lines 365 to 367 in 24eff12

 x = self.layer4(x) 

 x = self.bn2(self.conv2(x)) 

 x = self.gud_up_proj_layer1(x)

To my understanding, "atrous" version of Fig 3b will be like follows. Note that it is written in a PyTorch-like pseudo code where padding and stride options are omitted. The code should replace the above three lines, receiving input x and outputting x.

# Input x has c=1024 channels
b, c, h, w = x.shape
kh, kw = 3, 3

# Output a single channel weight map for subsequent four parallel CSPN layers.  
# (Although Fig 3b says it also uses BN and ReLU, I suppose it is only Conv2d).
W = conv2d(x, kernel_size=3, output_channel=1)

# From W, we compose four of 3x3 spatially-dependent kernel weight maps
# W1, W2, W3, W4 with dilation rates={6,12,16,24} and reshaping.
W1 = unfold(W, kernel_size=3, dilation= 6).reshape(b, 1, kh*kw, h, w)
W2 = unfold(W, kernel_size=3, dilation=12).reshape(b, 1, kh*kw, h, w)
W3 = unfold(W, kernel_size=3, dilation=18).reshape(b, 1, kh*kw, h, w)
W4 = unfold(W, kernel_size=3, dilation=24).reshape(b, 1, kh*kw, h, w)

# Normalize convolution weight maps along kernel axis
W1 = abs(W1)/abs(W1).sum(dim=2, keepdim=True)
W2 = abs(W2)/abs(W2).sum(dim=2, keepdim=True)
W3 = abs(W3)/abs(W3).sum(dim=2, keepdim=True)
W4 = abs(W4)/abs(W4).sum(dim=2, keepdim=True)

# Convolve x with the four weight maps, using corresponding dilation rates. 
# Here, the resulting y's have the same channel and resolution with x as (b, c, h, w)
y1 = unfold(x, kernel_size=3, dilation= 6).reshape(b, c, kh*kw, h, w)
y2 = unfold(x, kernel_size=3, dilation=12).reshape(b, c, kh*kw, h, w)
y3 = unfold(x, kernel_size=3, dilation=18).reshape(b, c, kh*kw, h, w)
y4 = unfold(x, kernel_size=3, dilation=24).reshape(b, c, kh*kw, h, w)
y1 = (y1*W1).sum(dim=2)
y2 = (y2*W2).sum(dim=2)
y3 = (y3*W3).sum(dim=2)
y4 = (y4*W4).sum(dim=2)

# Apply Conv2d-BN-ReLU to each y1, y2, y3, y4 to get 256-channel feature maps
z1 = relu(bn(conv2d(y1, output_channel=256, kernel_size=3, dilation=6)))
z2 = relu(bn(conv2d(y2, output_channel=256, kernel_size=3, dilation=12)))
z3 = relu(bn(conv2d(y3, output_channel=256, kernel_size=3, dilation=18)))
z4 = relu(bn(conv2d(y4, output_channel=256, kernel_size=3, dilation=24)))

# Concat them to produce the output of the module
x = concat([z1, z2, z3, z4], dim=1)

Can you verify my code and tell me if there is any misunderstanding? Specially, check the following points.

Do we compose W1, W2, W3, W4 from the same W?
Do we also use dilated convs to compute z1, z2, z3, z4 after CSPN layers?
What is the output channel number of z1, z2, z3, z4? (I guessed it is 256)

Pytorch model 404？

There seems to be a problem with the link of the pre-trained model ( NYU Depth V2 (Fast Unpool, non-pos) )

How do I test on an image without GT data?

Is it possible to test on an image (or image set) that does not have GT depth data? I am referring to Cityscapes, which does not have LiDaR data. I only care about the predicted depth using 0 samples.

UpProj Module and Dead Codes in torch_resnet_cspn_nyu.py

Hi,

I checked the "torch_resnet_cspn_nyu.py" to understand how you implemented the depth network before CSPN module. I found many dead codes in that file. For example, could you please tell me where you use those self.up_proj_layer(1/2/3/4) in forward function? In addition, when I check the paper the implementation of UpProj is not same as that depicted in Fig. 5 (see, conference paper)? After up-sampling by self._up_pool (object of Unpool), the output is used by the shortcut layer. However, the figure shows that there are conv, BN and Relu before feeding the output of the upsample to the shortcut. Could you please clarify those points for me?

Thanks for providing the code. I appreciate it. However, I think that the code could have been written far better to help people working in academic area and understand your idea better.

Thank you,

the Pytorch model link of NYU Depth V2 (Fast Unpool, non-pos) is null?

paddle.fluid.layers has no attribute affinity_propagate"

Hello Cheng, thanks for your submitted work, I like your TPAMI paper very much. I tried to use your 3DCSPN by following the instructions, but I found no implementations for the affinity_propagate embedded in the PaddlePaddle. Would you like to instruct which PaddlePaddle version you used and where i can find the function in PaddlePaddle? Thank you very much!

is the cspn.py the final version descriped in the paper?

Is one propagation the same as the description in the paper 'Learning Depth with Convolutional Spatial
Propagation Network'?
what is the effect of max_of_4_tensor and eight_way_propagation?
thanks a lot!

Bug in CSPN update step?

As I understand, eq. 6 in arxiv paper should correspond to line:

CSPN/cspn_pytorch/models/cspn.py

Line 81 in b3e487b

result_depth = (1 - sparse_mask) * result_depth + sparse_mask * raw_depth_input

However, raw_depth_input = blur_depth which is the predicted depth by the network. I wondering if

            if sparse_depth is not None:
                result_depth = (
                    1 - sparse_mask
                ) * result_depth + sparse_mask * raw_depth_input

should be

            if sparse_depth is not None:
                result_depth = (
                    1 - sparse_mask
                ) * result_depth + sparse_mask * sparse_depth

such that you keep the depth at the points where you have depth?

Grad explode

When iter steps getting lager(In my code iter steps is 24)the gradient is more easier to explode.How to deal with that.

why the result of my eval sample looks very bad with load the best_model.pth?

when I use cpu run this code,load the best model you give,the result of my eval sample looks very bad.The data I use is nyu_depth_v2_labeled.mat,and I convert the data to image and depth npy file and load it.And the error:
Epoch: 0, step: 0, loss=1.3417
MSE=2.7589(2.7589) RMSE=1.6610(1.6610) MAE=1.3417(1.3417) ABS_REL=0.4409(0.4409)
DELTA1.02=0.0673(0.0673) DELTA1.05=0.1037(0.1037) DELTA1.10=0.1616(0.1616)
DELTA1.25=0.3161(0.3161) DELTA1.25^2=0.5606(0.5606) DELTA1.25^3=0.7404(0.7404)

Has anyone tried this method for semantic segmentation? Does this work well?

I think this method can apply to semantic segmentation too. So I run a test, this works bad, even bader than the coarse mask.

Is max_of_8_tensor( ) operation and gate1_w1_cmb to gate8_w1_cmb used in Stereo version？

Hi:
I have some confusion about max_of_8_tensor() and gate1_w1_cmb to gate8_w1_cmb .Is that used in stereo version?In my code ,I reimplement stereo matching tast without max_of_8_tensor() and gate1_w1_cmb to gate8_w1_cmb.And I can't get good result.
Thanks for your help.

	tRgb = data_transform.Compose([transforms.Resize(s),
	data_transform.Rotation(degree),
	transforms.ColorJitter(brightness = 0.4, contrast = 0.4, saturation = 0.4),
	# data_transform.Lighting(0.1, imagenet_eigval, imagenet_eigvec)])
	transforms.CenterCrop((228, 304)),
	transforms.ToTensor(),
	transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
	transforms.ToPILImage()])

	tDepth = data_transform.Compose([transforms.Resize(s),
	data_transform.Rotation(degree),
	transforms.CenterCrop((228, 304))])
	rgb_image = tRgb(rgb_image)
	depth_image = tDepth(depth_image)
	if np.random.uniform()<0.5:
	rgb_image = rgb_image.transpose(Image.FLIP_LEFT_RIGHT)
	depth_image = depth_image.transpose(Image.FLIP_LEFT_RIGHT)

	rgb_image = transforms.ToTensor()(rgb_image)

	x = self.layer4(x)
	x = self.bn2(self.conv2(x))
	x = self.gud_up_proj_layer1(x)