xinjcheng / cspn Goto Github PK
View Code? Open in Web Editor NEWConvolutional Spatial Propagation Network
Convolutional Spatial Propagation Network
self.weight = torch.ones(1, 8, 1, 1, 1).cuda()
cspn中的求和卷积如果在init过程构建的话,batch运行的时候,pytorch会提示不在一张卡上。
所以该weight应该在forward的过程中创建
Im trying to run eval.py script with GTX 1070 8GB Vram card. I've set batch_size to 1 and tried various n_samples but error still prevails.
The error im getting is:
====TOTAL MEMORY==== 8589934592 GeForce GTX 1070 Memory Usage: Allocated: 0.0 GB Cached: 0.0 GB Traceback (most recent call last): File "eval.py", line 122, in <module> net.cuda() File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 258, in cuda return self._apply(lambda t: t.cuda(device)) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 185, in _apply module._apply(fn) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 185, in _apply module._apply(fn) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 191, in _apply param.data = fn(param.data) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 258, in <lambda> return self._apply(lambda t: t.cuda(device)) RuntimeError: CUDA error: out of memory ==> evaluating model with cspn and unet on nyudepth ==> Preparing data.. ==> Building model.. {'norm_type': '8sum', 'step': 24, 'kernel': 3} ==> Resuming from best model.. ==> model dict with addtional module, remove it...
In eval.py I've just added this in beginning:
os.environ["CUDA_VISIBLE_DEVICES"] = '0' print("====TOTAL MEMORY====") print(torch.cuda.get_device_properties(0).total_memory) print(torch.cuda.get_device_name(0)) print('Memory Usage:') print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB') print('Cached: ', round(torch.cuda.memory_cached(0)/1024**3,1), 'GB')
Does anyone knows how to fix this problem?
If I use cspn.py in my code directly, my python code will have an error
Segmentation fault (core dumped)
if I change the fllowing code range(16) to range(1), it will run with no error
for i in range(2):
# one propagation
spn_kernel = 3
elewise_max_gate1 = self.eight_way_propagation(gate1_w1_cmb, result_depth, spn_kernel)
elewise_max_gate2 = self.eight_way_propagation(gate2_w1_cmb, result_depth, spn_kernel)
elewise_max_gate3 = self.eight_way_propagation(gate3_w1_cmb, result_depth, spn_kernel)
elewise_max_gate4 = self.eight_way_propagation(gate4_w1_cmb, result_depth, spn_kernel)
elewise_max_gate5 = self.eight_way_propagation(gate5_w1_cmb, result_depth, spn_kernel)
elewise_max_gate6 = self.eight_way_propagation(gate6_w1_cmb, result_depth, spn_kernel)
elewise_max_gate7 = self.eight_way_propagation(gate7_w1_cmb, result_depth, spn_kernel)
elewise_max_gate8 = self.eight_way_propagation(gate8_w1_cmb, result_depth, spn_kernel)
What happened to this code?
Hope your reply, thank you
Hi, Thanks for your contributions, as we have made our own branch to train it in Kitti (# 25 pull request) and after one week training, we got the following results:
and the following results on validation samples, we made totally 2000+ inference on validation sets and found most of the samples are look like this:
Hello,
I am not sure if this is an error with my setup or the code, but when I execute "bash eval_nyudepth_cspn.sh" for testing, I receive following error:
I already tested if google colab has CUDA installed and if pytorch can detect it. And even executing that line outside the script in the cell before works without any errors, but when I try to run the script, it fails.
I appreciate any help or suggestion to fix this
How to implement WASPP?Is that using ASPP to get multi-scale feature pooling without feature size reduction first ,then using weighted pooling(2d cspn) like WSPP.Or just using dailation conv in 2d cspn's conv?
btw,is max_of_4_tensor used in Stereo or just used in Depth Completion ?
Thanks a lot.
Can you opensource 3D CSPN code? Thanks a lot. After i reading the paper, i didn't see the operation or details about how to train or apply 3DCSPN to PSMnet.
`
x = self.gud_up_proj_layer4(x, skip4) # 64 channels features
x= self.gud_up_proj_layer5(x) # blur depth
guidance = self.gud_up_proj_layer6(x) # affinity matrix
x = self.post_process_layer(guidance, x, sparse_depth)
`
I think the first x is 64 channels features, which are used to generate blur depth and affinity matrix, and the second x is blur depth. Therefore, the input of self.gud_up_proj_layer6 is the first x. I modify the code as follows:
`
x = self.gud_up_proj_layer4(x, skip4)
blur_depth = self.gud_up_proj_layer5(x)
guidance = self.gud_up_proj_layer6(x)
x = self.post_process_layer(guidance, blur_depth, sparse_depth)
`
And I find the cspn.py have the following errors.
IndexError: only integers, slices (
:
), ellipsis (...
), None and long or byte Variables are valid indices (got float)
So, I suggest replace '/'(div operation) using '//' when you want to get 'int' rather than 'float'.
I think your un_pooling operation is not fast. I recommend using the code below.
class Unpool(nn.Module):
def __init__(self, num_channels, stride=2):
super(Unpool, self).__init__()
self.num_channels = num_channels
self.stride = stride
def forward(self, x):
weights = torch.zeros(self.num_channels, 1, self.stride, self.stride)
if torch.cuda.is_available():
weights = weights.cuda()
weights[:, :, 0, 0] = 1
return F.conv_transpose2d(x, weights, stride=self.stride, groups=self.num_channels)
`
Thanks for the awesome contribution!
When I read the code, I found that the code did not match the paper.
According to the description of your paper, I think we should change
out = self.relu(self.bn1(self.conv1(x)))
out = torch.cat((out, side_input), 1)
to
x = self.relu(self.bn1(self.conv1(x)))
out = torch.cat((x, side_input), 1)
Looking forward to your reply!
is there a way to parse a larger image to the net? i am able to successfully load a png image in your code, and the depth info in an exr file, but i want to get a larger image as output. i tried changing the center crop and resize transform to twice the original parameters and I get the following error:
File "./models/torch_resnet_cspn_nyu.py", line 270, in forward out = torch.cat((out, side_input), 1) RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 57 and 29 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:71
Is it possible to train the model with dense depth annotations? In the form of aligned greyscale png data, which can be converted if needed. (I am experimenting with using synthetic data for training.)
Kind regards.
Thanks for the awesome work!
I notice that the performance of the recent paper CSPN++ is much better than the original CSPN.So do you plan to release the code for CSPN++ recently?
Best regards!
Hi, could u share the training parameters in the experiment? Such as the learning rate, optimizer, training epochs etc.
Is it possible to get a short Tutorial that covers how to use the code and maybe how to train the Network?
Im trying to run eval.py script with GTX 1070 8GB Vram card. I've set batch_size to 1 and tried various n_samples but error still prevails.
The error im getting is:
====TOTAL MEMORY==== 8589934592 GeForce GTX 1070 Memory Usage: Allocated: 0.0 GB Cached: 0.0 GB Traceback (most recent call last): File "eval.py", line 122, in <module> net.cuda() File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 258, in cuda return self._apply(lambda t: t.cuda(device)) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 185, in _apply module._apply(fn) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 185, in _apply module._apply(fn) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 191, in _apply param.data = fn(param.data) File "C:\Users\Lab_admin\anaconda3\envs\cspn\lib\site-packages\torch\nn\modules\module.py", line 258, in <lambda> return self._apply(lambda t: t.cuda(device)) RuntimeError: CUDA error: out of memory ==> evaluating model with cspn and unet on nyudepth ==> Preparing data.. ==> Building model.. {'norm_type': '8sum', 'step': 24, 'kernel': 3} ==> Resuming from best model.. ==> model dict with addtional module, remove it...
In eval.py I've just added this in beginning:
os.environ["CUDA_VISIBLE_DEVICES"] = '0' print("====TOTAL MEMORY====") print(torch.cuda.get_device_properties(0).total_memory) print(torch.cuda.get_device_name(0)) print('Memory Usage:') print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB') print('Cached: ', round(torch.cuda.memory_cached(0)/1024**3,1), 'GB')
Does anyone knows how to fix this problem?
Hi Xinjin,
We can see that guidance can get 12 channels from the follow code:
self.gud_up_proj_layer6 = self._make_gud_up_conv_layer(Simple_Gudi_UpConv_Block_Last_Layer, 64, 12, 228, 304)
But then you only use the first 8 channels of this output from the code:
gate1_wb_cmb = torch.abs(guidance.narrow(1, 0 , self.out_feature))
gate2_wb_cmb = torch.abs(guidance.narrow(1, 1 * self.out_feature, self.out_feature))
gate3_wb_cmb = torch.abs(guidance.narrow(1, 2 * self.out_feature, self.out_feature))
gate4_wb_cmb = torch.abs(guidance.narrow(1, 3 * self.out_feature, self.out_feature))
gate5_wb_cmb = torch.abs(guidance.narrow(1, 4 * self.out_feature, self.out_feature))
gate6_wb_cmb = torch.abs(guidance.narrow(1, 5 * self.out_feature, self.out_feature))
gate7_wb_cmb = torch.abs(guidance.narrow(1, 6 * self.out_feature, self.out_feature))
gate8_wb_cmb = torch.abs(guidance.narrow(1, 7 * self.out_feature, self.out_feature))
Whether this is mistake ? And the code should be :
self.gud_up_proj_layer6 = self._make_gud_up_conv_layer(Simple_Gudi_UpConv_Block_Last_Layer, 64, 8, 228, 304) ?
Thanks for the awesome paper CSPN++ !
I'm curious about the results of the baseline "Ma, Cavalheiro, and Karaman 2019".
On kitti depth completion validation, your implementation gets RMSE 799.08, compare to RMSE 856.75 of the original paper.
Is there any difference between your implementation and the original paper?
Thanks !
Hello and first of all thank you for your contribution!
I wanted to execute the provided paddle demo but am just receiving errors.
I tested it twice with the same notebook on Google Colab and Kaggle but both times it fails.
This is my notebook: https://colab.research.google.com/drive/1CkgfxGwsbEvfWxkE6iKwShKIK5lFlARu
WARNING: Do not have avx core. You may not build with AVX, but AVX is supported on local machine.
You could build paddle WITH_AVX=ON to get better performance.
The original error is: No module named 'paddle.fluid.core_avx'
W1209 03:51:00.008404 422 init.cc:162] AVX is available, Please re-compile on local machine
name: "cspn_affinity_propagate.tmp_23"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
dims: 48
dims: 64
dims: 128
}
lod_level: 0
}
}
persistable: false
W1209 03:51:00.632901 422 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 37, Driver API Version: 10.1, Runtime API Version: 9.0
W1209 03:51:00.642139 422 device_context.cc:267] device: 0, cuDNN Version: 7.6.
An exception was thrown!
Invoke operator fill_constant error.
Python Callstacks:
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/framework.py", line 1844, in _prepend_op
attrs=kwargs.get("attrs", None))
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/initializer.py", line 189, in __call__
stop_gradient=True)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/framework.py", line 1627, in create_var
kwargs['initializer'](var, self)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/layer_helper_base.py", line 383, in set_variable_initializer
initializer=initializer)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/layers/tensor.py", line 142, in create_global_var
value=float(value), force_cpu=force_cpu))
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 226, in _create_global_learning_rate
persistable=True)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 365, in _create_optimization_pass
self._create_global_learning_rate()
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 532, in apply_gradients
optimize_ops = self._create_optimization_pass(params_grads)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 562, in apply_optimize
optimize_ops = self.apply_gradients(params_grads)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 601, in minimize
loss, startup_program=startup_program, params_grads=params_grads)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/dygraph/base.py", line 86, in __impl__
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
return wrapped_func(*args, **kwargs)
File "</usr/local/lib/python3.6/dist-packages/decorator.py:decorator-gen-20>", line 2, in minimize
File "demo.py", line 75, in demo
optim.minimize(output)
File "demo.py", line 97, in <module>
MODULE.demo()
C++ Callstacks:
Enforce failed. Expected allocating <= available, but received allocating:10485338519 > available:1249705728.
Insufficient GPU memory to allocation. at [/paddle/paddle/fluid/platform/gpu_info.cc:293]
PaddlePaddle Call Stacks:
0 0x7f41c0889955p void paddle::platform::EnforceNotMet::Init<std::string>(std::string, char const*, int) + 357
1 0x7f41c0889cb2p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 82
2 0x7f41c260abe9p paddle::platform::GpuMaxChunkSize() + 617
3 0x7f41c2534064p
4 0x7f41fd309827p
5 0x7f41c253450dp paddle::memory::legacy::GetGPUBuddyAllocator(int) + 109
6 0x7f41c2534721p void* paddle::memory::legacy::Alloc<paddle::platform::CUDAPlace>(paddle::platform::CUDAPlace const&, unsigned long) + 33
7 0x7f41c2534df5p paddle::memory::allocation::NaiveBestFitAllocator::AllocateImpl(unsigned long) + 405
8 0x7f41c252f113p paddle::memory::allocation::AllocatorFacade::Alloc(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 227
9 0x7f41c252f3bbp paddle::memory::allocation::AllocatorFacade::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 27
10 0x7f41c2199d6cp paddle::memory::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 44
11 0x7f41c2507458p paddle::framework::Tensor::mutable_data(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, paddle::framework::proto::VarType_Type, unsigned long) + 136
12 0x7f41c0c29994p paddle::operators::FillConstantKernel<float>::Compute(paddle::framework::ExecutionContext const&) const + 500
13 0x7f41c0c2c8b0p std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::FillConstantKernel<float>, paddle::operators::FillConstantKernel<double>, paddle::operators::FillConstantKernel<long>, paddle::operators::FillConstantKernel<int>, paddle::operators::FillConstantKernel<paddle::platform::float16> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) + 32
14 0x7f41c24b576dp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 381
15 0x7f41c24b5dabp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 555
16 0x7f41c24b321cp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 300
17 0x7f41c09f8216p paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 438
18 0x7f41c09fadc4p paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 132
19 0x7f41c087bae3p
20 0x7f41c08b9b54p
21 0x5674fcp _PyCFunction_FastCallDict + 860
22 0x50abb3p
23 0x50c5b9p _PyEval_EvalFrameDefault + 1097
24 0x508245p
25 0x50a080p
26 0x50aa7dp
27 0x50d390p _PyEval_EvalFrameDefault + 4640
28 0x508245p
29 0x50a080p
30 0x50aa7dp
31 0x50d390p _PyEval_EvalFrameDefault + 4640
32 0x508245p
33 0x50a080p
34 0x50aa7dp
35 0x50c5b9p _PyEval_EvalFrameDefault + 1097
36 0x508245p
37 0x50a080p
38 0x50aa7dp
39 0x50c5b9p _PyEval_EvalFrameDefault + 1097
40 0x508245p
41 0x50b403p PyEval_EvalCode + 35
42 0x635222p
43 0x6352d7p PyRun_FileExFlags + 151
44 0x638a8fp PyRun_SimpleFileExFlags + 383
45 0x639631p Py_Main + 1425
46 0x4b0f40p main + 224
47 0x7f41fd53ab97p __libc_start_main + 231
48 0x5b2fdap _start + 42
Traceback (most recent call last):
File "demo.py", line 97, in <module>
MODULE.demo()
File "demo.py", line 79, in demo
exe.run(fluid.default_startup_program())
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/executor.py", line 644, in run
raise e
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/executor.py", line 640, in run
use_program_cache=use_program_cache)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/executor.py", line 669, in _run_impl
use_program_cache=use_program_cache)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/executor.py", line 766, in _run_program
exe.run(program.desc, scope, 0, True, True, fetch_var_name)
paddle.fluid.core_noavx.EnforceNotMet: Invoke operator fill_constant error.
Python Callstacks:
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/framework.py", line 1844, in _prepend_op
attrs=kwargs.get("attrs", None))
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/initializer.py", line 189, in __call__
stop_gradient=True)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/framework.py", line 1627, in create_var
kwargs['initializer'](var, self)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/layer_helper_base.py", line 383, in set_variable_initializer
initializer=initializer)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/layers/tensor.py", line 142, in create_global_var
value=float(value), force_cpu=force_cpu))
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 226, in _create_global_learning_rate
persistable=True)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 365, in _create_optimization_pass
self._create_global_learning_rate()
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 532, in apply_gradients
optimize_ops = self._create_optimization_pass(params_grads)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 562, in apply_optimize
optimize_ops = self.apply_gradients(params_grads)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/optimizer.py", line 601, in minimize
loss, startup_program=startup_program, params_grads=params_grads)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/dygraph/base.py", line 86, in __impl__
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
return wrapped_func(*args, **kwargs)
File "</usr/local/lib/python3.6/dist-packages/decorator.py:decorator-gen-20>", line 2, in minimize
File "demo.py", line 75, in demo
optim.minimize(output)
File "demo.py", line 97, in <module>
MODULE.demo()
C++ Callstacks:
Enforce failed. Expected allocating <= available, but received allocating:10485338519 > available:1249705728.
Insufficient GPU memory to allocation. at [/paddle/paddle/fluid/platform/gpu_info.cc:293]
PaddlePaddle Call Stacks:
0 0x7f41c0889955p void paddle::platform::EnforceNotMet::Init<std::string>(std::string, char const*, int) + 357
1 0x7f41c0889cb2p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 82
2 0x7f41c260abe9p paddle::platform::GpuMaxChunkSize() + 617
3 0x7f41c2534064p
4 0x7f41fd309827p
5 0x7f41c253450dp paddle::memory::legacy::GetGPUBuddyAllocator(int) + 109
6 0x7f41c2534721p void* paddle::memory::legacy::Alloc<paddle::platform::CUDAPlace>(paddle::platform::CUDAPlace const&, unsigned long) + 33
7 0x7f41c2534df5p paddle::memory::allocation::NaiveBestFitAllocator::AllocateImpl(unsigned long) + 405
8 0x7f41c252f113p paddle::memory::allocation::AllocatorFacade::Alloc(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 227
9 0x7f41c252f3bbp paddle::memory::allocation::AllocatorFacade::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 27
10 0x7f41c2199d6cp paddle::memory::AllocShared(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, unsigned long) + 44
11 0x7f41c2507458p paddle::framework::Tensor::mutable_data(boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, paddle::framework::proto::VarType_Type, unsigned long) + 136
12 0x7f41c0c29994p paddle::operators::FillConstantKernel<float>::Compute(paddle::framework::ExecutionContext const&) const + 500
13 0x7f41c0c2c8b0p std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::FillConstantKernel<float>, paddle::operators::FillConstantKernel<double>, paddle::operators::FillConstantKernel<long>, paddle::operators::FillConstantKernel<int>, paddle::operators::FillConstantKernel<paddle::platform::float16> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) + 32
14 0x7f41c24b576dp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 381
15 0x7f41c24b5dabp paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 555
16 0x7f41c24b321cp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 300
17 0x7f41c09f8216p paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 438
18 0x7f41c09fadc4p paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool) + 132
19 0x7f41c087bae3p
20 0x7f41c08b9b54p
21 0x5674fcp _PyCFunction_FastCallDict + 860
22 0x50abb3p
23 0x50c5b9p _PyEval_EvalFrameDefault + 1097
24 0x508245p
25 0x50a080p
26 0x50aa7dp
27 0x50d390p _PyEval_EvalFrameDefault + 4640
28 0x508245p
29 0x50a080p
30 0x50aa7dp
31 0x50d390p _PyEval_EvalFrameDefault + 4640
32 0x508245p
33 0x50a080p
34 0x50aa7dp
35 0x50c5b9p _PyEval_EvalFrameDefault + 1097
36 0x508245p
37 0x50a080p
38 0x50aa7dp
39 0x50c5b9p _PyEval_EvalFrameDefault + 1097
40 0x508245p
41 0x50b403p PyEval_EvalCode + 35
42 0x635222p
43 0x6352d7p PyRun_FileExFlags + 151
44 0x638a8fp PyRun_SimpleFileExFlags + 383
45 0x639631p Py_Main + 1425
46 0x4b0f40p main + 224
47 0x7f41fd53ab97p __libc_start_main + 231
48 0x5b2fdap _start + 42
So what I understand from this error, is that not enough GPU memory can be allocated. But there is around 13GB ram on Google Colab with the GPU available. How much is required for this demo?
Or is there another error that leads to this?
I also noticed that in the beginning it states "The original error is: No module named 'paddle.fluid.core_avx'" but from my understanding this is a warning and not what leads to this error, right?
Thanks for your help!
Thanks for your share of code. You really did an amazing job. I wonder if you have a plan to pulish the code just for depth estimation proposed in your paper. Thank you.
Hello.
I have read your paper,and make some experiment about the 3DCSPN. But I didn't get good result.
I'm confused about the way to get Affinity Matrix in 3D Module (page 7).
Is use a simple 3d Conv and ReLU to get Affinity matrix or a multi-layer Conv model to get Affinity martix?
Thx
后面会提供吗
I have reproduced the model trained on NYUv2 but met with problems. The loss and all evaluation metrics came to NAN after training with 26 epochs.
My setting is one 3090 GPU, pytorch == 1.9.1 and use default settings for all parameters . Do you have any idea to solve this problem?
Thanks for your help.
First of all, thank you for sharing code for your great work!
While checking dataloader part by myself,
I found strange behavior of data loader in following part:
Lines 84 to 102 in 15fa8a0
This process includes ToTensor() -> Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225) -> ToPILImage(),
but input value range changes as follows:
[0, 1] -> around [-2.5, 2.5] -> [0, 255]
In this process, because of implementation of ToPILImage(), values outside [0, 1] is not correctly considered, so your current implementation gives unexpected values after data augmentation.
Here I post some examples.
Original Image
Un-normalized without this process (No second PIL conversion)
Un-normalized with this process
Did you use this process for ECCV and AAAI submission?
Interesting point is that it still works as it is if I train with your code by myself.
(But slightly worse than reported value (0.117 vs. 0.131 for NYU).
Thank you very much!
Hi,
I noticed that you employ a function self.max_of_8_tensor(...) to obtain the final results. Why? It seems hard for me to associate the code with the operation proposed in the paper.
can the provided code be used to perfom stereo matching given two images?
I didn't find the code about it.
Hi
Thank you very much for your contribution. I have a question when I read your paper. Did you up-sampling the depth to the original size when evaluation?
Hi thanks for the code.
Just wondering if the listed results on NYU are achieved via resnet18 or resnet50 encoder?
nvm it is resnet50
raw_depth_input = blur_depth
...
if sparse_depth is not None:
result_depth = (1 - sparse_mask) * result_depth + sparse_mask * raw_depth_input
你好,我有一个疑惑。为了保留sparse depth上的有效值,为什么不是用sparse上的值替换,而是用blur_depth。blur_depth是网络预测的,这个不一定准确啊?请麻烦帮我解答一下,谢谢!
Thanks a lot for sharing your code! I'm trying to understand your variants of spatial pyramid pooling layers, specially atrous convolutional SPP. Since there is no code for those modules, I hope that the author confirm my understandings below.
I suppose that ACSPP is basically based on CSPP in Fig 3 (b) with some modifications to make it "atrous". I suppose this module (between input and output feature in the figure) should replace the following three lines in your PyTorch code.
CSPN/cspn_pytorch/models/torch_resnet_cspn_nyu.py
Lines 365 to 367 in 24eff12
To my understanding, "atrous" version of Fig 3b will be like follows. Note that it is written in a PyTorch-like pseudo code where padding and stride options are omitted. The code should replace the above three lines, receiving input x and outputting x.
# Input x has c=1024 channels
b, c, h, w = x.shape
kh, kw = 3, 3
# Output a single channel weight map for subsequent four parallel CSPN layers.
# (Although Fig 3b says it also uses BN and ReLU, I suppose it is only Conv2d).
W = conv2d(x, kernel_size=3, output_channel=1)
# From W, we compose four of 3x3 spatially-dependent kernel weight maps
# W1, W2, W3, W4 with dilation rates={6,12,16,24} and reshaping.
W1 = unfold(W, kernel_size=3, dilation= 6).reshape(b, 1, kh*kw, h, w)
W2 = unfold(W, kernel_size=3, dilation=12).reshape(b, 1, kh*kw, h, w)
W3 = unfold(W, kernel_size=3, dilation=18).reshape(b, 1, kh*kw, h, w)
W4 = unfold(W, kernel_size=3, dilation=24).reshape(b, 1, kh*kw, h, w)
# Normalize convolution weight maps along kernel axis
W1 = abs(W1)/abs(W1).sum(dim=2, keepdim=True)
W2 = abs(W2)/abs(W2).sum(dim=2, keepdim=True)
W3 = abs(W3)/abs(W3).sum(dim=2, keepdim=True)
W4 = abs(W4)/abs(W4).sum(dim=2, keepdim=True)
# Convolve x with the four weight maps, using corresponding dilation rates.
# Here, the resulting y's have the same channel and resolution with x as (b, c, h, w)
y1 = unfold(x, kernel_size=3, dilation= 6).reshape(b, c, kh*kw, h, w)
y2 = unfold(x, kernel_size=3, dilation=12).reshape(b, c, kh*kw, h, w)
y3 = unfold(x, kernel_size=3, dilation=18).reshape(b, c, kh*kw, h, w)
y4 = unfold(x, kernel_size=3, dilation=24).reshape(b, c, kh*kw, h, w)
y1 = (y1*W1).sum(dim=2)
y2 = (y2*W2).sum(dim=2)
y3 = (y3*W3).sum(dim=2)
y4 = (y4*W4).sum(dim=2)
# Apply Conv2d-BN-ReLU to each y1, y2, y3, y4 to get 256-channel feature maps
z1 = relu(bn(conv2d(y1, output_channel=256, kernel_size=3, dilation=6)))
z2 = relu(bn(conv2d(y2, output_channel=256, kernel_size=3, dilation=12)))
z3 = relu(bn(conv2d(y3, output_channel=256, kernel_size=3, dilation=18)))
z4 = relu(bn(conv2d(y4, output_channel=256, kernel_size=3, dilation=24)))
# Concat them to produce the output of the module
x = concat([z1, z2, z3, z4], dim=1)
Can you verify my code and tell me if there is any misunderstanding? Specially, check the following points.
There seems to be a problem with the link of the pre-trained model ( NYU Depth V2 (Fast Unpool, non-pos) )
Is it possible to test on an image (or image set) that does not have GT depth data? I am referring to Cityscapes, which does not have LiDaR data. I only care about the predicted depth using 0 samples.
Hi,
I checked the "torch_resnet_cspn_nyu.py" to understand how you implemented the depth network before CSPN module. I found many dead codes in that file. For example, could you please tell me where you use those self.up_proj_layer(1/2/3/4) in forward function? In addition, when I check the paper the implementation of UpProj is not same as that depicted in Fig. 5 (see, conference paper)? After up-sampling by self._up_pool (object of Unpool), the output is used by the shortcut layer. However, the figure shows that there are conv, BN and Relu before feeding the output of the upsample to the shortcut. Could you please clarify those points for me?
Thanks for providing the code. I appreciate it. However, I think that the code could have been written far better to help people working in academic area and understand your idea better.
Thank you,
Hello Cheng, thanks for your submitted work, I like your TPAMI paper very much. I tried to use your 3DCSPN by following the instructions, but I found no implementations for the affinity_propagate embedded in the PaddlePaddle. Would you like to instruct which PaddlePaddle version you used and where i can find the function in PaddlePaddle? Thank you very much!
Is one propagation the same as the description in the paper 'Learning Depth with Convolutional Spatial
Propagation Network'?
what is the effect of max_of_4_tensor and eight_way_propagation?
thanks a lot!
Hi
As I understand, eq. 6 in arxiv paper should correspond to line:
CSPN/cspn_pytorch/models/cspn.py
Line 81 in b3e487b
if sparse_depth is not None:
result_depth = (
1 - sparse_mask
) * result_depth + sparse_mask * raw_depth_input
should be
if sparse_depth is not None:
result_depth = (
1 - sparse_mask
) * result_depth + sparse_mask * sparse_depth
such that you keep the depth at the points where you have depth?
When iter steps getting lager(In my code iter steps is 24)the gradient is more easier to explode.How to deal with that.
when I use cpu run this code,load the best model you give,the result of my eval sample looks very bad.The data I use is nyu_depth_v2_labeled.mat,and I convert the data to image and depth npy file and load it.And the error:
Epoch: 0, step: 0, loss=1.3417
MSE=2.7589(2.7589) RMSE=1.6610(1.6610) MAE=1.3417(1.3417) ABS_REL=0.4409(0.4409)
DELTA1.02=0.0673(0.0673) DELTA1.05=0.1037(0.1037) DELTA1.10=0.1616(0.1616)
DELTA1.25=0.3161(0.3161) DELTA1.25^2=0.5606(0.5606) DELTA1.25^3=0.7404(0.7404)
I think this method can apply to semantic segmentation too. So I run a test, this works bad, even bader than the coarse mask.
Hi:
I have some confusion about max_of_8_tensor() and gate1_w1_cmb to gate8_w1_cmb .Is that used in stereo version?In my code ,I reimplement stereo matching tast without max_of_8_tensor() and gate1_w1_cmb to gate8_w1_cmb.And I can't get good result.
Thanks for your help.
Hello author, how should I use my own data to make training set and test set?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.