sniklaus / 3d-ken-burns Goto Github PK

View Code? Open in Web Editor NEW

1.5K 1.5K 226.0 365 KB

an implementation of 3D Ken Burns Effect from a Single Image using PyTorch

License: Other

Python 89.86% HTML 10.14%

cuda cupy deep-learning python pytorch

3d-ken-burns's People

Contributors

Stargazers

Watchers

Forkers

jonathanfly philackm liuguoyou peterzs ideaplexus peterzhousz shauryaag laksh9950 splovyt johndpope salehe-e happystorm shafiahmed thomw nulledexceptions edmz lukemshannonhill wyuzyf xiaxx244 lyuji282 serialforbreakfast hhy5277 gridsquare abuzarrizvi satoshirobatofujimoto yehor-morylov agmm bitelchux pythseq alaptseu 744982693 chaoso skyhehe123 peiyi-li abhinavjain13 aakashofficial salvadog sushantkumar10 ganbaheti ajax6255 tamwaiban yarin05 hyattlee axelbellec gridl dimwap willtejeda whucslwj sttomato end18 mtlong pr0gr8mm3r phymucs asears parkergibbons tilfast lulu1315 mrm8488 geez walkingmu hexstark avidmc zumbalamambo amirstudy mosely joeroeller rubilyn kunato lucascoolsouza ujey02 ya-dola elavin11 n1ckfg minygd solidmage cjuette flipflopbboi dibenedetto tonyngjichun wyvernzhao x-stas arlyndrfujitsu reddured mori0711 gillonz gokulsg scienceapps jwgu marcelsan robjamesd220 turmac gersonadr lou-k microsd ailty lxin81 jasonlsc progen gormonn gantman

3d-ken-burns's Issues

Can't seem to run this in Colab

I can't seem to run this on Colab. I'm stuck at trying to set CUDA_HOME but to no avail. Any advice?

Image resolution seems to take a hit

Loving the paper and its implementation. The results really do look stunning! I'm just having an issue with the output resolution being severely degraded in comparison to the input image. Everything seems so blurry. Is this just an artifact of the animation process itself?

any chance to release training code?

CPU inference ?

Congratulations for you paper, and thanks for open sourcing this very effective work. I have one question about your python implementation, hope you can give some advice.

Is it possible to process 3d-ken-burns effect without using CUDA (just by doing some CPU inference) ?

Thanks,

Axel

Output mp4 do not play if width/height not divisible by 2

At least i think this is the problem. More of a mp4 limitation that a bug.
I usually fix this in ffmpeg using:
-vf "pad=ceil(iw/2)*2:ceil(ih/2)*2"

segmentation fault .

running on KDE Neon (Ubuntu 18.04)
NVIDIA Corporation GM200 [GeForce GTX TITAN X] (rev a1)
Cuda compilation tools, release 10.2, V10.2.89
Cudnn 7.6.5
Python 3.6.9
torch 1.5.0a0+90a259e (compiled from source)
torchvision 0.6.0a0+28b7f8a (compiled from source)

command : python3 autozoom.py --in ./images/doublestrike.jpg --out ./autozoom.mp4
i've got a Segmentation fault when arriving on this line in common.py in process_load()
tensorDisparity = disparity_estimation(tensorImage)

thanks for your help.
luc

Interested in the depth adjustment part

I am so interested in the depth adjustment part, looking forward to it!

after git，the another issue happen

Traceback (most recent call last):
File "/data/home/v_lifengmei/anaconda3/envs/ken/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 438, in compile
nvrtc.compileProgram(self.ptr, options)
File "cupy/cuda/nvrtc.pyx", line 101, in cupy.cuda.nvrtc.compileProgram
File "cupy/cuda/nvrtc.pyx", line 111, in cupy.cuda.nvrtc.compileProgram
File "cupy/cuda/nvrtc.pyx", line 56, in cupy.cuda.nvrtc.check_status
cupy.cuda.nvrtc.NVRTCError: NVRTC_ERROR_COMPILATION (6)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "autozoom.py", line 88, in
'objFrom': objFrom
File "", line 116, in process_autozoom
File "", line 433, in render_pointcloud
File "cupy/util.pyx", line 81, in cupy.util.memoize.decorator.ret
File "", line 296, in launch_kernel
File "/data/home/v_lifengmei/anaconda3/envs/ken/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 287, in compile_with_cache
extra_source, backend)
File "/data/home/v_lifengmei/anaconda3/envs/ken/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 339, in _compile_with_cache_cuda
ptx = compile_using_nvrtc(source, options, arch, name + '.cu')
File "/data/home/v_lifengmei/anaconda3/envs/ken/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 147, in compile_using_nvrtc
ptx = prog.compile(options)
File "/data/home/v_lifengmei/anaconda3/envs/ken/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 442, in compile
raise CompileException(log, self.src, self.name, options, 'nvrtc')
cupy.cuda.compiler.CompileException: /tmp/tmpmblxzzzw/a2d1608f61e34b593ec1dcda48d72df5_2.cubin.cu(56): catastrophic error: cannot open source file "cuda_runtime.h"

1 catastrophic error detected in the compilation of "/tmp/tmpmblxzzzw/a2d1608f61e34b593ec1dcda48d72df5_2.cubin.cu".
Compilation terminated.

i sure that cuda_runtime.h is under my cuda root

Meaning of parameters in meta data json files

For the synthetic depth/normal dataset, there is a meta data json file included with the RGB images. This contains two parameters: intSample and fltFov. Could you explain what these mean? Ideally, I would like to be able to compute camera intrinsics in the form of focal length/principal point or a K matrix. Any guidance on doing this from the json files would be appreciated.

'NoneType' object has no attribute 'shape' happens for almost every image, why?

Here's an example of what errors I get when running in colab, by the way when it does work, it's very cool! Congrats!

Traceback (most recent call last):
File "autozoom.py", line 65, in
intWidth = numpyImage.shape[1]
AttributeError: 'NoneType' object has no attribute 'shape'
Traceback (most recent call last):
File "autozoom.py", line 65, in
intWidth = numpyImage.shape[1]
AttributeError: 'NoneType' object has no attribute 'shape'
Traceback (most recent call last):
File "autozoom.py", line 65, in
intWidth = numpyImage.shape[1]
AttributeError: 'NoneType' object has no attribute 'shape'

When I run interfac.py it hangs

I have to control+c to get out. On Windows

Getting IndexError: list index out of range while running default command

Using python autozoom.py --in ./images/doublestrike.jpg --out ./autozoom.mp4 command to run the code but i am getting IndexError: list index out of range.

This is the package list i have installed:

blas=1.0=mkl
ca-certificates=2021.7.5=haa95532_1
cached-property=1.5.2=py_0
certifi=2021.5.30=py37haa95532_0
cffi=1.14.6=py37h2bbff1b_0
charset-normalizer=2.0.4=pypi_0
click=8.0.1=pyhd3eb1b0_0
colorama=0.4.4=pypi_0
cudatoolkit=10.1.243=h74a9793_0
cudnn=7.6.5=cuda10.1_0
cupy=8.3.0=py37hd4ca531_0
decorator=4.4.2=pypi_0
fastrlock=0.6=py37hd77b12b_0
flask=1.1.2=pyhd3eb1b0_0
freetype=2.10.4=hd328e21_0
gevent=21.8.0=py37h2bbff1b_1
greenlet=1.1.1=py37hd77b12b_0
h5py=3.2.1=py37h3de5c98_0
hdf5=1.10.6=h7ebc959_0
icc_rt=2019.0.0=h0cc432a_1
idna=3.2=pypi_0
imageio=2.9.0=pypi_0
imageio-ffmpeg=0.4.5=pypi_0
importlib-metadata=3.10.0=py37haa95532_0
intel-openmp=2021.3.0=haa95532_3372
itsdangerous=2.0.1=pyhd3eb1b0_0
jinja2=3.0.1=pyhd3eb1b0_0
jpeg=9b=hb83a4c4_2
libpng=1.6.37=h2a8f88b_0
libtiff=4.2.0=hd0e1b90_0
lz4-c=1.9.3=h2bbff1b_1
markupsafe=2.0.1=py37h2bbff1b_0
mkl=2021.3.0=haa95532_524
mkl-service=2.4.0=py37h2bbff1b_0
mkl_fft=1.3.0=py37h277e83a_2
mkl_random=1.2.2=py37hf11a4ad_0
moviepy=1.0.3=pypi_0
ninja=1.10.2=h6d14046_1
numpy=1.20.3=py37ha4e8547_0
numpy-base=1.20.3=py37hc2deb75_0
olefile=0.46=py37_0
opencv-contrib-python=4.5.3.56=pypi_0
openssl=1.1.1k=h2bbff1b_0
pillow=8.3.1=py37h4fa10fc_0
pip=21.2.4=pypi_0
proglog=0.1.9=pypi_0
pycparser=2.20=py_2
pyreadline=2.1=py37_1
python=3.7.11=h6244533_0
pytorch=1.6.0=py3.7_cuda101_cudnn7_0
requests=2.26.0=pypi_0
scipy=1.6.2=py37h66253e8_1
setuptools=52.0.0=py37haa95532_0
six=1.16.0=pyhd3eb1b0_0
sqlite=3.36.0=h2bbff1b_0
tk=8.6.10=he774522_0
torchvision=0.7.0=py37_cu101
tqdm=4.62.2=pypi_0
typing_extensions=3.10.0.0=pyh06a4308_0
urllib3=1.26.6=pypi_0
vc=14.2=h21ff451_1
vs2015_runtime=14.27.29016=h5e58377_2
werkzeug=1.0.1=pyhd3eb1b0_0
wheel=0.37.0=pyhd3eb1b0_0
wincertstore=0.2=py37_0
xz=5.2.5=h62dcd97_0
zipp=3.5.0=pyhd3eb1b0_0
zlib=1.2.11=h62dcd97_4
zope=1.0=py37_1
zope.event=4.5.0=py37_0
zope.interface=5.4.0=py37h2bbff1b_0
zstd=1.4.9=h19a0ad4_0

IndexError: index 0 is out of bounds for dimension 0 with size 0

Hi, thanks for you incredible work.

autozoom.py works well for me in most of images. However, I met the below question by inputting this image

Traceback (most recent call last):
  File "autozoom.py", line 76, in <module>
    process_load(npyImage, {})
  File "<string>", line 10, in process_load
  File "<string>", line 64, in disparity_adjustment
IndexError: index 0 is out of bounds for dimension 0 with size 0

Python version

Python 3.8.10

The package list I have installed

Package                Version
---------------------- ---------
certifi                2021.10.8
charset-normalizer     2.0.7
click                  8.0.3
cupy                   9.5.0
decorator              4.4.2
fastrlock              0.8
Flask                  2.0.2
gevent                 21.8.0
greenlet               1.1.2
h5py                   3.5.0
idna                   3.3
imageio                2.9.0
imageio-ffmpeg         0.4.5
itsdangerous           2.0.1
Jinja2                 3.0.2
MarkupSafe             2.0.1
moviepy                1.0.3
numpy                  1.21.3
opencv-python-headless 4.5.4.58
Pillow                 8.4.0
pip                    21.2.4
proglog                0.1.9
requests               2.26.0
scipy                  1.7.1
setuptools             58.1.0
torch                  1.10.0
torchvision            0.11.1
tqdm                   4.62.3
typing-extensions      3.10.0.2
urllib3                1.26.7
Werkzeug               2.0.2
wheel                  0.37.0
zope.event             4.5.0
zope.interface         5.4.0

RuntimeError: view size is not compatible

When I try and run the colab of this I'm getting and error on the final step of

Traceback (most recent call last):
File "autozoom.py", line 76, in
process_load(npyImage, {})
File "", line 10, in process_load
File "", line 128, in disparity_refinement
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "", line 94, in forward
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Do you know what this might be?

Thank you

training code

Can you post the training code, please? thank you

Input dimensions during inference

Hello Simon!

I have a question about image resizing during inference. You write:

Different from existing work, we do not resize the input image to a fixed resolution when providing it to the network and instead resize it such that its largest dimension is 512 pixels while preserving its aspect ratio.

Why is the larger dimension = 512 and not the smaller one? For example, if I crop the center during training, I would be looking at the shorter side, so it would seem consistent to do the same during inference.

Curious About Training Dataset

Thank you for your impressive work. I am really curious about how to create training pairs for color and depth images inpainting.
Wonder if you would like to share a link to training dataset in the future?

Could I run on MacBook without CUDA?

I would like to test it on macbook, it is possible not to use Cuda? It is possible to make an update and use only CPU?

Depth adjustment

Hello Simon!
Do you plan on adding depth adjustment to the depth estimation script? I am interested.

"Please note that this script does not perform the depth adjustment, I will add it to the script at a later time should people end up being interested in it."

Attempting to read a .pytorch file instead of a .py file

Hi! Can I get a solution to this error? I can't seem to find anything online for this either.

python autozoom.py --in ./images/doublestrike.jpg --out ./autozoom.mp4
pygame 1.9.6
Hello from the pygame community. https://www.pygame.org/contribute.html
Traceback (most recent call last):
File "autozoom.py", line 45, in
exec(open('./models/disparity-estimation.py', 'r').read())
File "", line 195, in
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\serialization.py", line 525, in load
with _open_file_like(f, 'rb') as opened_file:
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\serialization.py", line 212, in _open_file_like
return _open_file(name_or_buffer, mode)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\serialization.py", line 193, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: './models/disparity-estimation.pytorch'

env.ylm or requirements.txt

Could you add a file with the conda environment or pip requirements?

Thanks in advance!

I cannot get the autozoom.py instructions to work...

After much installing and tweaking, I am still unable to get the
python autozoom.py --in MYINPUTFILE.jpg --out MYOUTPUTFILE.mp4
to work.

Script complains that './models/disparity-estimation.pytorch' does not exist, which it doesnt. Could we get some instructions on how to generate this model? The download script in the setup does not include this file.

Thanks!

Training scripts

Do you plan to release the training scripts any time soon?

This would be great to compare against your results for academic papers

Distorting the depth when training the depth refinement network

Hi, thank you for the wonderful work.

I have a question about the distortion mentioned in Section 3.1.3 Depth Refinement.
It states that when training the depth refinement network,

"we downsample and distort the ground truth depth to simulate the coarse prediction depth maps ...".

Could you explain a bit more details on how the distortion is conducted to the ground truth depth?

Patent question

Hi,

Thank you for nice repository.
The both depth pipeline and inpainting algorithm are awesome.

I have a question.
Are anything in the proposed methods in this paper patented?
https://arxiv.org/abs/1909.05483

Strange error - probably not you

It seems to depend on samples helper_math.h but doesn't know where to find it?

(base) PS C:\Users\Owner\Desktop\Code\ML\3d-ken-burns> python autozoom.py --in ./images/doublestrike.jpg --out ./autozoom.mp4
Traceback (most recent call last):
  File "C:\Users\Owner\Anaconda3\lib\site-packages\cupy\cuda\compiler.py", line 242, in compile
    nvrtc.compileProgram(self.ptr, options)
  File "cupy\cuda\nvrtc.pyx", line 98, in cupy.cuda.nvrtc.compileProgram
  File "cupy\cuda\nvrtc.pyx", line 108, in cupy.cuda.nvrtc.compileProgram
  File "cupy\cuda\nvrtc.pyx", line 53, in cupy.cuda.nvrtc.check_status
cupy.cuda.nvrtc.NVRTCError: NVRTC_ERROR_COMPILATION (6)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "autozoom.py", line 87, in <module>
    'objectFrom': objectFrom
  File "<string>", line 116, in process_autozoom
  File "<string>", line 443, in render_pointcloud
  File "cupy\util.pyx", line 55, in cupy.util.memoize.decorator.ret
  File "<string>", line 306, in launch_kernel
  File "C:\Users\Owner\Anaconda3\lib\site-packages\cupy\cuda\compiler.py", line 165, in compile_with_cache
    ptx = compile_using_nvrtc(source, options, arch, name + '.cu')
  File "C:\Users\Owner\Anaconda3\lib\site-packages\cupy\cuda\compiler.py", line 81, in compile_using_nvrtc
    ptx = prog.compile(options)
  File "C:\Users\Owner\Anaconda3\lib\site-packages\cupy\cuda\compiler.py", line 246, in compile
    raise CompileException(log, self.src, self.name, options)
cupy.cuda.compiler.CompileException: C:\Users\Owner\AppData\Local\Temp\tmpu807esdp\6e7e4513b4a6c2bb9216b74b496c77b7_2.cubin.cu(2): catastrophic error: cannot open source file "samples/common/inc/helper_math.h"

1 catastrophic error detected in the compilation of "C:\Users\Owner\AppData\Local\Temp\tmpu807esdp\6e7e4513b4a6c2bb9216b74b496c77b7_2.cubin.cu".
Compilation terminated.

I verified the samples exist here: C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.1

Benchmark results

You write that "the depth boundary error is currently different from the paper". I see that the huge dbe_com can be ignored; however, there are noticeable differences in some other metrics as well. Do you know why?

abs_rel =  0.09667361210538511
sq_rel  =  0.08968517555964581
rms     =  0.468931547303287
log10   =  0.04028934128953222
thr1    =  0.9042070992913529
thr2    =  0.9735211480731727
thr3    =  0.9914000487947728
dde_0   =  0.9348635887380082
dde_m   =  0.02829501122683011
dde_p   =  0.0368414000351617
dbe_acc =  2.027432278515633
dbe_com =  29.32951945280528
pe_fla  =  2.1928497290019897
pe_ori  =  10.243341646515104

AttributeError: 'Sequential' object has no attribute '17'

After watching a video on this, I immediately looked up to find this repository. I am very excited for it to work but I am encountering this issue, and I can't seem to resolve this.

Currently have:
pytorch 1.4.0
python 3.7.3 64-bit
cupy 7.1.0
moviepy 1.0.1

3d-ken-burns-master\3d-ken-burns-master>python autozoom.py --in ./images/doublestrike.jpg --out ./autozoom.mp4
pygame 1.9.6
Hello from the pygame community. https://www.pygame.org/contribute.html
Traceback (most recent call last):
File "autozoom.py", line 45, in
exec(open('./models/disparity-estimation.py', 'r').read())
File "", line 194, in
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_init_.py", line 1456, in init_then_script
self.dict["_actual_script_module"] = torch.jit._recursive.create_script_module(self, stubs)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_recursive.py", line 296, in create_script_module
return create_script_module_impl(nn_module, concrete_type, cpp_module, stubs)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_recursive.py", line 336, in create_script_module_impl
script_module = torch.jit.RecursiveScriptModule.construct(cpp_module, init_fn)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_init.py", line 1593, in _construct
init_fn(script_module)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_recursive.py", line 328, in init_fn
scripted = recursive_script(orig_value)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_recursive.py", line 534, in recursive_script
return create_script_module(nn_module, infer_methods_to_compile(nn_module))
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_recursive.py", line 296, in create_script_module
return create_script_module_impl(nn_module, concrete_type, cpp_module, stubs)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_recursive.py", line 336, in create_script_module_impl
script_module = torch.jit.RecursiveScriptModule.construct(cpp_module, init_fn)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_init.py", line 1593, in _construct
init_fn(script_module)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_recursive.py", line 328, in init_fn
scripted = recursive_script(orig_value)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_recursive.py", line 534, in recursive_script
return create_script_module(nn_module, infer_methods_to_compile(nn_module))
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_recursive.py", line 296, in create_script_module
return create_script_module_impl(nn_module, concrete_type, cpp_module, stubs)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_recursive.py", line 336, in create_script_module_impl
script_module = torch.jit.RecursiveScriptModule.construct(cpp_module, init_fn)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_init.py", line 1593, in _construct
init_fn(script_module)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit_recursive.py", line 321, in init_fn
orig_value = getattr(nn_module, name)
File "C:\Users\msaif\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 576, in getattr
type(self).name, name))
AttributeError: 'Sequential' object has no attribute '17'

list index out of range

python autozoom.py --in ./images/doublestrike.jpg --out ./autozoom.mp4
Traceback (most recent call last):
File "autozoom.py", line 89, in
'objFrom': objFrom
File "", line 119, in process_autozoom
File "", line 436, in render_pointcloud
File "cupy/_util.pyx", line 59, in cupy._util.memoize.decorator.ret
File "", line 296, in launch_kernel
IndexError: list index out of range

Coordinate system of normal maps

In your synthetic dataset, could you confirm that the normal maps are in camera coordinates as opposed to world? If I compute a normal map from the depth maps using finite difference and the intrinsic camera parameters, I can't get something that looks close to the ground truth.

Also, I wonder if there is quite heavy quantisation in the normal/depth maps? I think they are stored with 16 bit depth - is that right?

Error when trying to generate from the image

Hello! I'm interested in your work and try to use your code to generate a 3D Ken Burns video. But there is a problem when I try to generate it. My cuda version is 10.2 and all other packs has been installed. The environment is Win 10.
The error is below:

Traceback (most recent call last):
File "autozoom.py", line 88, in
'objFrom': objFrom
File "", line 116, in process_autozoom
File "", line 433, in render_pointcloud
File "cupy\util.pyx", line 81, in cupy.util.memoize.decorator.ret
File "", line 293, in launch_kernel
IndexError: list index out of range

I wonder why this happens and hope for your reply. Thanks a lot!
BTW, I'm curious about the training method. If there is any possibility for you to release the training code. Just asking.

AttributeError: 'NoneType' object has no attribute 'get_function'

When running "python autozoom.py --in ./images/doublestrike.jpg --out ./autozoom.mp4", I get the following error:

Traceback (most recent call last):
File "autozoom.py", line 89, in
'objFrom': objFrom
File "", line 116, in process_autozoom
File "", line 433, in render_pointcloud
File "cupy/_util.pyx", line 67, in cupy._util.memoize.decorator.ret
File "", line 296, in launch_kernel
AttributeError: 'NoneType' object has no attribute 'get_function'

Any thoughts for how to fix this?

Two places to calculate disparity map in the project

Hi,
Love your project.
After reading your code, I am a little confused for generating the disparity map.
From the code, there seems two different ways to calculate disparity map:

common.py

3d-ken-burns/common.py

Line 11 in a75ac80

tensorDisparity = tensorDisparity / tensorDisparity.max() * objectCommon['dblBaseline']

depthestim.py

3d-ken-burns/depthestim.py

Line 71 in a75ac80

 tensorDisparity = torch.nn.functional.interpolate(input=tensorDisparity, size=(tensorImage.size(2), tensorImage.size(3)), mode='bilinear', align_corners=False) * (max(tensorImage.size(2), tensorImage.size(3)) / 256.0) 

Can you kindly explain why the differences?
Thanks.

Baseline between 4 views in the dataset ?

Hi,

First of all, thank you for providing the wonderful dataset.
However, in the provided dataset, there is a json file which describes the FOV only for each 4 views. Is it possible to provide the baseline between each view ?

Thank you!

Question about loss func for depth estimation

Can you show the implementation of the loss function?
Because i don't understand how "i,j" change for each "h" in grad part of loss

Evaluate on NYUV2 test set?

Hi! Could You please provide a script how you evaluated the NYUV2 test set. I am trying to get the same results as you mentioned in the paper but can't.

Ground truth resolution

Hello Simon!
How do you reach the resolution of 1024 for the largest dimension if the resolution of the synthetic dataset is only 512x512?

Can this expand to arbitrary camera motion? eg. left-right swing or circle swing?

Great job! I was wondering can we add more motion support base on this algorithm? the output is fantastic!

Depth Extraction

Hello

I'm sorry if this is the wrong place to ask this, but I was wondering if there is a way to extract the depth map this tool creates?

Thank You

Depth as png

Hello Simon!
Following #11 👍

Disparity is resolution-dependent
Baseline is fixed (resolution independent)

3d-ken-burns/depthestim.py

Line 77 in bde6758

 cv2.imwrite(filename=arguments_strOut.replace('.npy', '.png'), img=(numpyDisparity / dblBaseline * 255.0).clip(0.0, 255.0).astype(numpy.uint8)) 

Then their ratio is also resolution-dependent and we may theoretically have problems with clipping, right?

No Image when using interface.py on Google Colab

Using https://github.com/wpmed92/3d-ken-burns-colab as the base, I added the following to the end of the notebook to attempt to use interface.py so I can do my own camera paths:

#Get an internet accessible address to the local server from interface.py. Changed server port in interface.py to 8050
from google.colab.output import eval_js
print(eval_js("google.colab.kernel.proxyPort(8050)"))
# Will be something like: https://z4spb7cvssd-496ff2e9c6d22116-8050-colab.googleusercontent.com/

#Run the interface
!python interface.py

Everything seems to be setup right and allows me to load an image from my PC, modify the zooms, etc., However, the finished 3D image never appears. Here is the log from interface.py:

127.0.0.1 - - [2020-05-24 18:49:04] "GET / HTTP/1.1" 200 9154 0.001346
127.0.0.1 - - [2020-05-24 18:49:04] "GET /favicon.ico HTTP/1.1" 404 356 0.001295
127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_mode HTTP/1.1" 400 318 0.000623
127.0.0.1 - - [2020-05-24 18:49:15] "POST /load_image HTTP/1.1" 400 318 0.001246
127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_mode HTTP/1.1" 400 318 0.000806
127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_mode HTTP/1.1" 400 318 0.000689
127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_mode HTTP/1.1" 400 318 0.000638
127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_from HTTP/1.1" 400 318 0.000629
127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_mode HTTP/1.1" 400 318 0.000582
127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_mode HTTP/1.1" 400 318 0.000783
127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_from HTTP/1.1" 400 318 0.000632
127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_mode HTTP/1.1" 400 318 0.000649
127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_from HTTP/1.1" 400 318 0.000658
127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_from HTTP/1.1" 400 318 0.000727
127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_to HTTP/1.1" 400 318 0.001202
127.0.0.1 - - [2020-05-24 18:49:17] "POST /update_mode HTTP/1.1" 400 318 0.000553
127.0.0.1 - - [2020-05-24 18:49:17] "POST /update_to HTTP/1.1" 400 318 0.000643
127.0.0.1 - - [2020-05-24 18:49:17] "POST /update_to HTTP/1.1" 400 318 0.000613
127.0.0.1 - - [2020-05-24 18:49:17] "POST /update_to HTTP/1.1" 400 318 0.000714
127.0.0.1 - - [2020-05-24 18:49:17] "POST /update_mode HTTP/1.1" 400 318 0.000655
127.0.0.1 - - [2020-05-24 18:50:59] "GET /get_live HTTP/1.1" 200 36695563 115.058576

And here is what the screen shows from the browser tab showing the interface:

Any help to get this to work is much appreciated.

Doug

A way to only zoom in one direction and to control the length/speed?

Is there a way to control the zoom behavior, like if I only want to zoom in or zoom out and how fast the zoom should be?
Best for the zoom speed I found so far is the fltSteps by increasing the number of steps and modifying the framerate accordingly.

unable to run the interface.py

While running the build with :

cuda 10.1
python36
OS: Windows 10

I get the following error.

Traceback (most recent call last): File ".\interface.py", line 45, in <module> exec(open('./models/disparity-estimation.py', 'r').read()) File "<string>", line 195, in <module> File "<my python virtual env for python 36>", line 419, in load f = open(f, 'rb') FileNotFoundError: [Errno 2] No such file or directory: './models/disparity-estimation.pytorch'

I am not sure why its looking for pytoch when the input is .py may be i am missing something , Please let me know

pytorch version and cupy version?

Hello!
Thanks for your repo！ It is a very beautiful project !
Recently, I want to try your code, and I tried to run on GeForce RTX 2080 with pytorch 1.3.1, cuda 10.0 and K40 with pytorch 1.3.1, cuda 9.2 10.0 in different machine, respectively, but I get the approximately the same error as fllow:
"RuntimeError: CUDA error: no kernel image is available for execution on the device (nms_cuda at /tmp/pip-req-build-c2_g4c3l/torchvision/csrc/cuda/nms_cuda.cu:127)"
so , can you give me some advise for running this project?
Thank you so much!

Question about the units of measurement

Hello Simon!

From what I've seen (e.g. NYUv2, MegaDepth), depth is usually measured in meters. Your model seems to output depth in millimeters. The sample of the synthetic dataset that you published also seems to be in millimeters. Could you clarify what units you are using and why?

I think this is important (and maybe something that should be mentioned) when you combine the two losses because the ordinal loss depends on the units whereas the gradient loss does not, so 1e-4*ordinal loss + gradient loss in millimeters would become 1e-7*ordinal loss + gradient loss when using meters.

Dolly Zoom effect

Has anybody tried to implement the dolly zoom effect based on this code?
We can zoom in the video now we only need to add the dolly effect which is moving the camera toward or from the scene.
I think by changing process_shift function to shift in y direction instead of x this effect can be implemented, any ideas?

Documentation for understanding/following the code

Hi there,

Could you please consider adding documentation on reading the code, and which files to follow for what?

I would also consider adding function-level comments.

Convert disparity model to onnx

I am trying to convert the disparity estimation to onnx but it fails.
Here is the code for the conversion:

moduleDisparity = Disparity().cpu().eval()
moduleDisparity.load_state_dict(torch.load('./models/disparity-estimation.pytorch',map_location=torch.device('cpu')))
.
.
.
tenImage = torch.nn.functional.interpolate(input=tenImage, size=(intHeight, intWidth), mode='bilinear', align_corners=False)
torch.onnx.export(moduleDisparity,           # model being run
                  tenImage,                  # model input (or a tuple for multiple inputs)
                  "disparity.onnx",          # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=11,          # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                )

Any ideas what might be wrong?

Below is the full log:

depthestim.py:106: TracerWarning: There are 2 live references to the data region being modified when tracing in-place operator copy_ (possibly due to an assignment). This might cause the trace to be incorrect, because all other views that also reference this data will not reflect this change in the trace! On the other hand, if all other views use the same memory chunk, but are disjoint (e.g. are outputs of torch.split), this might still be safe.
depthestim.py:107: TracerWarning: There are 2 live references to the data region being modified when tracing in-place operator copy_ (possibly due to an assignment). This might cause the trace to be incorrect, because all other views that also reference this data will not reflect this change in the trace! On the other hand, if all other views use the same memory chunk, but are disjoint (e.g. are outputs of torch.split), this might still be safe.
depthestim.py:108: TracerWarning: There are 2 live references to the data region being modified when tracing in-place operator copy_ (possibly due to an assignment). This might cause the trace to be incorrect, because all other views that also reference this data will not reflect this change in the trace! On the other hand, if all other views use the same memory chunk, but are disjoint (e.g. are outputs of torch.split), this might still be safe.
Traceback (most recent call last):
  File "depthestim.py", line 70, in <module>
    tenDisparity = disparity_estimation(tenImage)
  File "<string>", line 217, in disparity_estimation
  File "/usr/local/lib/python3.7/site-packages/torch/onnx/__init__.py", line 143, in export
    strip_doc_string, dynamic_axes, keep_initializers_as_inputs)
  File "/usr/local/lib/python3.7/site-packages/torch/onnx/utils.py", line 66, in export
    dynamic_axes=dynamic_axes, keep_initializers_as_inputs=keep_initializers_as_inputs)
  File "/usr/local/lib/python3.7/site-packages/torch/onnx/utils.py", line 382, in _export
    fixed_batch_size=fixed_batch_size)
  File "/usr/local/lib/python3.7/site-packages/torch/onnx/utils.py", line 249, in _model_to_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
  File "/usr/local/lib/python3.7/site-packages/torch/onnx/utils.py", line 206, in _trace_and_get_graph_from_model
    trace, torch_out, inputs_states = torch.jit.get_trace_graph(model, args, _force_outplace=True, _return_inputs_states=True)
  File "/usr/local/lib/python3.7/site-packages/torch/jit/__init__.py", line 275, in get_trace_graph
    return LegacyTracedModule(f, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/torch/jit/__init__.py", line 352, in forward
    out = self.inner(*trace_inputs)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 539, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 525, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "<string>", line 153, in forward
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 539, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 525, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "<string>", line 110, in forward
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 539, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 525, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 539, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 525, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 539, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 525, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 345, in forward
    return self.conv2d_forward(input, self.weight)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 342, in conv2d_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Cannot insert a Tensor that requires grad as a constant. Consider making it a parameter or input, or detaching the gradient
Tensor:
(1,1,.,.) = 
 1e-08 *
 -3.2130 -3.1717 -6.7120
  -2.9321 -2.1596 -5.3207
  -5.3304 -4.0832 -6.6434

(2,1,.,.) = 
 1e-08 *
 -6.3467 -5.2714 -3.8511
  -1.4349 -0.9043 -1.7164
   2.2282  1.9084  0.1815

(3,1,.,.) = 
 -0.1356 -0.2996 -0.1515
 -0.2680 -0.3050 -0.0946
 -0.1412 -0.0814  0.1711

(4,1,.,.) = 
  0.0323  0.3324  0.2069
  0.4698  0.7043  0.3661
  0.3435  0.3821 -0.0400

(5,1,.,.) = 
 0.01 *
  1.8387 -1.1811  3.2724
   5.0726 -4.2605 -2.5203
   8.7503  3.2887  2.4441

(6,1,.,.) = 
 1e-08 *
  8.4651 -34.0027 -17.7003
  -33.9391 -105.0318 -53.2571
  -0.4072 -68.6848 -21.4073

(7,1,.,.) = 
 -0.1424  0.0577  0.1054
  0.1166  0.2602  0.1935
  0.1603  0.1842  0.0418

(8,1,.,.) = 
 -0.0300  0.2590 -0.0611
  0.2711 -0.8872  0.2286
 -0.0140  0.2347  0.0200

(9,1,.,.) = 
 0.01 *
  5.5436  5.5922 -2.4238
   7.6962 -6.4165 -21.6364
   0.6882 -20.9868 -33.6227

(10,1,.,.) = 
 1e-07 *
 -0.6698  0.8277  1.3751
  -0.2397  1.7225  2.3451
  -0.3903  1.2526  2.0598

(11,1,.,.) = 
 1e-07 *
  0.9417  0.7442  0.6929
   1.2307  0.7018  0.4319
   1.0790  0.7065  0.3932

(12,1,.,.) = 
 1e-07 *
  7.6146  5.7121  3.7282
   5.3168  2.7130  1.1314
   2.9310  0.4808 -0.6361

(13,1,.,.) = 
 1e-09 *
  4.9544 -2.7519 -15.7078
  -7.1549 -6.7463 -16.9668
   6.9083 -4.8129 -33.0099

(14,1,.,.) = 
 1e-07 *
 -8.3304 -18.2230 -4.3244
  -8.2924 -20.0562 -5.6022
  -0.7945 -5.6383 -2.2384

(15,1,.,.) = 
 1e-07 *
  0.1333  1.8112  2.8743
   0.1626 -0.2449  0.3171
   1.0686 -0.1858 -0.8733

(16,1,.,.) = 
  0.0073  0.0509 -0.0243
  0.0977  0.3202  0.1052
  0.0016  0.1315  0.0105

(17,1,.,.) = 
 1e-07 *
 -0.5581 -0.7014  1.1886
  -3.7065 -3.4700  0.5572
  -1.8255 -1.2247  1.8284

(18,1,.,.) = 
 0.01 *
 -0.0112  0.7555 -0.2580
   0.7130  3.1319  0.3308
  -0.4120  0.6192 -0.5650

(19,1,.,.) = 
 1e-08 *
  4.5481  3.1647 -2.8839
   4.0908  4.8833 -0.0234
  -2.9707 -1.8011 -5.1054

(20,1,.,.) = 
 1e-08 *
 -2.7652  1.1106 -9.4666
  -0.4439  3.5718 -7.5674
  -8.4806 -7.1408 -13.3155

(21,1,.,.) = 
 1e-05 *
  0.1709  1.2788 -0.1330
   1.9517  4.3646  1.7920
   0.5286  2.7752  0.6355

(22,1,.,.) = 
 1e-06 *
 -2.4195 -2.8527 -2.3856
  -1.6849 -2.0008 -1.7201
  -0.6914 -0.5932 -0.9472

(23,1,.,.) = 
 -0.4247  0.0234  0.4551
 -0.6397  0.0100  0.6216
 -0.4591  0.0005  0.4119

(24,1,.,.) = 
 1e-07 *
  1.9019  1.7495  2.0668
   0.9231  0.5205  1.6116
   0.6974  0.3777  1.6375

(25,1,.,.) = 
 1e-08 *
  0.2160  0.9083 -2.2832
   0.4810  2.4905 -0.8133
  -1.9258 -1.8707 -3.0977

(26,1,.,.) = 
 1e-08 *
 -4.9415 -3.2174 -2.4702
  -5.9384 -4.1640 -1.9775
  -3.1884 -1.8921 -1.1001

(27,1,.,.) = 
 1e-07 *
 -5.3617  1.9570  3.3545
  -18.5022 -14.1343  2.3019
  -19.2923 -19.0897 -7.8894

(28,1,.,.) = 
 1e-08 *
 -2.5506 -2.8990 -1.6892
  -1.3132  2.3117  3.5727
  -4.9348 -1.0357  0.5232

(29,1,.,.) = 
 1e-08 *
 -2.7442 -0.6233  2.0642
  -3.5399 -3.5485  0.8157
  -0.7170  1.4163  0.2161

(30,1,.,.) = 
 1e-08 *
  1.8024  2.9505  5.4840
  -1.2822  0.9704  5.9925
  -0.2390  0.8721  5.8703

(31,1,.,.) = 
 1e-08 *
  3.1035  2.9412  3.5864
   4.0178  5.5527  5.3168
   3.6072  6.1292  6.8759

(32,1,.,.) = 
 1e-06 *
  1.5017  1.7743  2.3208
   0.6763  0.6931  1.7442
   0.4255  0.2757  1.2422

(33,1,.,.) = 
 1e-08 *
  3.4458 -0.4889  5.9261
  -3.6801 -9.5670  2.6385
  -0.2377 -2.8950  6.4081

(34,1,.,.) = 
 1e-07 *
 -4.5144 -4.6692 -3.3429
  -4.5607 -4.6428 -3.2703
  -3.6175 -3.1385 -2.4001

(35,1,.,.) = 
  0.0116 -0.1754 -0.0587
 -0.1912  0.8220 -0.2014
 -0.0253 -0.1767 -0.0126

(36,1,.,.) = 
 -0.5396 -0.7199 -0.5146
 -0.0061  0.0056  0.0329
  0.5150  0.7006  0.5301

(37,1,.,.) = 
 1e-06 *
  3.1119 -5.4077 -0.4363
   2.3829 -5.3158 -1.4985
   3.8122  5.1537  4.4000

(38,1,.,.) = 
 0.01 *
 -3.2733  0.9083 -5.9138
  -0.7773  9.9612 -1.0216
  -5.4286 -0.9929 -5.1010

(39,1,.,.) = 
 1e-08 *
  0.8127  1.8605  1.9616
   1.8872  3.5781  2.8704
   4.0289  4.3605  3.8942

(40,1,.,.) = 
 1e-05 *
  0.4078  0.0785 -0.3175
   1.0959  0.8966  0.2014
   1.1737  1.0569  0.3956

(41,1,.,.) = 
 1e-07 *
 -1.9799 -0.5889 -0.3815
  -2.4404 -0.4930  0.2717
  -2.4609 -0.6537  0.5980

(42,1,.,.) = 
 0.001 *
  0.1928  0.2587  0.5447
   0.6349  0.8473  0.9765
   1.2040  1.5262  1.3884

(43,1,.,.) = 
 1e-07 *
 -9.3050 -5.1853 -10.7189
  -5.4793  1.2346 -5.7137
  -11.1967 -6.5475 -13.1073

(44,1,.,.) = 
 1e-08 *
 -0.3475  0.3618  0.4419
   0.4402  1.0068  0.5588
  -0.7599 -0.0645  0.2469

(45,1,.,.) = 
 1e-06 *
  0.5890  1.1702  1.6037
  -0.0551  0.4409  0.9334
  -0.7573 -0.4063  0.1521

(46,1,.,.) = 
 1e-07 *
 -5.9283 -6.0248 -4.3666
  -7.0135 -7.0207 -5.0320
  -5.5825 -5.3588 -3.8779

(47,1,.,.) = 
 1e-07 *
  7.1273  6.2614  6.8432
   6.3101  5.7698  8.0014
   7.5343  7.4507  9.2514

(48,1,.,.) = 
 1e-08 *
 -6.3253 -4.2518 -3.4865
  -3.8313  1.2232  2.4297
   0.0472  3.3083  5.3953

(49,1,.,.) = 
 1e-07 *
  0.6027  1.0629  0.5140
   0.9677  1.7047  1.1592
   0.6036  1.1702  0.8284

(50,1,.,.) = 
 1e-08 *
 -7.6343 -6.6184 -4.3720
  -9.4936 -9.4832 -6.1061
  -9.9432 -8.8383 -5.7003

(51,1,.,.) = 
  0.4743  0.0478 -0.4587
  0.6531 -0.0001 -0.6834
  0.4574 -0.0229 -0.4484

(52,1,.,.) = 
 1e-06 *
 -1.1447 -1.5738 -1.9225
  -0.5674 -0.9159 -1.4932
  -0.4543 -0.8401 -1.2672

(53,1,.,.) = 
  0.0143  0.4215  0.5482
 -0.4121  0.0026  0.4000
 -0.6448 -0.3838  0.0261

(54,1,.,.) = 
  0.1641  0.0481  0.1217
  0.0083 -0.5368 -0.0190
  0.1041  0.0230  0.1250

(55,1,.,.) = 
  0.0814  0.0014 -0.0083
  0.0014 -0.0361  0.0249
  0.0247  0.0635  0.1292

(56,1,.,.) = 
 1e-07 *
  6.1326  5.9883  8.2058
   3.0163  2.2326  7.1874
   2.0996  1.4950  6.7134

(57,1,.,.) = 
  0.4293  0.5650  0.4375
  0.0216 -0.0060  0.0011
 -0.4517 -0.5976 -0.4215

(58,1,.,.) = 
 0.0001 *
 -1.3120 -0.8339 -0.4428
  -0.9003 -0.1267  0.2129
  -0.6267 -0.1008  0.2715

(59,1,.,.) = 
  0.0741 -0.0006  0.0794
 -0.0156 -0.3409  0.0396
  0.0912 -0.0303  0.1119

(60,1,.,.) = 
 0.01 *
 -1.5601 -6.7184 -1.0508
  -9.5421 -35.6012 -4.3299
  -0.5582 -9.2842  1.6888

(61,1,.,.) = 
 1e-07 *
 -1.5665  0.4656  2.4328
  -3.6600 -1.4843  1.7176
   0.0152  1.0109  2.2081

(62,1,.,.) = 
 1e-08 *
 -5.0743 -6.6759 -7.8941
  -4.6060 -7.9228 -7.3020
  -7.2354 -7.5888 -7.5602

(63,1,.,.) = 
 1e-08 *
 -0.7886  0.4417  4.1371
  -0.9184  0.4951  6.3214
   0.1276  2.6705  6.1890

(64,1,.,.) = 
 1e-06 *
 -0.5129  4.4387  2.8007
   1.4936  6.4549  4.6987
   1.3737  3.8875  3.2473

(1,2,.,.) = 
 1e-07 *
  1.3710  1.2848  0.7127
   1.5487  1.5186  0.8358
   0.9339  0.9945  0.4128

(2,2,.,.) = 
 1e-08 *
 -1.2191 -0.6193  0.8502
   3.8618  5.8869  2.2082
   6.2501  7.0528  2.7185

(3,2,.,.) = 
  0.2033  0.4012  0.1375
  0.4216  0.6759  0.3272
  0.1310  0.3265  0.0643

(4,2,.,.) = 
 -0.1414 -0.2387 -0.1346
 -0.2211 -0.2798 -0.1718
 -0.2047 -0.2486 -0.1294

(5,2,.,.) = 
 0.01 *
 -1.0735 -10.6886 -5.4771
   0.7580 -13.1768 -9.8815
   6.9760  0.5207  1.8009

(6,2,.,.) = 
 1e-07 *
  6.1484  2.1818  3.4716
   1.9061 -5.0211 -0.6005
   4.3868 -2.4856  1.1694

(7,2,.,.) = 
 0.01 *
  7.9205  3.4867  2.3931
   8.1699  4.5262  3.7272
   6.5017  3.6134  4.2199

(8,2,.,.) = 
  0.1014  0.2720  0.1048
  0.3035 -1.6442  0.2985
  0.0960  0.2757  0.1193

(9,2,.,.) = 
  0.1715  0.2214  0.1509
  0.2630  0.3155  0.2391
  0.1981  0.2454  0.1894

(10,2,.,.) = 
 1e-07 *
 -2.0213 -0.9222 -0.6084
  -1.9629 -0.6347 -0.2365
  -2.4276 -1.4879 -0.9748

(11,2,.,.) = 
 1e-07 *
  1.2485  0.9775  0.9403
   1.5329  1.0207  0.7374
   1.3247  0.9911  0.6307

(12,2,.,.) = 
 1e-07 *
  7.3805  5.6559  2.6340
   4.9445  2.1868 -0.3710
   0.7739 -1.8557 -3.5899

(13,2,.,.) = 
 1e-09 *
 -7.8430 -18.2795 -19.6778
  -21.4693 -27.3000 -29.2309
  -7.2956 -20.0620 -40.5552

(14,2,.,.) = 
 1e-07 *
 -3.1757 -11.8954  2.9131
  -3.1979 -14.1779  0.8009
   3.1463 -0.9556  1.2786

(15,2,.,.) = 
 1e-07 *
 -1.1331 -1.0500  0.8470
  -1.1098 -2.5925 -0.5586
   1.4447 -0.2070  0.8415

(16,2,.,.) = 
 -0.0420  0.0178 -0.0285
  0.0251  0.3195  0.0715
 -0.0854  0.0664 -0.0511

(17,2,.,.) = 
 1e-07 *
  2.3185  2.9606  5.1793
  -1.3416 -0.3349  4.2920
   0.3388  1.2581  4.6961

(18,2,.,.) = 
 0.01 *
  0.1510  1.7695  0.0943
   2.2318  5.9721  1.7010
   0.5089  2.1875  0.0194

(19,2,.,.) = 
 1e-08 *
  8.2084  6.4455 -1.1884
   9.9604  8.1284  2.4308
   0.6414  1.1300 -3.1664

(20,2,.,.) = 
 1e-07 *
  0.9760  1.6703  0.2511
   1.2978  2.0346  0.6137
  -0.2232  0.1498 -0.6015

(21,2,.,.) = 
 1e-05 *
 -2.7972 -1.3858 -2.8523
  -0.8485  2.2354 -0.8908
  -2.5157  0.1934 -2.2776

(22,2,.,.) = 
 1e-06 *
 -2.8531 -3.2974 -2.6111
  -1.8501 -2.1381 -1.7456
  -0.5032 -0.3220 -0.7391

(23,2,.,.) = 
 -0.6589  0.0052  0.6357
 -0.8922  0.0305  0.8885
 -0.6436  0.0096  0.5966

(24,2,.,.) = 
 1e-07 *
  1.6988  1.8050  2.3931
   0.3934  0.1875  1.5098
  -0.1069 -0.4227  1.0730

(25,2,.,.) = 
 1e-08 *
  5.5037  7.7091  3.8100
   7.2050  9.2191  6.3206
   5.2303  4.7573  2.6863

(26,2,.,.) = 
 1e-08 *
 -0.9628  0.5551  2.7214
  -1.1246  0.9768  2.4552
   1.2226  2.7805  3.3149

(27,2,.,.) = 
 1e-07 *
 -8.8270 -9.8669 -12.9440
  -12.4385 -15.5345 -3.1132
  -1.2480 -7.5480 -2.8577

(28,2,.,.) = 
 1e-07 *
  0.6291  0.8785  0.7424
   0.8780  1.3748  1.3155
   0.3348  0.8558  0.8507

(29,2,.,.) = 
 1e-08 *
 -6.7996 -4.8161 -1.2447
  -7.9861 -6.8797 -2.5372
  -5.2586 -3.1641 -2.5091

(30,2,.,.) = 
 1e-08 *
  2.4821  4.0418  6.7709
  -0.5786  1.3152  7.0085
   0.9587  1.6392  6.8209

(31,2,.,.) = 
 1e-08 *
  0.2835  0.3750  1.3348
   0.4529  1.4979  2.2306
  -0.1359  1.9121  3.1635

(32,2,.,.) = 
 1e-06 *
  0.7504  0.7307  1.5204
  -0.2315 -0.4969  0.8714
  -0.1716 -0.5306  0.7194

(33,2,.,.) = 
 1e-07 *
  0.7999  0.5482  1.2546
  -0.2741 -0.7001  0.7757
  -0.0701 -0.2032  0.9639

(34,2,.,.) = 
 1e-07 *
 -1.9978 -1.6008 -0.0791
  -1.6124 -1.0281  0.3160
  -0.2603  0.5578  1.1054

(35,2,.,.) = 
 -0.1137 -0.2427 -0.1122
 -0.2641  1.5968 -0.2574
 -0.1426 -0.3406 -0.1019

(36,2,.,.) = 
 -0.7261 -0.9708 -0.7299
  0.0226  0.0126  0.0233
  0.7116  0.9341  0.7125

(37,2,.,.) = 
 1e-06 *
 -1.2917 -9.4605 -5.0333
  -1.5844 -9.0300 -5.8260
  -0.9374  0.4661 -1.0719

(38,2,.,.) = 
  0.0018  0.0573  0.0025
  0.0476  0.1734  0.0598
  0.0187  0.0658  0.0296

(39,2,.,.) = 
 1e-08 *
  2.3589  4.2171  4.9815
   4.5985  5.2501  4.8231
   5.4677  7.1503  6.4537

(40,2,.,.) = 
 1e-05 *
  0.4449  0.1002 -0.4193
   1.3314  1.1269  0.2296
   1.3498  1.2557  0.4390

(41,2,.,.) = 
 1e-07 *
  0.5982  1.9320  1.7446
  -0.6534  1.0652  1.6802
  -1.1455  0.2092  1.3983

(42,2,.,.) = 
 0.001 *
 -0.1133 -0.0726  0.3334
   0.4433  0.7225  0.9603
   1.1659  1.5923  1.4960

(43,2,.,.) = 
 1e-07 *
 -8.6648 -3.8250 -11.5667
  -3.0682  4.8361 -4.6499
  -9.6525 -5.1069 -13.7572

(44,2,.,.) = 
 1e-08 *
  0.8765  1.6089  1.0757
   1.7934  2.1281  1.3244
   1.1674  1.2451  1.5239

(45,2,.,.) = 
 1e-07 *
 -5.1310 -3.6471 -2.7222
  -2.6795 -2.3539 -2.7017
  -3.3381 -3.5068 -3.5452

(46,2,.,.) = 
 1e-07 *
 -4.7931 -4.7356 -3.2178
  -5.7040 -5.2259 -3.5032
  -4.4675 -3.7765 -2.3918

(47,2,.,.) = 
 1e-07 *
  2.5422  1.6209  3.2285
  -0.4467 -1.4478  1.8978
  -0.5721 -1.5107  1.1626

(48,2,.,.) = 
 1e-08 *
 -5.3048 -2.3324 -2.7325
  -4.3256 -2.0402 -0.1963
  -3.7748 -2.6266 -1.1315

(49,2,.,.) = 
 1e-07 *
  0.0593  0.5612 -0.1476
   0.5300  1.0935  0.4719
   0.0223  0.5628  0.1479

(50,2,.,.) = 
 1e-08 *
 -4.1863 -3.2167 -1.5919
  -5.8997 -4.4626 -2.6122
  -6.0010 -5.0161 -2.7537

(51,2,.,.) = 
  0.6166  0.0321 -0.6731
  0.9017  0.0213 -0.9277
  0.6314 -0.0170 -0.6512

(52,2,.,.) = 
 1e-07 *
 -8.5589 -13.5798 -19.1697
  -0.4947 -4.2125 -13.1996
  -0.2457 -3.8786 -11.1589

(53,2,.,.) = 
 -0.0844 -0.1306 -0.2133
  0.1147  0.0068 -0.0577
  0.1902  0.1300  0.0659

(54,2,.,.) = 
 -0.1092 -0.1200 -0.0819
 -0.0727 -0.4302 -0.0823
 -0.0506 -0.0454 -0.0644

(55,2,.,.) = 
 -0.1853 -0.2614 -0.1454
 -0.3096 -0.4288 -0.2846
 -0.1886 -0.2863 -0.1863

(56,2,.,.) = 
 1e-07 *
 -1.7148 -1.7330  0.7483
  -5.3574 -6.2990 -0.9817
  -5.9172 -6.9102 -1.6769

(57,2,.,.) = 
  0.6005  0.7903  0.6006
  0.0189 -0.0045 -0.0194
 -0.6087 -0.7892 -0.6019

(58,2,.,.) = 
 0.0001 *
 -1.0926 -0.0924 -0.0793
  -0.0264  1.5005  1.4479
  -0.0166  1.1766  1.2771

(59,2,.,.) = 
  0.1118 -0.0039  0.1136
 -0.0585 -0.4165 -0.0268
  0.0657 -0.0991  0.0433

(60,2,.,.) = 
  0.1149  0.0266  0.1008
 -0.0256 -0.3645  0.0019
  0.0922 -0.0370  0.1008

(61,2,.,.) = 
 1e-07 *
 -3.8834 -2.1583  0.3707
  -6.3855 -4.4854 -0.7148
  -1.9961 -1.3461  0.3368

(62,2,.,.) = 
 1e-08 *
 -1.8643 -5.4891 -5.5029
  -2.9716 -6.0283 -5.9039
  -5.4538 -5.7847 -5.4300

(63,2,.,.) = 
 1e-08 *
 -0.1624 -0.0518  3.5820
  -1.5325  0.9548  6.6321
   0.5882  3.6904  7.9135

(64,2,.,.) = 
 1e-06 *
 -2.3703  3.5919  1.5149
   0.6311  6.6647  4.2479
  -0.2415  2.5924  1.3272

(1,3,.,.) = 
 1e-07 *
  1.7435  1.7840  1.0561
   1.9644  1.9769  1.2560
   1.1935  1.3447  0.5964

(2,3,.,.) = 
 1e-07 *
  0.3005  0.3800  0.3539
   0.8674  1.1626  0.8050
   0.7965  1.0057  0.6789

(3,3,.,.) = 
 0.01 *
 -4.1092  4.6723 -0.9724
   2.9558  3.5226 -12.3375
  -9.1504 -17.8359 -31.6353

(4,3,.,.) = 
  0.1347 -0.0453 -0.0484
 -0.0973 -0.1241  0.0158
 -0.1218 -0.0074  0.2231

(5,3,.,.) = 
 0.01 *
 -2.7755 -11.4378 -4.9553
  -6.4417 -16.6983 -9.7820
  -1.2538 -4.9292  0.0720

(6,3,.,.) = 
 1e-06 *
  1.1109  0.8472  0.8168
   0.8557  0.2821  0.5063
   0.9700  0.3644  0.5112

(7,3,.,.) = 
 0.01 *
  2.7364 -20.4821 -12.8161
  -25.4111 -43.1664 -25.4474
  -20.1354 -28.2614 -5.7105

(8,3,.,.) = 
 -0.0579  0.1723 -0.0545
  0.1461 -0.3804  0.1916
 -0.0614  0.1401 -0.0786

(9,3,.,.) = 
 -0.2687 -0.3999 -0.0517
 -0.4603 -0.5279 -0.1054
 -0.1322 -0.1469  0.2232

(10,3,.,.) = 
 1e-07 *
  0.3034  1.3673  1.1366
   0.8561  2.0211  1.5151
   0.0314  0.7614  0.3984

(11,3,.,.) = 
 1e-07 *
  1.5036  1.2659  1.0791
   1.7553  1.2261  0.8714
   1.4838  1.0744  0.7320

(12,3,.,.) = 
 1e-07 *
  7.4908  6.5384  3.1760
   5.5750  3.8664  0.9424
   1.2235 -0.6227 -2.6781

(13,3,.,.) = 
 1e-08 *
 -2.0400 -2.8860 -3.9457
  -3.5640 -3.1045 -4.5312
  -3.2060 -2.5798 -6.0164

(14,3,.,.) = 
 1e-07 *
  7.0406  0.8191  9.7489
   7.9326  0.2974  9.2077
   8.5765  6.7364  5.9515

(15,3,.,.) = 
 1e-07 *
 -5.0018 -5.6029 -3.3417
  -6.8680 -7.9857 -4.6384
  -4.7900 -5.3909 -2.5146

(16,3,.,.) = 
 -0.0589 -0.0258 -0.0500
 -0.0321  0.2130 -0.0065
 -0.0982  0.0113 -0.0784

(17,3,.,.) = 
 1e-07 *
  2.4922  3.0518  3.9945
  -0.2794  0.7801  3.7444
   0.6033  1.6761  3.7114

(18,3,.,.) = 
 0.01 *
 -2.7188 -0.6654 -3.3155
  -0.5455  3.7112 -1.6490
  -3.1122 -0.9703 -3.9689

(19,3,.,.) = 
 1e-08 *
  4.4617  2.6190 -5.2480
   5.2264  2.4374 -3.4152
  -3.2977 -6.6562 -9.8476

(20,3,.,.) = 
 1e-07 *
  1.0487  1.7234  0.0596
   1.2570  2.1311  0.4517
  -0.6968 -0.1399 -1.1719

(21,3,.,.) = 
 1e-05 *
 -4.5205 -3.4885 -4.7549
  -2.5624  0.1946 -2.5324
  -3.9172 -1.5077 -3.5573

(22,3,.,.) = 
 1e-06 *
 -2.6666 -3.0496 -2.7603
  -1.7612 -2.0770 -2.0324
  -0.9746 -0.9439 -1.4309

(23,3,.,.) = 
 -0.2579  0.0289  0.2975
 -0.4187  0.0307  0.4192
 -0.2788  0.0049  0.2238

(24,3,.,.) = 
 1e-07 *
  1.1949  1.3534  1.6934
  -0.0185 -0.0123  1.0673
  -0.6163 -0.7110  0.5211

(25,3,.,.) = 
 1e-08 *
  1.5508  2.4067 -1.4635
   2.4850  3.9062  0.6474
  -0.6172 -1.1185 -3.0983

(26,3,.,.) = 
 1e-08 *
 -1.2326  0.9697  2.0086
  -0.4169  0.7113  2.6294
   1.7622  3.0494  3.2969

(27,3,.,.) = 
 1e-07 *
 -2.5761 -9.3296 -19.2886
   0.8417 -9.8626 -11.8579
   9.1690 -2.5797 -10.7181

(28,3,.,.) = 
 1e-08 *
  0.4582  2.9777  2.0041
  -0.9073  4.3392  4.3631
  -7.6208 -3.9407 -2.6395

(29,3,.,.) = 
 1e-08 *
 -3.5945 -1.8789  1.6645
  -4.1478 -2.5943  1.2577
  -0.8839  0.3308  0.2012

(30,3,.,.) = 
 1e-08 *
  1.6820  1.9992  3.3327
  -0.5766  0.4969  3.3592
   1.0994  1.4718  3.9038

(31,3,.,.) = 
 1e-08 *
  1.0307  0.2820  1.4225
   1.0069  2.2240  1.2833
   1.5105  1.9918  2.5383

(32,3,.,.) = 
 1e-07 *
 -9.0798 -10.3044 -1.4690
  -16.8419 -18.7123 -5.7534
  -14.1238 -15.6054 -3.6777

(33,3,.,.) = 
 1e-07 *
  0.7686  0.7198  1.1296
  -0.0459 -0.2670  0.7553
  -0.0494 -0.0566  0.8235

(34,3,.,.) = 
 1e-07 *
 -3.7023 -3.5569 -2.5229
  -3.1004 -2.9359 -2.2306
  -1.9217 -1.6781 -1.5249

(35,3,.,.) = 
 -0.0141 -0.1093  0.0136
 -0.1554  0.4832 -0.1594
  0.0348 -0.1794  0.0475

(36,3,.,.) = 
 -0.3246 -0.4820 -0.3568
  0.0382  0.0468  0.0215
  0.3156  0.4596  0.3073

(37,3,.,.) = 
 1e-06 *
  3.7949 -2.0316  1.2212
   2.9814 -1.7330  0.8480
   1.2439  3.0178  1.8973

(38,3,.,.) = 
 -0.0431  0.0337 -0.0498
  0.0219  0.1638  0.0207
 -0.0373  0.0174 -0.0417

(39,3,.,.) = 
 1e-08 *
 -0.6980  0.6792  1.4421
  -0.0383  1.8864  1.6102
   1.6256  2.7426  2.3839

(40,3,.,.) = 
 1e-05 *
  0.6657  0.4284 -0.3520
   1.6642  1.6324  0.5363
   1.4210  1.4449  0.4712

(41,3,.,.) = 
 1e-07 *
  3.1926  4.5990  3.6964
   2.9911  4.7645  4.3288
   1.9934  3.4548  3.7693

(42,3,.,.) = 
 0.001 *
  0.1882  0.1801  0.3987
   0.5498  0.6975  0.7698
   0.9410  1.2093  1.0851

(43,3,.,.) = 
 1e-06 *
  0.1659  0.7121 -0.2990
   0.7522  1.6618  0.4673
  -0.2326  0.3359 -0.7013

(44,3,.,.) = 
 1e-08 *
  2.0321  2.4884  2.2358
   1.9913  3.4819  3.1764
   2.0356  2.3348  3.0701

(45,3,.,.) = 
 1e-06 *
  0.1760 -0.0096 -0.2838
   1.2848  1.0084  0.4184
   1.7592  1.4847  0.8642

(46,3,.,.) = 
 1e-07 *
 -3.5435 -3.2946 -2.2776
  -4.0101 -3.3570 -2.2036
  -3.0490 -2.2213 -1.3732

(47,3,.,.) = 
 1e-06 *
  1.2670  1.2950  1.0628
   1.3485  1.3611  1.2286
   1.1302  1.1377  0.9655

(48,3,.,.) = 
 1e-08 *
  6.0796  9.0475  8.3603
   5.7375  9.8296  9.7101
   3.0880  5.0081  5.5403

(49,3,.,.) = 
 1e-08 *
 -2.1104  1.6784 -4.4763
  -1.7860  3.8084 -2.4536
  -8.3467 -4.6288 -6.5169

(50,3,.,.) = 
 1e-08 *
  0.1531  1.3511  2.6276
  -0.6507  0.1383  1.8736
  -2.1324 -0.6695  1.2893

(51,3,.,.) = 
  0.2772  0.0450 -0.3044
  0.4592  0.0372 -0.4674
  0.2904  0.0109 -0.2729

(52,3,.,.) = 
 1e-07 *
  1.2710 -1.7770 -7.7901
   9.6130  7.8594 -1.2851
   7.2705  4.8296 -2.6513

(53,3,.,.) = 
  0.0320 -0.2392 -0.4290
  0.3741  0.0366 -0.3297
  0.4182  0.2273 -0.1199

(54,3,.,.) = 
 0.01 *
  4.0400  8.2527  5.4274
   7.9899 -18.7916  7.1183
   2.0346  5.0403  0.8236

(55,3,.,.) = 
  0.1110  0.2681  0.1516
  0.2779  0.3818  0.2128
  0.1672  0.2270  0.0409

(56,3,.,.) = 
 1e-07 *
  2.6139  2.8576  3.4194
   0.8592  0.4003  3.3364
  -0.0134 -0.4607  2.8113

(57,3,.,.) = 
  0.2517  0.3678  0.2692
  0.0275  0.0141  0.0053
 -0.2803 -0.3852 -0.2808

(58,3,.,.) = 
 0.0001 *
 -0.2050  0.7630  0.4254
   1.1597  2.5677  2.1463
   1.0320  2.1724  1.8942

(59,3,.,.) = 
 0.01 *
 -3.7801 -12.9423 -4.8293
  -10.3097 -33.9333 -7.9363
  -0.8938 -10.9720 -1.7039

(60,3,.,.) = 
 0.01 *
  3.0013  2.1515  2.7459
  -1.3244 -24.9340  0.5422
   4.6701 -2.7467  3.3751

(61,3,.,.) = 
 1e-07 *
 -2.0515 -1.4257 -0.7025
  -3.4988 -2.4465 -0.6791
  -0.6549 -0.1227  0.4529

(62,3,.,.) = 
 1e-09 *
 -9.0202 -40.8534 -40.7018
  -20.1452 -40.4337 -47.5819
  -49.3172 -49.0567 -48.9550

(63,3,.,.) = 
 1e-08 *
  0.7137  0.3589  2.7016
  -0.2062  1.1004  5.9306
   2.0382  3.8491  6.5097

(64,3,.,.) = 
 1e-05 *
  0.4049  1.1047  0.7551
   0.7462  1.4459  1.0637
   0.4210  0.7708  0.5101
[ torch.FloatTensor{64,3,3,3} ]

How to Use the Refinement Network

Hello Simon! If I want to run your refinement network on a given disparity map that was not obtained from your first network, what are the requirements on the input? Suppose the disparity map is normalized to 0..1, where 1 corresponds to the nearest object and 0 to the farthest one (alternatively 0..255), should I scale it in some way before feeding to the refinement network?

Single pixel artefact in resulting video

Hi Team,

Awesome work!

There is a single pixel artefact that moves in a way that looks relative to the acceleration of the viewport. It looks like it could be that the pixel at the origin of the viewport is set to 0 or very high or something.

You can see it in most of the videos created with the tool, but I am sure it is there in all of them.

In this waxy article
https://waxy.org/2019/11/turning-photos-into-2-5d-parallax-animations-with-machine-learning/

You can spot it on bottom left corner of the dress of the kissing in time square one.
Bottom of Nixons jacket in Elvis + Nixon
Gordon Sondlands left hand