nvlabs / nvdiffrast Goto Github PK

Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering

License: Other

Dockerfile 0.28% Python 10.93% Cuda 28.49% C 4.15% C++ 55.86% Shell 0.29%

nvdiffrast's Introduction

Nvdiffrast – Modular Primitives for High-Performance Differentiable Rendering

Modular Primitives for High-Performance Differentiable Rendering
Samuli Laine, Janne Hellsten, Tero Karras, Yeongho Seol, Jaakko Lehtinen, Timo Aila
http://arxiv.org/abs/2011.03277

Nvdiffrast is a PyTorch/TensorFlow library that provides high-performance primitive operations for rasterization-based differentiable rendering. Please refer to ☞☞ nvdiffrast documentation ☜☜ for more information.

Licenses

This work is made available under the Nvidia Source Code License.

For business inquiries, please visit our website and submit the form: NVIDIA Research Licensing

We do not currently accept outside code contributions in the form of pull requests.

Environment map stored as part of samples/data/envphong.npz is derived from a Wave Engine sample material originally shared under MIT License. Mesh and texture stored as part of samples/data/earth.npz are derived from 3D Earth Photorealistic 2K model originally made available under TurboSquid 3D Model License.

Citation

@article{Laine2020diffrast,
  title   = {Modular Primitives for High-Performance Differentiable Rendering},
  author  = {Samuli Laine and Janne Hellsten and Tero Karras and Yeongho Seol and Jaakko Lehtinen and Timo Aila},
  journal = {ACM Transactions on Graphics},
  year    = {2020},
  volume  = {39},
  number  = {6}
}

nvdiffrast's People

Contributors

Stargazers

Watchers

Forkers

sondro chaoso zeta1999 trinhgiahuy ankitshah009 back2yes lmurmann kiloa44 nachovizzo flynnamy karfly pableeto hiyyg albertotono g4g peterzhousz alexanderhusc chaiyujin liuguoyou eleisonling 1lovesjohnny sicxu short1st jiaxiangshang ricklentz ricetwice ldz666666 highcwu mirocos will1996 xianyumeng yaoao2017 millerhooks wonlee2019 filatovartm maorp liangdacai gongzhihong baudcode bitfultea kalufinnle ashishd fangbaohui zhangjian94cn pookiefoof tontontremblay zookae jafffy caenorst bruinxiong nick-klothed 3a1b2c3 huangzhengxiang huww98 cmf588124 ajunlonglive virobo-15 deepshwang ytzhang1 yezifeiafei srikalyan longervision miniminke in-game-collectables philipluk 77waiwai-demon wowlive123 jerrjose i-jones ruchithamanne nishadgothoskar yueshell rayguang phoenixdigitalfx buaayan lanzehua someoneserge pholpaphankorn shellmia0 onpix yaowang-bjtu xqterry deanofthewebb likojack zxhxixi stanhome hosseinjavidnia davidchoi76 bartkmiecik sawarae wdshin jonlysun kkpan11 sonsang shenhanqian assassin-plus jianantian eadmonddai vin129 sunyangtian

nvdiffrast's Issues

nvdiffrast + Jax?

Hi!
I have a theoretical question. Nvdiffrast currently supports PyTorch and Tensorflow, and as far as i can see a lot of code are reused for both versions. How many effort is needed to add support of Jax? I understand that jax support is outside nvdiffrast's dev team plans, just wondering.
Please let me know if github issues is not appropriate place for such questions.

DeprecationWarning: 'saved_variables' is deprecated; use 'saved_tensors'

Hi!
I just played a little bit with optimization and suddenly faced with following warnings:

/home/daiver/coding/nvdiffrast/nvdiffrast/torch/ops.py:324: DeprecationWarning: 'saved_variables' is deprecated; use 'saved_tensors'
  attr, rast, tri, rast_db = ctx.saved_variables
/home/daiver/coding/nvdiffrast/nvdiffrast/torch/ops.py:182: DeprecationWarning: 'saved_variables' is deprecated; use 'saved_tensors'
  pos, tri, out = ctx.saved_variables
/home/daiver/coding/nvdiffrast/nvdiffrast/torch/ops.py:583: DeprecationWarning: 'saved_variables' is deprecated; use 'saved_tensors'
  color, rast, pos, tri = ctx.saved_variables
/home/daiver/coding/nvdiffrast/nvdiffrast/torch/ops.py:418: DeprecationWarning: 'saved_variables' is deprecated; use 'saved_tensors'
  tex, uv, uv_da, mip_level_bias, *mip_stack = ctx.saved_variables
/home/daiver/coding/nvdiffrast/nvdiffrast/torch/ops.py:324: DeprecationWarning: 'saved_variables' is deprecated; use 'saved_tensors'
  attr, rast, tri, rast_db = ctx.saved_variables
/home/daiver/coding/nvdiffrast/nvdiffrast/torch/ops.py:182: DeprecationWarning: 'saved_variables' is deprecated; use 'saved_tensors'
  pos, tri, out = ctx.saved_variables

// same lines because i run nvdiffrast in optimization loop

My torch version: 1.9.0+cu102, my nvdiffrast version: 0.2.5.

Currently i cannot create minimal example, but it looks like this issue can be fixed via simple renaming. Please let me know if you still need minimal example or any additional information.
Thank you for great library!

glewInit() failed

I'm on Ubuntu-18.04 and I've installed all dependencies as in the docker file including

source-recompiling glew2.1.0 like in dockerfile
setting evironment variables LD_LIBRARY_PATH=/usr/lib64:$LD_LIBRARY_PATH, PYOPENGL_PLATFORM=egl, and CC=gcc-8.

I'm using torch 1.7.1, cuda 10.2

I keep getting this error on any sample in the torch folder:

[F glutil.inl:188] glewInit() failed, return value = 4

Any idea why?

[E glutil.cpp:248] eglMakeCurrent() failed when setting GL context

Hi, I follow the document and use nvdiffrast in ubuntu18.04LTS with cuda11.4.
I executed the following command which throws an exception.

command:

python3 cube.py --resolution 16 --display-interval 10

exception

No output directory specified, not saving log or images
Mesh has 12 triangles and 8 vertices.
iter=0,err=0.489876
[E glutil.cpp:248] eglMakeCurrent() failed when setting GL context
Traceback (most recent call last):
  File "cube.py", line 200, in <module>
    main()
  File "cube.py", line 191, in main
    mp4save_fn='progress.mp4'
  File "cube.py", line 122, in fit_cube
    color     = render(glctx, r_mvp, vtx_pos, pos_idx, vtx_col, col_idx, resolution)
  File "cube.py", line 30, in render
    rast_out, _ = dr.rasterize(glctx, pos_clip, pos_idx, resolution=[resolution, resolution])
  File "/home/zeming/.local/lib/python3.6/site-packages/nvdiffrast/torch/ops.py", line 241, in rasterize
    return _rasterize_func.apply(glctx, pos, tri, resolution, ranges, grad_db, -1)
  File "/home/zeming/.local/lib/python3.6/site-packages/nvdiffrast/torch/ops.py", line 175, in forward
    out, out_db = _get_plugin().rasterize_fwd(glctx.cpp_wrapper, pos, tri, resolution, ranges, peeling_idx)
RuntimeError: Cuda error: 219[cudaGraphicsMapResources(2, &s.cudaPosBuffer, stream);]
[E glutil.cpp:248] eglMakeCurrent() failed when setting GL context
terminate called after throwing an instance of 'c10::Error'
  what():  Cuda error: 219[cudaGraphicsUnregisterResource(s.cudaPosBuffer);]
Exception raised from rasterizeReleaseBuffers at /home/zeming/.local/lib/python3.6/site-packages/nvdiffrast/common/rasterize.cpp:573 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7efc192eda22 in /home/zeming/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5b (0x7efc192ea3db in /home/zeming/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #2: rasterizeReleaseBuffers(int, RasterizeGLState&) + 0xdb (0x7efaa982e63f in /home/zeming/.cache/torch_extensions/nvdiffrast_plugin/nvdiffrast_plugin.so)
frame #3: RasterizeGLStateWrapper::~RasterizeGLStateWrapper() + 0x33 (0x7efaa9885397 in /home/zeming/.cache/torch_extensions/nvdiffrast_plugin/nvdiffrast_plugin.so)
frame #4: std::default_delete<RasterizeGLStateWrapper>::operator()(RasterizeGLStateWrapper*) const + 0x22 (0x7efaa986c9f2 in /home/zeming/.cache/torch_extensions/nvdiffrast_plugin/nvdiffrast_plugin.so)
frame #5: std::unique_ptr<RasterizeGLStateWrapper, std::default_delete<RasterizeGLStateWrapper> >::~unique_ptr() + 0x49 (0x7efaa98618c9 in /home/zeming/.cache/torch_extensions/nvdiffrast_plugin/nvdiffrast_plugin.so)
frame #6: <unknown function> + 0xab003 (0x7efaa985b003 in /home/zeming/.cache/torch_extensions/nvdiffrast_plugin/nvdiffrast_plugin.so)
frame #7: <unknown function> + 0x4ff688 (0x7efc222e2688 in /home/zeming/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #8: <unknown function> + 0x50098e (0x7efc222e398e in /home/zeming/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #9: python3() [0x5732de]
frame #10: python3() [0x54edd2]
frame #11: python3() [0x588fd8]
frame #12: python3() [0x5add78]
frame #13: python3() [0x5add8e]
frame #14: python3() [0x5add8e]
frame #15: python3() [0x56b606]
<omitting python frames>
frame #21: __libc_start_main + 0xe7 (0x7efc28269bf7 in /lib/x86_64-linux-gnu/libc.so.6)

Any advice ? Thanks !!!

Error when trying to use other gpus rather than gpu 0

Hello, I try to run nvdiffrast on other gpus rather than gpu 0 by running:
CUDA_VISIBLE_DEVICES=1 python samples/torch/triangle.py
However, I got this error:

Traceback (most recent call last):
  File "samples/torch/triangle.py", line 26, in <module>
    rast, _ = dr.rasterize(glctx, pos, tri, resolution=[256, 256])
  File "nvdiffrast/nvdiffrast/torch/ops.py", line 227, in rasterize
    return _rasterize_func.apply(glctx, pos, tri, resolution, ranges, grad_db)
  File "nvdiffrast/nvdiffrast/torch/ops.py", line 169, in forward
    out, out_db = _get_plugin().rasterize_fwd(glctx.cpp_wrapper, pos, tri, resolution, ranges)
RuntimeError: CUDA error: invalid device ordinal

How can I solve this problem?

running docker example failed "__match_any_sync is undefined"

Hi, I meet a "__match_any_sync is undefined" problem. Although I have done the replacement in common.h as mentioned here, I failed to build the texture.o.

The common.h looks like:
`template static device forceinline void swap(T& a, T& b) { T temp = a; a = b; b = temp; }

//------------------------------------------------------------------------
// Coalesced atomics. These are all done via macros.

#define CA_TEMP _ca_temp
#define CA_TEMP_PARAM float CA_TEMP
#define CA_DECLARE_TEMP(threads_per_block) CA_TEMP_PARAM
#define CA_SET_GROUP_MASK(group, thread_mask)
#define CA_SET_GROUP(group)
#define caAtomicAdd(ptr, value) atomicAdd((ptr), (value))
#define caAtomicAdd3_xyw(ptr, x, y, w)
do {
atomicAdd((ptr), (x));
atomicAdd((ptr)+1, (y));
atomicAdd((ptr)+3, (w));
} while(0)

#define caAtomicAddTexture(ptr, level, idx, value) atomicAdd((ptr)+(idx), (value))

//------------------------------------------------------------------------
#endif // CUDACC`

The error is the following:

`Using container image: gltorch:latest
Running command: ./samples/torch/cube.py --resolution 32
No output directory specified, not saving log or images
Mesh has 12 triangles and 8 vertices.
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1515, in _run_ninja_build
env=env)
File "/opt/conda/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./samples/torch/cube.py", line 200, in
main()
File "./samples/torch/cube.py", line 191, in main
mp4save_fn='progress.mp4'
File "./samples/torch/cube.py", line 76, in fit_cube
glctx = dr.RasterizeGLContext()
File "/opt/conda/lib/python3.7/site-packages/nvdiffrast/torch/ops.py", line 142, in init
self.cpp_wrapper = get_plugin().RasterizeGLStateWrapper(output_db, mode == 'automatic')
File "/opt/conda/lib/python3.7/site-packages/nvdiffrast/torch/ops.py", line 83, in get_plugin
torch.utils.cpp_extension.load(name=plugin_name, sources=source_paths, extra_cflags=opts, extra_cuda_cflags=opts, extra_ldflags=ldflags, with_cuda=True, verbose=False)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 974, in load
keep_intermediates=keep_intermediates)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1179, in jit_compile
with_cuda=with_cuda)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1279, in write_ninja_file_and_build_library
error_prefix="Error building extension '{}'".format(name))
File "/opt/conda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1529, in run_ninja_build
raise RuntimeError(message)
RuntimeError: Error building extension 'nvdiffrast_plugin': [1/13] c++ -MMD -MF common.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/common/common.cpp -o common.o
[2/13] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS_ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/common/interpolate.cu -o interpolate.cuda.o
FAILED: interpolate.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/common/interpolate.cu -o interpolate.cuda.o
/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/interpolate.cu(178): error: identifier "__match_any_sync" is undefined

1 error detected in the compilation of "/tmp/tmpxft_0000001b_00000000-6_interpolate.cpp1.ii".
[3/13] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/common/antialias.cu -o antialias.cuda.o
FAILED: antialias.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/common/antialias.cu -o antialias.cuda.o
/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/antialias.cu(550): error: identifier "__match_any_sync" is undefined

1 error detected in the compilation of "/tmp/tmpxft_00000023_00000000-6_antialias.cpp1.ii".
[4/13] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/common/rasterize.cu -o rasterize.cuda.o
FAILED: rasterize.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/common/rasterize.cu -o rasterize.cuda.o
/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/rasterize.cu(66): error: identifier "__match_any_sync" is undefined

1 error detected in the compilation of "/tmp/tmpxft_00000017_00000000-6_rasterize.cpp1.ii".
[5/13] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cu -o texture.cuda.o
FAILED: texture.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cu -o texture.cuda.o
/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cu(593): error: identifier "__match_any_sync" is undefined

/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cu(594): error: identifier "__match_any_sync" is undefined

/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cu(595): error: identifier "__match_any_sync" is undefined

/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cu(596): error: identifier "__match_any_sync" is undefined

/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cu(600): error: identifier "__match_any_sync" is undefined

/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cu(601): error: identifier "__match_any_sync" is undefined

/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cu(602): error: identifier "__match_any_sync" is undefined

/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cu(603): error: identifier "__match_any_sync" is undefined

/opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cu(936): error: identifier "__match_any_sync" is undefined

9 errors detected in the compilation of "/tmp/tmpxft_0000001d_00000000-6_texture.cpp1.ii".
[6/13] c++ -MMD -MF texture.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/common/texture.cpp -o texture.o
[7/13] c++ -MMD -MF torch_antialias.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/torch/torch_antialias.cpp -o torch_antialias.o
[8/13] c++ -MMD -MF torch_interpolate.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/torch/torch_interpolate.cpp -o torch_interpolate.o
[9/13] c++ -MMD -MF rasterize.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/common/rasterize.cpp -o rasterize.o
[10/13] c++ -MMD -MF torch_rasterize.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/torch/torch_rasterize.cpp -o torch_rasterize.o
[11/13] c++ -MMD -MF torch_texture.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/torch/torch_texture.cpp -o torch_texture.o
[12/13] c++ -MMD -MF torch_bindings.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.7/site-packages/torch/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.7/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.7/site-packages/nvdiffrast/torch/torch_bindings.cpp -o torch_bindings.o
ninja: build stopped: subcommand failed.
`

Incorporating projective texture mapping?

Thank you for releasing this powerful tool!

In our research work, we are hoping to leverage nvdiffrast for joint texture and geometry optimization. I would love to hear your comments on whether incorporating projective texture mapping into nvdiffrast is a feasible task?

Given posed RGB images, depth maps, and a texture-less mesh, we hope to use projective texturing to initialize the mesh texture. The key idea depends on projection of posed camera images onto the mesh and then measuring the amount of visual discrepancy between the projected textures.

Thank you!

Does texture(tex, uv) propagate any gradients to uv ?

Hi,

I have a neural net that outputs a uv map, which will be used as input to the texture sampling op. I wonder if the texture op passes any gradients to the uv input during training? Thanks.

wglGetProcAddress() failed for 'glTexImage3D'

Hi, I'm seeing this error using both tensorflow and pytorch versions of nvdiffrast:

(test) PS F:\projects\lib\nvdiffrast\samples\torch> python .\triangle.py
[F C:\Miniconda\envs\test\lib\site-packages\nvdiffrast\common\glutil.cpp:66] wglGetProcAddress() failed for 'glTexImage3D'

My environment is :

Windows Server 2019
VS 2017
CUDA 10.2
Nvidia Tesla V100
python 3.6, torch 1.6, as suggested

Since I'm using an Azure server through remote desktop, I'm wondering if this is because I have no physical monitors connect to the GPU? If so, what can I do to make it run headlessly?

Required OpenGL 4.4+

nvdiffrast relies on OpenGL4.4+ (e.g., egl), while many linux distributions cannot install OpenGL4.4+. Will authors support OpenGL3+ without being restricted to configure heavy docker?

backface culling

First of all, thank you for this library, it's really fast, scales very well and brings differentiable rendering to a usable level for me.

Nvdiffrast seems to render double sided. Is there any way to disable this behaviour?

Runtime Error: glLinkProgram() failed

Hi, I tried to use nvdiffrast with the mentioned document in Windows 10.
When I executed following commad, runtime error happened:

D:\development\anaconda3\envs\dmodel\lib\site-packages\torch\utils\cpp_extension.py:304: UserWarning: Error checking compiler version for cl: 'utf-8' codec can't decode byte 0xd3 in position 0: invalid continuation byte
  warnings.warn(f'Error checking compiler version for {compiler}: {error}')
Traceback (most recent call last):
  File ".\samples\torch\cube.py", line 200, in <module>
    main()
  File ".\samples\torch\cube.py", line 191, in main
    mp4save_fn='progress.mp4'
  File ".\samples\torch\cube.py", line 76, in fit_cube
    glctx = dr.RasterizeGLContext()
  File "D:\development\anaconda3\envs\dmodel\lib\site-packages\nvdiffrast\torch\ops.py", line 151, in __init__
    self.cpp_wrapper = _get_plugin().RasterizeGLStateWrapper(output_db, mode == 'automatic', cuda_device_idx)
RuntimeError: glLinkProgram() failed:
Fragment info
-------------
0(2) : error C7528: OpenGL reserves names starting with 'gl_'
(0) : error C2003: incompatible options for link

my PyOpenGL version is 3.1.5 and glfw version is 2.3.0.

Centos support

Hi all, really nice work and it is very convenient on ubuntu.
However, the online server always centos, will you update some support on centos?
Thanks.

Error: could not select device driver "" with capabilities: [[gpu]]

Hi, I am trying to build up nvdiffrast using the docker. I finished the installation process with no error popped up. When I run ./run_sample.sh ./samples/torch/cube.py --resolution 32, it gives me the error docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]. Any help on this? Is cuda11 not supported yet?

P.S. I am using Ubuntu 20 + cuda11 + pytorch 1.8.1

How to sample textureAtlas for texturing function and join meshes into single scene?

Hi,

Looking through the documentations it seems very confusing how to use texture atlases where each face has a small patch that needs to be interpolated via barycentric coordinates. It seems that the documentation suggests to concatenate multiple texture maps horizontally, i.e. (32x32) * R (R==5) -> (160x32). However if we do this the bilinear sampling would result in wrong results, since we actually want to interpolated for each barycentric coordinate within the R samples.

I'm used to the implementation of SoftRas style by specifcying (N,F,R,R,D) (N-> batch, F->number of faces, R -> size of patch of this face, D-> RGB). And in Pytorch3D they do bilinear interpolation (currently nearest sample) for each face within RxR patch using barycentric coordinates of pixel. Is there a way to achieve the automatic bilinear sampling of texture Atlases in NVDiffrast?

Thanks so much!

corresponding to the envphong example

This demo shows how to optimize the cube map to a target cube map and using shading model of mirror reflection plus a Phong BRDF.
···
def render_refl(ldir, cpos, mvp):
# Transform and rasterize.
viewvec = pos[..., :3] - cpos[np.newaxis, np.newaxis, :] # View vectors at vertices.
reflvec = viewvec - 2.0 * normals[np.newaxis, ...] * torch.sum(normals[np.newaxis, ...] * viewvec, -1, keepdim=True) # Reflection vectors at vertices.
reflvec = reflvec / torch.sum(reflvec**2, -1, keepdim=True)**0.5 # Normalize.

        pos_clip = torch.matmul(pos, mvp.t())[np.newaxis, ...]
        rast_out, rast_out_db = dr.rasterize(glctx, pos_clip, pos_idx, [res, res])
        refl, refld = dr.interpolate(reflvec, rast_out, pos_idx, rast_db=rast_out_db, diff_attrs='all') # Interpolated reflection vectors.

        # Phong light.
        refl = refl / (torch.sum(refl**2, -1, keepdim=True) + 1e-8)**0.5  # Normalize.
        ldotr = torch.sum(-ldir * refl, -1, keepdim=True) # L dot R.

        # Return
        return refl, refld, ldotr, (rast_out[..., -1:] == 0)

    # Render the reflections.
    refl, refld, ldotr, mask = render_refl(lightdir, r_campos, r_mvp)

    # Reference color. No need for AA because we are not learning geometry.
    color = dr.texture(env[np.newaxis, ...], refl, uv_da=refld, filter_mode='linear-mipmap-linear', boundary_mode='cube')
    color = color + phong_rgb_t * torch.max(zero_tensor, ldotr) ** phong_exp # Phong.

···

how can I implement the real reflection model and how to replace the cube map with 2D texture image.
Can you give some advice about this useage. Thanks!

Read-only file system

I installed nvdiffrast using a docker image from dockerhub (https://hub.docker.com/r/rgabdullin/nvdiffrast_docker) and am trying to run the included sample (cube.py). When running the sample I get the error OSError: [Errno 30] Read-only file system despite specifying an output directory with the --outdir flag. Below is the error stack:

File "cube.py", line 200, in <module>
    main()
  File "cube.py", line 182, in main
    fit_cube(
  File "cube.py", line 76, in fit_cube
    glctx = dr.RasterizeGLContext()
  File "/opt/conda/lib/python3.8/site-packages/nvdiffrast/torch/ops.py", line 151, in __init__
    self.cpp_wrapper = _get_plugin().RasterizeGLStateWrapper(output_db, mode == 'automatic', cuda_device_idx)
  File "/opt/conda/lib/python3.8/site-packages/nvdiffrast/torch/ops.py", line 84, in _get_plugin
    torch.utils.cpp_extension.load(name=plugin_name, sources=source_paths, extra_cflags=opts, extra_cuda_cflags=opts, extra_ldflags=ldflags, with_cuda=True, verbose=False)
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 993, in load
    build_directory or _get_build_directory(name, verbose),
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1478, in _get_build_directory
    os.makedirs(build_directory, exist_ok=True)
  File "/opt/conda/lib/python3.8/os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/opt/conda/lib/python3.8/os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/opt/conda/lib/python3.8/os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  [Previous line repeated 2 more times]
  File "/opt/conda/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)
OSError: [Errno 30] Read-only file system: '/om2'

Any suggestions for what might have gone wrong?

NDC Convention

In https://nvlabs.github.io/nvdiffrast/#coordinate-systems, you say that you follow OpenGL conventions.

In OpenGL, NDC-z points into the screen, but you say "z increases towards the viewer".
In OpenGL, the [1,1]-element of the projection matrix (y-scale) is positive (n/x). But it is negative (n/-x) in https://github.com/NVlabs/nvdiffrast/blob/v0.2.6/samples/torch/util.py#L16, which flips the y-axis from view space to clip space.

I am confused.

Non deterministic results of optimization

Hi! First of all - thank you for such a great library, I really enjoyed the performance and API design.

However, I faced with a non-deterministic behavior of nvdiffrast - optimization results differ a lot between runs with the same inputs. For my experiments I use LBFGS but I also observed this problem with Adam in a standard cube.py example.

I added the following lines at the beginning of the main function to fix the result of random functions deterministic between runs.

torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
random.seed(1)
np.random.seed(1)
torch.manual_seed(1)
torch.cuda.manual_seed(1)

And values of loss differs between runs, sometimes final loss on the last iteration can differ up to 2 times between runs.

So I have the following questions:

Is such a big difference between runs ok? I understand that nvdiffrast uses atomic operations which may lead to non-deterministic results but I'm surprised that difference accumulates so fast.
Is it possible to reduce difference accumulation?
Do you have any plans to add a deterministic version of your routines? As far as I know, such non-determinism is ok for DL related stuff. But it makes debugging of other tasks like mesh fitting much harder.

Please let me know if i can provide any additional information. And thank you in advance!

Got cudaErrorInvalidDevice error when not using gpu=0

Hi,

I was able to run the cube.py example when using gpu=0. But when I switched to other gpus by setting CUDA_VISIBLE_DEVICES in the docker container, I got the error below. I'm pretty sure all gpus are exposed to the docker container because 1.) using nvidia-smi in the container returns all gpu info correctly and 2.) a simple tensorflow test example also worked with your docker image. The error seems to happen only with the rasterizer op. So I wonder if somehow rasterizer op has a bug so that it can only use gpu0 on a machine?

Mesh has 12 triangles and 8 vertices.
Setting up TensorFlow plugin "tf_all.cu": Preprocessing... Compiling... Loading... Done.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: Cuda error: cudaErrorInvalidDevice[cudaGraphicsGLRegisterBuffer(&s.cudaPosBuffer, s.glPosBuffer, cudaGraphicsRegisterFlagsWriteDiscard);]
[[{{node RasterizeFwd_1}}]]
[[Mean_1/_37]]
(1) Internal: Cuda error: cudaErrorInvalidDevice[cudaGraphicsGLRegisterBuffer(&s.cudaPosBuffer, s.glPosBuffer, cudaGraphicsRegisterFlagsWriteDiscard);]
[[{{node RasterizeFwd_1}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "samples/tensorflow/cube.py", line 201, in
main()
File "samples/tensorflow/cube.py", line 192, in main
fit_cube(max_iter=5000, resolution=resolution, discontinuous=discontinuous, log_interval=10, display_interval=display_interval, out_dir=out_dir, log_fn='log.txt', imgsave_interval=1000, imgsave_fn='img_%06d.png')
File "samples/tensorflow/cube.py", line 124, in fit_cube
gl_val, _ = util.run([geom_loss, train_op], {mtx_in: r_mvp, lr_in: lr})
File "/app/samples/tensorflow/util.py", line 257, in run
return tf.get_default_session().run(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: Cuda error: cudaErrorInvalidDevice[cudaGraphicsGLRegisterBuffer(&s.cudaPosBuffer, s.glPosBuffer, cudaGraphicsRegisterFlagsWriteDiscard);]
[[node RasterizeFwd_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
[[Mean_1/_37]]
(1) Internal: Cuda error: cudaErrorInvalidDevice[cudaGraphicsGLRegisterBuffer(&s.cudaPosBuffer, s.glPosBuffer, cudaGraphicsRegisterFlagsWriteDiscard);]
[[node RasterizeFwd_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'RasterizeFwd_1':
File "samples/tensorflow/cube.py", line 201, in
main()
File "samples/tensorflow/cube.py", line 192, in main
fit_cube(max_iter=5000, resolution=resolution, discontinuous=discontinuous, log_interval=10, display_interval=display_interval, out_dir=out_dir, log_fn='log.txt', imgsave_interval=1000, imgsave_fn='img_%06d.png')
File "samples/tensorflow/cube.py", line 69, in fit_cube
rast_out_opt, _ = dr.rasterize(pos_clip_opt, pos_idx, resolution=[resolution, resolution], output_db=False)
File "/app/samples/tensorflow/../../nvdiffrast/tensorflow/ops.py", line 108, in rasterize
return func(pos)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/custom_gradient.py", line 168, in decorated
return _graph_mode_decorator(f, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/custom_gradient.py", line 230, in _graph_mode_decorator
result, grad_fn = f(*args)
File "/app/samples/tensorflow/../../nvdiffrast/tensorflow/ops.py", line 97, in func
out, out_db = _get_plugin().rasterize_fwd(pos, tri, resolution, ranges, 0, tri_const)
File "", line 92, in rasterize_fwd
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in init
self._traceback = tf_stack.extract_stack()

Possible memory leak when using nn.DataParallel

Hi, when I use your code to implement multi-gpu training with the provided rasterization, the gpu memory keeps increasing.

I first define a list of instances of the class RasterizeGLContext for each gpu in the init func of pytorch nn.Module class.
During forward, I choose the RasterizeGLContext instance according to the current device id. The gpu memory keeps increasing when I use gpus >= 2.

I don't know whether I wrongly use the code or there exists some bugs in your implementation. If possible, could you provide some sample codes for multi-gpu training? Thanks!

Unsupported gpu architecture 'compute_86'

I am trying to run the docker container in a GeForce RTX 3090
the command bash ./run_sample.sh ./samples/torch/cube.py --resolution 32 is giving me following error :

Using container image: gltorch:latest
Running command: ./samples/torch/cube.py --resolution 32
No output directory specified, not saving log or images
Mesh has 12 triangles and 8 vertices.
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1533, in _run_ninja_build
subprocess.run(
File "/opt/conda/lib/python3.8/subprocess.py", line 512, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "./samples/torch/cube.py", line 200, in
main()
File "./samples/torch/cube.py", line 182, in main
fit_cube(
File "./samples/torch/cube.py", line 76, in fit_cube
glctx = dr.RasterizeGLContext()
File "/opt/conda/lib/python3.8/site-packages/nvdiffrast/torch/ops.py", line 151, in init
self.cpp_wrapper = get_plugin().RasterizeGLStateWrapper(output_db, mode == 'automatic', cuda_device_idx)
File "/opt/conda/lib/python3.8/site-packages/nvdiffrast/torch/ops.py", line 84, in get_plugin
torch.utils.cpp_extension.load(name=plugin_name, sources=source_paths, extra_cflags=opts, extra_cuda_cflags=opts, extra_ldflags=ldflags, with_cuda=True, verbose=False)
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 986, in load
return jit_compile(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1193, in jit_compile
write_ninja_file_and_build_library(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1297, in write_ninja_file_and_build_library
run_ninja_build(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1555, in run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'nvdiffrast_plugin': [1/14] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/rasterize.cu -o rasterize.cuda.o
FAILED: rasterize.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/rasterize.cu -o rasterize.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_86'
[2/14] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS_ --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/interpolate.cu -o interpolate.cuda.o
FAILED: interpolate.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/interpolate.cu -o interpolate.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_86'
[3/14] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/texture.cu -o texture.cuda.o
FAILED: texture.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/texture.cu -o texture.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_86'
[4/14] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/antialias.cu -o antialias.cuda.o
FAILED: antialias.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -DNVDR_TORCH -std=c++14 -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/antialias.cu -o antialias.cuda.o
nvcc fatal : Unsupported gpu architecture 'compute_86'
[5/14] c++ -MMD -MF common.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/common.cpp -o common.o
[6/14] c++ -MMD -MF texture.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/texture.cpp -o texture.o
[7/14] c++ -MMD -MF glutil.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/glutil.cpp -o glutil.o
[8/14] c++ -MMD -MF torch_rasterize.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/torch/torch_rasterize.cpp -o torch_rasterize.o
[9/14] c++ -MMD -MF torch_texture.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/torch/torch_texture.cpp -o torch_texture.o
[10/14] c++ -MMD -MF torch_interpolate.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/torch/torch_interpolate.cpp -o torch_interpolate.o
[11/14] c++ -MMD -MF rasterize.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/common/rasterize.cpp -o rasterize.o
[12/14] c++ -MMD -MF torch_antialias.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/torch/torch_antialias.cpp -o torch_antialias.o
[13/14] c++ -MMD -MF torch_bindings.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /opt/conda/lib/python3.8/site-packages/nvdiffrast/torch/torch_bindings.cpp -o torch_bindings.o
ninja: build stopped: subcommand failed.

Here is my nvidia-smi output -

+-----------------------------------------------------------------------------+
``

Why is abs() not applied to area calculation?

I have a question about backward of rasterization.

It seems that the code contains area calculation to obtain barycentric coordinate.
However there was no abs() function.

float a0 = p1x*p2y - p1y*p2x;
float a1 = p2x*p0y - p2y*p0x;
float a2 = p0x*p1y - p0y*p1x;

https://github.com/NVlabs/nvdiffrast/blob/main/nvdiffrast/common/rasterize.cu#L82

The case will not happen when 'a0', 'a1', and 'a2' are negative values?

add examples

Any plans to add some face examples or some projects like this?

How can I install nvdiffrast in a centos env ?

Thanks for sharing your amazing work.

Due to some restrictions, I can only build a docker image with centos. But from the dockerfile, I found some libs that are only available in ubuntu, and I am not sure how to modify the dockerfile to be compatible with centos.

Can you help me with this ?

No module named 'OpenGL'

I've been trying the examples. Everything works great except displaying the images. I am able to generate the videos. I get an error on line 69 of samples/torch/util.py saying:

~/PycharmProjects/nvdiffrast$ sudo bash run_sample.sh ./samples/torch/pose.py --outdir samples_output --display-interval 1
Using container image: gltorch:latest
Running command: ./samples/torch/pose.py --outdir samples_output --display-interval 1
Saving results under samples_output/pose
Mesh has 12 triangles and 24 vertices.
rep=0,iter=0,err=50.853275,err_best=50.853275,loss=0.153916,loss_best=0.153916,lr=0.010000,nr=1.000000
Traceback (most recent call last):
  File "./samples/torch/pose.py", line 290, in <module>
    main()
  File "./samples/torch/pose.py", line 281, in main
    mp4save_fn='progress.mp4'
  File "./samples/torch/pose.py", line 243, in fit_pose
    util.display_image(result_image, size=display_res, title='(%d) %d / %d' % (rep, it, max_iter))
  File "/app/samples/torch/util.py", line 69, in display_image
    import OpenGL.GL as gl
ModuleNotFoundError: No module named 'OpenGL'

I tried adding pip install PyOpenGL to Dockerfile but it caused another error. Then I tried replacing util.display_image with OpenCV imshow using a few different versions but this caused errors and does not seem a good direction.

silhouette loss possibility

Hello and thank you for releasing this amazing library!

I'm experimenting with implementations of different approaches to joint geometry optimization and inverse rendering, and I've tried implementing a sillhouette (mask) loss approach. Geometry optimization with texture works very well, but when I input a ones tensor into the texturing step and compare against the alpha mask of the input (synthetic data, so I have ground truth mask), no gradient seems to flow into the vertices (or anywhere). I also tried differentiable gaussian blur and it didn't seem to help. Is it in principle possible, or am I completely off road trying this?

Thank you!

Camera pose optimization supported?

Hello all! Thank you again for open-sourcing this powerful tool.

I am wondering that camera pose optimization is trivially supported by nvdiffrast? The rendering pipeline would look like this:

I went ahead and implemented a training loop, but the training outcome wasn't always successful. Camera either sometimes drifts away or simply doesn't converge. Other times the camera can find the optimal solution. Generally, the camera has a higher chance of drifting away when the initial error is large.

My guess is that anti-aliasing gradients (as opposed to soft rasterization), can only reason at a very local, pixel-level scale and cannot reason at a far distance away from an object. Hence the need for gradient-free greedy optimization in samples/pose.py?

Thank you!

RasterizeGLContext return killed

Hi, my environment is ubuntu18.04,cuda10.1, nvidia driver 430.46 and torch1.6. I encountered the error killed in RasterizeGLContext class? I have install all dependence such as libglvnd0. Is it because of the low version of nvidia driver？

Support to rasterize a batch of triangles

Thanks for the code!

I want to use this code to rasterize a batch of meshes with different triangles. However, in your definition, it seems that this implementation only considers a same triangle for a batch. Could you provide an implementation that supports rasterization of a batch of triangles?

nvdiffrast/nvdiffrast/torch/ops.py

Line 202 in a4e7a4d

tri: Triangle tensor with shape [num_triangles, 3] and dtype `torch.int32`.

VS Enterprise edition not detected

The linker search in torch/ops.py (here) doesn't have 'Enterprise' in the list of potential version names.

identifier "__match_any_sync" is undefined

I get the above error when running your code in a docker container. I suppose it's due to the fact that warp match functions are only supported by devices of compute capability 7.x or higher and my GPU has the compute capability of 6.1. Could you provide a workaround for such cases? Thank you!

nvidia driver version

Hi, I have ran the demo with nvidia driver 450.102.04. However, I failed when I used other environment with nvidia driver 450.80.02. Could you tell me your nvidia driver version?

Bogus error in even-channeled texture mapping

Using pytorch, calling dr.texture for a texture with an even number of channels gives the error "tex or mip input tensor not aligned to float4" (or "float2" for 2 channels). To reproduce, replace line 34 in samples/torch/earth.py with the following (making the texture 4-channeled and discarding the last channel after texturing; should be perfectly OK):

        tex = torch.cat((tex, tex[...,:1]),-1)
        color = dr.texture(tex[None, ...], texc, texd, filter_mode='linear-mipmap-linear', max_mip_level=max_mip_level)[..., :3]

This seems to be due to this and this line in torch_texture.cpp, where the condition should probably read i <= p.mipLevelMax instead of 0 <= p.mipLevelMax.

what's the meaning of "Calculate footprint axis lengths."

Hi, I'm looking at the differentiable details. the code calculates the MIP level through sample footprint in the forward pass, but I don't understand what area to calculate,That's the code below.

in texture.cu line742-line749
float A = dsdx * dsdx + dtdx * dtdx;
float B = dsdy * dsdy + dtdy * dtdy;
float C = dsdx * dsdy + dtdx * dtdy;
float l2b = 0.5 * (A + B);
float l2n = 0.25 * (A - B) * (A - B) + C * C;
float l2a = sqrt(l2n);
float lenMinorSqr = fmaxf(0.0, l2b - l2a);
float lenMajorSqr = l2b + l2a;

run sample failed

python cube.py --resolution 16 --display-interval 10

got error

[F glutil.cpp:338] eglInitialize() failed
Aborted (core dumped)

I have installed "apt-get install libegl1-mesa-dev" and run the demo in linux server with nvidia-gpu

Unexpected small gradients at texture seams

The following script optimizes the texture of two triangles towards zero.

import torch
import nvdiffrast.torch as dr
import matplotlib.pyplot as plt

def tensor(*args, **kwargs):
    return torch.tensor(*args, device='cuda', **kwargs)

img_size = 64
tex_size = 64
pos = tensor([[[-0.8, -0.8, 0, 1], [0.8, -0.8, 0, 1],
             [-0.8, 0.8, 0, 1], [0.8, 0.8, 0, 1]]], dtype=torch.float32)
tri = tensor([[0, 1, 2],[1, 3, 2]], dtype=torch.int32)
vert_uv = tensor([[[0.1, 0.1], [0.7, 0.1], [0.1, 0.7],
    [0.9, 0.9], [0.3, 0.9], [0.9, 0.3]]], dtype=torch.float32)
tri_uv = tensor([[0, 1, 2],[3, 4, 5]], dtype=torch.int32)
tex = torch.full((1, tex_size, tex_size, 1), dtype=torch.float32, fill_value=1, device='cuda', requires_grad=True)

rows = []
losses = []
glctx = dr.RasterizeGLContext()
optim = torch.optim.SGD([tex],lr=1e2)
for i in range(int(1e4)):
    optim.zero_grad()
    rast, rast_db = dr.rasterize(glctx, pos, tri, resolution=[img_size, img_size])
    uv, uv_da = dr.interpolate(vert_uv, rast, tri_uv, rast_db, diff_attrs='all')
    img = dr.texture(tex, uv, filter_mode='linear')#, uv_da)
    img = img * torch.clamp(rast[..., -1:], 0, 1) # Mask out background.
    loss = (img**2).mean()
    loss.backward()
    optim.step()
    rows.append(img[0,img_size//2,:,0].detach().cpu().numpy())
    losses.append(loss.item())

plt.subplot(2,2,1)
plt.imshow(tex[0].detach().cpu())
v = -.5 + tex_size * vert_uv[0,tri_uv[:,[0,1,2,0]].type(torch.long)].cpu().numpy()
plt.plot(v[0,:,0],v[0,:,1],'k')
plt.plot(v[1,:,0],v[1,:,1],'k')
plt.colorbar()
plt.title('tex')

plt.subplot(2,2,3)
plt.imshow(img[0].detach().cpu())
v = -.5 + img_size * (pos[0,tri[:,[0,1,2,0]].type(torch.long)]/2+.5).cpu().numpy()
plt.plot(v[0,:,0],v[0,:,1],'k')
plt.plot(v[1,:,0],v[1,:,1],'k')
plt.colorbar()
plt.title('image')

plt.subplot(2,2,2)
plt.plot(rows)
plt.title(f'image row {img_size//2}')
plt.xlabel('iteration')
plt.xscale('log')
plt.yscale('log')

plt.subplot(2,2,4)
plt.plot(losses)
plt.title('loss')
plt.xlabel('iteration')
plt.xscale('log')
plt.yscale('log')

plt.show()

I get the expected result when I run it with filter_mode='nearest':

But when I switch to filter_mode='linear', the training of the pixels at the edges slows down after ~100 iterations and pixel values are almost stuck at small constant values.

The plot looks similar with mipmapping enabled.

the approximation formula for screen-space derivatives duv/dxy

Hello, I'm looking into the geometry shader code that outputs the screen-space barycentric derivatives, and am a little confused by the formulation. I'm wondering if anyone would be able to clarify this mathematical approximation a little bit? Some questions I had are

why do we multiply by w when computing the area ?
Why do we need to define u in the u/w / 1/w way to derive it? In this case, why is d(u/w)/dX constant -- shouldn't u change with X?
I am mainly puzzled by the formulation of the following, and how they eventually lead to the formulation of duv/dxy for each vertex.
float duwdx = w2 * dudx;
float dvwdx = w2 * dvdx;
float duvdx = w0 * dudx + w1 * dvdx;
float duwdy = w2 * dudy;
float dvwdy = w2 * dvdy;
float duvdy = w0 * dudy + w1 * dvdy;
`
// Outputs:
// var_uvzw: (u,v,z,w)
// var_db: (du/dx,du/dy,dv/dx,dv/dy)

// Set up geometry shader. Calculation of per-pixel bary differentials is based on:
// u = (u/w) / (1/w)
// --> du/dX = d((u/w) / (1/w))/dX
// --> du/dX = [d(u/w)/dX - u*d(1/w)/dX] * w
// and we know both d(u/w)/dX and d(1/w)/dX are constant over triangle.compileGLShader(NVDR_CTX_PARAMS, &s.glGeometryShader, GL_GEOMETRY_SHADER,

 "#version 430\n"
        STRINGIFY_SHADER_SOURCE(
            layout(triangles) in;
            layout(triangle_strip, max_vertices=3) out;
            layout(location = 0) uniform vec2 vp_scale;
            in int v_layer[];
            in int v_offset[];
            out vec4 var_uvzw;
            out vec4 var_db;
            void main()
            {
                // Plane equations for bary differentials.
                float w0 = gl_in[0].gl_Position.w;
                float w1 = gl_in[1].gl_Position.w;
                float w2 = gl_in[2].gl_Position.w;
                vec2 p0 = gl_in[0].gl_Position.xy;
                vec2 p1 = gl_in[1].gl_Position.xy;
                vec2 p2 = gl_in[2].gl_Position.xy;
                vec2 e0 = p0*w2 - p2*w0;
                vec2 e1 = p1*w2 - p2*w1;
                float a = e0.x*e1.y - e0.y*e1.x; 

                // Clamp area to an epsilon to avoid arbitrarily high bary differentials.
                float eps = 1e-6f; // ~1 pixel in 1k x 1k image.
                float ca = (abs(a) >= eps) ? a : (a < 0.f) ? -eps : eps; // Clamp with sign.
                float ia = 1.f / ca; // Inverse area.

                vec2 ascl = ia * vp_scale;

                float dudx =  e1.y * ascl.x; 
                float dudy = -e1.x * ascl.y;
                float dvdx = -e0.y * ascl.x; 
                float dvdy =  e0.x * ascl.y;

                float duwdx = w2 * dudx; 
                float dvwdx = w2 * dvdx;
                float duvdx = w0 * dudx + w1 * dvdx;
                float duwdy = w2 * dudy;
                float dvwdy = w2 * dvdy;
                float duvdy = w0 * dudy + w1 * dvdy;

                vec4 db0 = vec4(duvdx - dvwdx, duvdy - dvwdy, dvwdx, dvwdy);
                vec4 db1 = vec4(duwdx, duwdy, duvdx - duwdx, duvdy - duwdy);
                vec4 db2 = vec4(duwdx, duwdy, dvwdx, dvwdy);

                int layer_id = v_layer[0];
                int prim_id = gl_PrimitiveIDIn + v_offset[0];

                gl_Layer = layer_id; gl_PrimitiveID = prim_id; gl_Position = vec4(gl_in[0].gl_Position.x, gl_in[0].gl_Position.y, gl_in[0].gl_Position.z, gl_in[0].gl_Position.w); var_uvzw = vec4(1.f, 0.f, gl_in[0].gl_Position.z, gl_in[0].gl_Position.w); var_db = db0; EmitVertex();
                gl_Layer = layer_id; gl_PrimitiveID = prim_id; gl_Position = vec4(gl_in[1].gl_Position.x, gl_in[1].gl_Position.y, gl_in[1].gl_Position.z, gl_in[1].gl_Position.w); var_uvzw = vec4(0.f, 1.f, gl_in[1].gl_Position.z, gl_in[1].gl_Position.w); var_db = db1; EmitVertex();
                gl_Layer = layer_id; gl_PrimitiveID = prim_id; gl_Position = vec4(gl_in[2].gl_Position.x, gl_in[2].gl_Position.y, gl_in[2].gl_Position.z, gl_in[2].gl_Position.w); var_uvzw = vec4(0.f, 0.f, gl_in[2].gl_Position.z, gl_in[2].gl_Position.w); var_db = db2; EmitVertex();
            }
        )
    );

nvdiffrast's custom torch extension build fails on Google Colab

Running nvdiffrast on Colab is probably possible but it doesn't currently work.

Something like this should work (this uses Google Drive mounts to make the nvdiffrast checkout editable -- if there's a better method for updating code on Colab, please leave a comment!):

!pip install ninja
from google.colab import drive
drive.mount('/content/drive')
%cd "/content/drive/My Drive/colab"
!git clone https://github.com/NVlabs/nvdiffrast.git
%cd "nvdiffrast"
!pip install .

%cd ".."
import nvdiffrast
import nvdiffrast.torch as dr
gl = dr.RasterizeGLContext()

The above fails with the following exception:

RuntimeError                              Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py in _run_ninja_build(build_directory, verbose, error_prefix)
   1536         if hasattr(error, 'output') and error.output:  # type: ignore
   1537             message += ": {}".format(error.output.decode())  # type: ignore
-> 1538         raise RuntimeError(message) from e
   1539 
   1540 
RuntimeError: Error building extension 'nvdiffrast_plugin'

The most likely reason is that the Torch extension build fails on some compiler or linker error. Unfortunately, the ninja build output logs get somehow eaten by Colab or our code, and it's hard to figure out what's wrong.

About the first running time

Thanks for your code. I have a question that why the time is so long at the first running time. I run the code in the docker which you provide. I just realize a function which is like warpaffine.
The running time is as follows

image shape : 2448 3264 warp_img time is 0.318
image shape : 1024 1024 warp_img time is 0.019
image shape : 720 1407 warp_img time is 0.020
image shape : 2448 3264 warp_img time is 0.085
image shape : 1920 2560 warp_img time is 0.055
image shape : 634 951 warp_img time is 0.010
image shape : 1944 2592 warp_img time is 0.056
image shape : 900 1600 warp_img time is 0.022
image shape : 720 1424 warp_img time is 0.018

Can I avoid the cold boot？

Looking forward to your reply.

Unit of image-space coordinates

Hello! Thank you for open-sourcing this powerful tool.

When calculating the image-space derivative du/dX, etc. , using the image-space coordinate (X,Y), what is the unit of X and Y? Is the unit "pixel" (i.e., 0<=X<width, 0<=Y<height), or normalized coordinate (0<=X<=1, 0<=Y<=1)?

ImportError: No module named 'nvdiffrast_plugin'

When I run codes in ./samples/torch，there is always an error: No module named 'nvdiffrast_plugin'

Traceback (most recent call last):
File "triangle.py", line 21, in
glctx = dr.RasterizeGLContext()
File "/opt/conda/envs/fomm/lib/python3.7/site-packages/nvdiffrast/torch/ops.py", line 142, in init
self.cpp_wrapper = _get_plugin().RasterizeGLStateWrapper(output_db, mode == 'automatic')
File "/opt/conda/envs/fomm/lib/python3.7/site-packages/nvdiffrast/torch/ops.py", line 83, in _get_plugin
torch.utils.cpp_extension.load(name=plugin_name, sources=source_paths, extra_cflags=opts, extra_cuda_cflags=opts, extra_ldflags=ldflags, with_cuda=True, verbose=False)
File "/opt/conda/envs/fomm/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1091, in load
keep_intermediates=keep_intermediates)
File "/opt/conda/envs/fomm/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1317, in _jit_compile
return _import_module_from_library(name, build_directory, is_python_module)
File "/opt/conda/envs/fomm/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1706, in _import_module_from_library
file, path, description = imp.find_module(module_name, [path])
File "/opt/conda/envs/fomm/lib/python3.7/imp.py", line 299, in find_module
raise ImportError(_ERR_MSG.format(name), name=name)
ImportError: No module named 'nvdiffrast_plugin'

It seems like that some packages are lost.
I install nvdiffrast as the instruction in document ----cd ./nvdiffrast and pip install .
I uninstall and install many times but this error still exists. I try installing in cuda10.0, torch 1.6, cuda11.1, torch 1.8.1, and Cuda 9.0, torch 1.6, but all these situations have this error. I use an Nvidia 3090 GPU.
Is there anyone who can solve this problem? Thanks.

[F glutil.cpp:366] eglCreateContext() failed ... RuntimeError: OpenGL 4.4 or later is required

Hi,

I tried to run it on local Linux machines following installations in the docker file, and I also tried to install on Docker image running Ubuntu 20.04 + CUDA 11, both seem to fail with the following message. I'm running on an AWS linux server so not sure if this is related to headless display.

Creating GL context for Cuda device 0
Failed, falling back to default display
eglInitialize() failed
eglChooseConfig() failed
eglCreateContext() failed
EGL 1471947312.32765 OpenGL context created (disp: 0x0000000082415470, ctx: 0x0000000000000000)
setGLContext() called with null gltcx
Traceback (most recent call last):
  File "nvdiffrast/samples/torch/envphong.py", line 226, in <module>
    main()
  File "nvdiffrast/samples/torch/envphong.py", line 211, in main
    fit_env_phong(
  File "nvdiffrast/samples/torch/envphong.py", line 77, in fit_env_phong
    glctx = dr.RasterizeGLContext()
  File "/usr/local/lib/python3.8/dist-packages/nvdiffrast/torch/ops.py", line 151, in __init__
    self.cpp_wrapper = _get_plugin().RasterizeGLStateWrapper(output_db, mode == 'automatic', cuda_device_idx)
RuntimeError: OpenGL 4.4 or later is required

Is there a setting that I need to set up for it to work? Thanks so much!!

error run inside WSL2 with cuda and pytorch

python3 pose.py

No output directory specified, not saving log or images
Mesh has 12 triangles and 24 vertices.
[F glutil.cpp:338] eglInitialize() failed
Aborted

install nvdiffrast got error

My environment is ubuntu 18.04.1, cuda 10.0, gcc 7.4.0, and pytorch 1.6.0.
I follow the instruction and install nvdiffrast in local Python site-packages by running
pip install . at the root of the repository
then I run the the script as follows:
python samples/torch/triangle.py
error accurs:

  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 974, in load
    keep_intermediates=keep_intermediates)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1179, in _jit_compile
    with_cuda=with_cuda)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1246, in _write_ninja_file_and_build_library
    verify_ninja_availability()
  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1302, in verify_ninja_availability
    raise RuntimeError("Ninja is required to load C++ extensions")
RuntimeError: Ninja is required to load C++ extensions

then I install ninja as follows:

git clone https://github.com/ninja-build/ninja.git
./configure.py --bootstrap
cp ./ninja  /usr/bin

rerun the script:

python samples/torch/triangle.py

and got the error:

Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1515, in _run_ninja_build
    env=env)
  File "/root/miniconda3/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "samples/torch/triangle.py", line 21, in <module>
    glctx = dr.RasterizeGLContext()
  File "/root/miniconda3/lib/python3.7/site-packages/nvdiffrast/torch/ops.py", line 151, in __init__
    self.cpp_wrapper = _get_plugin().RasterizeGLStateWrapper(output_db, mode == 'automatic', cuda_device_idx)
  File "/root/miniconda3/lib/python3.7/site-packages/nvdiffrast/torch/ops.py", line 84, in _get_plugin
    torch.utils.cpp_extension.load(name=plugin_name, sources=source_paths, extra_cflags=opts, extra_cuda_cflags=opts, extra_ldflags=ldflags, with_cuda=True, verbose=False)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 974, in load
    keep_intermediates=keep_intermediates)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1179, in _jit_compile
    with_cuda=with_cuda)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1279, in _write_ninja_file_and_build_library
    error_prefix="Error building extension '{}'".format(name))
  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1529, in _run_ninja_build
    raise RuntimeError(message)
RuntimeError: Error building extension 'nvdiffrast_plugin': [1/14] c++ -MMD -MF common.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/TH -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /root/miniconda3/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /root/miniconda3/lib/python3.7/site-packages/nvdiffrast/common/common.cpp -o common.o 
[2/14] c++ -MMD -MF torch_rasterize.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/TH -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /root/miniconda3/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /root/miniconda3/lib/python3.7/site-packages/nvdiffrast/torch/torch_rasterize.cpp -o torch_rasterize.o 
FAILED: torch_rasterize.o 
[12/14] c++ -MMD -MF torch_antialias.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/TH -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /root/miniconda3/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /root/miniconda3/lib/python3.7/site-packages/nvdiffrast/torch/torch_antialias.cpp -o torch_antialias.o 
[13/14] c++ -MMD -MF torch_bindings.o.d -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/TH -isystem /root/miniconda3/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /root/miniconda3/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -DNVDR_TORCH -c /root/miniconda3/lib/python3.7/site-packages/nvdiffrast/torch/torch_bindings.cpp -o torch_bindings.o 
ninja: build stopped: subcommand failed.

so I go to /root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py and modify

'ninja', '-v'

'ninja', '--v'

then rerun the script but got error as follows:

Traceback (most recent call last):
  File "samples/torch/triangle.py", line 21, in <module>
    glctx = dr.RasterizeGLContext()
  File "/root/miniconda3/lib/python3.7/site-packages/nvdiffrast/torch/ops.py", line 151, in __init__
    self.cpp_wrapper = _get_plugin().RasterizeGLStateWrapper(output_db, mode == 'automatic', cuda_device_idx)
  File "/root/miniconda3/lib/python3.7/site-packages/nvdiffrast/torch/ops.py", line 84, in _get_plugin
    torch.utils.cpp_extension.load(name=plugin_name, sources=source_paths, extra_cflags=opts, extra_cuda_cflags=opts, extra_ldflags=ldflags, with_cuda=True, verbose=False)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 974, in load
    keep_intermediates=keep_intermediates)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1190, in _jit_compile
    return _import_module_from_library(name, build_directory, is_python_module)
  File "/root/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1534, in _import_module_from_library
    file, path, description = imp.find_module(module_name, [path])
  File "/root/miniconda3/lib/python3.7/imp.py", line 296, in find_module
    raise ImportError(_ERR_MSG.format(name), name=name)
ImportError: No module named 'nvdiffrast_plugin'

anyone can help me ? Thanks !!!

What is pos?

Thanks for releasing this code--if I can get it to work, then it should be quite helpful :)

The function nvdiffrast.torch.rasterize(glctx, pos, tri, resolution, ranges=None, grad_db=True) takes an argument, pos which I assume is supposed to be the positions of vertices in a mesh. The documentation seems to indicate that this should be a 3-coord dimension: "x/w, y/w, z/w".
However, the dimension needs to be [num_verts, 4] which doesn't correspond to a 3D coordinate encoding.

I tried playing around with values in the triangle.py example, but couldn't figure it out.

Could you clarify what this argument is supposed to be?

Potential memory leak with rasterize()

Hello,
When running several optimizations in a script I noticed that my GPU eventually runs out of memory, thus causing the script to fail.
From looking at nvidia-smi after each run of an optimization, it seems that some memory is never freed (except when the process is killed of course).

Here is a minimal reproducer:

import nvdiffrast.torch as dr
import torch

def render_dummy():
    glctx = dr.RasterizeGLContext()
    # Create the NDCs of one dummy triangle seen from 16 dummy viewpoints
    v = torch.ones((16,3,4), device='cuda')
    f = torch.tensor([[0,1,2]], device='cuda', dtype=torch.int32)
    dr.rasterize(glctx, v, f, (1080, 1920))

Then, running

render_dummy()
torch.cuda.empty_cache()

several times in a jupyter notebook, and checking nvidia-smi in between calls shows the growing memory used by the process.

Alternatively, running

for i in range(20):
    render_dummy()
    torch.cuda.empty_cache()

Should be enough to make the GPU run out of memory (I have a Titan RTX on my end).

The size of the leak seems to be proportional to the number of viewpoints or the resolution, which makes me suspect that the framebuffer is not properly freed (provided I'm not to blame here 😅). For example, with the resolution and viewpoints in the example above, the leak on my end is 1080MiB large, which is pretty close to the size of the result of rasterize.

Also, here's the log output from running the dummy rendering function once with dr.set_log_level(0):

[I glutil.cpp:322] Creating GL context for Cuda device 0
[I glutil.cpp:370] EGL 5.1 OpenGL context created (disp: 0x0000555af6bcdd70, ctx: 0x0000555af6cf7141)
[I rasterize.cpp:91] OpenGL version reported as 4.6
[I rasterize.cpp:332] Increasing position buffer size to 192 float32
[I rasterize.cpp:343] Increasing triangle buffer size to 64 int32
[I rasterize.cpp:368] Increasing frame buffer size to (width, height, depth) = (1920, 1088, 16)
[I rasterize.cpp:394] Increasing range array size to 64 elements
[I glutil.cpp:391] EGL OpenGL context destroyed (disp: 0x0000555af6bcdd70, ctx: 0x0000555af6cf7141)

I initially noticed this behavior using nvidffrast v0.2.0, but I since updated to 0.2.5, which didn't change anything.

_texture_funcBackward returns nan values in cube mode

The following example adds a cube texture to the triangle torch sample:

import torch
import nvdiffrast.torch as dr
import matplotlib.pyplot as plt

def tensor(*args, **kwargs):
    return torch.tensor(*args, device='cuda', **kwargs)

pos = tensor([[[-0.8, -0.8, .2, 1], [0.8, -0.8, .2, 1],
             [-0.8, 0.8, .2, 1]]], dtype=torch.float32)
col = tensor([[[1, 0, 0], [0, 1, 0], [0, 0, 1]]], dtype=torch.float32)
tri = tensor([[0, 1, 2]], dtype=torch.int32)
tex = torch.rand((1, 6, 128, 128, 3), device='cuda', requires_grad=True)
vert_uv = pos[..., :3].clone()

glctx = dr.RasterizeGLContext()
rast, _ = dr.rasterize(glctx, pos, tri, resolution=[512, 512])
uv, _ = dr.interpolate(vert_uv, rast, tri)
out = dr.texture(tex, uv, boundary_mode='cube')
out.mean().backward()

plt.imshow(out[0].detach().cpu())
plt.show()

When I add with torch.autograd.detect_anomaly():, it fails with

...\test_tex.py:15: UserWarning: Anomaly Detection has been enabled. This mode will increase the runtime and should only be enabled for debugging.
  with torch.autograd.detect_anomaly():
[W ..\torch\csrc\autograd\python_anomaly_mode.cpp:104] Warning: Error detected in _texture_funcBackward. Traceback of forward call that caused the error:
  File "...\test_tex.py", line 19, in <module>
    out = dr.texture(tex, uv, boundary_mode='cube')
  File "...\nvdiffrast\torch\ops.py", line 541, in texture
    return _texture_func.apply(filter_mode, tex, uv, filter_mode_enum, boundary_mode_enum)
 (function _print_stack)
Traceback (most recent call last):
  File "...t\test_tex.py", line 20, in <module>
    out.mean().backward()
  File "...\torch\_tensor.py", line 255, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "...\torch\autograd\__init__.py", line 147, in backward
    Variable._execution_engine.run_backward(
RuntimeError: Function '_texture_funcBackward' returned nan values in its 1th output.

nvdiffrast 0.2.5
Windows 10, VS2019
pytorch 1.9.0

How to incorporate Cook-Torrance BRDF?

Hi awesome people,

I'm curious if there's a way to use ideas https://www.microsoft.com/en-us/research/wp-content/uploads/2009/12/sg.pdf to incorporate Cook-Torrance Gaussian Mixture model as BRDF. It looks like I need to put in my own prefiltered MIPMAP. However, if I do that, I still need to multiply each pixel with the correct Fresnel and Shadow. How do I do that? Is it possible to request an example?

In addition, would gradient be propagated to input MIPMAP if I did that? How do I make sure gradients are propagated back as well?

Error

ImportError: /tmp/torch_extensions/nvdiffrast_plugin/nvdiffrast_plugin.so: undefined symbol: eglCreateContext
How to solve like this?

nvlabs / nvdiffrast Goto Github PK

nvdiffrast's Introduction

Nvdiffrast – Modular Primitives for High-Performance Differentiable Rendering

Licenses

Citation

nvdiffrast's People

Contributors

Stargazers

Watchers

Forkers

nvdiffrast's Issues

python3 pose.py

Recommend Projects

Recommend Topics

Recommend Org