Hello, this is more a thread discussion than a real issue, but I've

I think you should replace x.type() by <code class="n

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

what's your pytorch version ? <div class="highlight highlight-source-python notran

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Discussion about cuda kernel,about pytorch/extension-cpp

ClementPinard commented on July 30, 2024 1

I think you should replace x.type() by x.options()

auto ROI_pos = at::zeros({x.size(0), x.size(1)}, x.options());
The key here is that the second argument is not the type but a list of options, within which can be found the type, but also e.g. the device

more info : https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/TensorOptions.h

this is done e.g. here : https://github.com/pytorch/pytorch/master/aten/src/ATen/native/SummaryOps.cpp#L36

from extension-cpp.

goldsborough commented on July 30, 2024

Thanks for contributing your ideas. I like how it makes the CUDA kernel shorter and more readable (assuming one knows what the macro does). It's important to note, however, that any use of TH things is not officially supported in C++ extensions. TH is a very low level backend to PyTorch and an active construction site. We remove or change things in it almost every day and there is no guarantee of any kind that THCDeviceTensor will still exist tomorrow. ATen is the only supported interface to PyTorch. It's fine to use TH things for your project as long as it works, but we won't advertise it for all users. It may be worth adding convenient functionality such as this to ATen directly, to make writing CUDA kernels easier.

from extension-cpp.

braveapple commented on July 30, 2024

at::Tensor max_ROI_cuda(
  at::Tensor x,
  at::Tensor ROI_size
) {
  const auto batch_size = x.size(0);
  const auto channel_num = x.size(1);
  const auto feat_height = x.size(2);
  const auto feat_width = x.size(3);

  auto ROI_pos = at::zeros({x.size(0), x.size(1)}, x.type());

  const dim3 blocksPerGrid(1); // 1 block per grid (1D) (x, )
  const dim3 threadsPerBlock(batch_size, channel_num); // batch_size * channel_num threads per block (2D) (x, y)
  
  AT_DISPATCH_FLOATING_TYPES(x.type(), "max_ROI_cuda", ([&] {
    max_ROI_cuda_kernel<scalar_t><<<blocksPerGrid, threadsPerBlock>>>(
      feat_height,
      feat_width,
      x.data<scalar_t>(),
      ROI_size.data<scalar_t>(),
      ROI_pos.data<scalar_t>()
    );
  }));

  return ROI_pos; 
}

When using the function "at::zeros({x.size(0), x.size(1)}, x.type())", I got two building errors: (1) error: no instance of constructor "at::Type::Type" matches the argument list argument types are: (int64_t, int64_t); (2) error: no suitable user-defined conversion from "at::Type" to "at::IntList" exists. can anybody help me to fix this problem? Thanks.

from extension-cpp.

braveapple commented on July 30, 2024

Thanks. Your advice helps me a lot.

from extension-cpp.

goldsborough commented on July 30, 2024

Hmm no this should not have been a problem, TensorOptions has an implicit constructor from Type, otherwise all such code in the wild would break. It's true that x.options() is, since 1 week, the correct way of doing this since it preserves the device, but x.type() should still work fine. @braveapple your code compiles perfectly fine for me, I just tried it. Could you maybe paste the full error you got at the time?

from extension-cpp.

braveapple commented on July 30, 2024

Hello @goldsborough. When I used x.type(), I also got such a building error.

$ python step.py install
running install
running bdist_egg
running egg_info
writing space_dropout_cuda.egg-info/PKG-INFO
writing top-level names to space_dropout_cuda.egg-info/top_level.txt
writing dependency_links to space_dropout_cuda.egg-info/dependency_links.txt
reading manifest file 'space_dropout_cuda.egg-info/SOURCES.txt'
writing manifest file 'space_dropout_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'space_dropout_cuda' extension
gcc -pthread -B /home/dmt/anaconda2/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include/TH -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/dmt/anaconda2/include/python2.7 -c space_dropout_cuda.cpp -o build/temp.linux-x86_64-2.7/space_dropout_cuda.o -DTORCH_EXTENSION_NAME=space_dropout_cuda -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda/bin/nvcc -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include/TH -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/dmt/anaconda2/include/python2.7 -c space_dropout_cuda_kernel.cu -o build/temp.linux-x86_64-2.7/space_dropout_cuda_kernel.o -DTORCH_EXTENSION_NAME=space_dropout_cuda --compiler-options '-fPIC' -std=c++11

space_dropout_cuda_kernel.cu(182): error: no instance of constructor "at::Type::Type" matches the argument list argument types are: (int64_t, int64_t)

space_dropout_cuda_kernel.cu(182): error: no suitable user-defined conversion from "at::Type" to "at::IntList" exists

2 errors detected in the compilation of "/tmp/tmpxft_00007903_00000000-6_space_dropout_cuda_kernel.cpp1.ii".
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

from extension-cpp.

braveapple commented on July 30, 2024

Hello @goldsborough. When I used x.options(), I also got similar building error.

$ python step.py install
running install
running bdist_egg
running egg_info
writing space_dropout_cuda.egg-info/PKG-INFO
writing top-level names to space_dropout_cuda.egg-info/top_level.txt
writing dependency_links to space_dropout_cuda.egg-info/dependency_links.txt
reading manifest file 'space_dropout_cuda.egg-info/SOURCES.txt'
writing manifest file 'space_dropout_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'space_dropout_cuda' extension
gcc -pthread -B /home/dmt/anaconda2/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include/TH -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/dmt/anaconda2/include/python2.7 -c space_dropout_cuda.cpp -o build/temp.linux-x86_64-2.7/space_dropout_cuda.o -DTORCH_EXTENSION_NAME=space_dropout_cuda -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda/bin/nvcc -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include/TH -I/home/dmt/anaconda2/lib/python2.7/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/dmt/anaconda2/include/python2.7 -c space_dropout_cuda_kernel.cu -o build/temp.linux-x86_64-2.7/space_dropout_cuda_kernel.o -DTORCH_EXTENSION_NAME=space_dropout_cuda --compiler-options '-fPIC' -std=c++11
space_dropout_cuda_kernel.cu(182): error: class "at::Tensor" has no member "options"

space_dropout_cuda_kernel.cu(182): error: no instance of constructor "at::Type::Type" matches the argument list argument types are: (int64_t, int64_t)

2 errors detected in the compilation of "/tmp/tmpxft_00000639_00000000-6_space_dropout_cuda_kernel.cpp1.ii".
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

from extension-cpp.

ClementPinard commented on July 30, 2024

what's your pytorch version ?

import torch
torch.__version__

the error: class "at::Tensor" has no member "options" makes me think that your version is not very up to date.

from extension-cpp.

braveapple commented on July 30, 2024

@ClementPinard. Thanks for your reply! My pytorch version is 0.4.0 (the newest version).

from extension-cpp.

ClementPinard commented on July 30, 2024

https://github.com/pytorch/pytorch/blob/v0.4.0/aten/src/ATen/test/basic.cpp

when looking at the 0.4.0 version of this code, if think you can try to invert type and sizes

auto ROI_pos = at::zeros(x.type(), {x.size(0), x.size(1)});

from extension-cpp.

ClementPinard commented on July 30, 2024

packed tensor accessors are now a thing, thanks @t-vi ! Would it be a good idea to implement it here ? Just implemented it for my own extension, and it works like a charm (and is more official than THCDeviceTensor 😆 , would be nice to spread awarenesse of this awesome feature, which is sadly without any documentation for the moment (apart from tests e.g. here )

from extension-cpp.

Discussion about cuda kernel about extension-cpp HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent