Comments (10)
I'm having trouble with this part of the example:
// get pointers to enable read/write of data
status = rocfft_buffer_get_ptr(buffer_a, &raw_ptr_a);
status = rocfft_buffer_get_ptr(buffer_a, &raw_ptr_b);
// initialize input
...
The get_ptr return is something like a blocking clEnqueueMapBuffer call. How do you tell the API that you're done with the pointer and to migrate the data back to the device? Or if you prefer AMP concepts, the get_ptr return is something like an array_view, and there needs to be a call to synchronize() to copy it to the device.
from rocfft.
Shouldn't there be a size_in_bytes or some other argument to buffer_create_with_ptr to enable some basic range checking?
from rocfft.
After thinking about this a bit more, with some more discussion/inputs from Kent and Timmy, here are my thoughts:
First, here are some of the assumptions.
- Decision made to keep the rocFFT C header not dependent on HIP or HC C++ headers
- Hide anything HIP/HC C++ related types through void pointers
- C interface should be consumable by vanilla C programmers and from other languages such as Fortran, Python etc
Given this, I propose 2 approaches to managing device memory
Approach 1:
- rocFFT C interface should be used ONLY by programmer who is already developing with HIP or HC C++
- Have NO memory related APIs in the rocFFT header
- Whenever we need to get device memory from the user, we obtain it through void pointer (a value returned by call to hipMalloc in the application) in the library API.
In adition to rocFFT interface, we provide an exact replica of FFTW interface (which hides all GPU programming) to support other languages. Here, we manage all device memory within the library; the user only sees host memory regions with calls to fftw_malloc etc. Within the execute API function, we copy data from host to device, do compute immediately and write it back; there is performance implication with this though but we keep the semblance of an ordinary C library to the user
Approach 2:
- rocFFT C interface can be consumed by anyone (with or without the use of HIP or HC/C++ programming in the application)
- Must have memory related APIs in the library header; it should provide user these basic tasks: create/free device memory region, copy data from/to device
- Should work harmoniously with any mixing of hipMalloc and rocFFT malloc
Providing replica FFTW interface is not needed in this case, but can be done for convenience.
My current rocFFT header kind of followed approach 2, but I realize I have not provided all of the features. I only created entry points for allocation/freeing of device memory. There is no explicit api for copying of data between host and device. I assumed the application developer can always use HIP functions for these; but I realize this is not an option for other language programmers. I feel that if we are going to have a library abstraction layer that deals with memory management, then we should have it independent of rocFFT. Maybe a base/common library that other libs such as rocBLAS can use.
I am starting to favor approach 1. I am going to delete memory related APIs from rocFFT. I can document how device memory created with HIP or HC C++ need to be passed to the library. There would not be any type safety though, with void pointer arguments.
@b-sumner with regard to your question above, my idea with the function 'rocfft_buffer_get_ptr' is to give the pointer created by call to hipMalloc inside the library back to the user and it is a very simple non blocking call. This pointer would be useless for de-referencing or pointer arithmetic on the host side.
from rocfft.
I am also an advocate of approach 1
rocFFT C interface should be used ONLY by programmer who is already developing with HIP or HC C++
I think this should just be framed to say that native C interfaces enable language wrappers. It can be used by anybody, and anybody can make a language wrapper if they so desire.
Whenever we need to get device memory from the user, we obtain it through void pointer (a value returned by call to hipMalloc in the application) in the library API
void* can come from mutliple sources: hipMalloc, hc::am_alloc and clSVMalloc(). So, void* should enable a very flexible, non-typed interface
In addition to rocFFT interface, we provide an exact replica of FFTW interface (which hides all GPU programming) to support other languages.
I would say we provide FFTW interfaces to facilitate quick CPU ports. I don't think the primary purpose should be to enable other language wrappers; they should wrap the native C interface.
from rocfft.
Thanks for feedback Kent, I am in agreement. I think it would be good to show example wrappers for other languages such as python/fortran, to show the possibility and to provoke interest.
from rocfft.
Ya, I think we should provide wrappers for c++ and python. C++ will be more under our direct control. We will need to figure out how to create queues and allocate device memory in python, but i think there will be a way. I think python will be the next easiest, not sure how to make queues or device memory in fortran. It might mean making wrappers for hip or hc.
from rocfft.
Is there any reason both input and output to the execute function are pointers to arrays?https://github.com/RadeonOpenCompute/rocFFT/blob/master/src/include/rocfft.h#L91
Shouldn't the input just be an array and output be pointer to an array ideally ?
from rocfft.
Now that I think about it, neither one of them needs to be a pointer to an array since output is pre-allocated.
Is this done in case the n transforms are not part of the same buffer?
from rocfft.
@pavanky the reason is to support planar formats. Both in & out are pointers to array of void-pointers. This 'array' length is either 1 or 2. It is 1 if the data is in complex interleaved format (1 buffer has both real & imaginary) and 2 if the data is in complex planar format (real and imaginary in separate buffers).
from rocfft.
@bragadeesh Yeah that makes sense. thanks for the clarification.
from rocfft.
Related Issues (20)
- Undefined symbol in rocfft-device HOT 8
- RFC: RTC kernel cache file behaviour HOT 2
- cannot build rocFFT on CUDA HOT 7
- rocfft_aot_helper rocfft_kernel_cache.db : Assertion `thread != NULL && "cannot lock() from (null)"' failed HOT 4
- Plan creation kills performance HOT 4
- Selecting gpu to run rocfft HOT 4
- Test failed on Radeon VII: Assertion `childNodes.size() >= 3 && childNodes.size() < 6' failed HOT 2
- Segfault while compiling rocFFT HOT 7
- ROCFFT_RTC_CACHE_PATH Default Location for HPC HOT 1
- segfault during build of rocFFT on Fedora HOT 15
- rocfft_aot_helper input checking HOT 5
- Multi-gpu multi-node FFT HOT 3
- rocFFT Test Suite Fails HOT 42
- Failure in random_real_3d/random_params.vs_fftw on ROCm 5.5 HOT 1
- Path for sphinx/requirements.txt in README is missing a fullstop
- rocFFT version for rocm 6.0.0, 5.7.1 and 5.7.0 HOT 1
- Compiler does not gracefully handle multi-core builds
- [Bug]: incorrect results in 3d complex-complex inplace FFT HOT 6
- Performance ISSUE: Slow performance of rocfft comaped to cufft in MI200 series accelerators vs A100 GPUs HOT 2
- [Issue]: Unable to build from source rocFFT 1.0.25 for ROCm 6.0.2 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rocfft.