Comments (9)
The example with hipPlanMany works i.e. if one calculates the inverse with the same plan settings, the error is small. Changing the sample as follows results in a wrong result.
--- clients/samples/hipfft/hipfft_planmany_2d_z2z.cpp 2019-12-03 17:33:03.676271591 +0100
+++ clients/samples/hipfft/hipfft_planmany_2d_z2z.cpp 2019-12-03 17:32:05.728933650 +0100
@@ -30,21 +30,21 @@
<< "hipfft 2D double-precision complex-to-complex transform using advanced interface\n";
int rank = 2;
- int n[2] = {4, 5};
- int howmany = 3;
+ int n[2] = {4, 4};
+ int howmany = 4;
// array is contiguous in memory
- int istride = 1;
+ int istride = 4;
// in-place transforms require istride=ostride
int ostride = istride;
// we choose to have no padding around our data:
- int inembed[2] = {istride * n[0], istride * n[1]};
+ int inembed[2] = {n[0], n[1]};
// in-place transforms require inembed=oneembed:
int onembed[2] = {inembed[0], inembed[1]};
- int idist = inembed[0] * inembed[1];
- int odist = onembed[0] * onembed[1];
+ int idist = 1;
+ int odist = 1;
std::cout << "n: " << n[0] << " " << n[1] << "\n"
<< "howmany: " << howmany << "\n"
@@ -54,7 +54,8 @@
<< "idist: " << idist << "\todist: " << odist << "\n"
<< std::endl;
- std::vector<std::complex<double>> data(howmany * idist);
+ std::vector<std::complex<double>> data(4*4*4);
+ std::vector<std::complex<double>> cdata(4*4*4);
const auto total_bytes = data.size() * sizeof(decltype(data)::value_type);
std::cout << "input:\n";
@@ -96,9 +97,11 @@
hipMemcpy(d_in_out, (void*)data.data(), total_bytes, hipMemcpyHostToDevice);
result = hipfftExecZ2Z(hipPlan, d_in_out, d_in_out, HIPFFT_FORWARD);
+ result = hipfftExecZ2Z(hipPlan, d_in_out, d_in_out, HIPFFT_BACKWARD);
- hipMemcpy((void*)data.data(), d_in_out, total_bytes, hipMemcpyDeviceToHost);
+ hipMemcpy((void*)cdata.data(), d_in_out, total_bytes, hipMemcpyDeviceToHost);
+ const double s = 1.0/(n[0]*n[1]);
std::cout << "output:\n";
for(int ibatch = 0; ibatch < howmany; ++ibatch)
{
@@ -108,7 +111,7 @@
for(int j = 0; j < onembed[1]; j++)
{
const auto pos = ibatch * odist + i * onembed[1] + j;
- std::cout << data[pos] << " ";
+ std::cout << s*cdata[pos] << " ";
}
std::cout << "\n";
}
@@ -116,5 +119,16 @@
}
std::cout << std::endl;
+ double err = 0.0;
+ for (int i = 0; i < data.size(); i++)
+ {
+ double ierr = std::abs(s*cdata[i] - data[i]);
+ if (ierr > err)
+ {
+ err = ierr;
+ }
+ }
+ std::cout << "max error: " << err << std::endl;
+
hipFree(d_in_out);
}
from rocfft.
It looks like you're changing a lot of the parameters; some combinations don't make sense, and can't produce correct results. Could you write down the input and output strides and distances along with your problem size, please?
from rocfft.
Sure.
problem size: 4x4x4
strides: 4, 16
distance: 1
I am only trying complex transforms and thus input and outputs strides are equal.
from rocfft.
Is the transform in-place or out-of-place? Just want to make sure it's not a duplicate of #270
from rocfft.
The transformation is in-place and the forward transform seems to be giving correct results.
from rocfft.
So, one thing that might be causing confusion is that hipFFT is row-major ("C-style) and rocFFT is column-major ("FORTRAN-style"). So a hipFFT transform has lengths {nx, ny}, and the equivalent rocFFT transform has lengths {ny, nx}. The strides also have to be reversed.
This may explain why hipFFT is working but rocFFT isn't.
from rocfft.
Sorry, if my 2nd comment was confusing, but both are not working.
But you are right that the results are different between roc- and hipFFT. However, the inverse should be the input times a scaling factor.
To clarify:
rocFFT is working for:
strides: 1 4
distance: 16
For
strides: 4, 16
distance: 1
I get the right result for the forward transform (I tested it using FFTW in julia, which is column major), but the inverse is not equal to the original (times a constant).
from rocfft.
Thanks for the info; I am making some time to look into this issue.
This may be solved by a commit that is under review. Could you try
https://github.com/malcolmroberts/rocFFT/tree/fix_transpose_for_1D_stride
and see if this solves your problem?
from rocfft.
Closing due to inactivity.
from rocfft.
Related Issues (20)
- Undefined symbol in rocfft-device HOT 8
- RFC: RTC kernel cache file behaviour HOT 2
- cannot build rocFFT on CUDA HOT 7
- rocfft_aot_helper rocfft_kernel_cache.db : Assertion `thread != NULL && "cannot lock() from (null)"' failed HOT 4
- Plan creation kills performance HOT 4
- Selecting gpu to run rocfft HOT 4
- Test failed on Radeon VII: Assertion `childNodes.size() >= 3 && childNodes.size() < 6' failed HOT 2
- Segfault while compiling rocFFT HOT 7
- ROCFFT_RTC_CACHE_PATH Default Location for HPC HOT 1
- segfault during build of rocFFT on Fedora HOT 15
- rocfft_aot_helper input checking HOT 5
- Multi-gpu multi-node FFT HOT 3
- rocFFT Test Suite Fails HOT 42
- Failure in random_real_3d/random_params.vs_fftw on ROCm 5.5 HOT 1
- Path for sphinx/requirements.txt in README is missing a fullstop
- rocFFT version for rocm 6.0.0, 5.7.1 and 5.7.0 HOT 1
- Compiler does not gracefully handle multi-core builds
- [Bug]: incorrect results in 3d complex-complex inplace FFT HOT 6
- Performance ISSUE: Slow performance of rocfft comaped to cufft in MI200 series accelerators vs A100 GPUs HOT 2
- [Issue]: Unable to build from source rocFFT 1.0.25 for ROCm 6.0.2 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rocfft.