springer13 / hptt Goto Github PK

View Code? Open in Web Editor NEW

179.0 179.0 39.0 838 KB

High-Performance Tensor Transpose library

License: BSD 3-Clause "New" or "Revised" License

Makefile 1.20% C++ 84.08% Shell 4.27% Python 8.13% C 1.27% CMake 1.05%

high-performance-computing multidimensional-arrays tensor tensor-transposition tensors transposition

hptt's People

Contributors

Stargazers

Watchers

hptt's Issues

Support for Travis CI

Add support for Travis CI

Either CMake config file or pkgconfig .pc needed

Currently neither is generated:

a ./opt/local/lib/libhptt.a
a ./opt/local/include/compute_node.h
a ./opt/local/include/hptt.h
a ./opt/local/include/hptt_types.h
a ./opt/local/include/macros.h
a ./opt/local/include/plan.h
a ./opt/local/include/transpose.h
a ./opt/local/include/utils.h

It is probably also better to install headers to ${prefix}/include/hptt and not dump them into a common folder :)

Problem with executing benchmark and projects

Hello,

I compiled the benchmark using the Makefile, but then i got an error, when I tried to run the exe.
"Error while loading shared libraries: libhptt.so: cannot open shared object file: No such file or directory"
What could the problem be? I followed exactly the instructions during installation.

Thanks in advance!

benchmark/reference.cpp fails to compile: error: cannot convert 'std::complex<float>' to 'float' in assignment

g++10 -O3 -std=c++11 -I../src/  -c ../benchmark/reference.cpp -o ../benchmark/reference.o
../benchmark/reference.cpp: In instantiation of 'void transpose_ref(uint32_t*, uint32_t*, int, const floatType*, floatType, floatType*, floatType, bool) [with floatType = float; uint32_t = unsigned int]':
../benchmark/reference.cpp:74:58:   required from here
../benchmark/reference.cpp:60:30: error: cannot convert 'std::complex<float>' to 'float' in assignment
   60 |                B_[i] = alpha * std::conj(A_[i * strideAinner]);
      |                        ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                              |
      |                              std::complex<float>
../benchmark/reference.cpp:66:64: error: cannot convert 'std::complex<float>' to 'float' in assignment
   66 |                B_[i] = alpha * std::conj(A_[i * strideAinner]) + beta * B_[i];
      |                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
      |                                                                |
      |                                                                std::complex<float>
../benchmark/reference.cpp: In instantiation of 'void transpose_ref(uint32_t*, uint32_t*, int, const floatType*, floatType, floatType*, floatType, bool) [with floatType = double; uint32_t = unsigned int]':
../benchmark/reference.cpp:80:60:   required from here
../benchmark/reference.cpp:60:30: error: cannot convert 'std::complex<double>' to 'double' in assignment
   60 |                B_[i] = alpha * std::conj(A_[i * strideAinner]);
      |                        ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                              |
      |                              std::complex<double>
../benchmark/reference.cpp:66:64: error: cannot convert 'std::complex<double>' to 'double' in assignment
   66 |                B_[i] = alpha * std::conj(A_[i * strideAinner]) + beta * B_[i];
      |                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
      |                                                                |
      |                                                                std::complex<double>
gmake[1]: *** [Makefile:32: ../benchmark/reference.o] Error 1
gmake[1]: Leaving directory '/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/testframework'
*** Error code 2

Version: 1.0.5-18-g9425386
gcc-10
FreeBSD 13.1

API description

Hi,
is it possible to give a little more details on the API:

The size vector corresponds to the sizes of A or B ?
i_1 corresponds to the major index (i_1 contiguous to i_1+1) ? ( or is it i_N)

Thanks in advance,

Laurent

Compiling benchmark

When I compile the benchmark and reference files by operating make in the /benchmark folder, I get the following error:
reference.cpp:60:30: error: cannot convert 'std::complex<float>' to 'float' in assignment 60 | B_[i] = alpha * std::conj(A_[i * strideAinner]); | ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | std::complex<float>

alpha is a floatType, B_[i] is a floatType, A_[i * strideAinner] is a FloatComplex. Which of them should I change to match the types?
Thanks!

Conjugation flag for 'A'

Would it make sense to have a conjugation flag for A; you'd already need this if you would like to use this library to implement Hermitian conjugation.

Inconsistent BSD vs LGPLv3 license text

Reading through the HPTT code, I noticed that some files (e.g., hptt.h and hptt.cpp) have LGPLv3 license headers, even though the top-level license text says that the license is 3-clause BSD.

Is this an oversight?

Thanks!

invalid read in create_plan for scalar build

Found a particularly 'challenging' transposition that's causing trouble. Building with gnu, no opts, via make scalar. Here is a minimal test

#include <hptt.h>

int main(){
  int order = 2;
  int perm[] = {1,0};
  int size[] = {1,1};
  double st_buffer[4];
  double new_buffer[4];
  int numThreads = 1;

  auto plan = hptt::create_plan( perm, order,
      1.0, ((double*)st_buffer), size, NULL,
      0.0, ((double*)new_buffer), NULL,
      hptt::ESTIMATE, numThreads );


  return 0;
}

Executing this in test.cxx in the hptt main folder as

g++ -O0 -std=c++0x test.cxx -I./src/ ./lib/libhptt.a  && valgrind ./a.out

gives

==27584== Invalid read of size 8
==27584==    at 0x41D90E: hptt::Transpose<double>::createPlans(std::vector<std::shared_ptr<hptt::Plan>, std::allocator<std::shared_ptr<hptt::Plan> > >&) const (hptt.cpp:1799)
==27584==    by 0x41CF33: hptt::Transpose<double>::createPlan() (hptt.cpp:1693)
==27584==    by 0x4047AE: hptt::create_plan(int const*, int, double, double const*, int const*, int const*, double, double*, int const*, hptt::SelectionMethod, int, int const*) (hptt.cpp:1926)
==27584==    by 0x401516: main (in /home/edgar/work/hptt-v1.0/a.out)
==27584==  Address 0x5ab5ef8 is 8 bytes before a block of size 16 alloc'd
==27584==    at 0x4C2E0EF: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27584==    by 0x498796: __gnu_cxx::new_allocator<unsigned long>::allocate(unsigned long, void const*) (in /home/edgar/work/hptt-v1.0/a.out)
==27584==    by 0x49873B: std::allocator_traits<std::allocator<unsigned long> >::allocate(std::allocator<unsigned long>&, unsigned long) (in /home/edgar/work/hptt-v1.0/a.out)
==27584==    by 0x4984D2: std::_Vector_base<unsigned long, std::allocator<unsigned long> >::_M_allocate(unsigned long) (in /home/edgar/work/hptt-v1.0/a.out)
==27584==    by 0x49815D: std::vector<unsigned long, std::allocator<unsigned long> >::_M_default_append(unsigned long) (vector.tcc:557)
==27584==    by 0x4129C0: std::vector<unsigned long, std::allocator<unsigned long> >::resize(unsigned long) (stl_vector.h:676)
==27584==    by 0x41AE79: hptt::Transpose<double>::Transpose(int const*, int const*, int const*, int const*, int, double const*, double, double*, double, hptt::SelectionMethod, int, int const*) (hptt.h:145)
==27584==    by 0x4641AC: void __gnu_cxx::new_allocator<hptt::Transpose<double> >::construct<hptt::Transpose<double>, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&>(hptt::Transpose<double>*, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&) (in /home/edgar/work/hptt-v1.0/a.out)
==27594==    by 0x463DB3: void std::allocator_traits<std::allocator<hptt::Transpose<double> > >::construct<hptt::Transpose<double>, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&>(std::allocator<hptt::Transpose<double> >&, hptt::Transpose<double>*, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&) (in /home/edgar/work/hptt-v1.0/a.out)
==27594==    by 0x463997: std::_Sp_counted_ptr_inplace<hptt::Transpose<double>, std::allocator<hptt::Transpose<double> >, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&>(std::allocator<hptt::Transpose<double> >, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&) (shared_ptr_base.h:522)
==27594==    by 0x4635B3: std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<hptt::Transpose<double>, std::allocator<hptt::Transpose<double> >, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&>(std::_Sp_make_shared_tag, hptt::Transpose<double>*, std::allocator<hptt::Transpose<double> > const&, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&) (shared_ptr_base.h:617)
==27594==    by 0x4632E5: std::__shared_ptr<hptt::Transpose<double>, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<hptt::Transpose<double> >, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&>(std::_Sp_make_shared_tag, std::allocator<hptt::Transpose<double> > const&, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&) (shared_ptr_base.h:1096)
==27594==
==27594== Invalid read of size 8
==27594==    at 0x41D935: hptt::Transpose<double>::createPlans(std::vector<std::shared_ptr<hptt::Plan>, std::allocator<std::shared_ptr<hptt::Plan> > >&) const (hptt.cpp:1800)
==27594==    by 0x41CF33: hptt::Transpose<double>::createPlan() (hptt.cpp:1693)
...

don't ask me why I want to do this transposition :).

Testing Framework

Create a testing framework that tests HPTT for a many (random) tensor transpositions, sizes, number of threads, data types, outerSizes, beta=0, and beta!=0.

One could use benchmark/referecence.cpp as a reference implementation.

"ValueError: repeated axis in transpose" in hptt.ascontiguousarray, not in np.ascontiguousarray

Hi,

I tried to use the python API of hptt in analogy to numpy. For the following code, I met "ValueError: repeated axis in transpose" in hptt.ascontiguousarray, not in np.ascontiguousarray. May I know is this normal? If yes, what would be the reason for this issue? Thanks

import numpy as np
import hptt
import copy

n_a = n_b = n_c = 1
n_d = n_e = n_f = 2
dim_a = (n_a, n_b, n_c, n_d, n_e, n_f)
a = np.random.random(dim_a)
b = copy.deepcopy(a)

b = np.transpose(b, (1,0,2,3,5,4))
#print(b.flags)
#b = np.ascontiguousarray(b)
b = hptt.ascontiguousarray(b)

Clang build OpenMP requirement

It would be nice to be able to build without OpenMP, for instance when using clang. My understanding is external modules are still necessary to do clang + OpenMP. My version of clang is

clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
Target: x86_64-pc-linux-gnu
Thread model: posix

I get the following error when building with -fopenmp or without

src/hptt.cpp:29:10: fatal error: 'omp.h' file not found
#include <omp.h>

Transposition into sub-tensor

Hello,
I tried to do a tensor transpose from a 3x3x3 tensor into a 3x3x3 sub-tensor of a 5x5x5 tensor, but the result is unexpected. The following code snippet is what I tried to do

std::vector<double> A(125), B(27, 1);
std::iota(A.begin(), A.end(), 0);
double* aliasA = &A[0];
std::vector<int> perm = {0,1,2};
std::vector<int> size = {3,3,3};
std::vector<int> outerSize = {5,5,5};
auto plan = hptt::create_plan(&perm[0], 3,
                              1, &B[0],   &size[0], NULL,
                              10, aliasA,           &outerSize[0],
                              hptt::ESTIMATE, 1);
plan->execute();
for(int i = 0; i < 125; i++) std::cout << A[i] << std::endl;

I would expect as result the following tensor

  1,  11,  21,   3,   4,
 51,  61,  71,   8,   9,
101, 111, 121,  13,  14,
 15,  16,  17,  18,  19,
 20,  21,  22,  23,  24,

251, 261, 271,  28,  29,
301, 311, 321,  33,  34,
351, 361, 371,  38,  39,
 40,  41,  42,  43,  44,
 45,  46,  47,  48,  49,

501, 511, 521,  53,  54,
551, 561, 571,  58,  59,
601, 611, 621,  63,  64,
 65,  66,  67,  68,  69,
 70,  71,  72,  73,  74,

 75,  76,  77,  78,  79,
 80,  81,  82,  83,  84,
 85,  86,  87,  88,  89,
 90,  91,  92,  93,  94,
 95,  96,  97,  98,  99,

100, 101, 102, 103, 104,
105, 106, 107, 108, 109,
110, 111, 112, 113, 114,
115, 116, 117, 118, 119,
120, 121, 122, 123, 124

However, the result ends up as the tensor

  1, 111,2111, 311,  41,
 51, 611,7111, 811,  91,
101,1111,12111,1311, 141,
 15,  16,  17,  18,  19,
 20,  21,  22,  23,  24,

 25,  26,  27,  28,  29,
 30,  31,  32,  33,  34,
 35,  36,  37,  38,  39,
 40,  41,  42,  43,  44,
 45,  46,  47,  48,  49,

 50,  51,  52,  53,  54,
 55,  56,  57,  58,  59,
 60,  61,  62,  63,  64,
 65,  66,  67,  68,  69,
 70,  71,  72,  73,  74,

 75,  76,  77,  78,  79,
 80,  81,  82,  83,  84,
 85,  86,  87,  88,  89,
 90,  91,  92,  93,  94,
 95,  96,  97,  98,  99,

100, 101, 102, 103, 104,
105, 106, 107, 108, 109,
110, 111, 112, 113, 114,
115, 116, 117, 118, 119,
120, 121, 122, 123, 124

Missing LICENSE information

I can't see a license or copyright information for this code. Could you please add one? Thanks!

Fix for MSVC

Three places need to fix if want to compile with MSVC:

complex types in MSVC-C is _Fcomplex and _Dcomplex, not conforming to C99 , should should replace xxx _Complex with following macro:

#ifndef HPTT_C_FLT_COMPLEX
#ifdef _MSC_VER
  #define HPTT_C_FLT_COMPLEX _Fcomplex
  #define HPTT_C_DBL_COMPLEX _Dcomplex
#else
  #define HPTT_C_FLT_COMPLEX float _Complex
  #define HPTT_C_DBL_COMPLEX double _Complex
#endif
#endif

the INLINE macro should add specification with MSVC: __forceinline

#if defined(__ICC) || defined(__INTEL_COMPILER) || defined(_MSC_VER)
# define INLINE __forceinline
#elif .....

MSVC does not support VLA, should use _alloca instead.

#ifdef _MSC_VER
#define HPTT_DECL_VLA(type_, name_, len_) type_* name_ = reinterpret_cast<type_*>(_alloca(sizeof(type_)*len_));
#else 
#define HPTT_DECL_VLA(type_, name_, len_) type_ name_[len_];
#endif

Any further update with new instruction sets like AVX2/AVX512?

The project is mainly finished about five years before, and only implement transpose with avx instructions. Will there be a update that support new instruction sets?

throw error rather than exit directly

when something like dimension is not correct, hptt exit(-1) directly,
it is not good for debugging,
it would be better if change them to throw error.

Python API not working for complex arrays

Test fails for the python API when complex numbers are used. The culprit seems to be a missing parameter in pythonAPI/hptt/hptt.py at line 119 (setConjA is missing, which is instead present in the C handle).

Please allow the user to choose the type of library that is built: STATIC or SHARED

here

If STATIC is removed - cmake's variable BUILD_SHARED_LIBS changeable by the user would define the library type.

Strange behaviour with numThreads>1 and execute_expert<useStream, false, betaIsNull>

The following code is OK for numThreads=1 but the result is different for numThreads>1.

   const int dim = 9;
    int sizeAx[dim];
    int sizeAy[dim];
    int sizeAz[dim];
    int perm[dim];


    perm[0] = 2;
    perm[1] = 5;
    perm[2] = 7;
    perm[3] = 8;
    perm[4] = 1;
    perm[5] = 4;
    perm[6] = 3;
    perm[7] = 0;
    perm[8] = 6;

    sizeAx[8] = 3;
    sizeAx[7] = 2;
    sizeAx[6] = 2;
    sizeAx[5] = 1;
    sizeAx[4] = 1;
    sizeAx[3] = 1;
    sizeAx[2] = 2;
    sizeAx[1] = 1;
    sizeAx[0] = 2;

    for (int d = 0; d < 9; d++) sizeAz[d] = sizeAx[perm[d]];
    for (int d = 0; d < 9; d++) sizeAy[d] = sizeAz[perm[d]];

    uint flatSize = 1;
    for (int d = 0; d < 9; d++) flatSize *= sizeAx[d];

    RealType * Ax = new RealType[flatSize];
    RealType * Ay = new RealType[flatSize];
    RealType * Az = new RealType[flatSize];

    for (uint i = 0; i < flatSize; i++) {
        Ax[i] = RealType(i);
        Ay[i] = 0.0;
        Az[i] = 0.0;
    }

    RealType alpha = 1.0;
    RealType beta = 0.0;

    const int numThreads = 4;

    auto planXZ = hptt::create_plan(perm, dim,
            alpha, Ax, sizeAx, NULL,
            beta, Az, NULL,
            hptt::ESTIMATE, numThreads);

    auto planZY = hptt::create_plan(perm, dim,
            alpha, Az, sizeAz, NULL,
            beta, Ay, NULL,
            hptt::ESTIMATE, numThreads);

    auto planYX = hptt::create_plan(perm, dim,
            alpha, Ay, sizeAy, NULL,
            beta, Ax, NULL,
            hptt::ESTIMATE, numThreads);


    const bool useStream = false;
    const bool useThreads = false;
    const bool betaIsNull = true;

    planXZ->execute_expert<useStream, useThreads, betaIsNull>();
    for (uint i = 0; i < flatSize; i++) std::cout << Az[i] << " ";
    std::cout << std::endl;
    planZY->execute_expert<useStream, useThreads, betaIsNull>();
    for (uint i = 0; i < flatSize; i++) std::cout << Ay[i] << " ";
    std::cout << std::endl;
    planYX->execute_expert<useStream, useThreads, betaIsNull>();
    for (uint i = 0; i < flatSize; i++) std::cout << Ax[i] << " ";
    std::cout << std::endl;

    delete[] Ax;
    delete[] Ay;
    delete[] Az;

Result for numThreads=1:

0 2 8 10 16 18 24 26 32 34 40 42 1 3 9 11 17 19 25 27 33 35 41 43 4 6 12 14 20 22 28 30 36 38 44 46 5 7 13 15 21 23 29 31 37 39 45 47
0 8 1 9 4 12 5 13 16 24 17 25 20 28 21 29 32 40 33 41 36 44 37 45 2 10 3 11 6 14 7 15 18 26 19 27 22 30 23 31 34 42 35 43 38 46 39 47
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Result for numThreads=4
0 2 8 10 16 18 0 0 0 0 0 0 1 3 9 11 17 19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 8 0 0 0 0 0 0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 10 0 0 0 0 0 0 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 2 3 4 5 6 7 8 0 10 11 12 13 14 15 16 0 18 19 20 21 22 23 0 0 26 27 28 29 30 31 0 0 34 35 36 37 38 39 0 0 42 43 44 45 46 47

error: implicit conversion from 'const _Complex float' to 'float' is not permitted in C++

Build fails with clang-13:

===>  Building for hptt-1.0.5.18
[ 20% 4/5] /usr/bin/c++  -I/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/include -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -fopenmp -march=native -std=gnu++11 -MD -MT CMakeFiles/hptt.dir/src/utils.cpp.o -MF CMakeFiles/hptt.dir/src/utils.cpp.o.d -o CMakeFiles/hptt.dir/src/utils.cpp.o -c /disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/utils.cpp
[ 40% 4/5] /usr/bin/c++  -I/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/include -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -fopenmp -march=native -std=gnu++11 -MD -MT CMakeFiles/hptt.dir/src/hptt.cpp.o -MF CMakeFiles/hptt.dir/src/hptt.cpp.o.d -o CMakeFiles/hptt.dir/src/hptt.cpp.o -c /disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp
FAILED: CMakeFiles/hptt.dir/src/hptt.cpp.o 
/usr/bin/c++  -I/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/include -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -fopenmp -march=native -std=gnu++11 -MD -MT CMakeFiles/hptt.dir/src/hptt.cpp.o -MF CMakeFiles/hptt.dir/src/hptt.cpp.o.d -o CMakeFiles/hptt.dir/src/hptt.cpp.o -c /disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp
/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp:179:131: error: implicit conversion from 'const _Complex float' to 'float' is not permitted in C++
                         (const hptt::FloatComplex*) A, (hptt::FloatComplex) alpha, (hptt::FloatComplex*) B, (hptt::FloatComplex) beta, hptt::ESTIMATE, numThreads, nullptr, useRowMajor));
                                                                                                             ~                    ^~~~
/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp:179:78: error: implicit conversion from 'const _Complex float' to 'float' is not permitted in C++
                         (const hptt::FloatComplex*) A, (hptt::FloatComplex) alpha, (hptt::FloatComplex*) B, (hptt::FloatComplex) beta, hptt::ESTIMATE, numThreads, nullptr, useRowMajor));
                                                        ~                    ^~~~~
/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp:190:135: error: implicit conversion from 'const _Complex double' to 'double' is not permitted in C++
                         (const hptt::DoubleComplex*) A, (hptt::DoubleComplex) alpha, (hptt::DoubleComplex*) B, (hptt::DoubleComplex) beta, hptt::ESTIMATE, numThreads, nullptr, useRowMajor));
                                                                                                                ~                     ^~~~
/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp:190:80: error: implicit conversion from 'const _Complex double' to 'double' is not permitted in C++
                         (const hptt::DoubleComplex*) A, (hptt::DoubleComplex) alpha, (hptt::DoubleComplex*) B, (hptt::DoubleComplex) beta, hptt::ESTIMATE, numThreads, nullptr, useRowMajor));
                                                         ~                     ^~~~~
4 errors generated.

Version: 1.0.5-18

Compilation failed with g++ 6.3.0

Hi,
the hptt compilation failed on my machine (6700K-ubuntu 17.04-g++ 6.3.0) with the following message:

/usr/lib/gcc/x86_64-linux-gnu/6/include/avxintrin.h:994:1: error: inlining failed in call to always_inline ‘void hptt::_mm256_stream_ps(float*, hptt::__m256)’: target specific option mismatch _mm256_stream_ps (float *__P, __m256 __A) ^~~~~~~~~~~~~~~~

springer13 / hptt Goto Github PK

hptt's People

Contributors

Stargazers

Watchers

Forkers

hptt's Issues

Recommend Projects

Recommend Topics

Recommend Org