Giter Club home page Giter Club logo

hptt's People

Contributors

ajaypanyala avatar jcmgray avatar solomonik avatar springer13 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hptt's Issues

Either CMake config file or pkgconfig .pc needed

Currently neither is generated:

a ./opt/local/lib/libhptt.a
a ./opt/local/include/compute_node.h
a ./opt/local/include/hptt.h
a ./opt/local/include/hptt_types.h
a ./opt/local/include/macros.h
a ./opt/local/include/plan.h
a ./opt/local/include/transpose.h
a ./opt/local/include/utils.h

It is probably also better to install headers to ${prefix}/include/hptt and not dump them into a common folder :)

Problem with executing benchmark and projects

Hello,

I compiled the benchmark using the Makefile, but then i got an error, when I tried to run the exe.
"Error while loading shared libraries: libhptt.so: cannot open shared object file: No such file or directory"
What could the problem be? I followed exactly the instructions during installation.

Thanks in advance!

benchmark/reference.cpp fails to compile: error: cannot convert 'std::complex<float>' to 'float' in assignment

g++10 -O3 -std=c++11 -I../src/  -c ../benchmark/reference.cpp -o ../benchmark/reference.o
../benchmark/reference.cpp: In instantiation of 'void transpose_ref(uint32_t*, uint32_t*, int, const floatType*, floatType, floatType*, floatType, bool) [with floatType = float; uint32_t = unsigned int]':
../benchmark/reference.cpp:74:58:   required from here
../benchmark/reference.cpp:60:30: error: cannot convert 'std::complex<float>' to 'float' in assignment
   60 |                B_[i] = alpha * std::conj(A_[i * strideAinner]);
      |                        ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                              |
      |                              std::complex<float>
../benchmark/reference.cpp:66:64: error: cannot convert 'std::complex<float>' to 'float' in assignment
   66 |                B_[i] = alpha * std::conj(A_[i * strideAinner]) + beta * B_[i];
      |                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
      |                                                                |
      |                                                                std::complex<float>
../benchmark/reference.cpp: In instantiation of 'void transpose_ref(uint32_t*, uint32_t*, int, const floatType*, floatType, floatType*, floatType, bool) [with floatType = double; uint32_t = unsigned int]':
../benchmark/reference.cpp:80:60:   required from here
../benchmark/reference.cpp:60:30: error: cannot convert 'std::complex<double>' to 'double' in assignment
   60 |                B_[i] = alpha * std::conj(A_[i * strideAinner]);
      |                        ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                              |
      |                              std::complex<double>
../benchmark/reference.cpp:66:64: error: cannot convert 'std::complex<double>' to 'double' in assignment
   66 |                B_[i] = alpha * std::conj(A_[i * strideAinner]) + beta * B_[i];
      |                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
      |                                                                |
      |                                                                std::complex<double>
gmake[1]: *** [Makefile:32: ../benchmark/reference.o] Error 1
gmake[1]: Leaving directory '/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/testframework'
*** Error code 2

Version: 1.0.5-18-g9425386
gcc-10
FreeBSD 13.1

API description

Hi,
is it possible to give a little more details on the API:

  • The size vector corresponds to the sizes of A or B ?
  • i_1 corresponds to the major index (i_1 contiguous to i_1+1) ? ( or is it i_N)

Thanks in advance,

Laurent

Compiling benchmark

When I compile the benchmark and reference files by operating make in the /benchmark folder, I get the following error:
reference.cpp:60:30: error: cannot convert 'std::complex<float>' to 'float' in assignment 60 | B_[i] = alpha * std::conj(A_[i * strideAinner]); | ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | std::complex<float>

alpha is a floatType, B_[i] is a floatType, A_[i * strideAinner] is a FloatComplex. Which of them should I change to match the types?
Thanks!

Conjugation flag for 'A'

Would it make sense to have a conjugation flag for A; you'd already need this if you would like to use this library to implement Hermitian conjugation.

Inconsistent BSD vs LGPLv3 license text

Reading through the HPTT code, I noticed that some files (e.g., hptt.h and hptt.cpp) have LGPLv3 license headers, even though the top-level license text says that the license is 3-clause BSD.

Is this an oversight?

Thanks!

invalid read in create_plan for scalar build

Found a particularly 'challenging' transposition that's causing trouble. Building with gnu, no opts, via make scalar. Here is a minimal test

#include <hptt.h>

int main(){
  int order = 2;
  int perm[] = {1,0};
  int size[] = {1,1};
  double st_buffer[4];
  double new_buffer[4];
  int numThreads = 1;

  auto plan = hptt::create_plan( perm, order,
      1.0, ((double*)st_buffer), size, NULL,
      0.0, ((double*)new_buffer), NULL,
      hptt::ESTIMATE, numThreads );


  return 0;
}

Executing this in test.cxx in the hptt main folder as

g++ -O0 -std=c++0x test.cxx -I./src/ ./lib/libhptt.a  && valgrind ./a.out

gives

==27584== Invalid read of size 8
==27584==    at 0x41D90E: hptt::Transpose<double>::createPlans(std::vector<std::shared_ptr<hptt::Plan>, std::allocator<std::shared_ptr<hptt::Plan> > >&) const (hptt.cpp:1799)
==27584==    by 0x41CF33: hptt::Transpose<double>::createPlan() (hptt.cpp:1693)
==27584==    by 0x4047AE: hptt::create_plan(int const*, int, double, double const*, int const*, int const*, double, double*, int const*, hptt::SelectionMethod, int, int const*) (hptt.cpp:1926)
==27584==    by 0x401516: main (in /home/edgar/work/hptt-v1.0/a.out)
==27584==  Address 0x5ab5ef8 is 8 bytes before a block of size 16 alloc'd
==27584==    at 0x4C2E0EF: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27584==    by 0x498796: __gnu_cxx::new_allocator<unsigned long>::allocate(unsigned long, void const*) (in /home/edgar/work/hptt-v1.0/a.out)
==27584==    by 0x49873B: std::allocator_traits<std::allocator<unsigned long> >::allocate(std::allocator<unsigned long>&, unsigned long) (in /home/edgar/work/hptt-v1.0/a.out)
==27584==    by 0x4984D2: std::_Vector_base<unsigned long, std::allocator<unsigned long> >::_M_allocate(unsigned long) (in /home/edgar/work/hptt-v1.0/a.out)
==27584==    by 0x49815D: std::vector<unsigned long, std::allocator<unsigned long> >::_M_default_append(unsigned long) (vector.tcc:557)
==27584==    by 0x4129C0: std::vector<unsigned long, std::allocator<unsigned long> >::resize(unsigned long) (stl_vector.h:676)
==27584==    by 0x41AE79: hptt::Transpose<double>::Transpose(int const*, int const*, int const*, int const*, int, double const*, double, double*, double, hptt::SelectionMethod, int, int const*) (hptt.h:145)
==27584==    by 0x4641AC: void __gnu_cxx::new_allocator<hptt::Transpose<double> >::construct<hptt::Transpose<double>, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&>(hptt::Transpose<double>*, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&) (in /home/edgar/work/hptt-v1.0/a.out)
==27594==    by 0x463DB3: void std::allocator_traits<std::allocator<hptt::Transpose<double> > >::construct<hptt::Transpose<double>, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&>(std::allocator<hptt::Transpose<double> >&, hptt::Transpose<double>*, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&) (in /home/edgar/work/hptt-v1.0/a.out)
==27594==    by 0x463997: std::_Sp_counted_ptr_inplace<hptt::Transpose<double>, std::allocator<hptt::Transpose<double> >, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&>(std::allocator<hptt::Transpose<double> >, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&) (shared_ptr_base.h:522)
==27594==    by 0x4635B3: std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<hptt::Transpose<double>, std::allocator<hptt::Transpose<double> >, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&>(std::_Sp_make_shared_tag, hptt::Transpose<double>*, std::allocator<hptt::Transpose<double> > const&, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&) (shared_ptr_base.h:617)
==27594==    by 0x4632E5: std::__shared_ptr<hptt::Transpose<double>, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<hptt::Transpose<double> >, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&>(std::_Sp_make_shared_tag, std::allocator<hptt::Transpose<double> > const&, int const*&, int const*&, int const*&, int const*&, int const&, double const*&, double const&, double*&, double const&, hptt::SelectionMethod const&, int const&, int const*&) (shared_ptr_base.h:1096)
==27594==
==27594== Invalid read of size 8
==27594==    at 0x41D935: hptt::Transpose<double>::createPlans(std::vector<std::shared_ptr<hptt::Plan>, std::allocator<std::shared_ptr<hptt::Plan> > >&) const (hptt.cpp:1800)
==27594==    by 0x41CF33: hptt::Transpose<double>::createPlan() (hptt.cpp:1693)
...

don't ask me why I want to do this transposition :).

Testing Framework

Create a testing framework that tests HPTT for a many (random) tensor transpositions, sizes, number of threads, data types, outerSizes, beta=0, and beta!=0.

One could use benchmark/referecence.cpp as a reference implementation.

"ValueError: repeated axis in transpose" in hptt.ascontiguousarray, not in np.ascontiguousarray

Hi,

I tried to use the python API of hptt in analogy to numpy. For the following code, I met "ValueError: repeated axis in transpose" in hptt.ascontiguousarray, not in np.ascontiguousarray. May I know is this normal? If yes, what would be the reason for this issue? Thanks

import numpy as np
import hptt
import copy

n_a = n_b = n_c = 1
n_d = n_e = n_f = 2
dim_a = (n_a, n_b, n_c, n_d, n_e, n_f)
a = np.random.random(dim_a)
b = copy.deepcopy(a)

b = np.transpose(b, (1,0,2,3,5,4))
#print(b.flags)
#b = np.ascontiguousarray(b)
b = hptt.ascontiguousarray(b)

Clang build OpenMP requirement

It would be nice to be able to build without OpenMP, for instance when using clang. My understanding is external modules are still necessary to do clang + OpenMP. My version of clang is

clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
Target: x86_64-pc-linux-gnu
Thread model: posix

I get the following error when building with -fopenmp or without

src/hptt.cpp:29:10: fatal error: 'omp.h' file not found
#include <omp.h>

Transposition into sub-tensor

Hello,
I tried to do a tensor transpose from a 3x3x3 tensor into a 3x3x3 sub-tensor of a 5x5x5 tensor, but the result is unexpected. The following code snippet is what I tried to do

std::vector<double> A(125), B(27, 1);
std::iota(A.begin(), A.end(), 0);
double* aliasA = &A[0];
std::vector<int> perm = {0,1,2};
std::vector<int> size = {3,3,3};
std::vector<int> outerSize = {5,5,5};
auto plan = hptt::create_plan(&perm[0], 3,
                              1, &B[0],   &size[0], NULL,
                              10, aliasA,           &outerSize[0],
                              hptt::ESTIMATE, 1);
plan->execute();
for(int i = 0; i < 125; i++) std::cout << A[i] << std::endl;

I would expect as result the following tensor

  1,  11,  21,   3,   4,
 51,  61,  71,   8,   9,
101, 111, 121,  13,  14,
 15,  16,  17,  18,  19,
 20,  21,  22,  23,  24,

251, 261, 271,  28,  29,
301, 311, 321,  33,  34,
351, 361, 371,  38,  39,
 40,  41,  42,  43,  44,
 45,  46,  47,  48,  49,

501, 511, 521,  53,  54,
551, 561, 571,  58,  59,
601, 611, 621,  63,  64,
 65,  66,  67,  68,  69,
 70,  71,  72,  73,  74,

 75,  76,  77,  78,  79,
 80,  81,  82,  83,  84,
 85,  86,  87,  88,  89,
 90,  91,  92,  93,  94,
 95,  96,  97,  98,  99,

100, 101, 102, 103, 104,
105, 106, 107, 108, 109,
110, 111, 112, 113, 114,
115, 116, 117, 118, 119,
120, 121, 122, 123, 124

However, the result ends up as the tensor

  1, 111,2111, 311,  41,
 51, 611,7111, 811,  91,
101,1111,12111,1311, 141,
 15,  16,  17,  18,  19,
 20,  21,  22,  23,  24,

 25,  26,  27,  28,  29,
 30,  31,  32,  33,  34,
 35,  36,  37,  38,  39,
 40,  41,  42,  43,  44,
 45,  46,  47,  48,  49,

 50,  51,  52,  53,  54,
 55,  56,  57,  58,  59,
 60,  61,  62,  63,  64,
 65,  66,  67,  68,  69,
 70,  71,  72,  73,  74,

 75,  76,  77,  78,  79,
 80,  81,  82,  83,  84,
 85,  86,  87,  88,  89,
 90,  91,  92,  93,  94,
 95,  96,  97,  98,  99,

100, 101, 102, 103, 104,
105, 106, 107, 108, 109,
110, 111, 112, 113, 114,
115, 116, 117, 118, 119,
120, 121, 122, 123, 124

Fix for MSVC

Three places need to fix if want to compile with MSVC:

  1. complex types in MSVC-C is _Fcomplex and _Dcomplex, not conforming to C99 , should should replace xxx _Complex with following macro:
#ifndef HPTT_C_FLT_COMPLEX
#ifdef _MSC_VER
  #define HPTT_C_FLT_COMPLEX _Fcomplex
  #define HPTT_C_DBL_COMPLEX _Dcomplex
#else
  #define HPTT_C_FLT_COMPLEX float _Complex
  #define HPTT_C_DBL_COMPLEX double _Complex
#endif
#endif
  1. the INLINE macro should add specification with MSVC: __forceinline
#if defined(__ICC) || defined(__INTEL_COMPILER) || defined(_MSC_VER)
# define INLINE __forceinline
#elif .....
  1. MSVC does not support VLA, should use _alloca instead.
#ifdef _MSC_VER
#define HPTT_DECL_VLA(type_, name_, len_) type_* name_ = reinterpret_cast<type_*>(_alloca(sizeof(type_)*len_));
#else 
#define HPTT_DECL_VLA(type_, name_, len_) type_ name_[len_];
#endif

throw error rather than exit directly

when something like dimension is not correct, hptt exit(-1) directly,
it is not good for debugging,
it would be better if change them to throw error.

Python API not working for complex arrays

Test fails for the python API when complex numbers are used. The culprit seems to be a missing parameter in pythonAPI/hptt/hptt.py at line 119 (setConjA is missing, which is instead present in the C handle).

Strange behaviour with numThreads>1 and execute_expert<useStream, false, betaIsNull>

The following code is OK for numThreads=1 but the result is different for numThreads>1.

   const int dim = 9;
    int sizeAx[dim];
    int sizeAy[dim];
    int sizeAz[dim];
    int perm[dim];


    perm[0] = 2;
    perm[1] = 5;
    perm[2] = 7;
    perm[3] = 8;
    perm[4] = 1;
    perm[5] = 4;
    perm[6] = 3;
    perm[7] = 0;
    perm[8] = 6;

    sizeAx[8] = 3;
    sizeAx[7] = 2;
    sizeAx[6] = 2;
    sizeAx[5] = 1;
    sizeAx[4] = 1;
    sizeAx[3] = 1;
    sizeAx[2] = 2;
    sizeAx[1] = 1;
    sizeAx[0] = 2;

    for (int d = 0; d < 9; d++) sizeAz[d] = sizeAx[perm[d]];
    for (int d = 0; d < 9; d++) sizeAy[d] = sizeAz[perm[d]];

    uint flatSize = 1;
    for (int d = 0; d < 9; d++) flatSize *= sizeAx[d];

    RealType * Ax = new RealType[flatSize];
    RealType * Ay = new RealType[flatSize];
    RealType * Az = new RealType[flatSize];

    for (uint i = 0; i < flatSize; i++) {
        Ax[i] = RealType(i);
        Ay[i] = 0.0;
        Az[i] = 0.0;
    }

    RealType alpha = 1.0;
    RealType beta = 0.0;

    const int numThreads = 4;

    auto planXZ = hptt::create_plan(perm, dim,
            alpha, Ax, sizeAx, NULL,
            beta, Az, NULL,
            hptt::ESTIMATE, numThreads);

    auto planZY = hptt::create_plan(perm, dim,
            alpha, Az, sizeAz, NULL,
            beta, Ay, NULL,
            hptt::ESTIMATE, numThreads);

    auto planYX = hptt::create_plan(perm, dim,
            alpha, Ay, sizeAy, NULL,
            beta, Ax, NULL,
            hptt::ESTIMATE, numThreads);


    const bool useStream = false;
    const bool useThreads = false;
    const bool betaIsNull = true;

    planXZ->execute_expert<useStream, useThreads, betaIsNull>();
    for (uint i = 0; i < flatSize; i++) std::cout << Az[i] << " ";
    std::cout << std::endl;
    planZY->execute_expert<useStream, useThreads, betaIsNull>();
    for (uint i = 0; i < flatSize; i++) std::cout << Ay[i] << " ";
    std::cout << std::endl;
    planYX->execute_expert<useStream, useThreads, betaIsNull>();
    for (uint i = 0; i < flatSize; i++) std::cout << Ax[i] << " ";
    std::cout << std::endl;

    delete[] Ax;
    delete[] Ay;
    delete[] Az;

Result for numThreads=1:

0 2 8 10 16 18 24 26 32 34 40 42 1 3 9 11 17 19 25 27 33 35 41 43 4 6 12 14 20 22 28 30 36 38 44 46 5 7 13 15 21 23 29 31 37 39 45 47
0 8 1 9 4 12 5 13 16 24 17 25 20 28 21 29 32 40 33 41 36 44 37 45 2 10 3 11 6 14 7 15 18 26 19 27 22 30 23 31 34 42 35 43 38 46 39 47
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Result for numThreads=4
0 2 8 10 16 18 0 0 0 0 0 0 1 3 9 11 17 19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 8 0 0 0 0 0 0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 10 0 0 0 0 0 0 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 2 3 4 5 6 7 8 0 10 11 12 13 14 15 16 0 18 19 20 21 22 23 0 0 26 27 28 29 30 31 0 0 34 35 36 37 38 39 0 0 42 43 44 45 46 47

error: implicit conversion from 'const _Complex float' to 'float' is not permitted in C++

Build fails with clang-13:

===>  Building for hptt-1.0.5.18
[ 20% 4/5] /usr/bin/c++  -I/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/include -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -fopenmp -march=native -std=gnu++11 -MD -MT CMakeFiles/hptt.dir/src/utils.cpp.o -MF CMakeFiles/hptt.dir/src/utils.cpp.o.d -o CMakeFiles/hptt.dir/src/utils.cpp.o -c /disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/utils.cpp
[ 40% 4/5] /usr/bin/c++  -I/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/include -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -fopenmp -march=native -std=gnu++11 -MD -MT CMakeFiles/hptt.dir/src/hptt.cpp.o -MF CMakeFiles/hptt.dir/src/hptt.cpp.o.d -o CMakeFiles/hptt.dir/src/hptt.cpp.o -c /disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp
FAILED: CMakeFiles/hptt.dir/src/hptt.cpp.o 
/usr/bin/c++  -I/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/include -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -O2 -pipe -fno-omit-frame-pointer -fstack-protector-strong -fno-strict-aliasing -fno-omit-frame-pointer -fopenmp -march=native -std=gnu++11 -MD -MT CMakeFiles/hptt.dir/src/hptt.cpp.o -MF CMakeFiles/hptt.dir/src/hptt.cpp.o.d -o CMakeFiles/hptt.dir/src/hptt.cpp.o -c /disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp
/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp:179:131: error: implicit conversion from 'const _Complex float' to 'float' is not permitted in C++
                         (const hptt::FloatComplex*) A, (hptt::FloatComplex) alpha, (hptt::FloatComplex*) B, (hptt::FloatComplex) beta, hptt::ESTIMATE, numThreads, nullptr, useRowMajor));
                                                                                                             ~                    ^~~~
/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp:179:78: error: implicit conversion from 'const _Complex float' to 'float' is not permitted in C++
                         (const hptt::FloatComplex*) A, (hptt::FloatComplex) alpha, (hptt::FloatComplex*) B, (hptt::FloatComplex) beta, hptt::ESTIMATE, numThreads, nullptr, useRowMajor));
                                                        ~                    ^~~~~
/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp:190:135: error: implicit conversion from 'const _Complex double' to 'double' is not permitted in C++
                         (const hptt::DoubleComplex*) A, (hptt::DoubleComplex) alpha, (hptt::DoubleComplex*) B, (hptt::DoubleComplex) beta, hptt::ESTIMATE, numThreads, nullptr, useRowMajor));
                                                                                                                ~                     ^~~~
/disk-samsung/freebsd-ports/math/hptt/work/hptt-1.0.5-18-g9425386/src/hptt.cpp:190:80: error: implicit conversion from 'const _Complex double' to 'double' is not permitted in C++
                         (const hptt::DoubleComplex*) A, (hptt::DoubleComplex) alpha, (hptt::DoubleComplex*) B, (hptt::DoubleComplex) beta, hptt::ESTIMATE, numThreads, nullptr, useRowMajor));
                                                         ~                     ^~~~~
4 errors generated.

Version: 1.0.5-18

Compilation failed with g++ 6.3.0

Hi,
the hptt compilation failed on my machine (6700K-ubuntu 17.04-g++ 6.3.0) with the following message:

/usr/lib/gcc/x86_64-linux-gnu/6/include/avxintrin.h:994:1: error: inlining failed in call to always_inline ‘void hptt::_mm256_stream_ps(float*, hptt::__m256)’: target specific option mismatch _mm256_stream_ps (float *__P, __m256 __A) ^~~~~~~~~~~~~~~~

The compilation is OK with intel icpc 2018.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.