Comments (13)
@waredjeb Is the Eigen lib also using cuda internally?
Cupla ships internally a few defines to rename functions cuda*
into cupla*
. This has the side effect that you must include cupla always after all external libs which need direct access to cuda. @sbastrakov thats the reason why include cupla late is working.
I like to remove this and allow the user to chose the renaming defines only if he need it.
from cupla.
@psychocoderHPC yes Eigen includes CUDA.
Then, we tried to manage this problem. Firstly we added the definition of the headers required:
#define __CUDA_FP16_H__
#define __CUDA_FP16_HPP__
In this way we moved the problem on the Half.h
header, in the following some of the errors generated:
/data/cmssw/slc7_amd64_gcc820/external/eigen/e4c107b451c52c9ab2d7b7fa4194ee35332916ec-nmpfii/include/eigen3/Eigen/src/Core/arch/GPU/Half.h(93): error: identifier "__half_raw" is undefined
/data/cmssw/slc7_amd64_gcc820/external/eigen/e4c107b451c52c9ab2d7b7fa4194ee35332916ec-nmpfii/include/eigen3/Eigen/src/Core/arch/GPU/Half.h(94): error: identifier "__half_raw" is undefined
/data/cmssw/slc7_amd64_gcc820/external/eigen/e4c107b451c52c9ab2d7b7fa4194ee35332916ec-nmpfii/include/eigen3/Eigen/src/Core/arch/GPU/Half.h(95): error: identifier "__half_raw" is undefined
/data/cmssw/slc7_amd64_gcc820/external/eigen/e4c107b451c52c9ab2d7b7fa4194ee35332916ec-nmpfii/include/eigen3/Eigen/src/Core/arch/GPU/Half.h(97): error: not a class or struct name
/data/cmssw/slc7_amd64_gcc820/external/eigen/e4c107b451c52c9ab2d7b7fa4194ee35332916ec-nmpfii/include/eigen3/Eigen/src/Core/arch/GPU/Half.h(100): error: identifier "__half_raw" is undefined
/data/cmssw/slc7_amd64_gcc820/external/eigen/e4c107b451c52c9ab2d7b7fa4194ee35332916ec-nmpfii/include/eigen3/Eigen/src/Core/arch/GPU/Half.h(111): error: identifier "__half" is undefined
/data/cmssw/slc7_amd64_gcc820/external/eigen/e4c107b451c52c9ab2d7b7fa4194ee35332916ec-nmpfii/include/eigen3/Eigen/src/Core/arch/GPU/Half.h(111): error: invalid redeclaration of member function "Eigen::half_impl::half_base::half_base(const <error-type> &)"
(100): here
We tried, adding:
typdef __half __half_raw;
And we got errors with hceil
and hfloor
in Half.h
:
...
/data/cmssw/slc7_amd64_gcc820/external/eigen/e4c107b451c52c9ab2d7b7fa4194ee35332916ec-nmpfii/include/eigen3/Eigen/src/Core/arch/GPU/Half.h(575): error: identifier "hfloor" is undefined
/data/cmssw/slc7_amd64_gcc820/external/eigen/e4c107b451c52c9ab2d7b7fa4194ee35332916ec-nmpfii/include/eigen3/Eigen/src/Core/arch/GPU/Half.h(583): error: identifier "hceil" is undefined
...
Finally, we managed to compile modifying Eigen internally.In particular we touched the following part of Half.h
.
...
EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC half floor(const half& a) {
#if (EIGEN_CUDACC_VER >= 80000 && defined EIGEN_CUDA_ARCH && EIGEN_CUDA_ARCH >= 300) || \
defined(EIGEN_HIP_DEVICE_COMPILE)
return half(hfloor(a));
#else
return half(::floorf(float(a)));
#endif
}
EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC half ceil(const half& a) {
#if (EIGEN_CUDACC_VER >= 80000 && defined EIGEN_CUDA_ARCH && EIGEN_CUDA_ARCH >= 300) || \
defined(EIGEN_HIP_DEVICE_COMPILE)
return half(hceil(a));
#else
return half(::ceilf(float(a)));
#endif
}
To finally have that:
...
EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC half floor(const half& a) {
return half(::floorf(float(a)));
}
EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC half ceil(const half& a) {
return half(::ceilf(float(a)));
}
In such a way everything is working fine for our case!
from cupla.
Needs to be checked, but we probably need to add defines for CUDA __half
and __half2
types. Kinda like what #118 did for int3
and float3
.
from cupla.
After offline discussion with @psychocoderHPC , our documentation recommends the wrong include order. I will provide a PR which fixes it and further explains in terms of other includes pulling CUDA headers internally.
from cupla.
Thanks for a detailed description. I will try to reproduce tomorrow. Our general idea is to include cupla before the CUDA stuff, which is however the way that does not work for you. The issues are probably caused by Eigen including CUDA headers directly and so the translation unit has both cupla and CUDA names in. But it is unclear why the other order works then.
from cupla.
Yes, Eigen includes CUDA stuff. Then we need to update our readme, that explicitly tells to include cupla before CUDA headers.
from cupla.
Glad that you found a workaround.
Does your application actually make use of half-precision types? Otherwise, another possible solution is to not define EIGEN_HAS_CUDA_FP16 and the corresponding CUDA headers should not be included. Unfortunately, this also requires to modify Eigen.
from cupla.
@sbastrakov @psychocoderHPC thanks for the quick replies!
from cupla.
@waredjeb should this issue be closed?
Not sure if you are interested in trying out this suggestion.
from cupla.
@sbastrakov Sorry I forgot to reply to the question, I will try your suggestion!
from cupla.
@waredjeb should I close this issue? Not sure if you found time to try my later suggestion.
from cupla.
@sbastrakov sorry for the late answer, I tried your suggestion. But in this way I got some errors:
.../Eigen/src/Core/arch/GPU/Half.h(583): error: identifier "hceil" is undefined
.../Eigen/src/Core/arch/GPU/Half.h(575): error: identifier "hfloor" is undefined
from cupla.
Thanks for trying, sorry it did not work out. Closing the issue.
from cupla.
Related Issues (20)
- Lots of warnings with 0.4.0. HOT 1
- examples missing HOT 2
- manager::Memory not thread safe HOT 4
- Test minimal required CMake version HOT 1
- Some atomic operations not defined HOT 4
- cudaGetLastError() results in infinite loop HOT 5
- threads vs elements when using the OpenMP 4.0 backend HOT 12
- build cupla as a standalone library HOT 7
- GitLab CI: test different GCC, Clang, Boost ... versions
- GitLab CI: add Clang-CUDA test
- problem with asynchronous peer to peer copy. HOT 3
- Problem with multi-Gpu single CPU kernel with CUPLA HOT 6
- cupla does not build with GCC 10 and CUDA 11 n C++17 mode HOT 5
- Using cupla alongside mallocMC HOT 2
- Issues with building CUPLA examples with ROCm HOT 4
- FAIL in matrixMul example HOT 5
- libCupla should be not forced to build as static library
- cupla examples can not be build without installing cupla HOT 1
- Math function wrappers are not sufficiently generic
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cupla.