Comments (7)
It could build something like
lib/libcupla.so
, that user code could link to.
Yes I would also go the way to create a shared or static library with cupla for each backend. IMO best way is to create a lib for each available backend e.g. libcupla_omp2b.so
, libcupla_omp2t.so
, libcupla_cuda.so
, ...`
We are currently working on the cmake changes for alpaka to be able to install alpaka. We will than also update cupla with this concepts. This means that we could in the future create this independent cupla backend libraries during the installation of cupla.
from cupla.
Unfortunately I don't know how to write a CMake file to do it, but I have written a simple Makefile:
I will try to post a CMake example snippets next week.
from cupla.
Yes I would also go the way to create a shared or static library with cupla for each backend. IMO best way is to create a lib for each available backend e.g.
libcupla_omp2b.so
,libcupla_omp2t.so
,libcupla_cuda.so
, ...
Mhm, good point, I'll try that as well.
from cupla.
Here is my attempt at a Makefile to build a separate library for each backend:
.PHONY: all library clean install
# installation path
INSTALL_PREFIX := /usr/local
# external tools and dependencies
# CUDA installation, leave empty to disable CUDA support
CUDA_BASE := /usr/local/cuda
# boost installation, leave empty to use the system installation
BOOST_BASE :=
# TBB installation, leave empty to use the system installation
TBB_BASE :=
# Alpaka installation, leave empty to use the version bundled with Cupla
ALPAKA_BASE :=
# host compiler
CXX := g++
CXXFLAGS := -std=c++14 -O2 -g
HOST_CXXFLAGS := -pthread -fPIC -Wall -Wextra
# OpenMP flags
OMP_FLAGS := -fopenmp -foffload=disable
# CUDA compiler
ifdef CUDA_BASE
NVCC := $(CUDA_BASE)/bin/nvcc
NVCC_FLAGS := --generate-line-info --source-in-ptx --expt-extended-lambda --expt-relaxed-constexpr --generate-code arch=compute_35,code=sm_35 --generate-code arch=compute_50,code=sm_50 --generate-code arch=compute_60,code=sm_60 --generate-code arch=compute_70,code=sm_70 --generate-code arch=compute_70,code=compute_70 --cudart shared -ccbin $(CXX) -Xcudafe --display_error_number -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored
CUDA_CXXFLAGS := -I$(CUDA_BASE)/include
CUDA_LDFLAGS := -L$(CUDA_BASE)/lib64 -lcudart
endif
# boost library
ifdef BOOST_BASE
BOOST_CXXFLAGS := -I$(BOOST_BASE)/include
else
BOOST_CXXFLAGS :=
endif
# TBB library
ifdef TBB_BASE
TBB_CXXFLAGS := -I$(TBB_BASE)/include
TBB_LDFLAGS := -L$(TBB_BASE)/lib -lrt
else
TBB_CXXFLAGS :=
TBB_LDFLAGS := -lrt
endif
# Alpaka library
ifdef ALPAKA_BASE
ALPAKA_CXXFLAGS := -I$(ALPAKA_BASE)/include -DALPAKA_DEBUG=0
else
ALPAKA_CXXFLAGS := -Ialpaka/include -DALPAKA_DEBUG=0
endif
# source files
SRC=$(wildcard src/*.cpp src/manager/*.cpp)
all: library
library: lib/libcupla-cuda.so lib/libcupla-serial.so lib/libcupla-threads.so lib/libcupla-omp2-threads.so lib/libcupla-omp2-blocks.so lib/libcupla-omp4.so lib/libcupla-tbb.so
clean:
rm -rf build lib
install: library
mkdir -p $(INSTALL_PREFIX)/cupla
cp -ar include src lib $(INSTALL_PREFIX)/cupla
# compile the CUDA GPU backend only if CUDA support is available
ifdef CUDA_BASE
# CUDA GPU backend with synchronous queues
CUDA_SYNC_OBJ = $(SRC:src/%.cpp=build/cuda-sync/%.o)
$(CUDA_SYNC_OBJ): build/cuda-sync/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(NVCC) -x cu $(CXXFLAGS) $(NVCC_FLAGS) -Xcompiler '$(HOST_CXXFLAGS)' $(CUDA_CXXFLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_GPU_CUDA_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=0 -c $< -o $@
# CUDA GPU backend with asynchronous queues
CUDA_ASYNC_OBJ = $(SRC:src/%.cpp=build/cuda-async/%.o)
$(CUDA_ASYNC_OBJ): build/cuda-async/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(NVCC) -x cu $(CXXFLAGS) $(NVCC_FLAGS) -Xcompiler '$(HOST_CXXFLAGS)' $(CUDA_CXXFLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_GPU_CUDA_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=1 -c $< -o $@
# cupla shared library for the CUDA GPU backend
lib/libcupla-cuda.so: $(CUDA_SYNC_OBJ) $(CUDA_ASYNC_OBJ)
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $^ $(CUDA_LDFLAGS) -shared -o $@
endif
# serial CPU backend with synchronous queues
SEQ_SEQ_SYNC_OBJ = $(SRC:src/%.cpp=build/seq-seq-sync/%.o)
$(SEQ_SEQ_SYNC_OBJ): build/seq-seq-sync/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=0 -c $< -o $@
# serial CPU backend with asynchronous queues
SEQ_SEQ_ASYNC_OBJ = $(SRC:src/%.cpp=build/seq-seq-async/%.o)
$(SEQ_SEQ_ASYNC_OBJ): build/seq-seq-async/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=1 -c $< -o $@
# cupla shared library for the serial CPU backend
lib/libcupla-serial.so: $(SEQ_SEQ_SYNC_OBJ) $(SEQ_SEQ_ASYNC_OBJ)
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $^ -shared -o $@
# std::thread CPU backend with synchronous queues
SEQ_THREADS_SYNC_OBJ = $(SRC:src/%.cpp=build/seq-threads-sync/%.o)
$(SEQ_THREADS_SYNC_OBJ): build/seq-threads-sync/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=0 -c $< -o $@
# std::thread CPU backend with asynchronous queues
SEQ_THREADS_ASYNC_OBJ = $(SRC:src/%.cpp=build/seq-threads-async/%.o)
$(SEQ_THREADS_ASYNC_OBJ): build/seq-threads-async/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_B_SEQ_T_THREADS_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=1 -c $< -o $@
# cupla shared library for the std::thread CPU backend
lib/libcupla-threads.so: $(SEQ_THREADS_SYNC_OBJ) $(SEQ_THREADS_ASYNC_OBJ)
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $^ -shared -o $@
# OpenMP 2.0 parallel threads CPU backend with synchronous queues
SEQ_OMP2_SYNC_OBJ = $(SRC:src/%.cpp=build/seq-omp2-sync/%.o)
$(SEQ_OMP2_SYNC_OBJ): build/seq-omp2-sync/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(OMP_FLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=0 -c $< -o $@
# OpenMP 2.0 parallel threads CPU backend with asynchronous queues
SEQ_OMP2_ASYNC_OBJ = $(SRC:src/%.cpp=build/seq-omp2-async/%.o)
$(SEQ_OMP2_ASYNC_OBJ): build/seq-omp2-async/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(OMP_FLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_B_SEQ_T_OMP2_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=1 -c $< -o $@
# cupla shared library for the OpenMP 2.0 parallel threads CPU backend
lib/libcupla-omp2-threads.so: $(SEQ_OMP2_SYNC_OBJ) $(SEQ_OMP2_ASYNC_OBJ)
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(OMP_FLAGS) $^ -shared -o $@
# OpenMP 2.0 parallel blocks CPU backend with synchronous queues
OMP2_SEQ_SYNC_OBJ = $(SRC:src/%.cpp=build/omp2-seq-sync/%.o)
$(OMP2_SEQ_SYNC_OBJ): build/omp2-seq-sync/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(OMP_FLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=0 -c $< -o $@
# OpenMP 2.0 parallel blocks CPU backend with asynchronous queues
OMP2_SEQ_ASYNC_OBJ = $(SRC:src/%.cpp=build/omp2-seq-async/%.o)
$(OMP2_SEQ_ASYNC_OBJ): build/omp2-seq-async/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(OMP_FLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=1 -c $< -o $@
# cupla shared library for the OpenMP 2.0 parallel blocks CPU backend
lib/libcupla-omp2-blocks.so: $(OMP2_SEQ_SYNC_OBJ) $(OMP2_SEQ_ASYNC_OBJ)
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(OMP_FLAGS) $^ -shared -o $@
# OpenMP 4.0 parallel CPU backend with synchronous queues
OMP4_SYNC_OBJ = $(SRC:src/%.cpp=build/omp4-sync/%.o)
$(OMP4_SYNC_OBJ): build/omp4-sync/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(OMP_FLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_BT_OMP4_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=0 -c $< -o $@
# OpenMP 4.0 parallel CPU backend with asynchronous queues
OMP4_ASYNC_OBJ = $(SRC:src/%.cpp=build/omp4-async/%.o)
$(OMP4_ASYNC_OBJ): build/omp4-async/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(OMP_FLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_BT_OMP4_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=1 -c $< -o $@
# cupla shared library for the OpenMP 4.0 parallel CPU backend
lib/libcupla-omp4.so: $(OMP4_SYNC_OBJ) $(OMP4_ASYNC_OBJ)
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(OMP_FLAGS) $^ -shared -o $@
# TBB parallel blocks CPU backend with synchronous queues
TBB_SEQ_SYNC_OBJ = $(SRC:src/%.cpp=build/tbb-seq-sync/%.o)
$(TBB_SEQ_SYNC_OBJ): build/tbb-seq-sync/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(TBB_CXXFLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_B_TBB_T_SEQ_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=0 -c $< -o $@
# TBB parallel blocks CPU backend with asynchronous queues
TBB_SEQ_ASYNC_OBJ = $(SRC:src/%.cpp=build/tbb-seq-async/%.o)
$(TBB_SEQ_ASYNC_OBJ): build/tbb-seq-async/%.o: src/%.cpp
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $(TBB_CXXFLAGS) $(BOOST_CXXFLAGS) $(ALPAKA_CXXFLAGS) -Iinclude -DALPAKA_ACC_CPU_B_TBB_T_SEQ_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=1 -c $< -o $@
# cupla shared library for the TBB parallel blocks CPU backend
lib/libcupla-tbb.so: $(TBB_SEQ_SYNC_OBJ) $(TBB_SEQ_ASYNC_OBJ)
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(HOST_CXXFLAGS) $^ $(TBB_LDFLAGS) -shared -o $@
from cupla.
Yes I would also go the way to create a shared or static library with cupla for each backend. IMO best way is to create a lib for each available backend e.g. libcupla_omp2b.so, libcupla_omp2t.so, libcupla_cuda.so, ...`
Excellent, yes just build a couple of CMake targets for those :) They can be INTERFACE
targets in case you want to skip the creation of an actual object file and provide them as header-only libs.
from cupla.
@psychocoderHPC If we install cupla, should we also install the internal alpaka version?
from cupla.
Solved due to #203.
from cupla.
Related Issues (20)
- Lots of warnings with 0.4.0. HOT 1
- examples missing HOT 2
- manager::Memory not thread safe HOT 4
- Problems with cuda_fp16 and Eigen Library HOT 13
- Test minimal required CMake version HOT 1
- Some atomic operations not defined HOT 4
- cudaGetLastError() results in infinite loop HOT 5
- threads vs elements when using the OpenMP 4.0 backend HOT 12
- GitLab CI: test different GCC, Clang, Boost ... versions
- GitLab CI: add Clang-CUDA test
- problem with asynchronous peer to peer copy. HOT 3
- Problem with multi-Gpu single CPU kernel with CUPLA HOT 6
- cupla does not build with GCC 10 and CUDA 11 n C++17 mode HOT 5
- Using cupla alongside mallocMC HOT 2
- Issues with building CUPLA examples with ROCm HOT 4
- FAIL in matrixMul example HOT 5
- libCupla should be not forced to build as static library
- cupla examples can not be build without installing cupla HOT 1
- Math function wrappers are not sufficiently generic
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cupla.