Welcome to the HiPerC repository.
This stub branch (nist-pages
) is only used to generate and serve
the HiPerC Docs:
you're probably looking for main
.
High Performance Computing Strategies for Boundary Value Problems
Home Page: https://pages.nist.gov/hiperc/en/latest/index.html
Welcome to the HiPerC repository.
This stub branch (nist-pages
) is only used to generate and serve
the HiPerC Docs:
you're probably looking for main
.
.cpp
for .c
extensions on affected sourcesResulting dataset is blank, giving extremely large wrss. Think about how data transfers are undertaken (addresses and pointers in cudaMemcpy
), how the mask is constructed, and look for off-by-one errors.
Implement using
CUDA is 2× faster
CSV output files are plain text, and therefore large. Only write the last one for scrutiny -- PNG is fine for the rest.
Having a constant source along the left wall, only, is a simple testing condition with an analytical solution. However, its solution is trivial, and will not effectively exercise accelerator hardware. Introduce modest complexity.
Implement using
GPU array handling is inefficient: arrays get cudaMalloc
d and cudaFree
d every time a function is called. Improve flow by moving these operations into gpu_init
and gpu_finalize
functions, called once at the beginning and once at the end of main()
.
Implement using
wrss
: 0.2%
means success, 7%
means failure.wrss
, store in an array, then sum the array)boundaries
, convolution
, and solve
from one function in main
, not each separately.Per @wd15:
README leaves it unclear who this code benefits, and how it's intended to be used. Is it for a specific software? What's the audience? Why bother?
make[1]: Entering directory '/home/amj/projects/phasefield-accelerator-benchmarks/cpu/tbb'
g++ -O2 -Wall -pedantic -std=c++11 -I../ -c boundaries.cpp
In file included from /opt/moose/tbb/include/tbb/tbb.h:68:0,
from boundaries.cpp:10:
/opt/moose/tbb/include/tbb/pipeline.h:328:74: warning: ‘template struct std::has_trivial_copy_constructor’ is deprecated [-Wdeprecated-declarations]
template struct tbb_trivially_copyable { enum { value = std::has_trivial_copy_constructor::value }; };
^
In file included from /opt/moose/gcc-5.3.0/include/c++/5.3.0/bits/move.h:57:0,
from /opt/moose/gcc-5.3.0/include/c++/5.3.0/bits/stl_pair.h:59,
from /opt/moose/gcc-5.3.0/include/c++/5.3.0/bits/stl_algobase.h:64,
from /opt/moose/gcc-5.3.0/include/c++/5.3.0/memory:62,
from /opt/moose/tbb/include/tbb/tbb_stddef.h:421,
from /opt/moose/tbb/include/tbb/aligned_space.h:24,
from /opt/moose/tbb/include/tbb/tbb.h:35,
from boundaries.cpp:10:
/opt/moose/gcc-5.3.0/include/c++/5.3.0/type_traits:1389:12: note: declared here
struct has_trivial_copy_constructor
^
g++ -O2 -Wall -pedantic -std=c++11 -I../ -c discretization.cpp
In file included from /opt/moose/tbb/include/tbb/tbb.h:68:0,
from discretization.cpp:10:
/opt/moose/tbb/include/tbb/pipeline.h:328:74: warning: ‘template struct std::has_trivial_copy_constructor’ is deprecated [-Wdeprecated-declarations]
template struct tbb_trivially_copyable { enum { value = std::has_trivial_copy_constructor::value }; };
^
In file included from /opt/moose/gcc-5.3.0/include/c++/5.3.0/bits/move.h:57:0,
from /opt/moose/gcc-5.3.0/include/c++/5.3.0/bits/stl_pair.h:59,
from /opt/moose/gcc-5.3.0/include/c++/5.3.0/bits/stl_algobase.h:64,
from /opt/moose/gcc-5.3.0/include/c++/5.3.0/memory:62,
from /opt/moose/tbb/include/tbb/tbb_stddef.h:421,
from /opt/moose/tbb/include/tbb/aligned_space.h:24,
from /opt/moose/tbb/include/tbb/tbb.h:35,
from discretization.cpp:10:
/opt/moose/gcc-5.3.0/include/c++/5.3.0/type_traits:1389:12: note: declared here
struct has_trivial_copy_constructor
^
g++ -O2 -Wall -pedantic -std=c++11 -I../ -c ../mesh.c
g++ -O2 -Wall -pedantic -std=c++11 -I../ -c ../output.c
g++ -O2 -Wall -pedantic -std=c++11 -I../ -c ../timer.c
g++ -O2 -Wall -pedantic -std=c++11 -I../ boundaries.o discretization.o mesh.o output.o timer.o ../main.c -o diffusion -lm -lpng -ltbb
make[1]: Leaving directory '/home/amj/projects/phasefield-accelerator-benchmarks/cpu/tbb'
...in the readme, in a manual, something (please don't make me dig through the source...)
Some (many?) GPUs are, in fact, two GPUs stuck together. Figure out how to use both.
Implement using
cuda_kernel<<<num_blocks, threads_per_block, shared_array_size>>>
constructint
, ceil()
, and floor()
wrss==0.0029
?printf
array locations and sizes; recompile with -g
then use cuda-memtest
and backtraceMakefile
flags and function locations: main()
should be in a .c
file, CUDA functions in .cu
, and objects built with -dc
flagsImplement using:
Convolution hard-codes [1, nx-1]
rather than [nm, nx-nm]
(for example).
nm
nm/2
in place of hard-coded 1
in loopsImplement using
- [ ] OpenACC
- [ ] OpenCL
Per @amjokisaari:
dx=dy=h
and how!
You have the "work in progress" section that tells the reader about the different sections, but no Makefile in the directory. Would be nice if there is some brief description that there are multiple versions in the different directories and they are made separately. I know that's kind of picky, but my first instinct is to look for a makefile in the top level directory...
edit: also, the "Basic Algorithm" section is confusing. A little text telling the reader that you implement the same Basic Algorithm in each of the different directories would clear that up.
The makefile has acc, but the actual directory is openacc. No compiling here :(
[amj][~/projects/phasefield-accelerator-benchmarks/gpu]> make
make -C acc
make[1]: *** acc: No such file or directory. Stop.
Makefile:7: recipe for target 'acc/diffusion' failed
make: *** [acc/diffusion] Error 2
It looks like the initial condition is two high-concentration lines on the edge of the computational domain, and concentration diffuses toward the center (see attached). This IC combined with the black-and-white color scheme made the first output image look like the code had errored. Perhaps choose a different color scheme (viridis?) and maybe consider a different initial condition. Will it serve your purposes to put a blob of mass in the center of the domain and let it diffuse outward?
Per @amjokisaari:
A
, B
, C
) to matsci labels (oldGrid
, newGrid
, convGrid
)dataX
arraysPer @amjokisaari:
c
gets computedImplement using
Implement for
- [ ] serial
Implement using
Performing 'make run' gives me some make command info, but then it seems like the code is just hanging (it's not, but my data is being written out ...somewhere... I believe to the subdirectories). It would be nice to:
get additional console output during execution so I know how far along the run is and
the readme in the subdirectory should tell the reader where their output is going.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.