Run the gpu applications with both ariel and mmio balar.
- Two SST installations
- One with MMIO balar
- One with ariel balar
- Need to download gpu-app-collection from accel-sim repo
- Compile with both CUDA 9.1 and 10.1
- Need to download NVBIT tool in
cuda_api_tracer/
folder - Need to have a python environment installing the packages in
requirements.txt
, could usevirtualenv
to create a sandbox get_traces.py
script- Need to
source gpu-app-collection/src/setup_environment
before calling - Runs with CUDA 10.1 and GCC 5.5.0
- But higher GCC version should also work
- Need to
run_traces.py
script- Need to
source setup_environment
before calling - Need CUDA 9.1 and GCC 4.9.2 for the ariel balar to works
- MMIO balar works with CUDA 10.1 and GCC 5.5.0
- Need to
# Get trace
python3 get_traces.py -B [BENCHMARKS]
# Run traces
python3 run_traces.py -B [BENCHMARKS]
# Example
python3 run_traces.py -B GPU_Microbenchmark,rodinia_2.0-ft --original_balar_sst_exe=~/SST-Integration/sstcore-11.0.0-release/bin/sst
# Collecting stats
## MMIO example
python3 convert_results.py -B GPU_Microbenchmark,rodinia_2.0-ft --statfile=mmio_stats.out --gpgpusim_statfile=gpgpu_inst_stats_mmio.log -o mmio_stats.json
## Ariel/Original balar example
python3 convert_results.py -B GPU_Microbenchmark,rodinia_2.0-ft --statfile=original_stats.out --gpgpusim_statfile=gpgpu_inst_stats_original.log -o original_stats.json
sst testBalar-simple.py --model-options='-c ariel-gpu-v100.cfg -v -x run -t trace/cuda_calls.trace'
- Use sst 10.1 env setup script
~/SST-Integration/sstcore-11.1.0-release/bin/sst cuda-test.py --model-options='-c ariel-gpu-v100.cfg -v -x vectorAdd/vectorAdd'
- Use the other sst installation
- Need to use cuda 9.1?
- NVBit tool should compiled with nvcc version >= 10.2
tracer_tool
setup script
- Test app are compiled with CUDA 10.1 version
test_app
setup script
- Need both 9.1 and 10.1 app
- 9.1 for simulation run
- 10.1 for tracer tool run
- Use gpu app from accel sim
- Need automatic script to do the work
- Python script launch cannot find cudart for app
- Rodinia backprop benchmark cannot trace, kernel parameter is null
- Rodinia 2.0 Backprop
- second kernel has issue
- Due to alignment setting
- Irregular argument size need to align properly?
- Implement this in the testcpu
- Link the python run script and cfg file for SST in run script?
- Create a clean script to clear all tmp files and option to not dump test data?
- Due to disk space concern
- In hw traces, save multiple cuda version traces?
- Parallelize the run traces
- Minimize the python requirement file? (iPython not needed, just the YAML file?)