arcs-skku / emdc_llvm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from intel/llvm

37.0 37.0 6.0 1.4 GB

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.

emdc_llvm's People

Contributors

Stargazers

Forkers

jaeguklee afront yunchaechoi jeongjaek jeongjaekimarcs emdc-os

emdc_llvm's Issues

Difference betwen Action.zip and Action_hw.zip

There are two files: Action.zip and Action_hw.zip, but only Action.zip is used in test.sh. What is the difference between the two files?

The reason of benchmark candidates in Eager Free

Are there some reasons about choosing benchmark candidates in eager free?

Question regarding Nsight profiler result

According to "benchmarks/cuda/src/cuda.nvprof", cudaFree takes up most of the execution time. Is this because the time includes the cudaMemsetAsync calls after the initial cudaMemset call?
Is there a way to estimate the execution time for the asynchronous calls and cudaFree call?

Scheduling policy

I think there is some performance degradation to each task because of the interference caused by task co-location. Is this okay?

Significance of 1GB for Allocation

(as well as benchmarks/full_interference/src/main.cu)

For these files, 1GB is used for allocation. Is there a reason why this size was used and not some other size like 2GB or 4GB (like in #14)?

Confused about polybench version

In benchmarks/pt/polybench-gpu-1.0/README.txt, there is a sentence "The codes are based on PolyBench 2.0 (with the exception of convolution which isn't part of PolyBench 2.0).".
I'm confused with directory name(polybench-gpu-1.0) and sentence(polybench 2.0).
Which polybench version is this program use?

The codes are based on PolyBench 2.0 (with the exception of convolution which isn't part of PolyBench 2.0).

Unused file in cuda_docker

In EMDC_llvm/devops/openwhisk/cuda_docker,
There are several files that seem unnecessary. (e.g., hello.c, index.html, ...)
Do they have another purposes?

Eager Free insertion

In benchmark codes of eager_sched/eager_free/benchmark, does programmer need to insert pre_bemps_free manually?

Fixed values in 2Dconvolution

In benchmarks/pt/polybench-gpu-1.0/CUDA/2DCONV/2DConvolution.cu, there are fixed values(c11 = +0.2, c21 = +0.5, ...).
Is there any reason about it?

c11 = +0.2;  c21 = +0.5;  c31 = -0.8;
c12 = -0.3;  c22 = +0.6;  c32 = -0.9;
c13 = +0.4;  c23 = +0.7;  c33 = +0.10;

Purpose of using a custom low-level allocator instead of `cudaMalloc`

In MMAP, a memory allocator was made using the Low-Level GPU Virtual Memory Management API. Can I ask why a custom low-level allocator was created and used in sync_main.cu instead of using the standard cudaMalloc allocator?

Array size

In devops/openwhisk/cuda_template/main.cu, there is an array which size is sizeof(float) * 1024 * 1024.
Is there any reason about it?

#include <stdio.h>
#include <stdlib.h>
#include <cuda.h>

__global__ void hello(float* input){

	input[threadIdx.x] = 0.0f;
	printf("Hello OW CUDA!");
	
}


int main() {
	float *h_arr, *d_arr;
	size_t alloc_size = sizeof(float)*1024*1024;
	
	h_arr = (float*) malloc(alloc_size);
	cudaMalloc((void**)&d_arr, alloc_size);

	cudaMemcpy(d_arr, h_arr, alloc_size, cudaMemcpyHostToDevice);

	hello<<<1,1>>>(d_arr);
	cudaDeviceSynchronize();

	cudaMemcpy(h_arr,d_arr, alloc_size, cudaMemcpyDeviceToHost);

	cudaFree(d_arr);
	free(h_arr);

	printf("OW CUDA!");
	//std::cout << "{\"msg\": \"The results are correct!\"}";

	return 0;
}

Possible UM_page options

In docs/gpu/UVM/UM_page, is the possible UM_page only ZC, GP, DP?

Usage of cudaDeviceSetLimit(cudaLimitStackSize, 0);

In EMDC_llvm/benchmarks/pt/polybench-gpu-1.0/CUDA/2DCONV/2DConvolution.cu,
line 187, why the stack size have to limited?

187| cudaDeviceSetLimit(cudaLimitStackSize, 0);

UM based task

Is the only task that was eager launched written based on UM?

Purporse of mismatch conditional in sycl_test

In sycl_test, there is a conditional that checks if there is a mismatch with the buffer. In which scenarios can there be a mismatch? How likely can these scenarios occur?

Using 2GB instead of 4GB in sa.cu

GB_2 and GB_4 are defined, but only GB_2. Is there any significance with using 2GB and not some other size for allocation in this benchmark?

#define GB_2 2147483648
#define GB_4 4294967296

In EMDC_llvm/benchmarks/full_interference/run.sh

The purpose of 'cudaMalloc.exe' is unclear because it is binary. If the Makefile of cudaMalloc.exe exists, it would more helpful to read the code.

C compiler version

In benchmark/MMAP/src/Makefile, there is a makefile.
Is there any reason about compiling program with c++14? If we use lower version, it can be a problem?

vector_example: vector_main.cpp cuvector.cpp
	$(NVCC) $^ -o $@ -lcuda -std=c++14

sync_example: sync_main.cu
	$(NVCC) $^ -o $@ -lcuda -std=c++14

arcs-skku / emdc_llvm Goto Github PK

emdc_llvm's People

Contributors

Stargazers

Forkers

emdc_llvm's Issues

Recommend Projects

Recommend Topics

Recommend Org