polyarch / gem-forge-framework Goto Github PK

View Code? Open in Web Editor NEW

20.0 20.0 9.0 10 KB

License: BSD 2-Clause "Simplified" License

Shell 14.88% Makefile 79.86% Python 5.26%

gem-forge-framework's People

Contributors

Stargazers

Watchers

Forkers

jlsbz footprintss zongwuwang rashidagp snu-arc liangy001 pongstorn-amd reyad010

gem-forge-framework's Issues

Is there any descriptions about Stream ISA?

Hello,

I’m testing my simple vector addition with indirect access, and I wanted how to use the design in detail.
Is there any documents or descriptions about Stream ISAs?
I have some questions about instructions such as stream_config, stream_input, etc.
It seems these instructions define streams and configure the stream table, and I want to know how these instructions configure it.

Thank you.

What are gem5 simulation options for streaming engine in SSP?

Hello,

I want to check out streaming engine introduced in the paper "Stream-based Memory Access Specialization for General Purpose Processors". I am running experiments that I think would be the right configuration, however I am new to gem5 so is not pretty sure whether I am doing the right thing. If possible, can someone provide gem5 simulation options so I can reproduce engine introduced in above paper?

Thanks a lot!
Haeun

How can I evaluate a new benchmark with SSP and Stream-float?

Hello,

I want to run a simulation (e.g. vec_add) on your implementation. However, it seems that the framework cannot recognize the stream in this code as there is no numLoadElementsAllocated and numStoreElementsAllocated in simulation results. Below is the code and my question is, how can I make arrays (A, B, and C) as streams?

static const uint64_t file_size = 65536; 
//static const uint64_t file_size = 33554432; 
//static const uint64_t file_size = 16777216;

__attribute__((noinline)) static void vector_addition_host(Value* A, Value* B, Value* C) {
  #pragma omp parallel for schedule(static) firstprivate(A, B, C)
    for (uint64_t i = 0; i < file_size; i += 16) {
      __m512i valA = _mm512_loadu_epi32(A + i);
      __m512i valB = _mm512_loadu_epi32(B + i);
      __m512i valC = _mm512_add_epi32(valA, valB);
      _mm512_storeu_epi32(C + i, valC);
    }
}

int main(int argc, char **argv) {

  int numThreads = 1;
  if (argc == 2) {
    numThreads = atoi(argv[1]);
  }
  printf("Number of Threads: %d.\n", numThreads);
  omp_set_dynamic(0);
  omp_set_num_threads(numThreads);
  omp_set_schedule(omp_sched_static, 0);

  // Create an input file with arbitrary data.
  Value* A = (Value*) aligned_alloc(CACHE_LINE_SIZE, file_size * sizeof(Value));
  Value* B = (Value*) aligned_alloc(CACHE_LINE_SIZE, file_size * sizeof(Value));
  Value* C = (Value*) aligned_alloc(CACHE_LINE_SIZE, file_size * sizeof(Value));

#ifdef GEM_FORGE
  gf_detail_sim_start();
#endif

#ifdef WARM_CACHE
  WARM_UP_ARRAY(A, file_size);
  WARM_UP_ARRAY(B, file_size);
  WARM_UP_ARRAY(C, file_size);
  // Initialize the threads.
#pragma omp parallel for schedule(static) firstprivate(A)
  for (int tid = 0; tid < numThreads; ++tid) {
    volatile Value x = *A;
  }
#endif

#ifdef GEM_FORGE
  gf_reset_stats();
#endif

  vector_addition_host(A, B, C);

#ifdef GEM_FORGE
  gf_detail_sim_end();
  exit(0);
#endif

  free(A);
  free(B);
  free(C);

  return 0;
}

Thank you for your attention and I'm looking forward to your reply.

Request for guidance on trace-based simulation in gem-forge

Hello,

I'm exploring the trace-based simulation feature in gem-forge and have encountered some difficulties. The examples provided in the driver folder predominantly focus on execution-based simulation. No trace or TDG files are being generated in these examples, and removing the "--fake-trace" option would disrupt the flow. Could you kindly provide some instructions or examples on how to try trace-based simulation using gem-forge?

Thank you for your assistance!

panic: More than 24 cache blocks for one stream element

Hello,

I'm still running simulation on your framework, but I faced an error message:
panic More than 24 cache blocks for one stream element, address (xxxxx) size 1536.

Why this happens (= Why dim_vector cannot be higher than 384?) and how I can fix it?
Below is the code:

const uint64_t dim_vector = 385;
for (uint64_t i = 0; i < 10; i++) {
  for (uint64_t j = 0; j < dim_vector; j++) {
    C[idx[i] + j] = A[idx[i] + j] + B[idx[i] + j];
  }
}

Thank you.

polyarch / gem-forge-framework Goto Github PK

gem-forge-framework's People

Contributors

Stargazers

Watchers

Forkers

gem-forge-framework's Issues

Is there any descriptions about Stream ISA?

What are gem5 simulation options for streaming engine in SSP?

How can I evaluate a new benchmark with SSP and Stream-float?

Request for guidance on trace-based simulation in gem-forge

panic: More than 24 cache blocks for one stream element

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent