Giter Club home page Giter Club logo

gem-forge-framework's People

Contributors

seanzw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

gem-forge-framework's Issues

Is there any descriptions about Stream ISA?

Hello,

Iโ€™m testing my simple vector addition with indirect access, and I wanted how to use the design in detail.
Is there any documents or descriptions about Stream ISAs?
I have some questions about instructions such as stream_config, stream_input, etc.
It seems these instructions define streams and configure the stream table, and I want to know how these instructions configure it.

Thank you.

What are gem5 simulation options for streaming engine in SSP?

Hello,

I want to check out streaming engine introduced in the paper "Stream-based Memory Access Specialization for General Purpose Processors". I am running experiments that I think would be the right configuration, however I am new to gem5 so is not pretty sure whether I am doing the right thing. If possible, can someone provide gem5 simulation options so I can reproduce engine introduced in above paper?

Thanks a lot!
Haeun

How can I evaluate a new benchmark with SSP and Stream-float?

Hello,

I want to run a simulation (e.g. vec_add) on your implementation. However, it seems that the framework cannot recognize the stream in this code as there is no numLoadElementsAllocated and numStoreElementsAllocated in simulation results. Below is the code and my question is, how can I make arrays (A, B, and C) as streams?

static const uint64_t file_size = 65536; 
//static const uint64_t file_size = 33554432; 
//static const uint64_t file_size = 16777216;

__attribute__((noinline)) static void vector_addition_host(Value* A, Value* B, Value* C) {
  #pragma omp parallel for schedule(static) firstprivate(A, B, C)
    for (uint64_t i = 0; i < file_size; i += 16) {
      __m512i valA = _mm512_loadu_epi32(A + i);
      __m512i valB = _mm512_loadu_epi32(B + i);
      __m512i valC = _mm512_add_epi32(valA, valB);
      _mm512_storeu_epi32(C + i, valC);
    }
}

int main(int argc, char **argv) {

  int numThreads = 1;
  if (argc == 2) {
    numThreads = atoi(argv[1]);
  }
  printf("Number of Threads: %d.\n", numThreads);
  omp_set_dynamic(0);
  omp_set_num_threads(numThreads);
  omp_set_schedule(omp_sched_static, 0);

  // Create an input file with arbitrary data.
  Value* A = (Value*) aligned_alloc(CACHE_LINE_SIZE, file_size * sizeof(Value));
  Value* B = (Value*) aligned_alloc(CACHE_LINE_SIZE, file_size * sizeof(Value));
  Value* C = (Value*) aligned_alloc(CACHE_LINE_SIZE, file_size * sizeof(Value));

#ifdef GEM_FORGE
  gf_detail_sim_start();
#endif

#ifdef WARM_CACHE
  WARM_UP_ARRAY(A, file_size);
  WARM_UP_ARRAY(B, file_size);
  WARM_UP_ARRAY(C, file_size);
  // Initialize the threads.
#pragma omp parallel for schedule(static) firstprivate(A)
  for (int tid = 0; tid < numThreads; ++tid) {
    volatile Value x = *A;
  }
#endif

#ifdef GEM_FORGE
  gf_reset_stats();
#endif

  vector_addition_host(A, B, C);

#ifdef GEM_FORGE
  gf_detail_sim_end();
  exit(0);
#endif

  free(A);
  free(B);
  free(C);

  return 0;
}

Thank you for your attention and I'm looking forward to your reply.

Request for guidance on trace-based simulation in gem-forge

Hello,

I'm exploring the trace-based simulation feature in gem-forge and have encountered some difficulties. The examples provided in the driver folder predominantly focus on execution-based simulation. No trace or TDG files are being generated in these examples, and removing the "--fake-trace" option would disrupt the flow. Could you kindly provide some instructions or examples on how to try trace-based simulation using gem-forge?

Thank you for your assistance!

panic: More than 24 cache blocks for one stream element

Hello,

I'm still running simulation on your framework, but I faced an error message:
panic More than 24 cache blocks for one stream element, address (xxxxx) size 1536.

Why this happens (= Why dim_vector cannot be higher than 384?) and how I can fix it?
Below is the code:

const uint64_t dim_vector = 385;
for (uint64_t i = 0; i < 10; i++) {
  for (uint64_t j = 0; j < dim_vector; j++) {
    C[idx[i] + j] = A[idx[i] + j] + B[idx[i] + j];
  }
}

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.