darchr / gem5 Goto Github PK

Fork of main gem5 repo: https://gem5.googlesource.com/public/gem5/

License: BSD 3-Clause "New" or "Revised" License

Python 16.05% Shell 0.20% C 7.97% C++ 74.55% Makefile 0.07% CMake 0.25% M4 0.08% HTML 0.33% Scala 0.01% Assembly 0.26% Awk 0.01% Perl 0.04% Emacs Lisp 0.01% Java 0.01% Roff 0.02% sed 0.01% Forth 0.01% SWIG 0.01% Dockerfile 0.09% Starlark 0.05%

gem5's People

Contributors

Stargazers

Watchers

Forkers

kavita04 nganjehloo jardhu abhishekuor test-orgn louisyoung056 gzaets kunpai mush902 mkjost0 trellixvulnteam darchr chekfung gem5vision darchr umershahidengr erin-le

gem5's Issues

Change memory requests to atomic requests to get loads working

I did this. gem5 is clearly getting the 4 bytes I want, but verilator is not applying the signal.... either something is wrong with my chisel or .....idk. need more time on this.

Add verbose test option

We need to add a way to allow users to see why the test is failing. For instance, we might want to change the following code in memory_tests/simple-run.py:

if exit_event.getCause() != "maximum number of loads reached":
    exit(1)

if exit_event.getCause() != "maximum number of loads reached":
    if testlib.verbose: print("Test failed for reason " + exit_event.getCause())
    exit(1)

Add New Workloads

We need to add new workloads (considering they increase the coverage of tests) in the testing infrastructure. Few examples: compression and llvm benchmarks.

Performance-driven tests

Write tests that evaluate simulated performance of various parts, in order to spot unexpected changes to simulated behavior resulting from code changes.

410.bwaves spec benchmark not working with LTAGE predictor

To fix the 410.bwaves spec benchmark error while running it with LTAGE predictor.

The error is : assertion fail for MaxAddr : “Assertion `corrTarget != MaxAddr' failed.”
Error occurs in file:build/X86/cpu/pred/ltage.cc
The experiment was run on SE mode.

config script:
1.Use Spec_run.py and spec_processes.py
2.Set branchPredictor to LTAGE in the spec_run.py.

Run script
1.Use run_gem5_experiments.sh.
2.Set workload: by making variable bms = 410.bwaves : The used executables and input files
3. Set (cpu): cpus to any of this or all of this: DefaultO3.O3_W256, O3_W2K.
4. Set memory : SingleCycle

fix build on clang

See https://travis-ci.com/darchr/gem5/jobs/172481481

I think this is the problem: https://stackoverflow.com/questions/45070588/c11-put-time-is-not-a-member-of-std-on-modern-g

This would be a good first commit for @rutuoza99 or @takekoputa! Though, before you get started, search gerrit to make sure that it hasn't already been fixed.

You can create a new branch (e.g., bugfix/fix-clang-build), try your fix, then push to darchr/gem5. This will kick off a travis build that will be available here: https://travis-ci.com/darchr/gem5.

Feel free to respond to this post if you have any questions!

bug in fixture.py

Set up experiments and experimental design/methodology

Model backpressure between stages.

FlexCPU currently behaves as if there is an infinite (or at least as large as inflightInsts buffer) buffer between all stages. This prevents us from modelling any pipeline stalls from downstream slowdowns.

Validate multiprocessing support

The Thread/Core relationship is many-to-one by design, and should theoretically work for SMT right now, but is entirely untested.

Test multithreading support: See test-progs/threads
Consider adding new constraints that are shared across threads, if we want to model that kind of behavior. Otherwise current implementation basically treats it as largely independent thread contexts which simply share resources on the core, e.g. execution width, memory ports, etc.
Validate memory consistency.

Add descriptions of tests

When creating a test (e.g., in tests/gem5/*test.py) it would be good to include a description in the test information. Then, when the test fails, the description could help the user know what part of the code to look in.

Push simple memory changes upstream

There's a changeset on the flexcpu branch that allows SimpleMemory to be connected to more than one master. This should be tested then pushed to the mainline.

Ruby tests

Tests for the ruby cache system need to be migrated.

CPU switcheroo tests

We need to migrate the switcheroo tests to validate CPU switch-off behavior.

Linux test suite for SE mode. It would be really cool to have a table of what works and what doesn't work.

pthread tests in SE mode

RISC-V tests integrate with CPU tests.

Improve the (wall clock) performance of the FlexCPU.

Not actually slow right now, running side-by-side with O3/Minor, but improvements should be possible.

Ideas:

Avoid smart pointers if raw pointers are sufficient (namely weak_ptr if liveness of pointer is known beforehand). Keep in mind potential for better asynchronous/liveness behavior, and self-documenting nature of parameters in some cases though.
Improve procedure for rebuilding lastUses table to speed up squashing. Perhaps a linked-list type of weak reference to uses before the last might be useful for partial squashes.
Note: populateDependencies() is a major consumer of gprof profiled running time. This makes sense as it represents a major portion of the logical setup and is called for every instruction. Consider looking for ways to optimize.

Idea: Remove all defaults from SimObject description files

Related to improving python interface more generally.

One confusion is that there are many places for default values for SimObjects. If we forced these files to truly by just description files, it may make things better.

The downside is that we would have to extend every single object in the Python library system, which would introduce more boilerplate code. So, maybe this is a bad idea.

GPU tests

Migrate existing GPU component tests to be included in a more unified test suite

Revise the Resource system

The Resource system is nonintuitive and has very strange behaviors, such as inconsistent latencies, etc. Needs a proper rewrite.

Make a DPI manager class to manage pointers between verilog and gem5.

Linker errors!

Add AMD EPYC system config

Add an AMD EPYC system like configuration to use a baseline for future experiments.

Relevant documents: 1, 2

Add Learning gem5 tests

Make tests for learning gem5 scripts.

configs/learning_gem5/part1/simple.py
- Should check return code
configs/learning_gem5/part1/two_level.py
- Should check return code
configs/learning_gem5/part2/hello_goodbye.py
- Should check stdout
configs/learning_gem5/part2/run_simple.py
- Should check stdout
configs/learning_gem5/part2/simple_cache.py
- Should check stdout
configs/learning_gem5/part2/simple_memobj.py
- Should check stdout

Part 3, Ruby

This is possibly part 4 or 5 in the book. I don't remember ;).

You'll also have to modify the way gem5 is built for this because the protocols are compiled into gem5. If you want to change the protocol, you have to rebuild gem5.

configs/learning_gem5/part3/ruby_test.py
- Not sure of check
configs/learning_gem5/part3/simple_ruby.py
- Not sure of check
something with MI_Example?

Want to make a directory in tests/gem5 called learning_gem5. In that directory you should have three new files called part1_test.py, part2_test.py, and part3_test.py. Then, inside each of the *_test.py files, you will call gem5_verify_config() with the proper parameters to test each of the config scripts shown above.

Get Ruby set up with SimpleFS

Improve x86 SIMD support

There are a huge number of x86 SIMD instructions.
Most are implemented in gem5 (though they haven't been tested).
However, they are generally microcoded and only work functionally.

It would be a great project to implement these instructions using the new vector register implementation in the CPU models.
E.g., https://gem5-review.googlesource.com/c/public/gem5/+/13519

Realview test

Realview test needs to be migrated to test suite

Pause gem5 and check stats

This would be useful to know if it's making forward progress.

Methodology

Figure out how to easily set up a disk image with workloads on it.

We want this to be automated. Something like a Dockerfile or a Vagrantfile. List the things we want on the disk image and it "just happens".

Install Ubuntu from scratch (with correct partitions)
Install various ubuntu packages
Copy in files
Build workloads

Not sure if all of this is possible to automate.

Another possibility is to install docker inside the disk image. Then, use docker images as the benchmark images.

Memory consistency model tests

It would be nice if a test suite existed which could determine with high confidence that a particular CPU model respected certain memory consistency models.

Probably difficult to do in a deterministic manner without specific hooks into the CPU model, but there should be multithreaded samples which are probabilistically likely to create violations of memory consistency models that the CPU has not respected.

Forward Check FlexCPU While running it.

To view and be able to check the progress of FlexCPU (whether it is progressing properly) while running benchmarks.

Configurable functional units

This is related to modeling realistic processors. What FU pool kind of thing do we need?

O3 and Minor both define their own classes for this functionality. A good small project might be to create a generalized interface for gem5 for this functionality, and implement for O3, Minor, and FlexCPU.
Question asked by @powerjg, how important is this? Can we show that there are studies that can't be done without this functionality? Can we characterize such workloads?

Improve default configurations

Python config for infinite model (Should be base, so parameters other models set have expected impact)
Python config for good finite model(s), each of which should be python subclasses of the base infinite model
Big OoO model
Small OoO model
In-order superscalar?
Speculative vs nonspeculative?

flags.isSet(STATIC_DATA|DYNAMIC_DATA) assertion

This has come up a number of times on the mailing list. It needs to be fixed.

Double check everything works for memory and CPU tests

Before pushing the CPU and memory tests, it would be good for someone else to test them (preferably not on amarillo).

To do this, go to the each changeset on gerrit and cherry pick it onto the "upstream" branch on darchr/gem5 or on master on gem5.googlesource.com.

CPU tests

https://gem5-review.googlesource.com/c/public/gem5/+/15855
https://gem5-review.googlesource.com/c/public/gem5/+/15856
https://gem5-review.googlesource.com/c/public/gem5/+/15857

Memory tests

https://gem5-review.googlesource.com/c/public/gem5/+/15835
https://gem5-review.googlesource.com/c/public/gem5/+/15836

Once you have run the tests and seen that they pass, you can give the review request a +2!

statistics checking in tests

Add a new verifier that checks specific statistics in the stats files.

The memtest would be a good place to start on this. See Sandburg's comment on https://gem5-review.googlesource.com/c/public/gem5/+/15835

To do

Make it so the stats files can be parsed
Modify the learning gem5 tests to use SimpleMemory instead of DDR3 so fewer things are tested
Update the learning gem5 tests

Full system support for FlexCPU

Squash/Commit widths/latency

Currently squash and commit events are unconstrained and take no simulated time. Some studies may want to observe the impact of either of these things (e.g. when speculation is often wrong, or expensive when wrong).

Developing a testing methodology

Want a straightforward way to run experiments that is well documented.

Fix/improve stats

Some stats were broken last we checked. They need to be validated and issues fixed.

FlexCPU: Branch predictor depth consumed without being released

When FlexCPU requests a branch prediction when the instruction buffer is full, it seems that the branch predictor depth is taken but not released when the following instruction fails to be added to the buffer. Need to investigate later.

Create NPB disk image

And get these workloads running.

Memory dependence prediction.

FlexCPU does memory dependence tracking extremely conservatively right now. While this should theoretically guarantee correct behavior, it leaves us room for performance improvements.

Current FlexCPU implementation can be treated as "Always predict dependent"
Consider alternative "Always predict independent" approach as an aggressive speculation option.
See Store Sets paper, which O3 implements.
A good small project would be to generalize what O3 implements into an interface that any CPU can easily use (and then use that interface for FlexCPU)

reevaluationg combinational logic in verilator won't work until clock edge changes. Work around?

Make all Ruby protocols compile together

We want to be able to have a single executable with all of the protocols compiled into it. Then it would just take changing the python to use a different protocol.

Update the SLICC compiler to make a unique name for each machine
Create a python wrapper module to rename back to the normal machine names (e.g., from MESI_two_level import *)
- This renames things like MESI_two_level_L1Cache to just L1Cache.

Major thing to overcome

There are some static enums that are create for the MachineID and MachineType. These include bitvectors that must be less than 64 entries.

MessageType will also have this problem, but this might be overcome with prepending names like MESI_Two_Level_MsgType.

Full System Tests

We need to add support to test Linux boot (preferably latest kernels).

Memrefs need to depend on mem barriers.

Noticed that mem barriers have dependencies on earlier barriers and refs, but mem refs don't have dependencies on barriers in populateDependencies(). This needs to be addressed, but probably should be in its own patch since that piece of code is touched by more than one commit right now.