Comments (7)
Hi,
Yes, as of now, Scarab does not have a knob for adjusting icache latency.
If you want to add additional cycles after resolving a branch misprediction, you should use EXTRA_RECOVERY_CYCLES. Or, do you need to model anything more than that? You can also use FETCH_TAKEN_BUBBLE_CYCLES to insert bubbles after any predicted taken branch (FETCH_BREAK_ON_TAKEN must be set).
EXTRA_REDIRECT_CYCLES has a somewhat confusing name. It actually controls the number of cycles needed to restart the frontend after resolving a branch BTB miss. It is probably not what you are looking for as a proxy for icache latency.
from scarab.
Hi Siavash,
thank you. If I understand correctly Scarab implements a decoupled frontend by inserting a buffer after decode and before scheduling.
On a correctly predicted taken branch, I would assume the BP can do another prediction in the next cycle but the icache latency would still apply. Does FETCH_TAKEN_BUBBLE_CYCLES introduce bubbles on the BP (ie. no branch prediction for the next cycles) or does it only introduce bubbles after branch prediction (ie. make the decode take longer?)?
from scarab.
Actually, Scarab does not implement a decoupled frontend. The frontend is a simple pipeline. There's an icache stage that is responsible for both branch prediction and icache access. On an icache hit, it sends a packet of instructions to the next stage in one cycle. Then, there's a few cycles of decode and rename (decode_stage and map_stage in scarab src) before insertion into the reservation station(s). Decode and rename (map) are straightforward pipelines.
To get a sense of how uops move in the pipeline, you can run in debug mode with "--debug_model 1" for a visualization of uops in the pipeline at every cycle. (I would only do a short run in this mode since the logs can get huge)
Since icache and branch prediction are not decoupled, FETCH_TAKEN_BUBBLE_CYCLES affects both. That is, after a taken branch, there will be a few bubbles with no prediction and icache access.
from scarab.
For more details on how to run debug mode to visualize the pipeline, see the answer to another question on Using the debug flag in Scarab #43
from scarab.
@siavashzk Thanks for your info. Let me know if you agree with the following.
In a modern frontend, the BP can predict every cycle, even if a branch is taken. The BHT/BTB enable predictions without waiting for the icache. As a result, it is reasonable to model the icache with latency=0 as we can hide/pipeline its latency. However, on a mispredict, we have to wait for the icache and hence we should set EXTRA_RECOVERY_CYCLES to e.g. 3.
FETCH_TAKEN_BUBBLE_CYCLES should not be required if we assume that we can predict every cycle (even after a taken branch).
Secondly, let's think about what is required to model a decoupled frontend. I think it does two things: 1) Hiding icache latency by overlapping icache reads in the presence of taken branches 2) Prefetch icache entries by running ahead even further than the 3 cycles required to hide icache latency.
We achieve 1) by setting icache latency to zero and by setting EXTRA_RECOVERY_CYCLES to capture the cost of a branch mispredict. Unfortunately, EXTRA_RECOVERY_CYCLES is dynamic as it depends on whether the icache line is in L1/L2/DRAM. If you have an idea on how to dynamically set EXTRA_RECOVERY_CYCLES based on this let me know.
2) we could achieve by emitting a prefetch of the target line whenever a branch is predicted taken
Let me know what you think
from scarab.
I think what you’re suggesting in the first paragraph is a reasonable implementation of a state of the art front end. i.e., no bubbles after taken branches, EXTRA_RECOVERY_CYCLES as a proxy for restarting the frontend pipeline and hide the icache latency.
I can't think of a good way to approximate a decoupled branch predictor with only small tweaks to Scarab. It seems to me that the interaction will be very dynamic and benchmark dependent: for example, how far is the branch predictor running ahead of the rest of the frontend? If you want to model the icache prefetching affect of a decoupled branch predictor, you have to modify scarab to separate branch prediction and icache access. It should be doable, but may require some major overhaul in icache_stage.c.
I'm not sure if your two additions are sufficient for modeling a decoupled branch predictor.
-
Actually, I think dynamic EXTRA_RECOVERY_CYCLES is not necessary. You'd just need to set it to Icache latency. When you resolve a misprediction, you'd first take EXTRA_RECOVERY_CYCLES to restart the frontend (which models icache latency). Then if the instruction misses in the icache, Scarab will have to fetch from L1 (LLC) or DRAM, so the dynamic part of the latency is already modelled.
-
However, prefetching the target line during branch prediction will not do much now, because without any scarab modifications, the target will get accessed in the same/next cycle anyway. Recall that icache and branch predictor of scarab are coupled. If you want the prefetch to have a meaningful effect, you'd need to actually decouple the branch predictor so it can run ahead of the rest of the frontend.
from scarab.
Closing this issue due to inactivity. Feel free to reopen or open another issue with other questions/comments.
from scarab.
Related Issues (20)
- [Question] How do you access metadata about the Cache from a prefetcher function? HOT 2
- [Question] !ENABLE_ICACHE_PACKET_BREAKING && !PERFECT_BP HOT 8
- [Question] Runahead Execution in Scarab HOT 6
- [Question] Decoder: Redirect on btb_miss and !taken HOT 2
- recipe for target 'pin_exec' failed HOT 19
- [BUG] Running SPEC2006 Checkpoints with PIN tool HOT 1
- Tests cannot access github secret values when pull request created by fork
- [Question]About wrong path execution HOT 1
- [Question]About Wrong Path Execution HOT 2
- [Question] mem_req buffer and queues HOT 1
- [Question] Assertion failure when running gcc in SPEC2017 IntSpeed HOT 3
- [Question] Proper documentation for running SPEC 17 HOT 2
- [BUG] Link to auto-generated software documentation is broken HOT 1
- [Question] make -C docs fails HOT 1
- [Question] Running make in src/ fails HOT 8
- Running make in scarab/src fails HOT 1
- [BUG] "scarab --help" info is out of date?
- [Question and Bug] "Invalid header for input file #0" HOT 1
- Having issue in building src in scarab
- [BUG] Multiple compilation errors HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scarab.