Comments (8)
Are you running this in Verilog simulation? Could you attach the command line you used and perhaps the diffs of what you changed to increase the core count, just so I can be sure I understand your configuration correctly? Thanks!
from nyuziprocessor.
Yes, I'm using verilog simulation.
I just change the NUM_CORES to 8, no other changes,
and run ./run_vcs in benchmarks/hash/ folder
from nyuziprocessor.
Got it. It's been a while since I worked on this project, so I need to refresh my memory, but I'm afraid I can't think of a reason this should be off the top of my head. I'll need to look into this.
from nyuziprocessor.
Maybe a quick experiment: what happens with 2 cores? What are the corresponding cycle counts in each case (like is it close to a clean integer multiple)?
My first question would be whether this is some artifact of the test configuration that is not synchronizing correctly (and thus misreporting the count), or if you are actually running into some kind of memory saturation issue where the performance is decreasing because of cache thrashing.
from nyuziprocessor.
- I tried 1, 2, 4, and 8, but the performance dropped once.
- Is it related to this problem? I don’t know the reason: assert failed during simulation: refesh_delay should < MAX_REFESH_INTERNAL
from nyuziprocessor.
Can you clarify what you mean by "performance dropped once"? (One time? Once you were above a certain number of cores?)
The refresh_delay assertion is probably not related (but kind of interesting, as I haven't seen that one).
from nyuziprocessor.
Sorry, performance dropped once means The more cores used, the greater the performance degradation(cycles per hash is high)
from nyuziprocessor.
Oops, I see the problem :)
The total number of hashes performed is hard coded here (256):
Because there are 16 vector lanes, four threads, and each thread does four hashes = 16 * 4 * 4 = 256. When you increase the number of cores, the total hashes that are being done increases, but this is still assuming it is fixed. The latency for each thread is going to increase because there's more memory contention, but this calculation is not accounting for the fact that the throughput has increased.
One each fix might be to add another global variable gTotalThreads and do a __sync_fetch_and_add at the top:
__sync_fetch_and_add(&gActiveThreadCount, 1);
+ __sync_fetch_and_add(&gTotalThreads, 1);
Then use that to compute the total number of operations done:
printf("%g cycles per hash\n", (float) endTime / (4 * gTotalThreads * 16));
(looking at this now, it should probably use constant variables for the number of iterations each thread takes and number of vector lanes for clarity instead of hard coding the numbers). I hope that helps.
from nyuziprocessor.
Related Issues (20)
- Switch to github actions for CI build.
- Is this project going to support the openCL? HOT 2
- I saw a similar crash in the user_copy_fault test. HOT 2
- I want to use quartus to generate the RTL of GPU HOT 3
- Error building NyuziProcessor (probably NyuziToolchain) HOT 16
- Crash w/ latest version of Verilator
- Is there a problem with these lines of code in cache_lru.sv or is it just that I don't understand the algorithm? HOT 2
- Suggestions HOT 1
- hi how can i get linux driver HOT 3
- May be a icache miss thread still can be scheduled again? HOT 3
- Khronos ML summit HOT 2
- run_fpga command does not work properly for me HOT 6
- 00000000 Did not get ack for load memory, got c8 instead HOT 7
- make error HOT 6
- setup_tools.sh failure HOT 10
- ./scripts/setup_tools.sh error on MAC M1 HOT 2
- Questions related to integration. HOT 1
- TARGET_FILE:llvm-strip : build error HOT 1
- solution of ASM compiler identification is unknown
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nyuziprocessor.