Comments (4)
I did more digging and it seems that amoadd.d.aqrl does not work correctly.
This is execution log from LLVM compiled binary in REV (see code below):
111c0 >> <main>: 13 01 01 f8 addi sp, sp, -128
111c4 >> <main>: 23 3c 11 06 sd ra, 120(sp)
111c8 >> <main>: 23 38 81 06 sd s0, 112(sp)
111cc >> <main>: 13 04 01 08 addi s0, sp, 128
111d0 >> <main>: 13 05 00 00 li a0, 0
111d4 >> <main>: 23 26 a4 fe sw a0, -20(s0)
111d8 >> <main>: 13 06 20 00 li a2, 2
111dc >> <main>: 23 30 c4 fe sd a2, -32(s0)
111e0 >> <main>: 83 36 04 fe ld a3, -32(s0)
111e4 >> <main>: b7 35 01 00 lui a1, 19
111e8 >> <main>: 93 85 85 f5 addi a1, a1, -168
111ec >> <main>: af b6 d5 00 amoadd.d a3, a3, (a1)
111f0 >> <main>: 23 3c d4 fc sd a3, -40(s0)
111f4 >> <main>: 23 38 c4 fc sd a2, -48(s0)
111f8 >> <main>: 83 36 04 fd ld a3, -48(s0)
111fc >> <main>: af b6 d5 04 amoadd.d.aq a3, a3, (a1)
11200 >> <main>: 23 34 d4 fc sd a3, -56(s0)
11204 >> <main>: 23 30 c4 fc sd a2, -64(s0)
11208 >> <main>: 83 36 04 fc ld a3, -64(s0)
1120c >> <main>: af b6 d5 04 amoadd.d.aq a3, a3, (a1)
11210 >> <main>: 23 3c d4 fa sd a3, -72(s0)
11214 >> <main>: 23 38 c4 fa sd a2, -80(s0)
11218 >> <main>: 83 36 04 fb ld a3, -80(s0)
1121c >> <main>: af b6 d5 02 amoadd.d.rl a3, a3, (a1)
11220 >> <main>: 23 34 d4 fa sd a3, -88(s0)
11224 >> <main>: 23 30 c4 fa sd a2, -96(s0)
11228 >> <main>: 83 36 04 fa ld a3, -96(s0)
1122c >> <main>: af b6 d5 06 amoadd.d.aqrl a3, a3, (a1)
1122c >> <main>: af b6 d5 06 amoadd.d.aqrl a3, a3, (a1)
1122c >> <main>: af b6 d5 06 amoadd.d.aqrl a3, a3, (a1)
As you can see amoadd.d.aqrl enters some kind of loop (infinite).
Gnu gcc does not generate addmod.d.aqrl instruction for last two memory ordering parameters.
Program in C:
uint64_t atom64 = 0;
int main() {
__atomic_fetch_add(&atom64, 2, __ATOMIC_RELAXED);
__atomic_fetch_add(&atom64, 2, __ATOMIC_CONSUME);
__atomic_fetch_add(&atom64, 2, __ATOMIC_ACQUIRE);
__atomic_fetch_add(&atom64, 2, __ATOMIC_RELEASE);
__atomic_fetch_add(&atom64, 2, __ATOMIC_ACQ_REL);
__atomic_fetch_add(&atom64, 2, __ATOMIC_SEQ_CST);
return 0;
}
objdump from binary compiled with gnu gcc:
0000000000000628 <main>:
628: ff010113 addi sp,sp,-16
62c: 00813423 sd s0,8(sp)
630: 01010413 addi s0,sp,16
634: 00002797 auipc a5,0x2
638: a1c78793 addi a5,a5,-1508 # 2050 <atom16>
63c: 00200713 li a4,2
640: 00e7b02f amoadd.d zero,a4,(a5)
644: 00002797 auipc a5,0x2
648: a0c78793 addi a5,a5,-1524 # 2050 <atom16>
64c: 00200713 li a4,2
650: 04e7b02f amoadd.d.aq zero,a4,(a5)
654: 00002797 auipc a5,0x2
658: 9fc78793 addi a5,a5,-1540 # 2050 <atom16>
65c: 00200713 li a4,2
660: 04e7b02f amoadd.d.aq zero,a4,(a5)
664: 00002797 auipc a5,0x2
668: 9ec78793 addi a5,a5,-1556 # 2050 <atom16>
66c: 00200713 li a4,2
670: 0f50000f fence iorw,ow
674: 00e7b02f amoadd.d zero,a4,(a5)
678: 00002797 auipc a5,0x2
67c: 9d878793 addi a5,a5,-1576 # 2050 <atom16>
680: 00200713 li a4,2
684: 0f50000f fence iorw,ow
688: 04e7b02f amoadd.d.aq zero,a4,(a5)
68c: 00002797 auipc a5,0x2
690: 9c478793 addi a5,a5,-1596 # 2050 <atom16>
694: 00200713 li a4,2
698: 0f50000f fence iorw,ow
69c: 04e7b02f amoadd.d.aq zero,a4,(a5)
6a0: 00000793 li a5,0
6a4: 00078513 mv a0,a5
6a8: 00813403 ld s0,8(sp)
6ac: 01010113 addi sp,sp,16
6b0: 00008067 ret
and objdump of binary compiled with llvm:
00000000000111c0 <main>:
111c0: 13 01 01 f8 addi sp, sp, -128
111c4: 23 3c 11 06 sd ra, 120(sp)
111c8: 23 38 81 06 sd s0, 112(sp)
111cc: 13 04 01 08 addi s0, sp, 128
111d0: 13 05 00 00 li a0, 0
111d4: 23 26 a4 fe sw a0, -20(s0)
111d8: 13 06 20 00 li a2, 2
111dc: 23 30 c4 fe sd a2, -32(s0)
111e0: 83 36 04 fe ld a3, -32(s0)
111e4: b7 35 01 00 lui a1, 19
111e8: 93 85 85 f5 addi a1, a1, -168
111ec: af b6 d5 00 amoadd.d a3, a3, (a1)
111f0: 23 3c d4 fc sd a3, -40(s0)
111f4: 23 38 c4 fc sd a2, -48(s0)
111f8: 83 36 04 fd ld a3, -48(s0)
111fc: af b6 d5 04 amoadd.d.aq a3, a3, (a1)
11200: 23 34 d4 fc sd a3, -56(s0)
11204: 23 30 c4 fc sd a2, -64(s0)
11208: 83 36 04 fc ld a3, -64(s0)
1120c: af b6 d5 04 amoadd.d.aq a3, a3, (a1)
11210: 23 3c d4 fa sd a3, -72(s0)
11214: 23 38 c4 fa sd a2, -80(s0)
11218: 83 36 04 fb ld a3, -80(s0)
1121c: af b6 d5 02 amoadd.d.rl a3, a3, (a1)
11220: 23 34 d4 fa sd a3, -88(s0)
11224: 23 30 c4 fa sd a2, -96(s0)
11228: 83 36 04 fa ld a3, -96(s0)
1122c: af b6 d5 06 amoadd.d.aqrl a3, a3, (a1)
11230: 23 3c d4 f8 sd a3, -104(s0)
11234: 23 38 c4 f8 sd a2, -112(s0)
11238: 03 36 04 f9 ld a2, -112(s0)
1123c: af b5 c5 06 amoadd.d.aqrl a1, a2, (a1)
11240: 23 34 b4 f8 sd a1, -120(s0)
11244: 83 30 81 07 ld ra, 120(sp)
11248: 03 34 01 07 ld s0, 112(sp)
1124c: 13 01 01 08 addi sp, sp, 128
11250: 67 80 00 00 ret
from rev.
Even if atomic operations cannot be "atomic" yet, we would like to see their behavior emulated, so they are functionally working on a single thread.
from rev.
the single threaded versions are functionally emulated using only the embedded memory infrastructure; currently adding support for these on the amo
branch
from rev.
fixed
from rev.
Related Issues (20)
- munmap test fails with memH enabled HOT 3
- Double addition ends as NaN HOT 7
- simple constructor crash HOT 4
- Argument passing / atoi conversion sensitivity to compiler output HOT 9
- missing thread execution for RV32I HOT 1
- Make spikeCheck modular based on the startSymbol passed in the rev config
- Statistics not being correctly accumulated or reported HOT 2
- fcvt.w.s does not support rounding modes HOT 8
- [FEATURE REQUEST] ifence HOT 1
- [FEATURE REQUEST] Support cache flush instructions HOT 1
- A problem with FPU register dependency tracking HOT 12
- Out of Memory error when using rev_mmap (with the memH memory backend) HOT 2
- c.addi16sp instruction is incorrectly printed as c.lui by Rev Tracer HOT 1
- Tracer is printing compressed register move instructions as c.jr instructions HOT 3
- [FEATURE REQUEST] Documentation HOT 3
- rev_read operation slowdown after large number of lines read HOT 2
- Stack buffer initialization corrupts memory. HOT 22
- lwsp instruction execution generating bad address outside of stack HOT 3
- dump_stack() system call returning only zeros
- [FEATURE REQUEST] Different testing levels HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rev.