Giter Club home page Giter Club logo

Comments (22)

leekillough avatar leekillough commented on August 14, 2024 1

Either way, that sb shouldn't corrupt anything right, as it's well within the stack allocation (even if the stack allocation has been compacted because we're not accessing parts of it)?

Yes. It is corrupting argv[0] in main(). I'm going through Rev trying to trace how and why it occurs.

from rev.

leekillough avatar leekillough commented on August 14, 2024

I have reduced the test case by making data as small as necessary, and eliminated the loop:

[[gnu::noipa]] void foo() {
  volatile char data[81];
  data[2] = '\0';
}

int main( int argc, char** argv ) {
  foo();
  return argv[0][0];
}

The assignment to data[2] apparently corrupts argv[0], making the return argv[0][0] dereference a nullptr.

I looked at the assembly code, and it looks mostly correct, although it only allocates 80 bytes for data instead of 81, although that shouldn't affect the data[2] assignment, and might be making sure sp is aligned to 16 bytes:

00000000000101b6 <foo()>:
   101b6:       715d                    add     sp,sp,-80
   101b8:       00010123                sb      zero,2(sp)
   101bc:       6161                    add     sp,sp,80
   101be:       8082                    ret

0000000000010106 <main>:
   10106:       1141                    add     sp,sp,-16
   10108:       e022                    sd      s0,0(sp)
   1010a:       e406                    sd      ra,8(sp)
   1010c:       842e                    mv      s0,a1
   1010e:       0a8000ef                jal     101b6 <foo()>
   10112:       601c                    ld      a5,0(s0)
   10114:       60a2                    ld      ra,8(sp)
   10116:       6402                    ld      s0,0(sp)
   10118:       0007c503                lbu     a0,0(a5)
   1011c:       0141                    add     sp,sp,16
   1011e:       8082                    ret

When I change the declaration to data[82], the compiler completely changes the stack allocation of data to the next 16 byte stack allocation, and makes the assignment of data[2] look strange:

[[gnu::noipa]] void foo() {
  volatile char data[82];
  data[2] = '\0';
}

int main( int argc, char** argv ) {
  foo();
  return argv[0][0];
}
00000000000101b6 <foo()>:
   101b6:       711d                    add     sp,sp,-96
   101b8:       00010523                sb      zero,10(sp)
   101bc:       6125                    add     sp,sp,96
   101be:       8082                    ret
0000000000010106 <main>:
   10106:       1141                    add     sp,sp,-16
   10108:       e022                    sd      s0,0(sp)
   1010a:       e406                    sd      ra,8(sp)
   1010c:       842e                    mv      s0,a1
   1010e:       0a8000ef                jal     101b6 <foo()>
   10112:       601c                    ld      a5,0(s0)
   10114:       60a2                    ld      ra,8(sp)
   10116:       6402                    ld      s0,0(sp)
   10118:       0007c503                lbu     a0,0(a5)
   1011c:       0141                    add     sp,sp,16
   1011e:       8082                    ret

I will try to find the physical address of argv[0] in the simulator, and set up watchpoints to catch its corruption.

from rev.

leekillough avatar leekillough commented on August 14, 2024

It looks like an off-by-1 allocation bug, since char data[81] is allocated only 80 bytes, while char data[82] jumps to 96 bytes. But in our test case, we don't use data[80] or data[81], so the compiler is free to generate code which produces the same results (the "as if" rule).

from rev.

ldalessa avatar ldalessa commented on August 14, 2024

It looks like an off-by-1 allocation bug, since char data[81] is allocated only 80 bytes, while char data[82] jumps to 96 bytes. But in our test case, we don't use data[80] or data[81], so the compiler is free to generate code which produces the same results (the "as if" rule).

This does seem weird. I don't know why gcc rounds to 80 instead of 16, like clang does. My original code seems to allocate the 256 fine, and the crosstool-ng musl toolchain I'm using doesn't produce the off-by-one for 81. I can't reproduce that at godbolt either (clang is smart enough to know it doesn't need the full 96 bytes).

Either way, that sb shouldn't corrupt anything right, as it's well within the stack allocation (even if the stack allocation has been compacted because we're not accessing parts of it)?

from rev.

leekillough avatar leekillough commented on August 14, 2024

Please verify this has been fixed (see #288 ) and close it.

from rev.

ldalessa avatar ldalessa commented on August 14, 2024

Looks like it's fixed my issues. Thank you.

from rev.

ldalessa avatar ldalessa commented on August 14, 2024

I'm still having some problems with stack corruption. I've tried to reduce the test case and come up with the following.

#include <cstdio> 

int main(int, char** argv)
{
    char a[512]{};
    int n = snprintf(a, sizeof(a), "Processing records from %s\n", argv[1]);
    if (argv[1] == nullptr) {
        asm volatile( ".dword 0x00000000" ::: "memory" );
    }
    return 0;
}

Compiled this time as:

$ riscv64-unknown-linux-musl-gcc -march=rv64imafdc -o test test.cpp

Oddly, if I do not store the unused n on the stack, there is no issue, if I leave the n there I fail the assert.

This one uses snprintf from musl, but I do not believe that this is causing the issue directly. If you can't reproduce this let me know and I can try and install a different toolchain.

$ riscv64-unknown-linux-musl-gcc --version
riscv64-unknown-linux-musl-gcc (crosstool-NG 1.26.0.85_06fad54) 14.1.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

from rev.

ldalessa avatar ldalessa commented on August 14, 2024

Wait wait. I wasn't using -static. I still am having problems but this test case doesn't exhibit it with -static. I'll close for now and reopen once I reduce something.

from rev.

ldalessa avatar ldalessa commented on August 14, 2024

Ugh, no... sorry for the churn. -static does not resolve it, the test is still failing.

from rev.

leekillough avatar leekillough commented on August 14, 2024

There might be two distinct issues at work. My reduced case and another unrelated argv corruption bug were fixed by correcting the allocation of argv. That wasn't caused by corruption, but by the misallocation of argv to use the same space as other variables.

This may be running into an unrelated bug which might be corruption.

from rev.

ldalessa avatar ldalessa commented on August 14, 2024

Okay. It wasn’t manifesting when I was using a static char buffer and it was manifesting with the nulled argv[] pointer so I assumed it was related. I guess if those are allocated near the stack then any memory corruption could look like this.

from rev.

leekillough avatar leekillough commented on August 14, 2024

#290

from rev.

leekillough avatar leekillough commented on August 14, 2024

Does it still fail after #290 has been merged into devel?

from rev.

ldalessa avatar ldalessa commented on August 14, 2024

Does it still fail after #290 has been merged into devel?

Hi Lee. This failure is still present. That #290 is referencing #289 and seems to be targeting it.

from rev.

leekillough avatar leekillough commented on August 14, 2024

Does it still fail after #290 has been merged into devel?

Hi Lee. This failure is still present. That #290 is referencing #289 and seems to be targeting it.

Yes, but it fixed a lot of corruption bugs.

from rev.

leekillough avatar leekillough commented on August 14, 2024

I have reduced it some more, switching to C (smaller executable):

#include <stdio.h>
int main(int argc, char** argv)
{
  char a[365];
  int n = snprintf(a, sizeof(a), "");
  if (!argv[1])
    asm(".word 0");
}

The 365 size of a is the size which triggers it. 364 does not. a does not need to be initialized. The snprintf is apparently necessary, but it can be an empty print. I tried other functions like puts, atoi and they did not trigger it. The variable n is necessary.

(I am providing a single argument with --args="[1]".)

from rev.

leekillough avatar leekillough commented on August 14, 2024

The stack pointer was not always aligned on 16 bytes. Try it now.

from rev.

ldalessa avatar ldalessa commented on August 14, 2024

The stack pointer was not always aligned on 16 bytes. Try it now.

This seems to have resolved the test you posted, but if I bump that buffer size back up to 512 (using your C code example) it's still corrupting argv and failing the assert.

I'm using musl, I'm not sure if that's making any difference.

from rev.

leekillough avatar leekillough commented on August 14, 2024

Increasing it 16 bytes to 381 makes it fail. I am stepping through the simulator, trying to figure out where corruption is occurring.

from rev.

leekillough avatar leekillough commented on August 14, 2024

The corruption occurs:

   104ac:       f482                    sd      zero,104(sp)

when sp + 104 == 0x3ffffb33, and &argv[1] == 0x3ffffb38.

It is in _svfprintf_r, which snprintf calls.

sp has the address 0x3ffffacb, which is not aligned. When main() is entered, sp == 0x3ffffffb, which is not aligned.

from rev.

leekillough avatar leekillough commented on August 14, 2024

It should be fixed now.

from rev.

ldalessa avatar ldalessa commented on August 14, 2024

Works both for my reduced test case and RL code now. Thanks Lee.

from rev.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.