Signed integer overflow/underflow aren't well defined by the C standard and should be

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Signed integer overflow in core interpreter about mupen64plus-core HOT 5 CLOSED

mupen64plus commented on June 1, 2024

Signed integer overflow in core interpreter

from mupen64plus-core.

Comments (5)

Nebuleon commented on June 1, 2024

ADDU and SUBU are misnomers on MIPS. They act like the signed two's complement versions, ADD and SUB, but they don't raise the Integer Overflow exception (which mupen64plus does not implement anyway) - they wrap the number around instead.

The true way to deal with this would be to use something like ...

#include <stdint.h>
#define SE32(expr) ((int64_t) ((int32_t) (expr))) /* sign-extend 32-bit to 64 */

DECLARE_INSTRUCTION(ADDU)
{
  /* sign-extend the result of ... */
  rrd = SE32( (
    /* extend the 32-bit operands to 64-bit so that bit #31 can be a carry instead of the sign, then add */
    SE32(rrs32) + SE32(rrt32)
    /* mask that carry away, leaving bits #0..31 of the result, and #32..63 as 0 */
  ) & INT64_C(0xFFFFFFFF) );
}

DECLARE_INSTRUCTION(SUBU)
{
  rrd = SE32( ( SE32(rrs32) - SE32(rrt32) ) & INT64_C(0xFFFFFFFF) );
}

But then you run into issues with DADDU and DSUBU because you'd need an int128_t type to represent the carry of int64_t + int64_t before using & INT128_C(0xFFFFFFFFFFFFFFFF) to mask it away, but C99 doesn't define one (and neither does C11, AFAIK). And I have no idea if gcc can optimise that code up there.

C doesn't define what happens on unsigned overflow either, so using uint32_t and uint64_t isn't an option. You'd need uint64_t and the still inexistent uint128_t to hold the carries into bits 32 and 64 before masking them away.

As long as loops in native code aren't misusing signedness, I don't know if going to such lengths is necessary to protect from undefined behavior in MIPS opcodes:

The compiler doesn't know what interpreter function a MIPS opcode is going to execute, so it emits code to indirectly call the appropriate interpreter function. Because all functions could be needed, none is removed as dead code.
The compiler doesn't know before an opcode executes what values are in the MIPS registers used as operands to the instructions, and the result is stored to a global variable. According to the C standard, that computation cannot be optimised away, so a native instruction must be emitted, unlike a loop that the compiler can reason with at compilation time and decide that it's dead.

I would just ignore the warnings from the sanitizer in MIPS opcodes, personally.

edit: whoops, I accidentally referenced two issues while discussing bit numbers

from mupen64plus-core.

Nebuleon commented on June 1, 2024

OK, so since then, I've discovered that performing operations on unsigned integers is well-defined in C [1]. The following, for example, would be valid reinterpretations of ADD, ADDU, DADD and DADDU, even if two of them imply signed arithmetic, because MIPS uses 2's complement arithmetic for all opcodes:

#include <stdint.h>
#define SE32(expr) ((int64_t) ((int32_t) (expr))) /* sign-extend 32-bit to 64 */

DECLARE_INSTRUCTION(ADDU)
{
  /* sign-extend the result of... */
  rrd = SE32(
    /* + between uint32_t is guaranteed to wrap around modulo 2^32 */
    (uint32_t) rrs32 + (uint32_t) rrt32
  );
}

/* ADD is ADDU with exceptions, which mupen64plus does not raise. Same code here. */

DECLARE_INSTRUCTION(DADDU)
{
  /* assign the result of ..., reinterpreted as int64_t for type correctness */
  rrd = (int64_t) (
    /* + between uint64_t is guaranteed to wrap around modulo 2^64 */
    (uint64_t) rrs + (uint64_t) rrt
  );
}

/* DADD is DADDU with exceptions, which mupen64plus does not raise. Same code here. */

Would you rather have this sort of casting, perhaps?

[1] http://en.wikipedia.org/w/index.php?title=Integer_overflow&oldid=650278548#Origin
"In the C programming language, signed integer overflow causes undefined behavior, while unsigned integer overflow causes the number to be reduced modulo a power of two, meaning that unsigned integers "wrap around" on overflow."

from mupen64plus-core.

commented on June 1, 2024

Would you rather have this sort of casting, perhaps?

If this is the right thing to do then yes

from mupen64plus-core.

Nebuleon commented on June 1, 2024

I actually wonder what standard the checker you use obeys, then.

Because apparently, the definition of <stdint.h> [1], which went into the C99 standard, states:

The typedef name int N _t designates a signed integer type with width N, no padding bits, and a two's-complement representation. Thus, int8_t denotes a signed integer type with a width of exactly 8 bits.

Note "and a two's complement representation", meaning that the signed additions and subtractions are guaranteed to be fully defined already, unlike for the C types signed char, signed short, signed int, signed long and signed long long.

I am still not certain that this an issue. I'd like the opinion of others on this.

[1] http://pubs.opengroup.org/onlinepubs/009695399/basedefs/stdint.h.html

from mupen64plus-core.

comex commented on June 1, 2024

@Nebuleon It is an issue, because even though the representation is defined, the C spec still forbids overflow - specifically, any case where the mathematical result of an operation does not fit in the "range of representable values". And then there's a special exception for unsigned integer types.

6.5.5. If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.

6.2.5.9. The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same. A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

(citations are to C11)

I just posted a PR to fix some of these overflows, after one of them actually did get miscompiled on my system.

from mupen64plus-core.

Signed integer overflow in core interpreter about mupen64plus-core HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent