Giter Club home page Giter Club logo

Comments (12)

greghope667 avatar greghope667 commented on August 22, 2024 1

Hi @CookiePLMonster , thanks for the message. It's good to see people interested in fixing this game.

The issue in TOCA sounds very similar to this. Some timing loop running too fast could very well be the cause of the bug.

From my experience playing this game, the menu freezing is worse on faster PCs. Definitely feels like there's some race condition/edge case bug exposed when the game loads too fast.

My fix here is very much a hack to get the game playable - I worry there could be other side effects or breaking of the physics engine. Actually identifying the root cause of this bug and making a proper fix would certainly be better.

I'll take another look in the debugger and get some control flow graphs/pseudocode so we can see more clearly what's happening.

from master-rallye-freeze-patcher.

greghope667 avatar greghope667 commented on August 22, 2024 1

Here's the function where the main rounding issues are occuring

void __thiscall FUN_0064f810(int param_1_00,float param_1)
{
  float *pfVar1;
  float fVar2;
  undefined4 *puVar3;
  float *pfVar4;
  uint uVar5;
  float local_8 [2];
  
  local_8[0] = param_1 - *(float *)(param_1_00 + 0x14);
  puVar3 = *(undefined4 **)(param_1_00 + 4);
  if (puVar3 == (undefined4 *)0x0) {
    uVar5 = 0;
  }
  else {
    uVar5 = *(int *)(param_1_00 + 8) - (int)puVar3 >> 2;
  }
  if (uVar5 < *(uint *)(param_1_00 + 0x10)) {
    FUN_0041bab0(*(undefined4 *)(param_1_00 + 8),1,local_8);
  }
  else {
    if (puVar3 != puVar3 + (*(uint *)(param_1_00 + 0x10) - 1)) {
      do {
        *puVar3 = puVar3[1];
        puVar3 = puVar3 + 1;
      } while (puVar3 != (undefined4 *)
                         (*(int *)(param_1_00 + 4) + -4 + *(int *)(param_1_00 + 0x10) * 4));
    }
    *(float *)(*(int *)(param_1_00 + 4) + -4 + *(int *)(param_1_00 + 0x10) * 4) = local_8[0];
  }
  fVar2 = 0.0;
  pfVar1 = *(float **)(param_1_00 + 4);
  for (pfVar4 = pfVar1; pfVar4 != *(float **)(param_1_00 + 8); pfVar4 = pfVar4 + 1) {
    fVar2 = fVar2 + *pfVar4;
  }
  if (pfVar1 == (float *)0x0) {
    uVar5 = 0;
  }
  else {
    uVar5 = (int)*(float **)(param_1_00 + 8) - (int)pfVar1 >> 2;
  }
  *(float *)(param_1_00 + 0x14) = param_1;
  *(float *)(param_1_00 + 0x18) = fVar2 / (float)(ulonglong)uVar5 + *(float *)(param_1_00 + 0x18);
  return;
}

The original C++ would have been something like

struct P {
  uint32_t _unknown;
  std::vector<float> times; // { offset 4=begin, 8=end, c=capacity }
  uint32_t max_count;       // { offset 10 }
  float time;               // { offset 14 }
  float smoothed_time;      // { offset 18 }
};

void P::advance_time(float measured)
{
  float diff = measured - this->time;

  if (this->times.size() < this->max_count) {
    this->times.push_back(diff);
  }
  else {
    // Discard oldest value
    float *pf = this->times.begin();
    do {
      *pf = pf[1]; pf++;
    } while (pf != &this->times.back());
    // Store new value
    this->times.back() = diff;
  }
  float total_time = 0.0f;
  for (float *pf = this->times.begin(); pf != this->times.end(); pf++) {
    total_time += *pf;
  }
  this->time = measured;
  this->smoothed_time += total_time / this->times.size();
  return;
}

where the measured parameter is the current time, in seconds, from QueryPerformanceCounter. When the interval between calls of advance_time are too small diff becomes 0 and smoothed_time stops advancing. The game waits until smoothed_time hits a predefined threshold to process updates.

I don't know the best way of fixing this. Adding a delay to the main loop (which my always-enable-rendering fix does) works, as the subtraction doesn't underflow to zero. I tried disabling this time smoothing (by changing the code to this->smoothed_time = measured;) which also works. I think the smoothing is only relevant for when the main loop time is slower than the fixed update rate (30 Hz), which is not true for any reasonably modern PC.

It might be possible to re-order the entire time handling logic to fix this but that's a much bigger change.

from master-rallye-freeze-patcher.

CookiePLMonster avatar CookiePLMonster commented on August 22, 2024

An alternate fix seems to exist, and indeed it's changing the timer frequency - https://sharemods.com/w63ctajylh7f/Master.Rallye.MenuFix.rar.html. Parallels with TOCA's issue are probably true, then.

from master-rallye-freeze-patcher.

CookiePLMonster avatar CookiePLMonster commented on August 22, 2024

Sooner or later I'll look into this myself, but for now I suspect you should have good results undoing your change, and then breaking into the debugger during that freeze - there is a very high chance the code will break right inside the broken loop for you to inspect.

from master-rallye-freeze-patcher.

greghope667 avatar greghope667 commented on August 22, 2024

I've made some progress and I think I've found the bug that's causing the freeze. It's pretty much what you suggested - when rendering is disabled, the code is running too fast and getting stuck thinking no time has passed. There's some differences to the TOCA bug though as we're not getting stuck in any particular function. The whole main loop is still running but not making progress.

In this case, the result of QueryPerformanceCounter is getting cast to a 32-bit float and differences are taken between successive readings. If these differences are small the result underflows to zero causing the game to get stuck. As the performance counter gets larger the rounding errors get larger and the game slows more.

My fix worked because it force enabled rendering, so the elapsed time was always large enough to stay non-zero. Alt-tabbing during the freeze (which works for some people) causes events which also slows down the game loop, sometimes allowing time to progress.

I'll post some decompiled code to better show what's going on.

from master-rallye-freeze-patcher.

CookiePLMonster avatar CookiePLMonster commented on August 22, 2024

Amazing job cleaning that pseudocode! IDA's would probably be a bit cleaner than ghidra's, but either way your interpretation makes sense.

You're most likely correct in your assessment about diff, but that also means that on paper re-enabling rendering might break again in the future, as PCs get even faster?

I wonder if advance_time is called often enough for the diff to be smaller than FLT_EPSILON - I suppose it can happen as measured increases the exponent as time goes on. I would triplecheck the way measured is constructed, as some degree of unnecessary precision loss may happen here too - however, looking at this function it might indeed need a rewrite... or at the very last, diff = max(diff, FLT_EPSILON) which is a hack but might help if this tick function is indeed called tens of thousands of times per second. I'll have to look into the executable myself for that.

from master-rallye-freeze-patcher.

greghope667 avatar greghope667 commented on August 22, 2024

Thanks, was very confused with the code until I realised it's all just std::vector stuff 😊. Don't have IDA pro (but might have to give the free version a try at some point).

Getting the measured time looks fine to me:

struct clock {
  int64_t epoch;
  float rate;
};

long double get_time(clock& c) // FUN_006455c0
{
  LARGE_INTEGER timestamp;
  QueryPerformanceCounter(&timestamp);
  return (long double)(timestamp.QuadPart - c.epoch) / (long double)c.rate;
}

The rate is from QueryPerformanceFrequency - 1e+7 on my machine. Interestingly there is some code for changing the epoch in the main loop under some conditions - will need investigation.

I realise I'm not sure the precision of most of the floating point calculations here. All the storage is happening using 32-bit floats, but values in x87 registers are higher precision (I believe?). This could actually be quite important for exactly when the subtraction in advance_time underflows.

The main game loop is quite a lot of code, but the overall structure is something like

while (true) {
  // Process events from Windows
  // ...
	
  float measured = get_time(clock);
  timer.advance_time(measured);

  while (timer.smoothed_time > (tick_count / 30.0)) {
    // Advance game state (by 1/30 s)
    // ...
    tick_count++;
    // ...
    if (game_loading) {
      // ...
    }
  }
	
  if (game_loading == false) {
    render_frame();
  }
}

If there's no events and the game is loading, almost zero code gets executed per loop while the timer is below the threshold.

It appears that there's a 2ms delay loop in render_frame somewhere (which would explain the 500Hz maximum render framerate). This is enough of a delay on my machine to stop the bug, but if the perf counter gets too high then it might not be enough - not sure how long you'd need to keep the game running for that to happen though.

from master-rallye-freeze-patcher.

CookiePLMonster avatar CookiePLMonster commented on August 22, 2024

OK I checked the code - oddly enough, long double get_time(clock& c) // FUN_006455c0 didn't match my original EU executable (from http://redump.org/disc/61066/) but does match the PL executable (from http://redump.org/disc/51494/). Is this a no-CD or a patched release? I checked patches-scrolls but there seem to have been no patches released for the game.

from master-rallye-freeze-patcher.

greghope667 avatar greghope667 commented on August 22, 2024

I've been using a no-cd patch from myabandonware - I think the game had DRM which didn't work on Windows 10.

Not sure which version of the game this was based on. I assumed all the versions of the game were basically the same but apparently that's not the case. Is the code the same and there's just an offset/address difference?

from master-rallye-freeze-patcher.

CookiePLMonster avatar CookiePLMonster commented on August 22, 2024

I've been using a no-cd patch from myabandonware - I think the game had DRM which didn't work on Windows 10.

The copy I got didn't at least - looks like the no-CD patch from myabandonware may be an official DRM free executable then, neat.

Not sure which version of the game this was based on. I assumed all the versions of the game were basically the same but apparently that's not the case. Is the code the same and there's just an offset/address difference?

Yeah it's just addresses which poses no challenge at all.

I think I got a good fix in place - I ran a few tests and you're 100% right in that diff becomes zero due to how often the time gets advanced. I implemented a fix which I think is "good enough" not to be called a hack, and the one that avoids changing the logic of those timers:

void Advance_Throttled(float time)
{
	// Only call the original function if a delta between the old time and the current time
	// can be represented as a floating point value
	if (time >= std::nextafter(m_time, std::numeric_limits<float>::infinity()))
	{
		Advance(time);
	}
}

Advancing the timer so often so it doesn't actually increment is meaningless, because this function has nothing resembling a "tick count" - therefore, I tried throttling updates just enough for the diffs to become nonzero, and it seems to work just fine. std::nextafter is probably a better idea than comparing the differences against epsilon, since no matter what the exponenf of time is, it'll only be called often enough for the delta to be meaningful.

Obviously this is not a perfect solution that should be implemented if this was to be done on the source level - but I think it's as good as it gets when it has to be retrofitted on top of an existing implementation.

EDIT:
On the second glance I guess it could just be done as time > m_time, and using nextafter is not needed... I need to run another test.

EDIT2:
Yep, doing it simpler works too, since a plain > comparison against two floats will already return false if they are equal due to precision loss! Abusing inaccuracies for my own benefit here.

from master-rallye-freeze-patcher.

greghope667 avatar greghope667 commented on August 22, 2024

That's great if that works, would be a good fix. I agree that if we had the source I'd probably re-structure this (and move the loading logic to not be tick rate dependent too).

I'm not sure exactly what's going on with the multiple different exe files. There's a bunch of different executables floating around online, some with various patches applied. Would be good if we could get a working exe starting from an original source.

from master-rallye-freeze-patcher.

CookiePLMonster avatar CookiePLMonster commented on August 22, 2024

I'm not sure exactly what's going on with the multiple different exe files. There's a bunch of different executables floating around online, some with various patches applied. Would be good if we could get a working exe starting from an original source.

That's probably a pipe dream, so the next best thing is supporting as many exes as possible - which is why I typically stick to finding patterns in the code and not hardcoding memory addresses.

from master-rallye-freeze-patcher.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.