riidefi / mkw Goto Github PK
View Code? Open in Web Editor NEWDecompilation of Mario Kart Wii
Decompilation of Mario Kart Wii
Other projects like BFBB Decompile have a neat visual progress tracker to show the state of the decompile. This could be appealing to potential future contributors.
The closest thing we have is the build.py
final output, which looks like this:
It would be nice to show similar progress statistics on this repository.
Graphic.py https://riidefi.github.io/mkw only visualizes DOL
Tracking issue for DWC and the GameSpy library.
Tracking issue for the Item namespace.
Let's enforce code formatting.
pre-commit is a utility for running scripts (hooks) before creating a Git commit.
IDEs like VSCode automatically install integrations when you enable "trust this project".
pre-commit supports hooks for clang-format
and black
.
RVL/IOS implements RPCs for IOS.
Tracking issue for the layout component of the nw4r library.
The linker is supposed to generate the extab
and extabindex
sections.
Since we don't have the source code to do that, we used binary blob .s
files to recreate them.
This is where the first problem occurs. The linker segfaults when feeding it object files with existing extab
/ extabindex
sections.
As a workaround, the sections were renamed to extab_
/ extabindex_
.
The second problem is the symbol _eti_init_info
.
This is a linker-generated symbol. Trying to reference it when using the extab_
hack instead of actual extab
is also going to make the linker crash.
So as another workaround, I've defined _eti_init_info_
with a hardcoded address.
I feel like we should work towards the root problem: Fixing extab
and extabindex
in the object files with some magic, so that the linker has no problem generating its exception-related symbols and sections.
I'm not sure on this but I think if we don't do this, we're not going to have C++ runtime exception support.
Such as section sizes like we are doing with the dol.
Create C files for all MetroTRK code and fill with decomp / inline ASM.
And generate the relocation tables as part of the build script.
@riidefi: We should consider requiring gen_asm on build and purging the asm out of the repo. It would really cut down on the git size since they're all tracked.
The README.md mentions that source files contain Doxygen annotations.
Let's set up a GitHub Action to publish the latest docs to https://riidefi.github.io/mkw/docs/
Summary: Tracking issue for improved ASM structuring
Context
The code parts that have not been decompiled yet are kept as assembly asm/*.s
files.
These files are generated from the game's binary executables.
The code generating backend for this uses a naive slicing approach:
A "slices table" (slices.csv) keeps track of which address ranges come from ASM or C/C++ sources.
This table only defines the C/C++ slices. The ASM slices are derived by looking at the "gaps".
These gaps get constructed by ASM files for each section (text, data, ...).
Problems
gen_asm
is quite destructive.Tasks
.s
files. This fixes problems 1, 2, 4Move RVL/NAND from assembly to C inline assembly.
Tracking issue to decompile RVL/NAND, which is currently inline assembly.
I want to add a badge like this to README that keeps itself updated:
We'll use GitHub Actions artifacts to store the percentage.
Write path:
build.yml
workflowpython -m mkwutil.progress.percent_decompiled --short
Display path:
The GET artifact endpoint is permissioned, so will have to create a GitHub App.
EDIT: A personal access token would also work, but a GitHub App allows forks to use the integration too.
We can use Heroku, Google App Engine or Google Cloud Functions as the backend.
Cloud Functions looks the most attractive atm.
Estimate 2-4h.
Once we get it running we can start adding more dynamic content to the README without much overhead.
The build system currently runs compile jobs in parallel to decrease compile time.
This achieved using the multiprocessing library. The multiprocessing library spawns additional instances of the Python interpreter that run independently.
This approach is quite fast, but it causes problems with stdout buffering and early termination: Ctrl+C signals and build errors are not handled properly. Instead of terminating, the build workers will continue to compile their whole job queue away. This makes it hard to see where an error occurs.
Modern Python has a nicer way to deal with this. The asyncio/subprocess library allows dispatching multiple subprocesses (compiler invocations) from the same interpreter instance. Stdout is handled asynchronously through an event loop.
So, let's either look into asyncio's subprocess functionalities, or consider using a full-fledged Python build system like SCons.
GNU Make has been suggested* in the past but has been rejected due to concerns with Windows compatibility and maintainability (Makefiles are unforgiving and opaque at times).
SliceTable.remove()
removes slices of code and inserts gaps.
We use it to accurately calculate the amount of decompiled code by reading slices from object files first, and then removing the inline asm
functions using SliceTable.remove()
.
Unfortunately there is a bug in this function that causes the amount of decompiled code to be overreported in some cases (atm 12% instead of 10%).
Can't reproduce the bug with unit tests yet so it's probably best to rewrite the whole slice table remove algorithm.
To reduce the size of repo, we want to remove ASM files not only from latest commit, but from all commits.
The was achieved using BFG here.
Although not having a symbol map does make this a bit tricky, we should be able to make some very reasonable guesses based on cross-references. Some special considerations might need to be made for jump tables.
Tracking issue for decompiling the nw4r::math library.
Contains:
RVL/IPC is low-level PPC <=> ARM communication.
Most of the unresearched code currently sits in a handful of large assembly blobs.
These blobs contain lots of unrelated pieces of code. We need to improve structuring.
A basic improvement is to recover the original translation unit slices and generate C inline ASM files for each TU.
The CodeWarrior build system leaks some information on TU structure.
Examples:
Let's investigate disabling warning 10369 or filtering it in python.
When the reconstructed main.dol/StaticR.rel doesn't match the genuine game files,
the verification script prints some generic information like segment size.
What's missing is debugging information about which parts of the binary are actually different, by showing hex or assembly diffs.
Tracking issue for the Kart namespace.
Ideally let's get the decomp to a point where someone without knowledge of assembly or a decompiler could meaningfully contribute. The fact that we know our compiler and its quirks opens up a few opportunities. In abstract, let's run a series of passes on each function checking matching after each. (This will be built on #42).
Before
asm int NANDCreate(void) {
nofralloc;
stwu r1, -0x20(r1);
mflr r0;
stw r0, 0x24(r1);
stw r31, 0x1c(r1);
mr r31, r5;
stw r30, 0x18(r1);
mr r30, r4;
stw r29, 0x14(r1);
mr r29, r3;
bl nandIsInitialized;
cmpwi r3, 0;
beq lbl_8019b490;
mr r3, r29;
mr r4, r30;
mr r5, r31;
li r6, 0;
li r7, 0;
li r8, 0;
bl nandCreate;
bl nandConvertErrorCode;
b lbl_8019b494;
lbl_8019b490:
li r3, -128;
lbl_8019b494:
lwz r0, 0x24(r1);
lwz r31, 0x1c(r1);
lwz r30, 0x18(r1);
lwz r29, 0x14(r1);
mtlr r0;
addi r1, r1, 0x20;
blr;
}
After:
int NANDCreate(void) {
__asm{
mr r31, r5;
mr r30, r4;
mr r29, r3;
bl nandIsInitialized;
cmpwi r3, 0;
beq lbl_8019b490;
mr r3, r29;
mr r4, r30;
mr r5, r31;
li r6, 0;
li r7, 0;
li r8, 0;
bl nandCreate;
bl nandConvertErrorCode;
b lbl_8019b494;
lbl_8019b490:
li r3, -128;
lbl_8019b494:
}
}
int NANDCreate(void) {
__asm {
mr r31, r5;
mr r30, r4;
mr r29, r3;
}
__asm {
bl nandIsInitialized;
cmpwi r3, 0;
beq lbl_8019b490;
}
__asm {
mr r3, r29;
mr r4, r30;
mr r5, r31;
li r6, 0;
li r7, 0;
li r8, 0;
bl nandCreate;
bl nandConvertErrorCode;
b lbl_8019b494;
}
lbl_8019b490:
__asm {
li r3, -128;
}
lbl_8019b494:
}
int NANDCreate(int in_a, int in_b, int in_c) {
__asm {
bl nandIsInitialized;
cmpwi r3, 0;
beq lbl_8019b490;
}
__asm {
mr r3, in_a;
mr r4, in_b;
mr r5, in_c;
li r6, 0;
li r7, 0;
li r8, 0;
bl nandCreate;
bl nandConvertErrorCode;
b lbl_8019b494;
}
lbl_8019b490:
__asm {
li r3, -128;
}
lbl_8019b494:
}
int NANDCreate(int in_a, int in_b, int in_c) {
__asm {
bl nandIsInitialized;
cmpwi r3, 0; // We need this, might be scheduled; Comparison below will be omitted in favor of this.
}
if (GET_REG(r3) != 0) {
__asm {
mr r3, in_a;
mr r4, in_b;
mr r5, in_c;
li r6, 0;
li r7, 0;
li r8, 0;
bl nandCreate;
bl nandConvertErrorCode;
}
} else {
__asm {
li r3, -128;
}
}
}
This is all still in the domain of asm, and we can likely get a very high yield on these passes (near 100% match).
Currently, percent_decompiled considers everything "decompiled" that's in a C file.
Since we are lifting a lot of ASM out from assembly files to C files without decompiling it (for now), this would be cheating.
percent_decompiled needs to be aware about which parts in a C file are inline ASM and regular C code. We found two approaches for this.
1. Comments and Regex
Auto-generated functions currently look like this:
// Symbol: nandSplitPerm
// Function signature is unknown.
// PAL: 0x8019c12c..0x8019c1b8
asm UNKNOWN_FUNCTION(nandSplitPerm) {
The following regex allows extracting the address ranges of matching function signature.
// Function signature is unknown.\n// PAL: (0x[0-9a-f]{8}..0x[0-9a-f]{8})\nasm UNKNOWN_FUNCTION
However, this approach is kind of brittle. There won't be any clear signs of breakage if matching fails.
2. Custom section
A more robust approach is to define a custom section that holds this data.
Maybe something like:
#pragma section "binary_blobs"
__declspec (section "binary_blobs") static const char MARK_BINARY_BLOB_0x8019c12c = "BINARY BLOB: <name>\t0x8019c12c\t0x8019c1b8\n" __attribute__((never_inline));
This can be wrapped in a nice macro.
graphic.py
is non-deterministic. It randomizes colors on every run. This causes the generated index.html
to be different on every commit, even if the decompiled slices weren't changed. As a result, this spams the gh-pages
branch with useless commits.
graphic.py
should be updated to use a deterministic PRNG for generating colors.
The RFL (presumably "Revolution Face Library") implements Mii functionality.
Miis are playable characters in MKW.
Let's lift the RFL ASM out of binary blobs into inline-ASM C files to prepare decompilation.
The CI pipelines don't check whether any changes to slices, object file lists and asm files match with the output of gen_asm
.
To fix this, CI should run gen_asm
and then check whether any files were changed in the Git repo with the second run.
Installing 7-zip with choco costs 61 precious seconds on each CI run.
We need it because the tooling is distributed as a 7-zip file.
Distributing it via a .zip file would work just fine and makes CI runs faster by not having to install 7-zip.
Tracking issue to decompile the game's layered archive filesystem code.
Names not final
I have followed the tutorial and installed everything (except CodeWarrior compilers bc i cant find them). When building, i get this error.
Traceback (most recent call last): File "D:\Users\Korisnik\Documents\Wii\MKWii Mods\mkw-master\mkw-master\build.py", line 328, in <module> build() File "D:\Users\Korisnik\Documents\Wii\MKWii Mods\mkw-master\mkw-master\build.py", line 315, in build compile_sources() File "D:\Users\Korisnik\Documents\Wii\MKWii Mods\mkw-master\mkw-master\build.py", line 243, in compile_sources compile_queued_sources() File "D:\Users\Korisnik\Documents\Wii\MKWii Mods\mkw-master\mkw-master\build.py", line 175, in compile_queued_sources pool.map(lambda s: compile_source_impl(*s), gSourceQueue) File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.1776.0_x64__qbz5n2kfra8p0\lib\multiprocessing\pool.py", line 364, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.1776.0_x64__qbz5n2kfra8p0\lib\multiprocessing\pool.py", line 771, in get raise self._value File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.1776.0_x64__qbz5n2kfra8p0\lib\multiprocessing\pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.1776.0_x64__qbz5n2kfra8p0\lib\multiprocessing\pool.py", line 48, in mapstar return list(map(*args)) File "D:\Users\Korisnik\Documents\Wii\MKWii Mods\mkw-master\mkw-master\build.py", line 175, in <lambda> pool.map(lambda s: compile_source_impl(*s), gSourceQueue) File "D:\Users\Korisnik\Documents\Wii\MKWii Mods\mkw-master\mkw-master\build.py", line 149, in compile_source_impl process = subprocess.Popen( File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.1776.0_x64__qbz5n2kfra8p0\lib\subprocess.py", line 951, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.1776.0_x64__qbz5n2kfra8p0\lib\subprocess.py", line 1420, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, FileNotFoundError: [WinError 2] The system cannot find the file specified
I think we can reorganize the python code structure to be a bit more clear.
Suggested changes:
Currently, we need to link an entire .dol/.rel executable to verify a function matches. Introducing a .o validator will enable faster iteration for both humans and computers ;)
graphic.py
is a neat script that visualizes the compilation process by compiling a static HTML file.
It would be a shame to let this go unnoticed.
GitHub offers an easy way to continually deploy auto-generated websites to their GitHub Pages hosting services. https://github.com/marketplace/actions/deploy-to-github-pages
The site would be accessible under https://riidefi.github.io/mkw/
Macro to get the offset of a struct member in bytes.
Simple one, add it to stddef.h
or cstddef
.
..Linux is fine though.
See: https://github.com/riidefi/mkw/runs/3566024287#step:10:8
stebler:
GhostFile::writeHeader is non-matching because of some weirdness in the lap times loop. I'm at least confident that bit fields are used because they got me to immediately match parts of GhostFile::readHeader and GhostFile::writeHeader that didn't work with the inline getters over reinterpret_cast I initially tried. Using a local union can get the loop closer to the original asm, but scheduling and regalloc stay different.
GhostFileGroup::get needed some hack to prevent the compiler from shortening it with a lbzxu instruction. It is disabled when NON_MATCHING is set.
GhostFile::init, RawGhostFile::compress and RawGhostFile::decompress will be decompiled at a later time.
The current disassembler routine is a hack.
It starts by invoking Capstone, which doesn't support the complete Gekko instruction set.
Capstone is going to abort on undefined instructions.
We fall back to custom disassembler extensions for the rest.
Ideally, we'd want to use only one disassembler. Options:
To ensure this project keeps working for Linux folks, we should add a CI configuration that runs on Linux. CodeWarrior Windows-only tooling runs fine under Wine.
Follow-up of #88
Tracking issue for decompiling the RVL/MEM library.
RVL/MEM implements the memory allocator and is a dependency of the EGG allocator.
Create C files for all MSL libc code and fill with decomp / inline ASM.
@riidefi mentioned in #31 that asm void
-style functions have been suspected to change compiler flags, affecting the way how other / unrelated code in the same TU would be compiled.
We could wrap inline ASM functions in push/pop
pragmas to backup and restore the compiler settings state as a workaround.
Before doing that, it would be nice to create a minimum code sample that reproduces this effect on known affected compiler version and check if it affects the one's we're using as well.
Tracking issue for decompiling the RVL/MTX library.
RVL/MTX implements maths primitives. nw4r::math depends on it.
Contains PSMTX*
and PSVEC*
C functions.
Checklist
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.