esaulenka / ghidra_c166 Goto Github PK
View Code? Open in Web Editor NEWGhidra SLEIGH module for Bosch C166 MCU
License: Apache License 2.0
Ghidra SLEIGH module for Bosch C166 MCU
License: Apache License 2.0
There is something very wrong in the slaspec file.
The following code has been decoded correctly by IDA pro
F7 F0 B0 FE movb S0TBUF, rL0
and
F3 F0 B2 FE movb rL0, S0RBUF
The special function register FEB0 is ASC0_TBUF or S0TBUF (depending on which Infineon manual you use)
The special function register FEB2 is ASC0_RBUF or S0RBUF (depending on which Infineon manual you use)
But what Ghidra decodes is completely wrong:
f7 f0 b0 fe movb 0x3eb0, RL0
and
f3 f0 b2 fe movb RL0, 0x3eb2
FEB0 is wrongly converted to 3EB0
FEB2 is wrongly converted to 3EB2
Also wrong:
All the following:
f7 f0 04 81 movb 0x0104,RL0
f7 f0 04 91 movb 0x4104,RL0
f7 f0 04 a1 movb 0x8104,RL0
f7 f0 04 c1 movb 0xC104,RL0
are displayed as if they were the same instruction:
f7 f0 04 x1 movb 0x104,RL0
In the file c166.slaspec you define the registers.
But most are missing.
I extracted the information from 3 Infineon manuals.
Sadly there are some contradictions.
An unexpected error occurred while processing the command: Disassemble
java.lang.NullPointerException
at ghidra.program.database.symbol.SymbolManager.getPrimarySymbol(SymbolManager.java:1100)
at ghidra.program.database.function.FunctionManagerDB.getFunctionAt(FunctionManagerDB.java:631)
at ghidra.program.disassemble.Disassembler.isNoReturnCall(Disassembler.java:1332)
at ghidra.program.disassemble.Disassembler.processInstructionFlows(Disassembler.java:1195)
at ghidra.program.disassemble.Disassembler.processInstruction(Disassembler.java:1156)
at ghidra.program.disassemble.Disassembler.disassembleInstructionBlock(Disassembler.java:1028)
at ghidra.program.disassemble.Disassembler.disassembleNextInstructionSet(Disassembler.java:662)
at ghidra.program.disassemble.Disassembler.disassemble(Disassembler.java:602)
at ghidra.program.disassemble.Disassembler.disassemble(Disassembler.java:473)
at ghidra.app.cmd.disassemble.DisassembleCommand.doDisassembly(DisassembleCommand.java:223)
at ghidra.app.cmd.disassemble.DisassembleCommand.applyTo(DisassembleCommand.java:157)
at ghidra.framework.plugintool.mgr.BackgroundCommandTask.run(BackgroundCommandTask.java:101)
at ghidra.framework.plugintool.mgr.ToolTaskManager.run(ToolTaskManager.java:315)
at java.base/java.lang.Thread.run(Thread.java:830)
---------------------------------------------------
Build Date: 2019-Dec-18 1306 EST
Ghidra Version: 9.1.1
Java Home: /Library/Java/JavaVirtualMachines/jdk-13.0.2.jdk/Contents/Home
JVM Version: Oracle Corporation 13.0.2
OS: Mac OS X 10.14.6 x86_64
Workstation: 10.37.129.2
To reproduce:
0x800000
for base addressjmps
instructionlooks like using local variables MDL/MDH instead global registers
sample: FUN_cf2a40
I load your files into a new subfolder
C:\Program Files (x86)\Ghidra\Ghidra\Processors\C166
and Ghidra complains that the SLA file is missing.
java.io.FileNotFoundException: C:\Program Files (x86)\Ghidra\Ghidra\Processors\C166\data\languages\c166.sla
All other processors have a SLA file.
What is wrong here ?
Error: operand mem is undefined: for table "mem1631_w" constructor from c166.slaspec:342
Error: operand mem is undefined: for table "mem1631_b" constructor from c166.slaspec:343
WARN c166.slaspec:336: Unreferenced table: 'memDpp' (SleighCompile)
ERROR c166.slaspec:342: Problem in table: 'mem1631_w (SleighCompile)
ERROR c166.slaspec:343: Problem in table: 'mem1631_b (SleighCompile)
now simple push r3
converted to ugly
*(undefined2 *)
((uint3)((ushort)(((ushort)auStack2 & 0xc000) == 0) * 2 |
(ushort)(((ushort)auStack2 & 0xc000) == 1) * 0x380 |
(ushort)(((ushort)auStack2 & 0xc000) == 2) * 0x381 |
(ushort)(((ushort)auStack2 & 0xc000) == 3) * 3) << 0xe |
(uint3)((ushort)auStack2 & 0x3fff)) = param_2;
Perhaps should be designed separate mem
table special for pop/push
Here's an example:
//
// ram
// fileOffset=0, length=30
// ram: 000000-00001d
//
000000 e6 f4 00 01 mov r4,0x100
LAB_000004 XREF[1]: 00000a(j)
000004 d7 00 09 00 exts #0x9,#0x1
000008 98 84 mov r8,[r4+]
00000a 2d fc jmpr cc_EQ,LAB_000004
00000c 26 f4 02 01 sub r4,#0x102
000010 5c 44 shl r4,#0x4
000012 2b 58 prior r5,r8
000014 5c 15 shl r5,#0x1
000016 00 45 add r4,r5
000018 06 f4 00 10 add r4,#0x1000
00001c 9c 04 jmpi cc_UC,[r4]
The lines
exts #0x9,#0x1
mov r8,[r4+]
should decompile to something like:
short foo, bar; // r8, r4
foo = *(short* far)(0x90000 + bar);
bar += 2;
Instead, they decompile to:
uVar2 = uVar2 + 2;
uVar2 = *(ushort *)
((uint3)((ushort)((uVar2 & 0xc000) == 0) * unaff_DPP0 |
(ushort)((uVar2 & 0xc000) == 1) * unaff_DPP1 |
(ushort)((uVar2 & 0xc000) == 2) * unaff_DPP2 |
(ushort)((uVar2 & 0xc000) == 3) * unaff_DPP3) << 0xe | (uint3)(uVar2 & 0x3fff))
;
First of all, it's incrementing the variable representing r4
before accessing it, but the instruction set manual says [r4+]
means to access the address in r4
and then increment it.
Second, it appears to be ignoring the exts
line entirely, instead interpreting r4 as a 16-bit address. exts s, n
means that the next n
instructions should use 32-bit addressing, using s
as the upper word for all addresses. Instead, it's using the DPP registers, which define the 16-bit address space.
Less importantly, even if the decompilation were correct, it seems far too verbose. Code using a lot of 16-bit addressing (which is extremely common) becomes very hard to read due to all the repetitive lines of code. Wouldn't it make more sense to use near/far pointer syntax, and just treat all the DPP stuff as implied whenever a near pointer is dereferenced? Or if Ghidra doesn't support that, you could probably define pointers as 24-bit (or 32-bit if it must be a power of two) and define an intrinsic for converting 16-bit addresses, like *__dpp(address) = value;
Since the last fix
c64dd2a
Ghidra crashes with a null pointer exception.
I disassemble the same binary file with the same settings as last week.
Before I have never seen that crash.
Now it happens always.
How to reproduce:
I create a new project.
Select C166
Set base address = C00000
Select an area with the mouse which I want to disassemble
right click the selected area
click "Disassemble" in the menu
The first bytes are disassembled fine until Ghidra finds invalid bytes in a section which does not contain valid code.
Before your last fix Ghidra simply skipped that invalid bytes and continued later on where it found valid code.
Now it crashes here:
It seems to have a problem with the byte CC
c081bc cc ?? CCh
c081bd 76 ?? 76h v
c081be c0 ?? C0h
c081bf 00 ?? 00h
c081c0 3e ?? 3Eh >
c081c1 77 ?? 77h w
c081c2 c0 ?? C0h
c081c3 00 ?? 00h
c081c4 42 ?? 42h B
c081c5 77 ?? 77h w
c081c6 c0 ?? C0h
c081c7 00 ?? 00h
c081c8 aa ?? AAh
c081c9 77 ?? 77h w
c081ca c0 ?? C0h
c081cb 00 ?? 00h
c081cc 74 ?? 74h t
c081cd 77 ?? 77h w
c081ce c0 ?? C0h
c081cf 00 ?? 00h
c081d0 70 ?? 70h p
c081d1 77 ?? 77h w
c081d2 c0 ?? C0h
c081d3 00 ?? 00h
c081d4 bc ?? BCh
c081d5 77 ?? 77h w
c081d6 c0 ?? C0h
c081d7 00 ?? 00h
c081d8 c0 ?? C0h
c081d9 77 ?? 77h w
c081da c0 ?? C0h
c081db 00 ?? 00h
c081dc c6 ?? C6h
c081dd 77 ?? 77h w
c081de c0 ?? C0h
c081df 00 ?? 00h
c081e0 8e ?? 8Eh
c081e1 77 ?? 77h w
c081e2 c0 ?? C0h
c081e3 00 ?? 00h
c081e4 8a ?? 8Ah
If you cannot reproduce the problem I can send you the binary file.
I have downloaded this repository and added in ghidra, trying to analyse this ECU's frimeware but it is not being analysed, am I doing something wrong ? Can you please somehow give me direction ? Thank you.
@esaulenka Hadn't ever seen that you wrote this, but came across it on /r/reverseengineering.
I don't have any binaries to test this, but it looks like your approach might be doing weird things with memory writes. For example in add
, it would maybe read the value in mem1631_w
and use it in the macro and then try to write to that value instead of writing to export *:2 mem1631
. Curious if this sort of change would improve things.
:add mem1631_w, r0815 is op0007=0x04 & r0815 ; mem1631_w {
add_w (mem1631_w, r0815);
}
:add mem1631_w, r0815 is op0007=0x04 & r0815 ; mem1631_w {
local tmp = mem1632_w;
add_w (tmp, r0815);
mem1632_w = tmp;
}
Using this pdf as a reference.
For most PSW update functions, the data sheet states the following:
E: Set if the value of op2 represents the lowest possible negative
number. Cleared otherwise. Used to signal the end of a table.
In the sleigh code I see:
macro setE_b(x) { $(PSW_E) = (x == 0xFF); }
macro setE_w(x) { $(PSW_E) = (x == 0xFFFF); }
shouldn't these values be 0x80 or 0x8000 respectively? Lowest negative number should be just the sign set to 1, and everything else set to 0. 0xFF and 0xFFFF is just -1
(Accidentally found this discrepancy trying to write an emulator for the C167CS CPU)
now unconditinal branch disassembled sligtly ugly:
c03578 0d 02 jmpr cc_UC,LAB_c0357e
transforms to if (true) ....
@esaulenka After a lot of messing around, I found a way to emulate the banked registers. It needs a lot of work tho to improve the decopilation output, but now, it can accurately decompile the init function (Although messy) for an ECU that uses C167 (Daimler EGS52).
macro load_wgpr() {
local addr:2 = CP; # Take value of ContextPointer, treat it as an address
*[register]:2 (addr+0x00) = r0; # Load register values into the memory addresses needed
*[register]:2 (addr+0x02) = r1;
*[register]:2 (addr+0x04) = r2;
*[register]:2 (addr+0x06) = r3;
*[register]:2 (addr+0x08) = r4;
*[register]:2 (addr+0x0A) = r5;
*[register]:2 (addr+0x0C) = r6;
*[register]:2 (addr+0x0E) = r7;
*[register]:2 (addr+0x10) = r8;
*[register]:2 (addr+0x12) = r9;
*[register]:2 (addr+0x14) = r10;
*[register]:2 (addr+0x16) = r11;
*[register]:2 (addr+0x18) = r12;
*[register]:2 (addr+0x1A) = r13;
*[register]:2 (addr+0x1C) = r14;
*[register]:2 (addr+0x1E) = r15;
}
macro save_wgpr() {
local addr:2 = CP; # Take value of ContextPointer, treat it as an address
r0 = *[register]:2 (addr+0x00); # Save value from memory into register
r1 = *[register]:2 (addr+0x02);
r2 = *[register]:2 (addr+0x04);
r3 = *[register]:2 (addr+0x06);
r4 = *[register]:2 (addr+0x08);
r5 = *[register]:2 (addr+0x0A);
r6 = *[register]:2 (addr+0x0C);
r7 = *[register]:2 (addr+0x0E);
r8 = *[register]:2 (addr+0x10);
r9 = *[register]:2 (addr+0x12);
r10 = *[register]:2 (addr+0x14);
r11 = *[register]:2 (addr+0x16);
r12 = *[register]:2 (addr+0x18);
r13 = *[register]:2 (addr+0x1A);
r14 = *[register]:2 (addr+0x1C);
r15 = *[register]:2 (addr+0x1E);
}
usage on a function that requires Rw access:
# Rw n , #data3 08 n:0###
# Rw n , [Rw i +] 08 n:11ii
# Rw n , [Rw i ] 08 n:10ii
# Rw n , Rw m 00 nm
:add Rwn1215, op2_w is op0407=0x0 & Rwn1215 & op2_w & ExtDec {
load_wgpr();
add_w (Rwn1215, op2_w);
save_wgpr();
}
output:
they always returns constant value
sample: FUN_cef674, LAB_cf22fe
These two disassemblies
f2 fe ae f7 mov r14, 0x37ae
and
e6 fc 00 80 mov r12, 0x8000
seem to load a constant value into a register in the Ghidra disassembly.
But they do completely different things.
The Infineon "Instruction Set Manual for the C166 Family" says on page 30
F2 MOV reg, mem
E6 MOV reg, #data16
This must urgently be fixed into the correct display:
f2 fe ae f7 mov r14, [0x37ae]
and
e6 fc 00 80 mov r12, #0x8000
otherwise the disassembly is unreadable.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.