esaulenka / ghidra_c166 Goto Github PK

View Code? Open in Web Editor NEW

28.0 28.0 8.0 58 KB

Ghidra SLEIGH module for Bosch C166 MCU

License: Apache License 2.0

ghidra_c166's People

Contributors

Stargazers

Watchers

Forkers

mumbel fanyi3315 mkchmiel wangyanjunmsn frmdstryr wrongbaud rnd-ash x-tfrk

ghidra_c166's Issues

Severe instruction decoding bug

There is something very wrong in the slaspec file.

The following code has been decoded correctly by IDA pro

F7 F0 B0 FE    movb    S0TBUF, rL0
and
F3 F0 B2 FE    movb    rL0, S0RBUF

The special function register FEB0 is ASC0_TBUF or S0TBUF (depending on which Infineon manual you use)
The special function register FEB2 is ASC0_RBUF or S0RBUF (depending on which Infineon manual you use)

But what Ghidra decodes is completely wrong:

f7 f0 b0 fe     movb       0x3eb0, RL0
and
f3 f0 b2 fe     movb       RL0, 0x3eb2

FEB0 is wrongly converted to 3EB0
FEB2 is wrongly converted to 3EB2

Also wrong:

All the following:
f7 f0 04 81     movb       0x0104,RL0
f7 f0 04 91     movb       0x4104,RL0
f7 f0 04 a1     movb       0x8104,RL0
f7 f0 04 c1     movb       0xC104,RL0
are displayed as if they were the same instruction:
f7 f0 04 x1     movb       0x104,RL0

Register definitions are highly incomplete and some are wrong

In the file c166.slaspec you define the registers.
But most are missing.

I extracted the information from 3 Infineon manuals.
Sadly there are some contradictions.

NullPointerException while disassembling

An unexpected error occurred while processing the command: Disassemble
java.lang.NullPointerException
	at ghidra.program.database.symbol.SymbolManager.getPrimarySymbol(SymbolManager.java:1100)
	at ghidra.program.database.function.FunctionManagerDB.getFunctionAt(FunctionManagerDB.java:631)
	at ghidra.program.disassemble.Disassembler.isNoReturnCall(Disassembler.java:1332)
	at ghidra.program.disassemble.Disassembler.processInstructionFlows(Disassembler.java:1195)
	at ghidra.program.disassemble.Disassembler.processInstruction(Disassembler.java:1156)
	at ghidra.program.disassemble.Disassembler.disassembleInstructionBlock(Disassembler.java:1028)
	at ghidra.program.disassemble.Disassembler.disassembleNextInstructionSet(Disassembler.java:662)
	at ghidra.program.disassemble.Disassembler.disassemble(Disassembler.java:602)
	at ghidra.program.disassemble.Disassembler.disassemble(Disassembler.java:473)
	at ghidra.app.cmd.disassemble.DisassembleCommand.doDisassembly(DisassembleCommand.java:223)
	at ghidra.app.cmd.disassemble.DisassembleCommand.applyTo(DisassembleCommand.java:157)
	at ghidra.framework.plugintool.mgr.BackgroundCommandTask.run(BackgroundCommandTask.java:101)
	at ghidra.framework.plugintool.mgr.ToolTaskManager.run(ToolTaskManager.java:315)
	at java.base/java.lang.Thread.run(Thread.java:830)

---------------------------------------------------
Build Date: 2019-Dec-18 1306 EST
Ghidra Version: 9.1.1
Java Home: /Library/Java/JavaVirtualMachines/jdk-13.0.2.jdk/Contents/Home
JVM Version: Oracle Corporation 13.0.2
OS: Mac OS X 10.14.6 x86_64
Workstation: 10.37.129.2

To reproduce:

Download and unpack this
Open it in Ghidra, specify 0x800000 for base address
Disassemble at offset 0, it will yield a jmps instruction
Follow the jump
Try disassembling at the jump location and fail with the above exception

MUL / MULU does not working

looks like using local variables MDL/MDH instead global registers

sample: FUN_cf2a40

SLA file is missing

I load your files into a new subfolder
C:\Program Files (x86)\Ghidra\Ghidra\Processors\C166
and Ghidra complains that the SLA file is missing.

java.io.FileNotFoundException: C:\Program Files (x86)\Ghidra\Ghidra\Processors\C166\data\languages\c166.sla

All other processors have a SLA file.
What is wrong here ?

SLASPEC cannot compile (On commit d494972)

Error: operand mem is undefined: for table "mem1631_w" constructor from c166.slaspec:342
Error: operand mem is undefined: for table "mem1631_b" constructor from c166.slaspec:343
WARN  c166.slaspec:336: Unreferenced table: 'memDpp' (SleighCompile)  
ERROR c166.slaspec:342: Problem in table: 'mem1631_w (SleighCompile)  
ERROR c166.slaspec:343: Problem in table: 'mem1631_b (SleighCompile)

Bit manipulation functions references DAT* rather than registers

bfldh, bclr and bfldh are referencing DAT values directly, rather than treating these as registers:

Function decompiled in Ghidra

And the same in IDA:

Though, the address of the DAT values do match the addresses in registers as shown in IDA (C167CS)

add restrictions to stack pointer value

now simple push r3 converted to ugly

  *(undefined2 *)
   ((uint3)((ushort)(((ushort)auStack2 & 0xc000) == 0) * 2 |
            (ushort)(((ushort)auStack2 & 0xc000) == 1) * 0x380 |
            (ushort)(((ushort)auStack2 & 0xc000) == 2) * 0x381 |
           (ushort)(((ushort)auStack2 & 0xc000) == 3) * 3) << 0xe |
   (uint3)((ushort)auStack2 & 0x3fff)) = param_2;

Perhaps should be designed separate mem table special for pop/push

Simple instruction decompiles to incorrect and overly verbose code

Here's an example:

                             //
                             // ram 
                             // fileOffset=0, length=30
                             // ram: 000000-00001d
                             //
          000000 e6 f4 00 01     mov        r4,0x100
                             LAB_000004                                      XREF[1]:     00000a(j)  
          000004 d7 00 09 00     exts       #0x9,#0x1
          000008 98 84           mov        r8,[r4+]
          00000a 2d fc           jmpr       cc_EQ,LAB_000004
          00000c 26 f4 02 01     sub        r4,#0x102
          000010 5c 44           shl        r4,#0x4
          000012 2b 58           prior      r5,r8
          000014 5c 15           shl        r5,#0x1
          000016 00 45           add        r4,r5
          000018 06 f4 00 10     add        r4,#0x1000
          00001c 9c 04           jmpi       cc_UC,[r4]

The lines

exts #0x9,#0x1
mov r8,[r4+]

should decompile to something like:

short foo, bar;  // r8, r4
foo = *(short* far)(0x90000 + bar);
bar += 2;

Instead, they decompile to:

uVar2 = uVar2 + 2;
uVar2 = *(ushort *)
         ((uint3)((ushort)((uVar2 & 0xc000) == 0) * unaff_DPP0 |
                  (ushort)((uVar2 & 0xc000) == 1) * unaff_DPP1 |
                  (ushort)((uVar2 & 0xc000) == 2) * unaff_DPP2 |
                 (ushort)((uVar2 & 0xc000) == 3) * unaff_DPP3) << 0xe | (uint3)(uVar2 & 0x3fff))
;

First of all, it's incrementing the variable representing r4 before accessing it, but the instruction set manual says [r4+] means to access the address in r4 and then increment it.

Second, it appears to be ignoring the exts line entirely, instead interpreting r4 as a 16-bit address. exts s, n means that the next n instructions should use 32-bit addressing, using s as the upper word for all addresses. Instead, it's using the DPP registers, which define the 16-bit address space.

Less importantly, even if the decompilation were correct, it seems far too verbose. Code using a lot of 16-bit addressing (which is extremely common) becomes very hard to read due to all the repetitive lines of code. Wouldn't it make more sense to use near/far pointer syntax, and just treat all the DPP stuff as implied whenever a near pointer is dereferenced? Or if Ghidra doesn't support that, you could probably define pointers as 24-bit (or 32-bit if it must be a power of two) and define an intrinsic for converting 16-bit addresses, like *__dpp(address) = value;

Crashes since the last fix "Ghidra loses '#' symbol in display section "

Since the last fix
c64dd2a
Ghidra crashes with a null pointer exception.

I disassemble the same binary file with the same settings as last week.
Before I have never seen that crash.
Now it happens always.

How to reproduce:
I create a new project.
Select C166
Set base address = C00000
Select an area with the mouse which I want to disassemble
right click the selected area
click "Disassemble" in the menu

The first bytes are disassembled fine until Ghidra finds invalid bytes in a section which does not contain valid code.

Before your last fix Ghidra simply skipped that invalid bytes and continued later on where it found valid code.
Now it crashes here:
It seems to have a problem with the byte CC

      c081bc cc              ??         CCh
      c081bd 76              ??         76h    v
      c081be c0              ??         C0h
      c081bf 00              ??         00h
      c081c0 3e              ??         3Eh    >
      c081c1 77              ??         77h    w
      c081c2 c0              ??         C0h
      c081c3 00              ??         00h
      c081c4 42              ??         42h    B
      c081c5 77              ??         77h    w
      c081c6 c0              ??         C0h
      c081c7 00              ??         00h
      c081c8 aa              ??         AAh
      c081c9 77              ??         77h    w
      c081ca c0              ??         C0h
      c081cb 00              ??         00h
      c081cc 74              ??         74h    t
      c081cd 77              ??         77h    w
      c081ce c0              ??         C0h
      c081cf 00              ??         00h
      c081d0 70              ??         70h    p
      c081d1 77              ??         77h    w
      c081d2 c0              ??         C0h
      c081d3 00              ??         00h
      c081d4 bc              ??         BCh
      c081d5 77              ??         77h    w
      c081d6 c0              ??         C0h
      c081d7 00              ??         00h
      c081d8 c0              ??         C0h
      c081d9 77              ??         77h    w
      c081da c0              ??         C0h
      c081db 00              ??         00h
      c081dc c6              ??         C6h
      c081dd 77              ??         77h    w
      c081de c0              ??         C0h
      c081df 00              ??         00h
      c081e0 8e              ??         8Eh
      c081e1 77              ??         77h    w
      c081e2 c0              ??         C0h
      c081e3 00              ??         00h
      c081e4 8a              ??         8Ah

If you cannot reproduce the problem I can send you the binary file.

Trying to decompile Visteon DCU 101 (ST10F276 )

I have downloaded this repository and added in ghidra, trying to analyse this ECU's frimeware but it is not being analysed, am I doing something wrong ? Can you please somehow give me direction ? Thank you.

memory writes in macros

@esaulenka Hadn't ever seen that you wrote this, but came across it on /r/reverseengineering.

I don't have any binaries to test this, but it looks like your approach might be doing weird things with memory writes. For example in add, it would maybe read the value in mem1631_w and use it in the macro and then try to write to that value instead of writing to export *:2 mem1631. Curious if this sort of change would improve things.

:add mem1631_w, r0815 is op0007=0x04 & r0815 ; mem1631_w {
	add_w (mem1631_w, r0815);
}

:add mem1631_w, r0815 is op0007=0x04 & r0815 ; mem1631_w {
	local tmp = mem1632_w;
	add_w (tmp, r0815);
	mem1632_w = tmp;
}

PSW flags might be wrong

Using this pdf as a reference.

For most PSW update functions, the data sheet states the following:

E: Set if the value of op2 represents the lowest possible negative
number. Cleared otherwise. Used to signal the end of a table.

In the sleigh code I see:

macro setE_b(x)		{ $(PSW_E) = (x == 0xFF); }
macro setE_w(x)		{ $(PSW_E) = (x == 0xFFFF); }

shouldn't these values be 0x80 or 0x8000 respectively? Lowest negative number should be just the sign set to 1, and everything else set to 0. 0xFF and 0xFFFF is just -1

(Accidentally found this discrepancy trying to write an emulator for the C167CS CPU)

jmpXX with cc_UC is a special case

now unconditinal branch disassembled sligtly ugly:
c03578 0d 02 jmpr cc_UC,LAB_c0357e
transforms to if (true) ....

Found a way to emulate Banked registers and improve decomp. output - WIP

@esaulenka After a lot of messing around, I found a way to emulate the banked registers. It needs a lot of work tho to improve the decopilation output, but now, it can accurately decompile the init function (Although messy) for an ECU that uses C167 (Daimler EGS52).

macro load_wgpr() {
	local addr:2 = CP; # Take value of ContextPointer, treat it as an address
        *[register]:2 (addr+0x00) = r0; # Load register values into the memory addresses needed
	*[register]:2 (addr+0x02) = r1;
	*[register]:2 (addr+0x04) = r2;
	*[register]:2 (addr+0x06) = r3;
	*[register]:2 (addr+0x08) = r4;
	*[register]:2 (addr+0x0A) = r5;
	*[register]:2 (addr+0x0C) = r6;
	*[register]:2 (addr+0x0E) = r7;
	*[register]:2 (addr+0x10) = r8;
	*[register]:2 (addr+0x12) = r9;
	*[register]:2 (addr+0x14) = r10;
	*[register]:2 (addr+0x16) = r11;
	*[register]:2 (addr+0x18) = r12;
	*[register]:2 (addr+0x1A) = r13;
	*[register]:2 (addr+0x1C) = r14;
	*[register]:2 (addr+0x1E) = r15;
}

macro save_wgpr() {
	local addr:2 = CP; # Take value of ContextPointer, treat it as an address
        r0  = *[register]:2 (addr+0x00); # Save value from memory into register
	r1  = *[register]:2 (addr+0x02);
	r2  = *[register]:2 (addr+0x04);
	r3  = *[register]:2 (addr+0x06);
	r4  = *[register]:2 (addr+0x08);
	r5  = *[register]:2 (addr+0x0A);
	r6  = *[register]:2 (addr+0x0C);
	r7  = *[register]:2 (addr+0x0E);
	r8  = *[register]:2 (addr+0x10);
	r9  = *[register]:2 (addr+0x12);
	r10 = *[register]:2 (addr+0x14);
	r11 = *[register]:2 (addr+0x16);
	r12 = *[register]:2 (addr+0x18);
	r13 = *[register]:2 (addr+0x1A);
	r14 = *[register]:2 (addr+0x1C);
	r15 = *[register]:2 (addr+0x1E);
}

usage on a function that requires Rw access:

# Rw n , #data3			08 n:0###
# Rw n , [Rw i +]		08 n:11ii
# Rw n , [Rw i ]		08 n:10ii
# Rw n , Rw m			00 nm
:add Rwn1215, op2_w is op0407=0x0 & Rwn1215 & op2_w & ExtDec {
    load_wgpr();
   add_w (Rwn1215, op2_w);
    save_wgpr();
}

output:

Decompiler output now:

Decompiler output before:

ZEROS and ONES - a very special SFRs

they always returns constant value

sample: FUN_cef674, LAB_cf22fe

Severe bug: Wrong disassembly

These two disassemblies
f2 fe ae f7 mov r14, 0x37ae
and
e6 fc 00 80 mov r12, 0x8000
seem to load a constant value into a register in the Ghidra disassembly.

But they do completely different things.

The Infineon "Instruction Set Manual for the C166 Family" says on page 30
F2 MOV reg, mem
E6 MOV reg, #data16

This must urgently be fixed into the correct display:
f2 fe ae f7 mov r14, [0x37ae]
and
e6 fc 00 80 mov r12, #0x8000

otherwise the disassembly is unreadable.