Giter Club home page Giter Club logo

hsdecomp's Introduction

hsdecomp

A decompiler for GHC-compiled Haskell

Dependencies

Trying it out

To decompile a file without any installation steps, simply run the runner.py script on the file you want to decompile:

python3 runner.py path/to/binary

Installation

hsdecomp utilizes setuptools for packaging and installation. To install:

python3 setup.py install

Known Limitations

Note that testing has been slim, so there probably are many other limitations not mentioned here.

  • No support for stripped binaries.
  • No support for direct manipulation of unboxed types. This generally shouldn't be a problem for unopimized binaries, as all that manipulation should be hidden behind library calls.
  • No support for tail recursion (which gets compiled to a loop).
  • Limited ability to display useful patterns in case expressions. As a replacement for proper names, patterns of the form <tag n> are shown.
  • No support for FFI.
  • Limited to x86 and x86-64.
  • Limited to ELF files.

How It Works

The decompiler is composed of several distinct stages:

  • Metadata Parsing. In this stage, we read basic metadata from the file, including the names of all symbols in the program, the version of GHC the program was compiled with, and whether the binary is 32 bit or 64 bit. Code for this process can be found in hsdecomp/metadata.py.
  • Code Parsing. In this stage, we recursively locate and parse every relevant section of code into an internal interpretation representation. This is the meat of the work done by the decompiler, and can be found primarily in hsdecomp/parse/__init__.py. Note that much of the analysis is done by means of simulation, for which the code can be found at hsdecomp/machine.py.
  • Type Inference. Although much of the interpretation of the binary can be found directly, the patterns which case expressions are branching on are initially opaque to the decompiler. Type inference allows displaying more precise patterns. Note that this stage is currently extremely primitive.
  • Optimization. At this stage in the pipeline, the decompiler has a fairly clear understanding of what is going on. However, the information is laid out as it is in the binary, with many small, uninlined expressions. To increase readability, the decompiler will perform various passes over the interpretations to clean them up and make them easier for a human to understand. The code for this is at hsdecomp/optimize.py.
  • Display. Finally, the decompiled code must be displayed to the user. This currently uses a fairly hacky pretty printer implemented at hsdecomp/show.py.

Unfortunately, I haven't written a full description of any of these stages or even adequately commented my code. However, I wrote a description of manually decompiling a file for the sCTF security competition. The output of this decompiler on that file can be found at test/lambda1/output in this repository.

hsdecomp's People

Contributors

gereeter avatar junsooo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hsdecomp's Issues

fails on CSAW CTF'19 Quals > wizkid

https://ctf.csaw.io/challenges

No rust or go binaries this time ;)
nc rev.chal.csaw.io 1002

https://ctf.csaw.io/files/8f347a362c16c610a9594104c8d43540/wizkid
mirror https://www.ninefile.com/mv6826fi7dvc.html

$ python3 -u runner.py wizkid
Error in processing case at c4no_info
    Error:
    Error Location: 143
    Disassembly:
        mov	rax, qword ptr [rbp + 8]
        mov	rcx, qword ptr [rbp + 0x10]
        mov	rdx, rbx
        and	edx, 7
        cmp	rdx, 1
        jne	0x40a97a
        mov	ebx, 0x6ef418
        add	rbp, 0x18
        jmp	0x4b0980

Error in processing case at c4mg_info
    Error:
    Error Location: 143
    Disassembly:
        mov	rax, rbx
        and	eax, 7
        cmp	rax, 1
        jne	0x40a2b6
        jmp	0x40a258

Error in processing case at c4hg_info
    Error:
    Error Location: 143
    Disassembly:
        mov	rax, rbx
        and	eax, 7
        cmp	rax, 1
        jne	0x409dd0
        mov	ebx, 0x6f90a9
        add	rbp, 8
        jmp	qword ptr [rbp]

Error in processing case at c3GS_info
    Error:
    Error Location: 143
    Disassembly:
        mov	rax, rbx
        and	eax, 7
        cmp	rax, 1
        jne	0x405798
        mov	ebx, 0x6f01b1
        add	rbp, 8
        jmp	qword ptr [rbp]

Error in processing case at c3FH_info
    Error:
    Error Location: 143
    Disassembly:
        mov	rax, rbx
        and	eax, 7
        cmp	rax, 1
        jne	0x4059d8
        mov	ebx, 0x6f01b1
        add	rbp, 8
        jmp	qword ptr [rbp]

Error in processing case at c4du_info
    Error:
    Error Location: 143
    Disassembly:
        mov	rax, rbx
        and	eax, 7
        cmp	rax, 3
        jb	0x409527
        add	r12, 0x20
        cmp	r12, qword ptr [r13 + 0x358]
        ja	0x4095a5
        mov	rax, qword ptr [rbx + 5]
        mov	rbx, qword ptr [rbx + 0xd]
        mov	qword ptr [r12 - 0x18], 0x4093e8
        mov	qword ptr [r12 - 8], rax
        mov	qword ptr [r12], rbx
        lea	rax, [r12 - 0x18]
        mov	r14d, 0x6f0910
        mov	qword ptr [rbp - 0x10], 0x4b46e0
        mov	qword ptr [rbp - 8], 0x6fc6f9
        mov	qword ptr [rbp], rax
        add	rbp, -0x10
        jmp	0x411930

Main_main_closure = >> $fMonadIO (putStrLn (unpackCString# "Do you know the secret code?:")) (>>= $fMonadIO getLine (\s3Fd_info_arg_0 -> !!ERROR!!))

Make it a plugin/package for radare2/Cutter

Thank you for awesome tool!

Radare2 is a highly-portable cross-platform reverse engineering framework and a toolkit without dependencies. It has support for analyzing binaries, disassembling code, debugging programs, attaching to remote GDB/LLDB, WinDbg servers, rich plugin system (see r2pm), and integration with various decompilers. For example, ghidra decompiler plugin - r2ghidra-dec. It is actively developed and can be easily integrated in various open source and commercial products. I believe, it will be highly beneficial to support these and provide a package for install from r2pm, see the package repository here: https://github.com/radareorg/radare2-pm

image

For documentation on writing plugins for radare2 see Scripting and Plugins Radare2 Book chapters.

Cutter is a crossplatform Qt/C++ GUI frontend to radare2:

image

For documentation on writing plugins for Cutter see the official tutorial and the curated list of various popular plugins.
Would be awesome to have it as a part of popular RE framework ๐Ÿ‘

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.