Giter Club home page Giter Club logo

decompiler-explorer's Introduction

Decompiler Explorer

Decompiler Explorer is a web front-end to a number of decompilers. This web service lets you compare the output of different decompilers on small executables. In other words: It's basically the same thing as Matt Godbolt's awesome Compiler Explorer, but in reverse.

Decompiler Explorer

Prerequisites

  • python >= 3.8
  • pipenv
  • docker
  • docker-compose

Installation

pipenv install
python scripts/dce.py init

Setting up decompilers

See the instructions here

Running in docker (dev)

pipenv install
python scripts/dce.py init

# Build all decompilers with valid keys 
python scripts/dce.py build
# If you want to exclude certain decompilers
# python scripts/dce.py --without-reko build

python scripts/dce.py start
# UI now accessible on port 80/443

Running in docker (production)

python scripts/dce.py start --prod --replicas 2 --acme-email=<your email>

Running in docker (production with s3 storage)

python scripts/dce.py start --prod --acme-email=<your email> --s3-bucket=<s3 bucket name>

Starting dev server (outside Docker)

This won't start any decompilers, just the frontend

pipenv run python manage.py migrate
pipenv run python manage.py runserver 0.0.0.0:8000

Starting decompiler for dev server

export EXPLORER_URL=http://172.17.0.1:8000

docker-compose up binja --build --force-recreate --remove-orphans

decompiler-explorer's People

Contributors

couleeapps avatar github-actions[bot] avatar joleeee avatar joshterrill avatar ltfish avatar negasora avatar psifertex avatar stephenfewer avatar twizmwazin avatar zennjamin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

decompiler-explorer's Issues

Support dark theme

  • Add a switch in the corner to let the user swap to dark mode.
  • Remember this preference.

Add License

Since this is pubic now, we should add a license.

Need Logo/Branding

We need a little logo and some simple branding stuff (theme colors, font, etc).

Support for uploading source and compiling

The original design for Decompiler Explorer included a box in which the user could type source, which would then be compiled and have the binary be sent for decompilation and shown as the results. This makes it a lot faster to round-trip source -> decompiled source, which is likely what most people care about seeing anyway.

Including this would add the additional overhead of needing to run compiler workers (something the original did), potentially for many versions of the compilers ala Compiler Explorer. Users will likely want to specify compiler/target/platform/commandline options.

Add option to submit decompiler bug report

We can detect crashes and failed decomps easily enough. Correctness and formatting are more challenging. If a decompiler generates incorrect/unreadable output, give an option to the user to flag it for investigation by the vendor.

Better "updating" page

Currently it just shows "404 not found" or "Bad gateway" depending on which stage in the deploy process is running. We should have a nicer "updating, please wait" page.

Support multiple decompiler versions

Eventually we will want to compare different versions of decompilers to see how they evolve over the years, this should be an option similar to how CompilerExplorer does it with an easy dropdown/searchable list.

Add Samples

It would be nice to have sample files included on the site to demonstrate the functionality without requiring users to upload a binary of their own. It would make using the site from a phone actually have results. Also it would demonstrate the output for desktop users who initially open the site, instead of just a blank screen.

Threat modeling/dealing with eventuality of being popped

Our users are likely to be VR experts/enthusiasts, so someone is just going to try it eventually. The current docker-based container system is decently separated, although Docker Is Not A Security Boundary still applies. The key resources to protect are the commercial software/licenses and the auth tokens, though it would be nice to also maintain availability.

MVP tracking issue

General architecture:

  • Frontend (nginx?)
  • Middle (glennbolt?)
  • Backend (all the decompilers)

MVP:

  • binary upload
  • Binja/Ghida/Angr output
  • resource limits (filesize client/server limits and runtime server limits)
  • rate limiting of some kind
  • separate frontend/backend
  • traefik/HTTPS
  • cached/static render
  • S3 storage
  • authenticated api
  • Don't use printing but rather return value from wrapper (change structure of back-ends)
  • Syntax highlight apps that don't have it built-in
  • Make go pretty (theme HTML)
  • Comment out version info at top of file and normalize text for each decompiler
  • Add readme
  • Escape non-html output

POST MVP:

  • S3 logins for friendlies
  • some threat modeling/dealing with eventuality of being popped
  • Support for uploading source and/or selecting a function?

Some provided samples show analysis timeouts

Some of the provided samples include binaries which apparently timed out during analysis:

  • Preferably we adjust the timeouts so this does not happen on samples, especially with default-enabled tools.
  • Sample binaries are not available so I cannot confirm locally and check that the tool is not choking on the input.
  • The timeout is not listed so I don't know how long the tool took before timing out.

Screenshot from 2022-07-14 16-29-38

Allow download of input file

If someone shares a decompiler bug repro via dogbolt link, it'd be nice to allow anyone receiving that link to download the input file, so they can reproduce the issue locally.

This would allow dogbolt to be used to report issues to open-source decompiler issue trackers, without unnecessarily restricting the people who can reproduce and fix such issues.

Resource limit breaks retdec

Setting the hard limit less than INFINITY causes it to crash. Only on the deploy server though, this did not happen on my local install.

>>> args.mem_limit_soft=10000000000
>>> args.mem_limit_hard=10000000000
>>> decompile_source(args, conts)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in decompile_source
__main__.DecompileError: b'Traceback (most recent call last):\n  File "decompile_retdec.py", line 58, in <module>\n    main()\n  File "decompile_retdec.py", line 23, in main\n    subprocess.check_call([RETDEC_DECOMPILER, \'--output\', outfile.name, \'--cleanup\', \'--silent\', infile.name], stdout=subprocess.PIPE, stderr=subprocess.PIPE)\n  File "/usr/local/lib/python3.8/subprocess.py", line 364, in check_call\n    raise CalledProcessError(retcode, cmd)\nsubprocess.CalledProcessError: Command \'[PosixPath(\'/home/decompiler_user/retdec/bin/retdec-decompiler\'), \'--output\', \'/tmp/tmpgp113qgs/tmp18f8xilr\', \'--cleanup\', \'--silent\', \'/tmp/tmpgp113qgs/tmpqfw3g882\']\' died with <Signals.SIGABRT: 6>.\n'

>>> args.mem_limit_soft=100000000000
>>> args.mem_limit_hard=100000000000
>>> decompile_source(args, conts)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in decompile_source
__main__.DecompileError: b'Traceback (most recent call last):\n  File "decompile_retdec.py", line 58, in <module>\n    main()\n  File "decompile_retdec.py", line 23, in main\n    subprocess.check_call([RETDEC_DECOMPILER, \'--output\', outfile.name, \'--cleanup\', \'--silent\', infile.name], stdout=subprocess.PIPE, stderr=subprocess.PIPE)\n  File "/usr/local/lib/python3.8/subprocess.py", line 364, in check_call\n    raise CalledProcessError(retcode, cmd)\nsubprocess.CalledProcessError: Command \'[PosixPath(\'/home/decompiler_user/retdec/bin/retdec-decompiler\'), \'--output\', \'/tmp/tmpnd1o63o3/tmp94lqqgqz\', \'--cleanup\', \'--silent\', \'/tmp/tmpnd1o63o3/tmp_fhab1h3\']\' died with <Signals.SIGABRT: 6>.\n'

>>> args.mem_limit_soft=100000000000
>>> args.mem_limit_hard=resource.RLIM_INFINITY
>>> decompile_source(args, conts)
b'<pre>//\n// This file was generated by the Retargetable Decompiler\n// Website: https://retdec.com\n//\n\n#include <stdint.h>\n#include <stdio.h>\n\n// ------------------- Function Prototypes --------------------\n\nint64_t __do_global_dtors_aux(void);\nint64_t __libc_csu_fini(void);\nint64_t __libc_csu_init(void);\nint64_t _fini(void);\nint64_t _init(void);\nint64_t _start(void);\nint64_t deregister_tm_clones(void);\nint64_t frame_dummy(void);\nint64_t function_1003(void);\nint64_t function_1040(void);\nvoid function_1043(int64_t * d);\nint64_t function_1050(void);\nint32_t function_1053(char * format, ...);\nint64_t function_1103(void);\nint64_t function_1143(void);\n...'

>>> args.mem_limit_soft=resource.RLIM_INFINITY
>>> args.mem_limit_hard=resource.RLIM_INFINITY
>>> decompile_source(args, conts)
b'<pre>//\n// This file was generated by the Retargetable Decompiler\n// Website: https://retdec.com\n//\n\n#include <stdint.h>\n#include <stdio.h>\n\n// ------------------- Function Prototypes --------------------\n\nint64_t __do_global_dtors_aux(void);\nint64_t __libc_csu_fini(void);\nint64_t __libc_csu_init(void);\nint64_t _fini(void);\nint64_t _init(void);\nint64_t _start(void);\nint64_t deregister_tm_clones(void);\nint64_t frame_dummy(void);\nint64_t function_1003(void);\nint64_t function_1040(void);\nvoid function_1043(int64_t * d);\nint64_t function_1050(void);\nint32_t function_1053(char * format, ...);\nint64_t function_1103(void);\nint64_t function_1143(void);\n...'

Service redeployment causes availability disruption

When we merge a fix, the entire site goes down apparently. It would be nice to minimize this disruption. An easy solution is to have the public service update nightly. Ideally standing up the new version before tearing the old version down so we can transition smoothly.

better mobile support

current decompiler iframes don't flow well in mobile resolutions. AKA: make moar responsive

Need FAQ Page

We need an FAQ page that includes:

  • What this is (for people who aren't familiar with decompilers)
  • Links and short descriptions for all of the tools involved
  • How it's being hosted (CPUs, RAM, etc)
  • Who made it
  • Where to get the code
  • How to contribute to the code

Anything else?

Add disassembly "decompiler"

Currently if you have a dogbolt link where two tools disagree, there's no way to know which one is correct. (e.g. BinaryNinja vs Hex-Rays here)

Adding a "disassembly" pane would give a source of truth, to allow users to understand what the input to the decompilers actually was, and which one is correct.

Some of the runners don't handle timeouts properly

Just checked the workers and found multiple runners from various decompilers, just spinning. We probably need to have a stricter timeout on the runner script and actually terminate the process correctly. I'm not sure why the current implementation didn't work.

image

Notable finds:

  • Reko
  • RecStudio
  • Boomerang

Ghidra is real slow

30 seconds timeout on current hardware doesn't seem to decompile even the smallest executables. We should bump up either the timeout or the hardware before release so it actually gets results.

Frontend should use screen space more effectively

  • Currently everything is squashed into center container making >3 decompiler comparison pretty tight
  • Checkboxes/Input section take up more space than necessary
  • Don't need a footer, just put the mention about GitHub in the header

Recommend something more like mdec:
image

Support decompiler options

Decompilers may provide options to adjust their output. It would be nice to be able to expose some options.

S3 logins for friendlies

Currently the only way to access the uploaded binaries is via SSH to the deployment box, or via Django admins. We should look into a cleaner solution that allows us to give access to these binaries without giving complete access to the rest of the infra.

Align functions

Put all the functions in the same order in each decompilation so it's easier to compare.

Cleanup worker/server structure

Currently, workers are poll-based and if you have 2 instances of one worker, they will both try to decompile everything. We should do one of the following:

  • Make the queue push-based and server-authoritative
  • Make the clients "check out" requests and skip over others' checked out requests

switch status checks to WS://

Currently the frontend polls requests every second for every decompiler, which ends up being 8 req/sec/user, which is a lot. We should reduce this number by using only 1 request per binary job, and remember to cancel outstanding requests if the user uploads a new binary. We may be able to use websockets instead, to really tone down the volume of requests

Handle fat binaries / archives

Fat Mach-O files are a collection of binaries in one file, separated by architecture. Some of the tools can read the containers and analyze them, while others fall over, even if they could analyze the inner Mach-O. Similar happens with archive libraries (.a). It would be nice to have some way to "thin" the files for the decompilers with less capable loaders.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.