Giter Club home page Giter Club logo

spedi's Introduction

Spedi

A speculative disassembler for the variable-size Thumb ISA. Given an ELF file as input, Spedi can:

  • Recover correct assembly instructions.
  • Recover targets of switch jumps tables.
  • Identify procedures in the binary and their call graph.

Spedi works directly on the binary without using any symbol information. We found Spedi to outperform IDA Pro in our experiments.

Main idea

Spedi recovers all possible Basic Blocks (BB) available in the binary. BBs that share the same jump instruction are grouped in a Maximal Block (MB). Then, MBs are refined using overlap and CFG conflict analysis. More details are available in our upcoming CASES'16 paper.

Usage

Build the project and try it on one of the binaries of our benchmark suite available here.

The following command will instruct spedito speculatively disassemble the .text section,

$ ./spedi -t -s -f $FILE > speculative.inst

Use the following command to instruct spedi to disassemble the .text section based on ARM code mapping symbols which provides the ground truth about correct instructions,

$ ./spedi -t -f $FILE > correct.inst

The easiest way to compare both outputs is by using,

$ diff -y correct.inst speculative.inst |less

Currently, you need to manually modify main.cpp to show results related to switch table and call-graph recovery.

Road map

Spedi is an academic proof-of-concept tool. Currently, it's not on our priority list. However, there are certain features that we have in mind for the future, namely:

  • Mixed-mode ARM/Thumb disassembly. The general idea of speculative disassembly provides the framework to do that. Basically, one needs to speculatively disassemble a code region twice. One time in Thumb mode, and another time in ARM mode. Later, mode switching instructions (mainly bx and blx) should be analyzed. This paper provides more details.
  • ELF Reader: currently our ELF reader is based on libelfin. We inherit some memory leakage issues. Additionally, the reader might crash on binaries with dwarf debug info. These issues needs to be addressed either in upstream or directly here.
  • Support for x86/x64. Thumb and similar RISC-like ISA have limited variability. Basically, instructions can either be 2 or 4 bytes width. We need to push the challenge even further by supporting x86/x64. To this end, overlap analysis might span more than two MB which complicates things. Current Maximal Block data structure is not efficient to do that.
  • Refactorings. The code is tightly coupled to our ELF reader. Also, it is specific to Thumb ISA. We need to make it more modular to suppport other ISAs and binary formats.

Dependencies

The project depends on Capstone disassembly library.

spedi's People

Contributors

abenkhadra avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.