Giter Club home page Giter Club logo

symex-vp's Introduction

A concolic testing engine for RISC-V embedded software with support for SystemC peripherals.

About

SymEx-VP focuses explicitly on testing software for constrained embedded devices (e.g. as used in the Internet of Things). This software often interacts very closely with low-level hardware peripherals and in order to support these interactions, during simulation-based software testing, SymEx-VP supports SystemC peripheral models. SystemC is a C++ class library for modeling hardware peripherals. SystemC is often used to create executable models of an entire hardware platform, so-called Virtual Prototypes (VPs). Using these SystemC peripherals models, SymEx-VP allows injecting test inputs into the software simulation via the MMIO peripheral interface. Thereby allowing tests of embedded software with only minimal software modifications.

Based on injected inputs, SymEx-VP allows symbolic execution (or more specifically concolic testing) of RISC-V RV32IMAC machine code. Branches based on injected symbolic values are tracked and as soon as execution terminates new assignments for symbolic variables are determined by negating encountered branch conditions (Dynamic Symbolic Execution). For each new assignment, the software simulation is restarted from the beginning, thereby (ideally) enabling exploration of all paths through the program based on the introduced symbolic variables.

SymEx-VP is implemented on top of the existing riscv-vp codebase and integrates this existing VP with the clover concolic testing library. More details on SymEx-VP are provided in the SymEx-VP paper.

Features

  • Concolic execution of RISC-V RV32IMAC machine code
  • Accurate hardware peripheral modeling via the SystemC C++ library
  • Support for injecting concolic test inputs via the MMIO peripheral interface
  • Support for generating test case files and replaying them (see SYMEX_TESTCASE below)
  • Integrated GDB stub to ease debugging of encountered errors (--debug-mode)
  • Support for many embedded operating systems (e.g. RIOT, Zephyr, …)

Cloning

This repository makes use of submodules to include vendored dependencies. In order to automatically checkout these submodules clone the repository as follows:

$ git clone --recursive https://github.com/agra-uni-bremen/symex-vp

Alternatively, if you already cloned the repository without passing the --recursive option run the following command to checkout all submodules:

$ git submodule update --init

Installation

This software can be installed either using Docker (recommended) or manually via GNU make.

Docker

To build a Docker image for SymEx-VP run the following command:

$ docker build -t riscv-symex .

Afterwards, create a new Docker container from this image using:

$ docker run --rm -it riscv-symex

The SymEx-VP source directory is then available in /home/riscv-vp/riscv-vp within the container. Provided VPs are automatically added to $PATH. For this reason, the examples provided in /home/riscv-vp/riscv-vp/examples (see below) can easily be executed within the provided container.

Manual

Manual installation requires the following software to be installed:

  • A C++ compiler toolchain with C++17 support
  • CMake
  • A recent version of Z3 (4.8.X is known to work)
  • LLVM
  • Boost

After all dependencies have been installed run:

$ make

Executable binaries are then available in ./vp/build/bin.

Usage

In regards to using SymEx-VP for software execution it behaves like a normal virtual prototype and should be able to execute any RV32IMAC binaries. In order to utilize the symbolic execution features provided by SymEx-VP, the following additional aspects have to be consider for software testing. Communication between software and the virtual prototype is achieved through memory-mapped IO with a provided SymbolicCTRL peripheral.

  1. Symbolic Inputs: In order to explore different paths through the program, inputs based on which path exploration should take place need to be marked as symbolic. Symbolic variables can be introduced through SystemC peripherals which return symbolic variables through a TLM 2.0 extension as part of memory-mapped I/O. For example, it is possible to model an Ethernet peripheral using SystemC which returns symbolic variables for network packets and thereby allows exploring paths through the network stack of the executed software. Alternatively, it is also possible to declare variables as symbolic manually through the aforementioned SymbolicCTRL peripheral.
  2. Termination Points: Since SymEx-VP restarts the entire SystemC simulation for each new assignment of symbolic input variables, termination points need to be defined for the simulated software. For example, when exploring the network stack of the executed software, it may be sufficient to terminate software simulation as soon as the symbolic network packet was handled by the network stack. Termination points must presently be declared by the executed software by writing to a control register in the memory-mapped SymbolicCTRL peripheral.
  3. Path Analyzers: In order to find errors using symbolic execution, a so-called path analyzer needs to be employed on each executed path. For each executed path, the employed path analyzer needs to decide if the path constitutes an error case. For example, errors may include violation of spatial memory safety, stack overflows, et cetera. Presently, we mostly rely on the executed software to signal an error condition to the virtual prototype using the SymbolicCTRL peripheral. For example, this allows signaling errors from software-specific panic handlers.

Usage Examples

Usage examples which demonstrate the three aspects mentioned in the previous section can be found in the ./examples subdirectory. For instance, the ./examples/zig-out-of-bounds demonstrates the discovery of out-of-bounds array accesses using SymEx-VP. More information on individual example applications is available in the README.md file in the ./examples subdirectory.

The provided Docker image already contains a correctly configured RISC-V cross toolchain. For example, to run the assertion-failure example simply execute the following command inside the container:

$ make -C examples/assertion-failure/ sim

More complex usage example, e.g. for the integration with embedded operating systems, are provided as part of the evaluation artifacts for the SymEx-VP overview paper. Furthermore, simple test harnesses for using SymEx-VP in conjunction with the RIOT operating systems are provided in the symex-vp-riot repository.

Provided VPs

The following virtual prototypes are available:

  • symex-vp: A very minimal virtual prototype, based on tiny32-vp from the original riscv-vp repository. This allows testing very basic bare-metal RV32 software.
  • hifive-vp: A virtual prototype mostly compatible with the SiFive HiFive1. This allows executing software for embedded operating systems like RIOT or Zephyr symbolically.

Other platforms from riscv-vp can be trivially ported.

Environment Variables

The following environment variables can be set:

  • SYMEX_ERREXIT: If set, terminate after encountering the first error during symbolic execution of the software.
  • SYMEX_TIMEOUT: This can be used to configure the solver timeout of the employed SMT solver, by default no timeout is used. The timeout is given in seconds.
  • SYMEX_TIMEBUDGET: Time budget (in seconds) for path exploration, by default the software is explored until all branches have been negated.
  • SYMEX_TESTCASE: This environment variable can point to a test case file for replaying inputs causing an error. This is most useful in conjunction with --debug-mode.
  • SYMEX_DUMPALL: Create test case files for all generated paths, by default test case files are only generated for paths with an error condition.

How To Cite

The concepts behind SymEx-VP are further described in the following publication:

@article{tempel2022symex,
	title   = {{SymEx-VP: An open source virtual prototype for OS-agnostic concolic testing of IoT firmware}},
	journal = {Journal of Systems Architecture},
	pages   = {12},
	year    = {2022},
	issn    = {1383-7621},
	doi     = {10.1016/j.sysarc.2022.102456},
	author  = {Sören Tempel and Vladimir Herdt and Rolf Drechsler},
}

The artifacts for this publication are also available on Code Ocean.

Acknowledgements

This work was supported in part by the German Federal Ministry of Education and Research (BMBF) within the project Scale4Edge under contract no. 16ME0127 and within the project VerSys under contract no. 01IW19001.

License

The original riscv-vp code is licensed under MIT (see LICENSE.MIT). All modifications made for the integration of symbolic execution with riscv-vp are licensed under GPLv3+ (see LICENSE.GPL). Consult the copyright headers of individual files for more information.

symex-vp's People

Contributors

cirromulus avatar grossed avatar krypt-n avatar nbruns1 avatar nmeum avatar rysiof avatar u2654 avatar vherdt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.