Giter Club home page Giter Club logo

unicorn-lua's Introduction

Unicorn CPU Emulator for Lua

Build status Lua versions Supported platforms

Lua bindings for the Unicorn CPU Emulator.

I'm currently testing this on vanilla Lua 5.1 - 5.4, LuaJIT 2.0, and LuaJIT 2.1 on both Linux and MacOS. This does not work on MacOS 14 with Unicorn 1.x; see Known Issues below for details.

License Change

As of version 2.0 the license has changed to GPL v2. This is due to the viral nature of the GPL license family: since QEMU uses GPL, this must also be GPL even though it only dynamically links to Unicorn. I apologize for the mistake I made when I created this with the BSD-3 license.

Known Limitations

The following are some limitations that are either impossible to work around due to the nature of Lua, or I haven't gotten around to fixing yet.

32-bit Lua Behavior

Behavior for 32-bit Lua (i.e. compiled with LUA_32BITS set to a nonzero value) won't handle 64-bit integers properly. Exactly what happens is technically undefined until C++20, but most likely you would silently lose the upper 32 bits. It's for this reason I strongly discourage using such builds.

64-bit Integers

64-bit integers do not fully work on Lua 5.2 or 5.1. This is because Lua only added direct support for integers in 5.3; Lua 5.1 and 5.2 use floating-point numbers, which provide at most 17 digits of precision. Thus, values over 53 bits cannot be represented accurately before 5.3.

We can work around this limitation by:

  • Using libraries such as BigInt. This could quickly become cumbersome, and the performance impact is unknown.
  • Providing special read and write functions for 64-bit integers. This is the least disruptive but also makes the API irregular.

I don't intend to fix this at the moment, as I want to focus on getting the API complete first.

Signedness

Because numbers in Lua are always signed, values above LUA_MAXINTEGER [1] such as addresses or register values will be returned from functions as negative numbers, e.g.

uc:reg_write(x86_const.UC_X86_REG_RAX, 0xffffffffffffffff)

-- Returns -1 not 2^64 - 1
uc:reg_read(x86_const.UC_X86_REG_RAX)

This doesn't affect how arguments are passed to the library, only values returned from the library.

Floating-point Registers

The 80-bit ST(x) registers on x86 architectures can't be read from or written to properly; a bug in the current encoding/decoding code gives garbage values so I've disabled it for the time being. Even if it did work, because Lua's floating-point numbers are by default at most 64 bits, you're still going to lose precision when reading the registers.

Emergency Collection and Memory Leaks

If Lua doesn't have enough available memory to do a proper garbage collection cycle, the collector will run in "emergency mode." [2] In this mode, finalizers are not run, so you could end up in a situation where hooks, contexts, and other resources held by a disused engine aren't released and never can be.

This rarely happens and most user code will probably be able to let the library do its own memory management. If you like to be safe, call the close() method on an engine after you're done using it to reduce the risk of an emergency collection leaking resources.

General Usage

unicorn tries to mirror the organization and naming conventions of the Python binding as much as possible. For example, architecture-specific constants are defined in submodules like unicorn.x86_const; a few global functions are defined in unicorn, and the rest are instance methods of the engine.

Quick Example

This is a short example to show how a some of the features can be used to emulate the BIOS setting up a system when booting.

local unicorn = require 'unicorn'
local uc_const = require 'unicorn.unicorn_const'

local uc = unicorn.open(uc_const.UC_ARCH_X86, uc_const.UC_MODE_32)

-- Map in 1 MiB of RAM for the processor with full read/write/execute
-- permissions. We could pass permissions as a third argument if we want.
uc:mem_map(0, 0x100000)

-- Revoke write access to the VGA and BIOS ROM shadow areas.
uc:mem_protect(0xC0000, 32 * 1024, uc_const.UC_PROT_READ|uc_const.UC_PROT_EXEC)
uc:mem_protect(0xF0000, 64 * 1024, uc_const.UC_PROT_READ|uc_const.UC_PROT_EXEC)

-- Create a hook for the VGA driver that's called whenever VGA memory is
-- written to by client code.
uc:hook_add(uc_const.UC_MEM_WRITE, vga_write_callback, 0xA0000, 0xBFFFF)

-- Install interrupt hooks so the CPU can perform I/O and other operations.
-- We'll handle all of that in Lua. Only one interrupt hook can be set at a
-- time.
uc:hook_add(uc_const.UC_HOOK_INTR, interrupt_dispatch_hook)

-- Load the boot sector of the hard drive into 0x7C000
local fdesc = io.open('hard-drive.img')
local boot_sector = fdesc:read(512)
uc:mem_write(0x7C000, boot_sector)
fdesc:close()

-- Start emulation at the boot sector we just loaded, stopping if execution
-- hits the address 0x100000. Since this is beyond the range we have mapped
-- in, the CPU will run forever until the code shuts it down, just like a
-- real system.
uc:emu_start(0x7C000, 0x100000)

Detailed Examples

More real-world examples can be found in the docs/examples directory. To run them, make sure you do make examples to generate the required resources.

Deviations from the Python Library

Because end is a Lua keyword, mem_regions() returns tables whose record names are begins, ends, and perms rather than begin, end, perms.

Requirements

This project has the following dependencies. Ensure you have them installed before using.

  • Lua 5.1 or higher, as well as the static library and headers. Lua 5.3 and above must not have been compiled with the LUA_32BITS option set.
  • A C++ compiler supporting the C++11 standard or later. Supported compilers include GCC 4.1+ and GCC-compatible compilers like Clang.
  • The Unicorn CPU Emulator library must be installed in your system's standard library location. Versions 1 and 2 are supported.
  • You must also have the Unicorn headers installed.
  • Some examples have additional dependencies; see their READMEs for details.

Known Issues

Unicorn 2.0.1 will not compile on macOS with Boost 1.73.0 or newer. If you run into an error involving the header boost/detail/endian.hpp, use Unicorn version 2.0.1.post1 or higher. (Ticket here).

On Unicorn 1.x and macOS 14+ (using Apple silicon), this sometimes crashes with no warning and no error. My suspicion is that it's some binary incompatibility caused by LuaRocks's compiler settings, specifically this issue rearing its head again. I don't know for sure.

Just Installing?

If you just want to install this library, open a terminal, navigate to the root directory of this repository, and run

luarocks build

Development

Using a virtual environment for Lua is strongly recommended. You'll want to avoid using your OS's real Lua, and using virtual environments allows you to test with multiple versions of Lua. You can use lenv for this.

If you're running MacOS and encounter a linker error with LuaJIT, check out this ticket.

Building and Testing

# Build and install the library into your tree
luarocks build

# Build and run the tests
luarocks test

Examples

See the examples directory for examples of how you can use this library.

License

See NOTICE.txt and LICENSE.txt for details. I'm legally required to release this under GPL 2+ due to QEMU's license, so please don't ask me to change this to MIT or 3-clause BSD. Sorry.

Footnotes

[1]Typically 263 - 1 on 64-bit machines and 231 - 1 on 32-bit machines.
[2]Programming in Lua, 4th Edition, page 233.

unicorn-lua's People

Contributors

dargueta avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

udbg fengjixuchui

unicorn-lua's Issues

64-bit integers broken on <5.3

Lua 5.1 and 5.2 exclusively use floating-point numbers, which means that anything over 53 bits can't be represented accurately and will get truncated.

Add support for Windows

I've been trying to get this to work off and on for two years, but just getting Unicorn and Lua to build is proving much more difficult than I thought. I would really like to get this working if possible.

So far my major problem is building and installing Lua and Unicorn properly. I'm fairly sure the code itself is cross-platform, but we'll see.

Add to Luarocks

Need to add a rockspec file and make it easily installable. The major problem we have here is that we currently rely on Python scripts to process the Unicorn header files, meaning the user has to have Python installed. We need to

  • Rewrite the header processing scripts in Lua
  • Write a rockspec
  • (Possibly) switch to a regular Makefile since LuaRocks passes us many variables we need

Big-endian host support?

Reading from/writing to registers on a big-endian host system won't work for registers that aren't the same size as a Lua integer. This is because the library currently has no concept of register sizes and thus doesn't know how to do typecasts.

Due to how byte order works this doesn't matter on a little-endian host, but on a big-endian host it'll result in things like a 16-bit register getting returned to Lua as 0x7fff000000000000 instead of 0x7fff.

Reading MSRs on x86 is broken

When reading a model-specific register on an x86 machine, uc_reg_read() expects the register ID to be present in the buffer. There is currently no way to pass the ID of the MSR you want to read, so this always returns the machine check exception address register (judging from this).

We need to modify this to allow specifying a model.

Buffer overflow when reading large registers

The current code for reading a register looks like this:

lua_Unsigned value = 0;
error = uc_reg_read(engine, register_id, &value);

On Lua 5.3, integers are 64 bits, so this will result in a buffer overflow when reading the 128-bit XMMX/YMMX etc. registers on an Intel machine. Similarly, reading a 64-bit register on any 32-bit Lua installation will also result in a buffer overflow.

Can't read/write 80-bit floating-point registers

The library is incapable of reading or writing the 80-bit floating-point ST(x) registers on x86 architectures. The encoding and decoding code is buggy so it currently throws an exception rather than give an incorrect answer.

Compilation fails on GCC 12

Running luarocks build fails to compile.

Update: It compiles fine if the -Wall flag is removed. I really don't want to do this.

The command Make runs (reformatted, paths are slightly modified to remove personal info)

gcc -std=c++11 -DIS_LUAJIT=0  -Wall -Wextra -Werror -Wpedantic -pedantic-errors \
    -I./include               \
    -I./.luaenv-5.4.6/include \
    -I./.luaenv-5.4.6/include \
    -I/usr/local/include      \
    -I/usr/local/include      \
    -O2 -fPIC -c              \
    -o src/control_functions.o src/control_functions.cpp

GCC barfs:

In file included from /usr/include/c++/12/memory:75,
                 from src/control_functions.cpp:15:
In member function ‘void std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = long unsigned int]’,
    inlined from ‘std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = long unsigned int; _Dp = std::default_delete<long unsigned int>]’ at /usr/include/c++/12/bits/unique_ptr.h:396:17,
    inlined from ‘int ul_ctl_get_exits(lua_State*)’ at src/control_functions.cpp:47:1:
/usr/include/c++/12/bits/unique_ptr.h:95:9: error: ‘void operator delete(void*)’ called on pointer returned from a mismatched allocation function [-Werror=mismatched-new-delete]
   95 |         delete __ptr;
      |         ^~~~~~~~~~~~
src/control_functions.cpp: In function ‘int ul_ctl_get_exits(lua_State*)’:
src/control_functions.cpp:34:55: note: returned from ‘void* operator new [](std::size_t)’
   34 |     std::unique_ptr<uint64_t> array(new uint64_t[count]);
      |                                                       ^
In member function ‘void std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = long unsigned int]’,
    inlined from ‘std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = long unsigned int; _Dp = std::default_delete<long unsigned int>]’ at /usr/include/c++/12/bits/unique_ptr.h:396:17,
    inlined from ‘int ul_ctl_get_exits(lua_State*)’ at src/control_functions.cpp:47:1:
/usr/include/c++/12/bits/unique_ptr.h:95:9: error: ‘void operator delete(void*)’ called on pointer returned from a mismatched allocation function [-Werror=mismatched-new-delete]
   95 |         delete __ptr;
      |         ^~~~~~~~~~~~
src/control_functions.cpp: In function ‘int ul_ctl_get_exits(lua_State*)’:
src/control_functions.cpp:34:55: note: returned from ‘void* operator new [](std::size_t)’
   34 |     std::unique_ptr<uint64_t> array(new uint64_t[count]);
      |                                                       ^
In member function ‘void std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = long unsigned int]’,
    inlined from ‘std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = long unsigned int; _Dp = std::default_delete<long unsigned int>]’ at /usr/include/c++/12/bits/unique_ptr.h:396:17,
    inlined from ‘int ul_ctl_set_exits(lua_State*)’ at src/control_functions.cpp:88:1:
/usr/include/c++/12/bits/unique_ptr.h:95:9: error: ‘void operator delete(void*)’ called on pointer returned from a mismatched allocation function [-Werror=mismatched-new-delete]
   95 |         delete __ptr;
      |         ^~~~~~~~~~~~
src/control_functions.cpp: In function ‘int ul_ctl_set_exits(lua_State*)’:
src/control_functions.cpp:74:61: note: returned from ‘void* operator new [](std::size_t)’
   74 |     std::unique_ptr<uint64_t> entries(new uint64_t[n_entries]);
      |                                                             ^
In member function ‘void std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = long unsigned int]’,
    inlined from ‘std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = long unsigned int; _Dp = std::default_delete<long unsigned int>]’ at /usr/include/c++/12/bits/unique_ptr.h:396:17,
    inlined from ‘int ul_ctl_set_exits(lua_State*)’ at src/control_functions.cpp:88:1:
/usr/include/c++/12/bits/unique_ptr.h:95:9: error: ‘void operator delete(void*)’ called on pointer returned from a mismatched allocation function [-Werror=mismatched-new-delete]
   95 |         delete __ptr;
      |         ^~~~~~~~~~~~
src/control_functions.cpp: In function ‘int ul_ctl_set_exits(lua_State*)’:
src/control_functions.cpp:74:61: note: returned from ‘void* operator new [](std::size_t)’
   74 |     std::unique_ptr<uint64_t> entries(new uint64_t[n_entries]);
      |                                                             ^
cc1plus: all warnings being treated as errors
make: *** [Makefile:194: src/control_functions.o] Error 1

Error: Build error: Failed building.

No support for ST or SSE2+ registers

Maximum supported register size is 64 bits. This means we can't read 80-bit floating-point registers, nor the XMMX (128-bit), YMMX (256-bit) or ZMMX (512-bit) registers. Reading these registers will result in a buffer overflow (see #3) and garbage return values.

Reading/writing floating-point registers gives garbage values

Everything is read as an integer so you're going to get back whatever the binary representation of a floating-point integer is on your machine, which can differ between architectures.

This is fixable, but is going to be tedious and error-prone for architectures other than x86 and MIPS because I'm not as familiar with those.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.