ndrewh / pyda Goto Github PK

View Code? Open in Web Editor NEW

7.0 2.0 0.0 212 KB

Write simple dynamic binary analysis tools in Python

License: Other

CMake 1.37% Dockerfile 3.22% Python 30.43% Shell 0.41% C 64.56%

dynamic-instrumentation dynamorio reverse-engineering debugging-tool

pyda's People

Contributors

Stargazers

Watchers

pyda's Issues

feat: Watchpoints

We can instrument memory accesses, following the example of https://github.com/DynamoRIO/dynamorio/blob/master/api/samples/memtrace_simple.c.

An initial prototype can be slow, as long as the instrumentation is disabled by default.

I think several optimizations will eventually be necessary.

We cannot afford a clean-call all the way out to Python for every memory access. At a minimum, filtering needs to take place in pyda_core -- ideally inline. Users likely only care about a range (or: several ranges) of memory addresses, and our instrumentation should check the address (using in-line instrumentation) so it only performs a clean-call at all if the access is within one of these regions.
One potential optimization is to make watched accesses cause a fault (i.e. by changing page protections), and then handling the fault by selectively inserting instrumentation into the block where the fault occurred

fix linear search over large number of hooks

When a large number of hooks are added, our basic block instrumentation event handler is slow because of the dumb linear search

feat: syscall hooks

Dynamorio supports syscall hooks:

pre-syscall dr_register_pre_syscall_event()
post-syscall dr_register_post_syscall_event()
and the ability to filter the above events... dr_register_filter_syscall_event().

I suspect we should implement the pre- and post- hooks in pyda_core, and filter syscall events in the Python wrapper. If the overhead is too substantial, then we can implement a simple version of the filter that just checks the syscall number against a list of registered events.

Proposed API:

# Syscall hooks use the same signature as regular hooks
p.pre_syscall(syscall_num, hook)
p.post_syscall(syscall_num, hook)

This should support multiple hooks for a single syscall, but that support can be provided by the Python wrapper.

feat: windows support

Reduce docker container size

feat: multithread

The way multithreading is supposed to work is that all of the threads share the same python interpreter state (and all of its globals). I've already added a thread_init hook that users can do initial setup (e.g. updating of hooks) or thread accounting.

Generally, I expect there to be very little that users want to do when a new thread spawns. All of the hooks are global to all threads by default, and you can check p.tid to see what thread you're on.

Right now multithreading is broken. There are likely multiple issues, but the most pressing one is we don't have a way to instrument the thread entrypoint (so we can have the target thread block while the thread init hook runs in python). For the main thread we were just using the module entrypoint, but that's not right here.

feat: fine-grained instrumentation generation

The motivation for creating this tool in the first place was the observation that many dynamic binary instrumentation tasks can be accomplished with only a "clean call" interface: the ability to insert calls to instrumentation functions at specific instructions (and the target of these calls could be code in any language).

This is somewhat aspirational, but in situations where performance is more important it would be cool to be able to insert other types of instrumentation. Here's how I think that could work:

def my_instrumentation(builder):
    tmp = builder.load(builder.rsp + 0x20) # loads
    builder.store(builder.rsp + 0x20, builder.rsi) # stores
    builder.rdi += 1 # register modification

p.instrument(0x100000, my_instrumentation)

The wrapper library would run the instrumentation function only once -- to generate an AST representing the desired computation (i.e. builder.rxx is a symbolic expression). This would then get lowered to dynamorio ops in pyda_core and inserted using the normal instrumentation APIs in dynamorio. Instrumentation functions would need to be written in a branchless way, but we could introduce an builder.if(...) expression that gets lowered to a branch or cmov.

feat: heap

something like libdebug/libdebug#53

or... we could just write a libdebug DebugInterface and patch it in to libdebug lmfao 😉

feat: attach to a running process

drrun has an -attach <pid> option, but it almost certainly doesn't work right now since we rely on the program executing its entrypoint.

Consider making XMM register saving optional

If CPython uses XMM registers, we need to save and restore them. Otherwise, we can change the cleancall argument to not save these regs

Dynamorio syscall filtering

b053746 left the syscall filter (dr_register_filter_syscall_event) unimplemented. For most use-cases, we expect users to know which syscalls they want to hook, and we can direct dynamorio to only intercept those.

The basic implementation is that lib/pyda/process.py needs to report the list of syscalls to pyda_core. Then, pyda_core will check this list when the dynamorio filter hook is called. Linear search is probably fine.

Register access should use DR enums

'debugger-style' API

# Run until a pc is reached
p.run_until(0x123456)

# Run until a 'breakpoint' is reached
p.cont()

How this works with hooks is TBD, but I would assume that hooks would execute at all times by default.

Potential use case (pseudocode):

while tcache_is_not_full(p, 0x100):
    p.send("a" * 0x100)
    p.run_until(0x12345678)

feat: native calls

I think in theory you can just cast an int to a function pointer using ctypes.cast and call it, but I'm not sure. It would be cool to make it easier to do make these calls e.g:

e = ELF(...)
crc32_result = p.call[e.symbols["crc32"]](0, b"AAAA", 4)

It would also be nice to do this with some guardrails. For example:

Catching segfaults that occur during these calls, and report them as Python exceptions
Instrument memory accesses during these calls so that they can be (optionally) rolled back (?)

These guardrails would be quite complicated to implement, so some analysis of use-cases is warranted...

`from pwn import *` ValueError: signal only works in main thread of the main interpreter

You are running Pyda v0.1.1.
Traceback (most recent call last):
  File "/examples/simple.py", line 5, in <module>
    from pwn import *
  File "/opt/custom-python-root/lib/python3.10/site-packages/pwn/__init__.py", line 6, in <module>
    pwnlib.args.initialize()
  File "/opt/custom-python-root/lib/python3.10/site-packages/pwnlib/args.py", line 209, in initialize
    term.init()
  File "/opt/custom-python-root/lib/python3.10/site-packages/pwnlib/term/__init__.py", line 78, in init
    term.init()
  File "/opt/custom-python-root/lib/python3.10/site-packages/pwnlib/term/term.py", line 110, in init
    signal.signal(signal.SIGWINCH, handler_sigwinch)
  File "/opt/custom-python-root/lib/python3.10/signal.py", line 56, in signal
    handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
ValueError: signal only works in main thread of the main interpreter

feat: ARM64 support

feat: macOS support

This would require implementing a private loader in dynamorio, and fixing any lingering issues with dynamorio on macOS. I'd like to do this on ARM64, so this depends on #7

Incorrect PC in thread entry hook

Currently p.rip in a thread entry hook does not provide the correct pc of the thread entrypoint. This is actually not trivial to fix since the RIP is lost deep in dynamorio.

feat: i/o

p.r should be a pwntools tube.

p.r.sendline(....), etc

The tricky part is probably avoiding reading prints performed by the tool itself. Should we automatically redirect program stdout to a pipe, and leave tool stdout using regular stdout?

More precisely:

int orig_in = dup(0);
int orig_out = dup(1);
int orig_err = dup(2);

int pipe1[2], pipe2[2], pipe3[2];
pipe(pipe1); pipe(pipe2); pipe(pipe3);

dup2(pipe1[0], 0)
dup2(pipe2[1], 1)
dup2(pipe3[1], 2)

stdin = fdopen(orig_in, "r");
stdout = fdopen(orig_out, "w");
stderr = fdopen(orig_err, "w");
// also need to modify dynamorio printing functions
// as they use raw fd 0/1/2

Then the problem becomes: who is reading from the pipe, and when?

This is a bit messy (see #13) but my guess is the right pattern is to make all recv calls implicit pyda_yields. .recvuntil would do a non-blocking yield before entering a blocking recv.

recv* calls would only be allowed in two contexts:

the top-level script (e.g. in lieu of p.run()).
thread entry hooks (e.g. this is in lieu of the implicit yield that currently happens at the end of the thread entry hook)

This paradigm is quite different from the current one. Python code and target code will now be 'running' at the same time! This can cause weird races when hooks get called, but we still have the GIL so should be fine.

feat: support redirecting execution after hooks

Currently it's not possible to update rip in hooks, so it's not possible to redirect execution.

For some reason, dynamorio doesn't respect updating the pc field in the mcontext struct. There should be a way to do this in dynamorio, and we just need to copy that.

feat: improve segv reporting

When patching an application, it's common that you might break it and cause a segfault. It would be nice if these could be reported and handled somewhat gracefully. There is no good way to debug this sort of thing, since the code would be executing in the code cache.

If dynamorio provides any interface to inspect the origin of code in the code cache (i.e. to map an instruction back to the original binary), reporting that information would be useful. The information here suggests that generating backtraces may be complicated, but we could also attempt that.

ndrewh / pyda Goto Github PK

pyda's People

Contributors

Stargazers

Watchers

pyda's Issues

Recommend Projects

Recommend Topics

Recommend Org