Giter Club home page Giter Club logo

Comments (34)

rhelmot avatar rhelmot commented on July 22, 2024

That binary in particular is a position-independent executable (PIE), which means it has no base address. CLE tells you whether the main binary contains position-independent code via loader.main_bin.pic, which is true in your case.

At some point in the dev process we made the decision to have PIEs automatically loaded with a base address of 0x400000, because it's useful when address zero doesn't map to anything.

If you like, we can add a logger message warning that the binary is being loaded at that base address for PIEs? If you want it loaded at zero, you can specify that with b = angr.Project('filename', load_options={'main_opts': {'custom_base_addr': 0}}) or... something like that? check the docstring for cle.Loader.

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

Yeah, that's what I've ended up doing. A warning would be good though.

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

On the same problem, I've found this:

UnsupportedSyscallError('no syscall 13 for arch AMD64',)

The binary is creating an alarm for itself (connection timeout i think). For now I'll just add a handler or something to jump over it, but are signals a general limitation of angr? feature in the future? etc

from angr-doc.

rhelmot avatar rhelmot commented on July 22, 2024

There are a lot, at lot a lot of unsupported syscalls. If you want something to work, you're probably better off implementing SimProcedures for the individual library functions your binaries use, instead of implementing syscalls.

On the other hand, that alarm behavior is really unimportant for analysis purposes, and the default behavior of an unsupported syscall is to do nothing and return an unconstrained symbolic value, so that error is probably nothing to worry about.

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

sorry to keep pestering. i've noticed that the explorer seems to run off on its own sometimes. is there a way for angr to tell you what portions of the code it is spending most time on so that the end user can try to optimize around it?

from angr-doc.

zardus avatar zardus commented on July 22, 2024

There isn't at the moment, but that'd be an interesting feature. In the meantime, this code could be useful:

import itertools
import collections

p = angr.Project('/bin/bash')
pg = p.factory.path_group().step(170)
print collections.Counter(itertools.chain(*pg.mp_active.addr_backtrace.mp_items))

That'll step bash 170 times (at which point, it splits a few times), then grabs the backtrace of every path, combines them all, and passes them to collections.Counter, which gives you a count for the most common basic blocks executed.

This suffers from the problem that a basic block is counted once per path where it appears rather than every time it was executed. That is, if you execute blocks A->B and then branch and execute B->C1 and B->C2 in two different paths, A and B will end up being counted twice. Solving that requires analyzing the path history and is probably a decent-sized pain in the ass.

One other thing you can do is monkeypatch the lifter. Of course, this is super ugly and you should never do it, but if you were to do it, it'd look like this:

import angr
import collections

p = angr.Project('/bin/bash')
_old_lifter = p.factory.block
counter = collections.Counter()
def count_block(*args, **kwargs):
    counter[args[0]] += 1
    return _old_lifter(*args, **kwargs)
p.factory.block = count_block

pg = p.factory.path_group().step(170)
print counter.most_common(10)

from angr-doc.

rhelmot avatar rhelmot commented on July 22, 2024

Alternately, if you want an easier approach: For a given path, you can look at all the steps its made with path.backtrace. Look through all the paths (the ones you care about are gonna be in surveyor.active and surveyor.spilled), checking for ones where len(path.backtrace) is pretty huge, and then look where it's running a bunch of blocks you weren't expecting. In all likelyhood it's just stuck inside libc, in which case the solution is to write a SimProcedure for whatever library function you were calling.

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

Does angr support global variable spaces? I'm running into a case where angr just errors out whenever I give it a constraint on a global. To test this, I created a simple xor program that I theoretically should be able to use angr to decrypt.

#include <stdio.h>
#include <string.h>

unsigned char *key = "\xa5\x57\x03\x4a\xa4\xdc\xff\x7c\xfa\xf5\x1f\x52\x79\x6a\x61\xcb";
unsigned char inbuf[256];
unsigned char outbuf[256];

void xor_encrypt()//char *key, char *string)
{
    int i, string_length = strlen(inbuf);
    for(i=0; i<string_length; i++)
    {
        outbuf[i] = inbuf[i] ^ key[i%16];
    }
}


int main() {
  int i;

  printf("Input To Hash: ");

  fgets(inbuf,256,stdin);

  size_t ln = strlen(inbuf) - 1;
  if (inbuf[ln] == '\n')
    inbuf[ln] = '\0';

  xor_encrypt();//key,buf);

  for (i=0; i < ln; i++) {
   printf("%.2x ",outbuf[i]);
  }

  printf("\n");

  return 0;
}

Then I use the following commands for angr. If I add the constraints before asking it to explore, it just fails at the exploration portion:

import angr
import simuvex
import logging

b = angr.Project("a.out",use_sim_procedures=True)

# Get the addresses
inbufAddr = b.loader.main_bin.get_symbol("inbuf").addr
outbufAddr = b.loader.main_bin.get_symbol("outbuf").addr

e = b.surveyors.Explorer(find=(0x400724))

e.run()

s = e.found[0].state

outbuf = s.memory.load(outbufAddr,36)
inbuf = s.memory.load(inbufAddr,36)

s.add_constraints(outbuf[7:0] != 0)
s.se.any_str(inbuf)

from angr-doc.

ltfish avatar ltfish commented on July 22, 2024

One thing that I suspect is happening is that, e.run() returned the very first path it found, which happened to be the one with strlen(...) == 0.

Can you check more paths, other than only the first path returned?

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

It says it has 2 active paths as well. Is there an easy way to keep running those?

from angr-doc.

rhelmot avatar rhelmot commented on July 22, 2024

There's a parameter num_find on the Explorer constructor.

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

e = b.surveyors.Explorer(find=(0x400724),num_find=3)
e.run()
<Explorer with paths: 0 active, 0 spilled, 1 deadended, 2 errored, 0 unconstrained, 1 found, 0 avoided, 0 deviating, 0 looping, 0 lost>
e.errored[0]
<Errored Path with 31 runs (at 0x400674, ClaripyOperationError)>
e.errored[0].error
ClaripyOperationError("can't reverse non-byte sized bitvectors",)

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

well that didn't format right, but the other paths errored. still only one found path

from angr-doc.

rhelmot avatar rhelmot commented on July 22, 2024

Fuck. We fixed a bug just like that yesterday. Let me make sure everything is clean to push.

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

I installed from pip a few days back. maybe i need to pull from git?

from angr-doc.

ltfish avatar ltfish commented on July 22, 2024

@rhelmot You might want to wait for my VSA_DDG test to pass...

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

Here's some more of the error:

In [12]: e.run()
WARNING:simuvex.vex.irsb:<SimIRSB 0x40070d> hit an when analyzing statements
Traceback (most recent call last):
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/vex/irsb.py", line 92, in _handle_irsb
self._handle_statements()
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/vex/irsb.py", line 207, in _handle_statements
s_stmt = translate_stmt(self.irsb, stmt_idx, self.last_imark, self.state)
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/vex/statements/init.py", line 31, in translate_stmt
s.process()
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/vex/statements/base.py", line 26, in process
self._execute()
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/vex/statements/store.py", line 34, in _execute
self.state.memory.store(addr.expr, data_endianness, action=a)
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/storage/memory.py", line 163, in store
self._store(request)
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/plugins/symbolic_memory.py", line 469, in _store
req.actual_addresses = self.concretize_write_addr(req.addr)
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/plugins/symbolic_memory.py", line 202, in concretize_write_addr
return self._concretize_addr(addr, strategy=strategy, limit=limit)
File "/home/jess/symbolic/angr-dev/simuvex/simuvex/plugins/symbolic_memory.py", line 168, in _concretize_addr
raise SimMemoryAddressError("Trying to concretize with unsat constraints.")
SimMemoryAddressError: Trying to concretize with unsat constraints.

from angr-doc.

rhelmot avatar rhelmot commented on July 22, 2024

This is a non-error. It just means the path is unsatisfiable and should be thrown out. The actual analysis is still running while this prints out; it's just part of a logger message.

from angr-doc.

rhelmot avatar rhelmot commented on July 22, 2024

Oh, and: we updated the versions of everything on pip last night! Upgrade angr and see if you still get the bitvector-reversing errors.

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

Nice. The errors are gone. Though for whatever reason it still fails with:

s.se.any_str(inbuf)
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/simuvex/plugins/solver.py", line 269, in any_str
return self.any_n_str(e, 1, extra_constraints=extra_constraints)[0]
IndexError: list index out of range

I think there should be a patch for that part to at least return a message to the user saying that there were no solutions. That said, am I asking too much of angr/symbolic execution to solve the xor as described above?

from angr-doc.

rhelmot avatar rhelmot commented on July 22, 2024

The problem with the code you provided is that you only check the first found path on the explorer. The only thing affecting the length of a path in your program is how many times it goes around the xor_encrypt loop, and going around the loop zero times corresponds to the case where the user entered an empty string. Therefore, the path to which you add the constraint has never had any data written to outbuf, so the first char in outbuf can never be anything but zero, which is why the state goes unsat after you add your constraint.

from angr-doc.

rhelmot avatar rhelmot commented on July 22, 2024

I do agree with you though, there should be a better error message there.

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

I really must be missing something here then. I do the num_find=5 for instance, and use s = e.found[4].state. Even with that, adding that one constraint causes it to fail. I've even tried adding the following at the beginning with no luck in getting the constraints solved:

def stringlen(state):
        state.regs.rax = 16

# Hook the strlen on xor to define the input size
b.hook(0x400663,stringlen,length=5)

If I understand it correctly, that should go to the hook once it gets to the strlen check in the xor function, and instead of checking it will just return that the length is 16. Using that hook still causes it to fail to solve the constraint.

from angr-doc.

rhelmot avatar rhelmot commented on July 22, 2024

Ah, I see. The problem now is that by default, memory.load loads big-endian values, which is intentional so that you can load strings. outbuf[7:0] pulls out the eight least significant bits in outbuf, which in a big-endian load, correspond to the byte in memory with the largest address. If I allow the loop to run 36 times, I can satisfiably set the constraint.

import angr
import simuvex
import logging

b = angr.Project("a.out",use_sim_procedures=True)

# Get the addresses
inbufAddr = b.loader.main_bin.get_symbol("inbuf").addr
outbufAddr = b.loader.main_bin.get_symbol("outbuf").addr

e = b.surveyors.Explorer(find=(0x4006F5), num_find=50)
x = 0

for found in e:
    x += 1
    print 'found', x
    s = found.state

    outbuf = s.memory.load(outbufAddr,36)
    inbuf = s.memory.load(inbufAddr,36)

    s.add_constraints(outbuf[7:0] != 0)
    if s.satisfiable():
        print repr(s.se.any_str(inbuf))
        break

import IPython; IPython.embed()

Prints found 1 found 2 ... found 37 and then drops into the shell.

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

Thanks! Got it to work. That was seriously driving me nuts. Btw, is there a more elegant way to set the constraints than the following?

i = 0
for c in "".join("c3 3b 62 2d df 8f 86 11 98 9a 73 3b 1a 35 04 b3 c0 34 76 3e cd b3 91 23 9c 9a 6d 0d 0d 02 04 94 f2 1e 4d 37".split(" ")).decode('hex')[::-1]:
        s.add_constraints(outbuf[i + 7:i] == ord(c))
        i += 8

It works, but seems a little janky.

from angr-doc.

rhelmot avatar rhelmot commented on July 22, 2024
number = int('c3 3b 62 2d df 8f 86 11 98 9a 73 3b 1a 35 04 b3 c0 34 76 3e cd b3 91 23 9c 9a 6d 0d 0d 02 04 94 f2 1e 4d 37'.replace(' ', ''), 16)
s.add_constraints(outbuf == number)

from angr-doc.

zardus avatar zardus commented on July 22, 2024

Do you mean int instead of hex?
On Sep 10, 2015 8:07 PM, "Andrew Dutcher" [email protected] wrote:

number = hex('c3 3b 62 2d df 8f 86 11 98 9a 73 3b 1a 35 04 b3 c0 34 76 3e cd b3 91 23 9c 9a 6d 0d 0d 02 04 94 f2 1e 4d 37'.replace(' ', ''), 16)
s.add_constraints(outbuf == number)


Reply to this email directly or view it on GitHub
#7 (comment).

from angr-doc.

rhelmot avatar rhelmot commented on July 22, 2024

I edited that post within four seconds of posting it, snap!

from angr-doc.

zardus avatar zardus commented on July 22, 2024

Ah, poop, I was on email :-)
On Sep 10, 2015 8:10 PM, "Andrew Dutcher" [email protected] wrote:

I edited that post within four seconds of posting it, snap!


Reply to this email directly or view it on GitHub
#7 (comment).

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

Ah. I tried doing the same thing as string itself. Converting it into just a really large number and setting it in one shot (outbuf == number) worked perfect. Thanks!

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

I've seen you guys use both Explorer and Path Groups. Is one better than the other?

from angr-doc.

ltfish avatar ltfish commented on July 22, 2024

You should try to use PathGroup instead of Explorer. PathGroup is more flexible. Explorer belongs to the past!

from angr-doc.

bannsec avatar bannsec commented on July 22, 2024

Also, how does Angr decide when it has hit a deadend? I'm playing around with a CTF challenge that Angr returns claiming all it was able to find were a bunch of deadends and a bunch of avoided paths.

from angr-doc.

ltfish avatar ltfish commented on July 22, 2024

A path deadends when there is no feasible successor can be found.

from angr-doc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.