Giter Club home page Giter Club logo

acquire's People

Contributors

cecinestpasunepipe avatar devjoost avatar horofic avatar janstarke avatar joost-j avatar jscu-cni avatar martinvanhensbergen avatar miauwkeru avatar poeloe avatar pyrco avatar rickvandreunen avatar ruzzle avatar schamper avatar tobraha avatar wouter-jansen avatar zawadidone avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

acquire's Issues

Collect CarbonBlack logs

CarbonBlack logs can contain interesting information, they reside in the following directory on Windows:

  • c:\ProgramData\CarbonBlack\Logs

    Some example log files in this directory:

  • confer.log and confer.log.\*.zip

  • cblr.log

  • SensorAlarms.log

    However, all logs in this directory can be of interest.

Name services configuration + binaries

  • Name services configuration + binaries
    ** /etc/nsswitch.conf
    ** all referred libnss_* modules

Parse contents of {{nsswitch.conf}} to records. Ideally, collect the paths found in this config file with {{acquire}} automagically.

Make acquire modules OS aware

Design requirements:

  • Collection should be done for the whole set of os levels (e.g. Linux & fortigate incase of collection for a fortigate machine). The set of levels could be determined by looking at the (super) classes of OSPlugin for the specific target at hand.
  • This determining of the set of OS levels is best implemented in dissect.target to have a single interface for this type of information. Take a sneak peek at how to replace the current system of determining OSes (linux, windows, mac, unknown) with this interface.
  • A Module should be able to service multiple OSes. This means the SPECS etc. in a Module subclass should be devided based on OS.

Acquire Docker container metadata and filesystems

We recently had a case where relevant logs (and other traces) were stored in Docker volumes. It would be nice to have a way (a {{docker}} plugin?) to acquire the container and image metadata from {{/var/lib/docker}} as well as the overlayfs layers of the actual containers (i.e. not those of the images, since that would be both less interesting as well as a lot of data).

Mac OS typo Applications Support

The acquire globs for Mac OS files related to Google Chrome and Firefox use the wrong folder name (s).

Applications Support -> Application Support

On Mac OS Monterey version 12.6

ls /Users/*/Library/Applications\ Support/
ls: /Users/*/Library/Applications Support/: No such file or directory

ls /Users/*/Library/Application\ Support/
Google
[...]

References

Research alternative to get rpd cache

when running qwinsta on windows11, it seems that it isn’t compatibe with the version of windows used… which is odd

  • look if we can run executable in compatibility mode using cmd

Pyoxidizer musl execution on VMware ESXi

Using the Pyoxidizer configuration (#109) I was able to build a static musl binary, but when executing the binary on VMware ESXi 7 the execution fails while it works on the Docker image (quay.io/pypa/manylinux2014_x86_64).

The error as shown below is triggered because it cannot obtain the current path of the executable, because /self/proc/exe (https://github.com/indygreg/PyOxidizer/blob/b78b0cb75f4317c45408bbc9a569c062c482c679/pyembed/src/config.rs#L478) is not available on VMware ESXi 7.

I don't understand well enough how Pyoxidizer works to determine what causes this error and how this issue can be resolved, but I will look into this.

*VMWare ESXi 7*

vmware -v
VMware ESXi 7.0.3 build-21930508

./acquire --help
error instantiating embedded Python interpreter: could not obtain current executable

strace ./acquire
[...]
mmap(0x7221612000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7221612000
readlink("/proc/self/exe", 0x7221612000, 256) = -1 EINVAL (Invalid argument)
[...]

ls -al /proc/self/exe
ls: /proc/self/exe: No such file or directory

*quay.io/pypa/manylinux2014_x86_64*

acquire --help
[...]
If no options are given, the collection profile 'default' is used.

ls -al /proc/self/exe
lrwxrwxrwx 1 root root 0 Nov 20 16:23 /proc/self/exe -> /usr/bin/ls

*Alpine Linux 3.18*

acquire --help
[...]
If no options are given, the collection profile 'default' is used.

strace ./acquire --help
[...]
readlink("/proc/self/exe", "/tmp/acquire", 256) = 12
open("/tmp/acquire", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_PATH) = 3
readlink("/proc/self/fd/3", "/tmp/acquire", 4095) = 12
fstat(3, {st_mode=S_IFREG|0750, st_size=112942000, ...}) = 0
stat("/tmp/acquire", {st_mode=S_IFREG|0750, st_size=112942000, ...}) = 0
close(3)
[...]


ls -al /proc/self/exe
lrwxrwxrwx    1 root     root             0 Nov 20 16:37 /proc/self/exe -> /bin/busybox

Inconsistent output paths when acquiring Windows container from Linux

When acquiring any non-live Windows container (HDD, VM image) from Linux with case-sensitive filesystem output tar/directory contains duplicate directories with mixed case:

For example, running acquire windows-vm.qcow2 on Linux with btrfs gives following directories (truncated for readability):

$ tree
.
└── C:
    ├── $Recycle.bin
    ├── $Recycle.Bin
    ├── windows
    │   ├── appcompat
    │   ├── system32
    │   │   ├── config
    │   │   ├── drivers
    │   │   ├── sru
    │   │   ├── tasks
    │   │   ├── wbem
    │   │   └── winevt
    │   └── tasks
    └── Windows
        └── System32
            └── WDI

Notice duplicated $Recycle.Bin, Windows, System32 directories with different case.
I managed to somewhat fix it with replacing all sysvol/windows/ and /sysvol/windows/system32 strings in acquire.py with proper case, but this method also requires similar changes in other dissect libraries, since acquire calls them to get collection paths. Surely there are a better fix for this than specifying correct case in collection paths, e.g. using proper path from filesystem for output path

acquire produces corrupt tar file when trying to collect a file which cannot be read for its whole length

I am using acquire on a raw image to get out some files. Within this image there are a couple of special (sparse? corrupt?) files which originate from a Firefox profile. The files in question are all sqlite files:

/root/.mozilla/firefox/[some-id].default-esr/cookies.sqlite and some other sqlite files found in this profile folder.

This file (cookies.sqlite) is reported to be 524288 bytes in size by ls or stat. But when actually read it stops after 98304 bytes (In my experiments cp, hexdump and md5sum all did not go beyond). After that it is only zero bytes anyway, so I doubt this is a coincidence.

Now when using the dir output type it seems to work without errors:


  Module |    Success |    Failure |    Missing |      Empty

------------------------------------------------------------

cli-args |      11477 |            |            |          

------------------------------------------------------------

   Total |      11477 |          0 |          0 |          0

------------------------------------------------------------

It should be noted though that the file cookies.sqlite for example was only 98304 bytes at destination as a result. I guess this behavior is ok depending on how you look at it. But a warning message might be appropriate.

The more serious problem happened when using the tar output type. Judging by the log this works mostly fine. In my example there are 11477 files of which only 5 failed:


Done collecting artifacts:

------------------------------------------------------------

  Module |    Success |    Failure |    Missing |      Empty

------------------------------------------------------------

cli-args |      11472 |          5 |            |

------------------------------------------------------------

   Total |      11472 |          5 |          0 |          0

------------------------------------------------------------

The 5 failed files were the sqlite files mentioned before.

But now when inspecting the tar file it seems to be corrupted:

tar -tvf out.tar | wc -l

tar: Skipping to next header

tar: Exiting with failure status due to previous errors

7152

It seems to only contain 7152 of the 11472 files. 7-Zip on Windows opens the tar even without complaints but many files are missing / not shown.

When using the tar switch -i/--ignore-zeros it finds at least all the files that were successfully collected.

tar -itvf out.tar | wc -l

tar: Skipping to next header

tar: Skipping to next header

tar: Skipping to next header

tar: Exiting with failure status due to previous errors

11469

What I learned is that if the file to be collected cannot be read for the length attributed to it, adding it to the tar file will fail, and the tar becomes somewhat corrupted.

What the logs says:


[2023-05-02 17:09:16,850] [ERROR] Failed to collect file

Traceback (most recent call last):

  File "/root/py3venv_dissect/lib/python3.11/site-packages/acquire/collector.py", line 279, in collect_file

    self.output.write_entry(outpath, entry, size)

  File "/root/py3venv_dissect/lib/python3.11/site-packages/acquire/outputs/base.py", line 54, in write_entry

    self.write(output_path, fh, entry, size)

  File "/root/py3venv_dissect/lib/python3.11/site-packages/acquire/outputs/tar.py", line 103, in write

    self.tar.addfile(info, fh)

  File "/usr/lib/python3.11/tarfile.py", line 2030, in addfile

    copyfileobj(fileobj, self.fileobj, tarinfo.size, bufsize=bufsize)

  File "/usr/lib/python3.11/tarfile.py", line 250, in copyfileobj

    raise exception("unexpected end of data")

OSError: unexpected end of data

[2023-05-02 17:09:16,853] [INFO ] - Collecting file /root/.mozilla/firefox/[some-id].default-esr/cookies.sqlite: OSError('unexpected end of data')

I tried to reproduce this error by building a sparse file manually with truncate -s 1M file.img and adding it to a tar using tarfile but to my surprise it worked. Maybe Firefox's sqlite files are created somehow special?

Let me know if I could provide more info.

Extend method for NamedObject.from_directory_information

        # TODO: Let this method generate different types of NamedObjects according to its type
        # Idealy they each handle their own behaviour themselves, and clean up after themselves.
        # NOTE: this does require some additional methods to get added, such as maybe open, close, and details(?)
        # Then it would also be nice to have a contextmanager that handles the open and closing of the object.

Make acquire subprocess output capture non-blocking

There is no default way to do this on both Windows & Linux.

The best solution is to have a thread do the reading and let that block. An example can be found here:

[http://eyalarubas.com/python-subproc-nonblock.html]

 

Another way to do it is to make the way the winpmem module does it the default, with intermediate files where stdout and stderr are stored. This circumvents the need to implement non-blocking pipes.

Add option to acquire to collect UEFI

The UEFI partition is FAT based, and dissect.fat should just work. Might need some investigation into the differences between Windows and Linux based systems.

Add temporary suffix while Acquire is still running

During historical engagements, there have been cases where Acquire was deployed to many systems, with all resulting .tar.gz output files being written to a single fileshare. Although the .log file indicates the completion status of the Acquire run, it is hard to assess the status without opening and inspecting this file. As a result, Acquire output files are sometimes copied before the Acquire process has finished, leading to corrupt and/or incomplete output files.

Would it be an idea to add an additional suffix to the output archive/file/directory while Acquire is still running. E.g. start by writing all results to file:

HOSTNAME_20240625102705.tar.gz.running

and once the process has finished with success, rename this file to

HOSTNAME_20240625102705.tar.gz

In this way, it's easier to spot which Acquires have not yet finished and should not be copied. Obviously the .running suffix can be anything, maybe .tmp or .active are more suitable.

Different consideration of letter case for windows output folders by the acquire plugins

When using acquire on a Linux distbution (in my case Ubuntu 22.04.3 LTS) to collect data from a Windows image (in my case in EWF format), different plugins seem to use different letter case for the output folders. Some plugins working case-sensitiv and some case-insensitiv.
Unfortunately, this leads, for example, to two folders with the name Windows being created in the output directory on my Linux system, one in upper and lower case and one only in lower case. This probably depends on the plugin to which folder the output is written.
This also not only affects folders under fs/sysvol but also subfolders. For example, two System32 folders are also created - one system32 and one system32
Here is an example of my output (i worked with the full profile):

/fs/sysvol $ tree -L 3 -d .
.
├── $Extend
├── $Recycle.bin
├── $Recycle.Bin
├── ProgramData
│   └── Microsoft
│       ├── Network
│       ├── Search
│       ├── Windows
│       └── Windows Defender
├── Users
├── windows
│   ├── appcompat
│   │   ├── appraiser
│   │   ├── Programs
│   │   └── UA
│   ├── inf
│   ├── prefetch
│   ├── system32
│   │   ├── config
│   │   ├── drivers
│   │   ├── sru
│   │   ├── tasks
│   │   ├── wbem
│   │   └── winevt
│   └── tasks
└── Windows
    ├── Logs
    │   ├── CBS
    │   └── WindowsUpdate
    ├── ServiceProfiles
    │   ├── LocalService
    │   └── NetworkService
    ├── system32
    │   └── config
    ├── System32
    │   ├── WDI
    │   └── winevt
    └── Temp

It would be nice if the plugins all used the same upper and lower case for the respective output folders. Preferably the Windows standard, i.e. what was found on the Windows image.

Add Atera/Splashtop to Acquire

During a CERT case it was observed that the actors were using the Atera Management Agent. This agent seems to use the Splashtop Remote Access Tool underlying. We'll need to add these locations to acquire so we can query this data with target-query.

File locations: C:\Program Files (x86)\Splashtop\Splashtop Remote\Server\log\

  • svcinfo.txt -> Splashtop service information loggin;
  • agent_log.txt -> agent output, generic information;
  • sysinfo.txt -> information about server and session startups;
  • SPLog.00x -> information about clipboard, transferred files, etc;

Add option to select specific children in acquire

I.e. a specific child may error or for some other reason you want to exclude it.

Maybe nice to draw inspiration from (or just use) rdump selectors e.g.

"t.os == 'windows'" or "'Windows Server' in t.version"

Error: Module name must be provided or Collector needs to be bound to a module

I am having trouble using dissects acquire. When using it with the file option everything is fine:

*** Acquiring specified paths
- Collecting file /root/.bashrc: OK

Done collecting artifacts:
------------------------------------------------------------
  Module |    Success |    Failure |    Missing |      Empty
------------------------------------------------------------
cli-args |          1 |            |            |
------------------------------------------------------------
   Total |          1 |          0 |          0 |          0
------------------------------------------------------------

But when using the glob option I can't seem to get any files out of the image. The following error suggests providing a module name which I am not sure how to do. Am I missing something? Any advice is much appreciated.

*** Acquiring specified paths
- Collecting glob /root/.*shrc
- Failed to collect glob /root/.*shrc
Traceback (most recent call last):
  File "/root/py3venv_dissect/lib/python3.10/site-packages/acquire/collector.py", line 361, in collect_glob
    self.collect_path(entry)
  File "/root/py3venv_dissect/lib/python3.10/site-packages/acquire/collector.py", line 381, in collect_path
    raise ValueError("Module name must be provided or Collector needs to be bound to a module")
ValueError: Module name must be provided or Collector needs to be bound to a module

Done collecting artifacts:
------------------------------------------------------------
  Module |    Success |    Failure |    Missing |      Empty
------------------------------------------------------------
cli-args |            |          1 |            |
------------------------------------------------------------
   Total |          0 |          1 |          0 |          0
------------------------------------------------------------ 

Additional NTFS Artefact Collection

The following files would be beneficial when collecting data with Acquire.

{code:java}
C:$LogFile
C:$Extend$UsnJrnl:$Max
C:$Extend$RmMetadata$TxfLog$Tops:$T
C:$Extend$RmMetadata$TxfLog$T{code}

Collect all nginx&apache logs in Acquire

For IIS we parse the config (using dissect.target’s IIS plugin) to find additional log directories.

A similar thing can be done for NginX and Apache. Their respective plugins already have such a “give-me-all-log-paths” function, so that can be used.

All this functionality (including collecting the default paths, but those are probably also emitted by the plugin function) should be put in a WebserverLog module in acquire, also moving the IIS stuff there.

The IIS module should log & print a deprecation warning and forward to the WebserverLog module.

Also add acquire-test tests, with default logs & logs configured in a mock config.

Feature Request: allow custom report names

Currently when acquire is ran, the tool writes a report to the output directory prefixed by the hostname of the machine, a timestamp and a suffix of .report.json. However, whenever a custom output file is specified, the report does not adhere to this same naming scheme. I was looking for the option, and I could only find an option for disabling the report all together.

This mainly becomes tedious when every single machine that is acuire'd has the exact same hostname, making reports hard to distinguish from each other.

Would it be possible to either have the report be written to a similarly named file, or have an extra option which lets users specify their own report path?

Acquire output does not contain a clear ending message

Acquire can give confusing output, that does not make it obvious whether it exited cleanly or not. Even after a summary it sometimes still provides confusing output.
This makes it unclear whether the acquisition process has ended due to a bug, or whether it ended the happy flow.

I strongly recommend ending cleanly with something like repeating e.g. the command line arguments like is done at the start.
Then end with a clear message. For example:

[2023-11-02 16:46:26,193] [INFO ] Finishing acquisition
[2023-11-02 16:46:26,367] [INFO ] Arguments: --children -o /vmfs/volumes/NFS_storetemp/server
[2023-11-02 16:46:26,367] [INFO ] Default Arguments: --compress --ntds --active-directory --profile full
[2023-11-02 16:46:26,573] [INFO ] Exiting with status code 0 (SUCCESS)

Note that this can also be confusing for individual child acquisitions with --children. The individual log files can also contain unclear beginning and ending log lines.

This should be a fairly simply fix with relatively high value. It can really help troubleshooting.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.