This tool crawls a directory tree and outputs hashes to a json log file. The hash, absolute file path, and datetime of the recording are save as a list of json objects.
Opt 1. Go Install recommended method
go install github.com/DavidHoenisch/crawlhash@latest
Opt 2. Pre-built bianaries
Prebuilt bianaries are provided for easier set up. You can also build from source if you would prefer.
make the directory
mkdir -p .local/bin/evitools/
get the lastest version
wget https://github.com/DavidHoenisch/crawlhash/releases/download/v0.0.8/crawlhash_0.0.8_linux_amd64.tar.gz
Note, in the above example you should replace the release and bianary to match your respective system. Bianaries are provided for all major systems and architectures.
extract the bin from the archive
tar xvf crawlhash_0.0.8_linux_amd64.tar.gz
move into bin directory
mv crawlhash ~/.local/bin/evitools/
ensure that path is updated
echo "export PATH=$PATH:$HOME/.local/bin/evitools/" >> .profile
note, depend on setup, path may need to be added to either the .bashrc file or
.zshrc profile. If after restarting shell you are not able to run the command
crawlhash
, rerun the above echo command but redirect to your shell profile (.bashr, .zshrc).
Crawlhash takes in one filepath argument. This is the path to the root directory scan.
A singular file, log.json
will be written to the directory the directory that crawlhash was
run from. This file contains the output from crawlhash.
crawlhash ~/path/to/dir
crawlhash is a port of a previous tool that I wroke that did the exact same thing. That tool, found here was tooled in python and was is significantly slower.
As an example, I hashed my ~/Documents
folder which has about 13k files in it.
Here are screenshots from my testing:
Quite the signifcant speed improvement. This is not, however, a 1-to-1 comparison. Both iterations of the tool writ the results to a file, but the golang implementation records results to json. Writing results to json requires marginally more work. The golang version is not only faster, it is faster while doing more.
I have a few items on the roadmap for improvement.
- Threading -> My thought is that each directory could be handed off to a seperate thread for hashing
- More command line options for flexibility implemented through use of the cobra library