Giter Club home page Giter Club logo

dorkbot's Introduction

Image of Dorkbot

dorkbot

Scan Google (or other) search results for vulnerabilities.

dorkbot is a modular command-line tool for performing vulnerability scans against sets of webpages returned by Google search queries or other supported sources. It is broken up into two sets of modules:

  • Indexers - modules that return a list of targets
  • Scanners - modules that perform a vulnerability scan against each target

Targets are stored in a database as they are indexed. Once scanned, a standard JSON report is produced containing any vulnerabilities found. Indexing and scanning processes can be run separately or combined in a single command (up to one of each).

Quickstart

$ pip3 install dorkbot wapiti3
$ dorkbot -i google_api -o key=your_api_credential_here -o engine=your_engine_id_here -o query="filetype:php inurl:id"
$ dorkbot -s wapiti

Help

 -h, --help            Show program (or specified module) help
  --show-defaults       Show default values in help output

Usage

usage: dorkbot.py [-c CONFIG] [-r DIRECTORY] [--source [SOURCE]]
                  [--show-defaults] [--count COUNT] [--random] [-h]
                  [--log LOG] [-v] [-V] [-d DATABASE] [-u] [-l]
                  [--list-unscanned] [--add-target TARGET]
                  [--delete-target TARGET] [--flush-targets] [-i INDEXER]
                  [-o INDEXER_ARG] [-s SCANNER] [-p SCANNER_ARG] [-f]
                  [--list-blocklist] [--add-blocklist-item ITEM]
                  [--delete-blocklist-item ITEM] [--flush-blocklist]
                  [-b EXTERNAL_BLOCKLIST]

options:
  -c CONFIG, --config CONFIG
                        Configuration file
  -r DIRECTORY, --directory DIRECTORY
                        Dorkbot directory (default location of db, tools,
                        reports)
  --source [SOURCE]     Label associated with targets
  --show-defaults       Show default values in help output
  -h, --help            Show program (or specified module) help
  --log LOG             Path to log file
  -v, --verbose         Enable verbose logging (DEBUG output)
  -V, --version         Print version

global scanner options:
  --count COUNT         number of urls to scan, or -1 to scan all urls
  --random              retrieve urls in random order

database:
  -d DATABASE, --database DATABASE
                        Database file/uri
  -u, --prune           Apply fingerprinting and blocklist without scanning

targets:
  -l, --list-targets    List targets in database
  --list-unscanned      List unscanned targets in database
  --add-target TARGET   Add a url to the target database
  --delete-target TARGET
                        Delete a url from the target database
  --flush-targets       Delete all targets

indexing:
  -i INDEXER, --indexer INDEXER
                        Indexer module to use
  -o INDEXER_ARG, --indexer-arg INDEXER_ARG
                        Pass an argument to the indexer module (can be used
                        multiple times)

scanning:
  -s SCANNER, --scanner SCANNER
                        Scanner module to use
  -p SCANNER_ARG, --scanner-arg SCANNER_ARG
                        Pass an argument to the scanner module (can be used
                        multiple times)

fingerprints:
  -f, --flush-fingerprints
                        Delete all fingerprints of previously-scanned items

blocklist:
  --list-blocklist      List internal blocklist entries
  --add-blocklist-item ITEM
                        Add an ip/host/regex pattern to the internal blocklist
  --delete-blocklist-item ITEM
                        Delete an item from the internal blocklist
  --flush-blocklist     Delete all internal blocklist items
  -b EXTERNAL_BLOCKLIST, --external-blocklist EXTERNAL_BLOCKLIST
                        Supplemental external blocklist file/db (can be used
                        multiple times)

Tools / Dependencies

As needed, dorkbot will search for tools in the following order:

  • Directory specified via relevant module option
  • Located in tools directory (within current directory, by default), with the subdirectory named after the tool
  • Available in the user's PATH (e.g. installed system-wide)

Files

All SQLite3 databases, tools, and reports are saved in the dorkbot directory, which by default is the current directory. You can force a specific directory with the --directory flag. Default file paths within this directory are as follows:

  • SQLite3 database file: dorkbot.db
  • External tools directory: tools/
  • Scan report output directory: reports/

Configuration files are by default read from ~/.config/dorkbot/ (Linux / MacOS) or in the Application Data folder (Windows), honoring $XDG_CONFIG_HOME / %APPDATA%. Default file paths within this directory are as follows:

  • Dorkbot configuration file: dorkbot.ini

Config File

The configuration file (dorkbot.ini) can be used to prepopulate certain command-line flags.

Example dorkbot.ini:

[dorkbot]
database=/opt/dorkbot/dorkbot.db
[dorkbot.indexers.wayback]
domain=example.com
[dorkbot.scanners.arachni]
arachni_dir=/opt/arachni
report_dir=/tmp/reports

Blocklist

The blocklist is a list of ip addresses, hostnames, or regular expressions of url patterns that should not be scanned. If a target url matches any item in this list it will be skipped and removed from the database. The internal blocklist is maintained in the dorkbot database, but a separate file or databasecan be specified by passing the appropriate file path or connection uri to --external-blocklist. Targets are matched first against the internal blocklist and then optionally against any provided external blocklists.

Supported external blocklists:

  • postgresql://[server info]
  • phoenixdb://[server info]
  • sqlite3:///path/to/blocklist.db
  • /path/to/blocklist.txt

Example blocklist items:

regex:^[^\?]+$
regex:.*login.*
regex:^https?://[^.]*.example.com/.*
host:www.google.com
ip:127.0.0.1

The first item will remove any target that doesn't contain a question mark, in other words any url that doesn't contain any GET parameters to test. The second attempts to avoid login functions, and the third blocklists all target urls on example.com. The fourth excludes targets with a hostname of www.google.com and the fifth excludes targets whose host resolves to 127.0.0.1.

Prune

The prune flag iterates through all targets, computes the fingerprints in memory, and marks subsequent matching targets as scanned. Additionally it deletes any target matching a blocklist item. The result is a database where --list-unscanned returns only scannable urls. It honors the random flag to compute fingerprints in random order.

General Options

These options are applicable regardless of module chosen:

  --source [SOURCE]     Label associated with targets
  --count COUNT         number of urls to scan, or -1 to scan all urls
  --random              retrieve urls in random order

Indexer Modules

google

  Searches google.com via scraping

  --engine ENGINE       CSE id
  --query QUERY         search query
  --phantomjs-dir PHANTOMJS_DIR
                        phantomjs base dir containing bin/phantomjs
  --domain DOMAIN       limit searches to specified domain

google_api

  Searches google.com

  --key KEY             API key
  --engine ENGINE       CSE id
  --query QUERY         search query
  --domain DOMAIN       limit searches to specified domain

pywb

  Searches a given pywb server's crawl data

  --server SERVER       pywb server url
  --domain DOMAIN       pull all results for given domain or subdomain
  --cdx-api-suffix CDX_API_SUFFIX
                        suffix after index for index api
  --index INDEX         search a specific index
  --filter FILTER       query filter to apply to the search
  --retries RETRIES     number of times to retry fetching results on error
  --threads THREADS     number of concurrent requests to wayback.org

commoncrawl

  Searches commoncrawl.org crawl data

  --domain DOMAIN       pull all results for given domain or subdomain
  --index INDEX         search a specific index, e.g. CC-MAIN-2019-22 (default: latest)
  --filter FILTER       query filter to apply to the search
  --retries RETRIES     number of times to retry fetching results on error
  --threads THREADS     number of concurrent requests to commoncrawl.org

wayback

  Searches archive.org crawl data

  --domain DOMAIN       pull all results for given domain or subdomain
  --filter FILTER       query filter to apply to the search
  --from FROM           beginning timestamp
  --to TO               end timestamp
  --retries RETRIES     number of times to retry fetching results on error
  --threads THREADS     number of concurrent requests to wayback.org

bing_api

  Searches bing.com

  --key KEY             API key
  --query QUERY         search query

stdin

  Accepts urls from stdin, one per line

Scanner Modules

arachni

  Scans with the arachni command-line scanner

  --arachni-dir ARACHNI_DIR
                        arachni base dir containing bin/arachni and bin/arachni_reporter
  --args ARGS           space-delimited list of additional arguments
  --report-dir REPORT_DIR
                        directory to save vulnerability report
  --label LABEL         friendly name field to include in vulnerability report

wapiti

  Scans with the wapiti3 command-line scanner

  --wapiti-dir WAPITI_DIR
                        wapiti base dir containing bin/wapiti
  --args ARGS           space-delimited list of additional arguments
  --report-dir REPORT_DIR
                        directory to save vulnerability report
  --label LABEL         friendly name field to include in vulnerability report

dorkbot's People

Contributors

jgor avatar utiso avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dorkbot's Issues

Keyerror: total result

I just install the script and get theses error:

Traceback (most recent call last):
File "/usr/local/bin/dorkbot", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/dorkbot/dorkbot.py", line 73, in main
index(db, blacklist, load_module("indexers", args.indexer), args, indexer_options)
File "/usr/local/lib/python3.8/dist-packages/dorkbot/dorkbot.py", line 212, in index
urls = indexer.run(options)
File "/usr/local/lib/python3.8/dist-packages/dorkbot/indexers/google_api.py", line 17, in run
results = get_results(options)
File "/usr/local/lib/python3.8/dist-packages/dorkbot/indexers/google_api.py", line 34, in get_results
items = issue_request(data)
File "/usr/local/lib/python3.8/dist-packages/dorkbot/indexers/google_api.py", line 78, in issue_request
if int(request["totalResults"]) == 0:
KeyError: 'totalResults'

Sometingchange in the google API?

Query must be set

Hello, I'm trying to run Dorkbok with the following command:

python ./dorkbot.py -i google -o engine=GoogleCSE,query="filetype:php inurl:id" -s arachni

And I get this error:

-ERROR - query must be set

Why is this error?

CommonCrawl indexer

Great work! this looks like it has such potential.

How can I just run the commoncrawl indexer and search for something in a url?
Much like the using google by doing inurl:index.html
I see commoncrawl has a filter option and I'm guessing this is what I'm looking for.. though it asks for a domain as a required arg?? bit confused...

My ultimate goal is to simply grab a listing of sites that match a given criteria. creating a custom search on google isn't quite anonymous as I'm looking for.

Cheers and thank you!

i got some errors

./dorkbot.py -i google -o engine=012345678901234567891:abc12defg3h,query="filetype:php inurl:id" -s arachni
Traceback (most recent call last):
  File "./dorkbot.py", line 174, in <module>
    main()
  File "./dorkbot.py", line 69, in main
    index(db, args.indexer, args.indexer_options)
  File "./dorkbot.py", line 126, in index
    results = indexer_module.run(options)
  File "/root/Desktop/git/dorkbot/indexers/google.py", line 31, in run
    results = get_results(phantomjs_path, options["engine"], options["query"], domain)
  File "/root/Desktop/git/dorkbot/indexers/google.py", line 46, in get_results
    output = subprocess.check_output(index_cmd)
  File "/usr/lib/python2.7/subprocess.py", line 212, in check_output
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
  File "/usr/lib/python2.7/subprocess.py", line 390, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1024, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

CSV support

If possible can you please a csv file support instead of using an sqlite db?

As we can then easily edit the csv file on the go to modify certain links or remove similar ones

two errors

1、when i run google,this need photomjs,but have error
PhantomJS has crashed. Please read the bug reporting guide at
http://phantomjs.org/bug-reporting.html and file a bug report.
2020-01-11T19:41:09-0500 - ERROR - Failed to execute phantomjs command
Traceback (most recent call last):
File "dorkbot.py", line 674, in
main()
File "dorkbot.py", line 69, in main
index(db, blacklist, load_module("indexers", args.indexer), args, indexer_options)
File "dorkbot.py", line 194, in index
for url in urls:
TypeError: 'bool' object is not iterable

2、when i run google-api,also have error
Traceback (most recent call last):
File "dorkbot.py", line 674, in
main()
File "dorkbot.py", line 69, in main
index(db, blacklist, load_module("indexers", args.indexer), args, indexer_options)
File "dorkbot.py", line 195, in index
if not blacklist.match(Target(url)): targets.append(url)
File "dorkbot.py", line 466, in init
self.starttime = generate_timestamp()
File "dorkbot.py", line 291, in generate_timestamp
return datetime.datetime.now().astimezone().isoformat()
ValueError: astimezone() cannot be applied to a naive datetime

how can i run this success?

Searching for Multiple Index Queries

I was wondering if there is a way of searching for multiple index queries at a time.

Say i have a list of 100 popular google dorking search terms and I would like to feed all these into dorkbot at once. Is there an option to feed in a text file line by line in the -o query= option. If not what would be the best way to acheive this? Ideally, I want to avoid manually performing 100 scans.

Add instructions to README to run headless

Hi!

It's a common error that when trying to run this tool in a completely headless environment such as a VPS, people is getting the following errors from PhantomJS:

qt.qpa.screen: QXcbConnection: Could not connect to display
Could not connect to any X display.
Traceback (most recent call last):
  File "/usr/local/bin/dorkbot", line 11, in <module>
(...)

You may want to add the following to the README:

export QT_QPA_PLATFORM=offscreen

This will allow PhantomJS to run without the need to be attached to a display :D

slow if scanning in bulk

Hello sometimes it takes 2-5 sec each dork to scan, this is fairly slow if i want to scan 100 dorks , any way to fix this ?

config file

Hi, I'm just unsure on how to use the correct format for the config file.

The configuration file (dorkbot.ini) can be used to prepopulate certain command-line flags.

Suppose I want to auto fill these options

-i google_api -o key=your_api_credential_here -o engine=your_engine_id_here

How do I put it in the config file?

about search result numbers

i find when i finish search,only display a little result.
i want to know,how can i use more and more result?

Issues running dorkbot

All requirements are met but when I run ./dorkbot.py -s arachni nothing happens. Thanks.
dorkbot

Import List of Target URLs?

Hi,

Instead of having to manually add 1 URL at a time, is it possible to just have it ingest a .txt file list of bulk URLS?

"Query" Errors

robot:/dorkbot# ./dorkbot.py -i google -o engine=0099604576279661038117:vquegqft89g,query=filetype:php -s arachni
Traceback (most recent call last):
File "./dorkbot.py", line 177, in
main()
File "./dorkbot.py", line 69, in main
index(db, args.indexer, args.indexer_options)
File "./dorkbot.py", line 129, in index
results = indexer_module.run(options)
File "/root/dorkbot/indexers/google.py", line 46, in run
return results
UnboundLocalError: local variable 'results' referenced before assignment
robot:
/dorkbot#
Hello I am getting the above error... How can I resolve :(

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.