Giter Club home page Giter Club logo

theharvester's Introduction

************************************
*theHarvester 2.3                  *
*Coded by Christian Martorella     *
*[email protected]     *
************************************

What is this?
-------------

theHarvester is a tool for gathering e-mail accounts, subdomain names, virtual hosts, open ports/ banners, and employee names from different public sources (search engines, pgp key servers). 

Is a really simple tool, but very effective for the early stages of a penetration test or just to know the visibility of your company in the Internet.

The actual sources are:

Passive:
--------
-google: google search engine  - www.google.com

-google-profiles: google search engine, specific search for Google profiles

-bing: microsoft search engine  - www.bing.com

-bingapi: microsoft search engine, through the API (you need to add your Key in the discovery/bingsearch.py file)

-pgp: pgp key server - pgp.rediris.es 

-linkedin: google search engine, specific search for Linkedin users

-dogpile: Dogpile search engine

-shodan: Shodan Computer search engine, will search for ports and banner of the discovered hosts  (http://www.shodanhq.com/)

-vhost: Bing virtual hosts search

Active:
-------
-DNS brute force: this plugin will run a dictionary brute force enumeration
-DNS reverse lookup: reverse lookup of ip´s discovered in order to find hostnames
-DNS TLD expansion: TLD dictionary brute force enumeration using the public suffix list

Dependencies:
------------:
none

Changelog 2.3:
-Added DogPile
-Refactored throughout so that new search engines can be added with minimal change of the core code.
-Refactoring allows us to perform further searches on output (email search on domains found through TLD enumeration?)
-Explicit XML output with filename (-X or output-xml parameters)
-Explicit HTML output with filename (-H or outout-html parameters)
-HTML/XML now include more data
-Shodan search returns the time the data was updated. This is especially useful when it returns multiple results
   for the same host/port that were scanned several time.
-Fixed some bugs/irregularities in the DNS Lookup
-Exalead and Yandex are faulty and have been commented out (see engine_list in theharvester.py)

Changelog 2.2a:
---------------
-Fixed Linkedin parser (thanks Alton Johnson and Francesco Stillavato)
-New banner with superpowers

Changelogin 2.2:
----------------
-Added Jigsaw (www.jigsaw.com)
-Added 123People (www.123people.com)
-Added limit to google searches as the maximum results we can obtain is 1000
-Removed SET, as service was discontinued by Google
-Fixed parser to remove wrong results like emails starting with @


Changelog in 2.1:
----------------
-DNS Bruteforcer
-DNS Reverse lookups
-DNS TDL Expansion
-SHODAN DB integration
-HTML report
-DNS server selection 


Changelog in 2.0:
----------------
-Complete rewrite, more modular and easy to maintain
-New sources (Exalead, Google-Profiles, Bing-Api)
-Time delay between request, to prevent search engines from blocking our IP´s
-You can start the search from the results page that you want, hence you can *resume* a search 
-Export to xml
-All search engines harvesting


TODO:
----
See TODO file.

Comments? Bugs? requests?
------------------------
[email protected]

Updates:
--------
http://code.google.com/p/theharvester/

Thanks:
-------
John Matherly -  SHODAN project
Lee Baird for suggestions and bugs reporting

theharvester's People

Contributors

ethicalhack3r avatar mariovilas avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.