detectem is a specialized software detector. Let's see it in action.
$ det http://domain.tld
[{'name': 'phusion-passenger', 'version': '4.0.10'},
{'name': 'apache-mod_bwlimited', 'version': '1.4'},
{'name': 'apache-mod_fcgid', 'version': '2.3.9'},
{'name': 'jquery', 'version': '1.11.3'},
{'name': 'crayon-syntax-highlighter', 'version': '_2.7.2_beta'}]
Using a serie of indicators, it's able to detect software running on a site and extract accurately its version information. It uses Splash API to render the website and start the detection routine. It does full analysis on requests, responses and even on the DOM!
There are two important articles to read:
- Detect software in modern web technologies.
- Browser support provided by Splash.
- Analysis on requests made and responses received by the browser.
- Get software information from the DOM.
- Great performance (less than 10 seconds to get a fingerprint).
- Plugin system to add new software easily.
- Test suite to ensure plugin result integrity.
- Continuous development to support new features.
Install Docker and add your user to the docker group, then you avoid to use sudo.
Pull the image:
$ docker pull scrapinghub/splash
Create a virtual environment with Python >= 3.5 .
Install detectem:
$ pip install detectem
Run it against some URL:
$ det http://domain.tld
The documentation is at ReadTheDocs.