Giter Club home page Giter Club logo

headful-puppet's Introduction

๐Ÿคก headful puppet

A headful chromium in a docker container; waiting to have your strings attached.

Testing and API crawling via the remote debugging API, with

  • ๐Ÿ“บ headful chromium; using xvfb
  • ๐Ÿ•ถ stealth mode for user-detection or DDOS-prevention evasion
  • ๐Ÿ“ฐ logging console events, page- and request errors to container output

run

โšก this image is BIG (900M)

Run the container and expose the remote debugging port (default: 9222)

docker run -p 9222:9222 bitmeal/headful-puppet

config

Configure the container using environment variables with the -e flag:

  • STEALTH: enable stealth mode; see below; off by default, set any value to enable
  • PORT: debug interface port to listen on for external connections; default 9222
  • PORT_INTERNAL: internal port; proxied to PORT; default 9992

use

Use your favourite implementation of the chrome remote debugging API and point it to the address of your container, or localhost if port is exposed. The default config uses the default remote debugging port 9222.

// node.js + puppeteer example

const puppeteer = require('puppeteer-core');

(async () => {
	const browser = await puppeteer.connect({ browserURL: 'http://localhost:9222' });
    const page = await browser.newPage();

    await page.goto('https://github.com/bitmeal', { waitUntil: 'networkidle2' });
    // do something
    await page.close();

    browser.disconnect();
})();

The example shows the use of puppeteer in Node.js, though any other implementation of the API may be used. When using puppeteer, you may use the puppeteer-core package in your application to skip fetching a local chromium executable.

stealth

docker run -p 9222:9222 -e STEALTH=1 bitmeal/headful-puppet

For API crawling on endpoints with user-detection or DDOS-prevention mechanisms, the packages puppeteer-extra and puppeteer-extra-plugin-stealth provide chromium with the necessary flags and settings for successful evasion mechanisms. Here, relying on tried community resources allows faster integration and deployment, and is the main motivation to use puppeteer as a provider for chromium.

To use stealth mode to its' full capabilities, use Node.js with pupeteer, puppeteer-extra and puppeteer-extra-plugin-stealth packages as your API. See the example below, additionally demoing the use of an incognito context:

// node.js + puppeteer STEALTH mode example

// !! puppeteer with: npm i puppeteer@npm:puppeteer-core
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());

(async () => {
	const browser = await puppeteer.connect({ browserURL: 'http://localhost:9222' });
    const context = await browser.createIncognitoBrowserContext();
    const page = await context.newPage();

    await page.goto('https://github.com/bitmeal', { waitUntil: 'networkidle2' });
    // do something
    await page.close();

    browser.disconnect();
})();

internal

short summary

  • my_init.sh from phusion/baseimage as init
  • xvfb to provide "the head"
  • chromium provided by puppeteer Node.js package
  • socat to prox API to outside of the container (without --headless, binding on localhost only)
  • puppeteer-extra and puppeteer-extra-plugin-stealth for stealth mode and evasion tactics
  • puppeteer for logging of js-console events and page errors

headful-puppet's People

Contributors

bitmeal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.