Giter Club home page Giter Club logo

rotating-tor-http-proxy's Introduction

GitHub Docker Image Version (latest semver) GitHub Workflow Status Docker Pulls Docker Image Size (latest semver)

rotating-tor-http-proxy

This Docker image provides one HTTP proxy endpoint with many IP addresses for use scenarios like web crawling.

Screenshot

Behind the scene, it has an HAProxy sitting in front of multiple pairs of Privoxy-Tor. The HAProxy dispatches the incoming requests to the Privoxy instances with a round-robin strategy.

Usage

This image is multi-platform enabled, currently supporting:

  • amd64 (x86_64)
  • arm64 (aarch64)
  • arm/v7 (armhf)
  • arm/v6 (armel)

Simple case

docker run --rm -it -p 3128:3128 zhaowde/rotating-tor-http-proxy

At the host, 127.0.0.1:3128 is the HTTP/HTTPS proxy address.

Moreover

docker run --rm -it -p 3128:3128 -p 4444:4444 -e "TOR_INSTANCES=5" -e "TOR_REBUILD_INTERVAL=3600" zhaowde/rotating-tor-http-proxy

Port 4444/TCP can be mapped to the host if HAProxy stats information is needed. With docker run -p 4444:4444, the HAProxy statistics report is available at http://127.0.0.1:4444. An article from the HAProxy official blog explains in detail how to understand this report.

Environment variable TOR_INSTANCES can be used to config the number of concurrent Tor clients (as well as the associated Privoxy instances). The default is 10, and the valid value is purposely limited to the range between 1 and 40.

Each Tor client attempts to build a new circuit (results in a new outbound IP address) every 30 seconds. Every 30 minutes, this image rebuilds all the circuits. This interval can be changed with environment variable TOR_REBUILD_INTERVAL, the default value is 1800 seconds, while it can be set up any number greater than 600 seconds.

Test the proxy

while :; do curl -sx localhost:3128 ifconfig.io; echo ""; sleep 2; done

Credit

At Github, there are many repos build Docker image to provide HTTP proxy connects to the Tor network. The project is reinventing the wheel based on many of them. Remarkably:

  • y4ns0l0/docker-multi-tor creates a setup with multiple pairs of Privoxy-Tor. Having no HAProxy-like dispatcher, each Privoxy expose itself to the host as a different TCP port.
  • mattes/rotating-proxy does exactly the same job as this project. However,
    1. it utilizes Polipo as the HTTP-SOCKS proxy adapter. Polipo ceased to be maintained on 6 November 2016
    2. the base image is Ubuntu 14.04, which it too heavy for this case, and out-of-maintenance as well
    3. the main control logic is written in Ruby

Bill-of-Material

  • alpine-3.19.1
  • bash-5.2.21
  • curl-8.5.0
  • haproxy-2.8.5
  • privoxy-3.0.34
  • sed-4.9
  • tor-0.4.8.10

rotating-tor-http-proxy's People

Contributors

zhaow-de avatar github-actions[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.