Giter Club home page Giter Club logo

undetected-chromedriver's Introduction

undetected_chromedriver

https://github.com/ultrafunkamsterdam/undetected-chromedriver

Optimized Selenium Chromedriver patch which does not trigger anti-bot services like Distill Network / Imperva / DataDome / Botprotect.io Automatically downloads the driver binary and patches it.

  • Tested from version 80 until current beta
  • Patching also works on MS Edge (chromium-based) webdriver binary
  • Python 3.6++

Installation

pip install undetected-chromedriver

Usage

To prevent unnecessary hair-pulling and issue-raising, please mind the important note at the end of this document .


the easy way (recommended)

import undetected_chromedriver as uc
driver = uc.Chrome()
driver.get('https://distilnetworks.com')

the V2 (beta) way

import undetected_chromedriver.v2 as uc
driver = uc.Chrome()
with driver:
    driver.get('https://coinfaucet.eu')  # known url using cloudflare's "under attack mode"

target specific chrome version

import undetected_chromedriver as uc
uc.TARGET_VERSION = 85
driver = uc.Chrome()

monkeypatch mode

Needs to be done before importing from selenium package

import undetected_chromedriver as uc
uc.install()

from selenium.webdriver import Chrome
driver = Chrome()
driver.get('https://distilnetworks.com')

the customized way

import undetected_chromedriver as uc

#specify chromedriver version to download and patch
uc.TARGET_VERSION = 78    

# or specify your own chromedriver binary (why you would need this, i don't know)

uc.install(
    executable_path='c:/users/user1/chromedriver.exe',
)

opts = uc.ChromeOptions()
opts.add_argument(f'--proxy-server=socks5://127.0.0.1:9050')
driver = uc.Chrome(options=opts)
driver.get('https://distilnetworks.com')

datadome.co example

These guys have actually a powerful product, and a link to this repo, which makes me wanna test their product. Make sure you use a "clean" ip for this one.

#
# STANDARD selenium Chromedriver
#
from selenium import webdriver
chrome = webdriver.Chrome()
chrome.get('https://datadome.co/customers-stories/toppreise-ends-web-scraping-and-content-theft-with-datadome/')
chrome.save_screenshot('datadome_regular_webdriver.png')
True   # it caused my ip to be flagged, unfortunately


#
# UNDETECTED chromedriver (headless,even)
#
import undetected_chromedriver as uc
options = uc.ChromeOptions()
options.headless=True
options.add_argument('--headless')
chrome = uc.Chrome(options=options)
chrome.get('https://datadome.co/customers-stories/toppreise-ends-web-scraping-and-content-theft-with-datadome/')
chrome.save_screenshot('datadome_undetected_webddriver.png')

Check both saved screenhots here

important note

Due to the inner workings of the module, it is needed to browse programmatically (ie: using .get(url) ). Never use the gui to navigate. Using your keybord and mouse for navigation causes possible detection! New Tabs: same story. If you really need multi-tabs, then open the tab with the blank page (hint: url is data:, including comma, and yes, driver accepts it) and do your thing as usual. If you follow these "rules" (actually its default behaviour), then you will have a great time for now.

TL;DR and for the visual-minded:

In [1]: import undetected_chromedriver as uc
In [2]: driver = uc.Chrome()
In [3]: driver.execute_script('return navigator.webdriver')
Out[3]: True  # Detectable
In [4]: driver.get('https://distilnetworks.com') # starts magic
In [4]: driver.execute_script('return navigator.webdriver')
In [5]: None  # Undetectable!

end important note

undetected-chromedriver's People

Contributors

ultrafunkamsterdam avatar harryyc avatar thomas-milburn avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.