https://github.com/oxylabs/selenium-bypass-captcha

See how to easily bypass CAPTCHA tests using Selenium in Python.
https://github.com/oxylabs/selenium-bypass-captcha

bypass-captcha captcha captcha-bypass python selenium selenium-python web-scraping

Last synced: 3 months ago
JSON representation

See how to easily bypass CAPTCHA tests using Selenium in Python.

Host: GitHub
URL: https://github.com/oxylabs/selenium-bypass-captcha
Owner: oxylabs
Created: 2023-10-11T12:30:50.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-04-19T10:59:25.000Z (over 1 year ago)
Last Synced: 2024-11-17T02:09:28.458Z (about 1 year ago)
Topics: bypass-captcha, captcha, captcha-bypass, python, selenium, selenium-python, web-scraping
Language: Python
Homepage: https://oxylabs.io/blog/selenium-bypass-captcha
Size: 699 KB
Stars: 7
Watchers: 1
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # How to Bypass CAPTCHA With Selenium & Python

[![mubeng](https://raw.githubusercontent.com/oxylabs/product-integrations/refs/heads/master/Affiliate-Universal-1090x275.png)](https://github.com/oxylabs/web-unblocker)

[![](https://dcbadge.limes.pink/api/server/Pds3gBmKMH?style=for-the-badge&theme=discord)](https://discord.gg/Pds3gBmKMH) [![YouTube](https://img.shields.io/badge/YouTube-Oxylabs-red?style=for-the-badge&logo=youtube&logoColor=white)](https://www.youtube.com/@oxylabs)

- [Bypass CAPTCHA with Selenium and Python](#bypass-captcha-with-selenium-and-python)

  * [**Step 1 - Install dependencies**](#step-1---install-dependencies)

  * [**Step 2 - Import libraries**](#step-2---import-libraries)

  * [**Step 3 - Navigate to webpage**](#step-3---navigate-to-webpage)

  * [**Step 4 - Take a screenshot**](#step-4---take-a-screenshot)

- [**Bypass CAPTCHA with Web Unblocker**](#bypass-captcha-with-web-unblocker)

  * [**Step 1 - Import libraries**](#step-1---import-libraries)

  * [**Step 2 - Get Web Unblocker credentials**](#step-2---get-web-unblocker-credentials)

  * [**Step 3 - Prepare Web Unblocker**](#step-3---prepare-web-unblocker)

  * [**Step 4 - Fetch content**](#step-4---fetch-content)

In this tutorial, you’ll learn how to handle CAPTCHA tests in Selenium

and Python using

[undetected-chromedriver](https://github.com/ultrafunkamsterdam/undetected-chromedriver)

and [Oxylabs’ Web

Unblocker](https://oxylabs.io/products/web-unblocker). See the

[full blog post](https://oxylabs.io/blog/selenium-bypass-captcha)

for more details and tips.

## Bypass CAPTCHA with Selenium and Python

The first step is to install Python if you haven't installed it already.

You can download it from the [official

website](https://python.org/download). Download the latest version

or a version greater than 3.6; otherwise, undetected-chromedriver won’t

work properly.

### Step 1 - Install dependencies

Install the **undetected-chromedriver** and **requests** module. You can use the

`pip` command given below:

```bash

pip install undetected-chromedriver requests

```

### Step 2 - Import libraries

Now that you’ve installed undetected-chromedriver, you can import it as

shown below:

```python

import undetected_chromedriver as webdriver

chrome_options = webdriver.ChromeOptions()

chrome_options.add_argument("--headless")

chrome_options.add_argument("--use_subprocess")

browser = webdriver.Chrome(options=chrome_options)

```

Notice, you’ve also created a `browser` instance. This will open a

Chrome window in the background in `headless` mode.

### Step 3 - Navigate to webpage

Use the `browser` instance to navigate to your target website. For

this tutorial, let’s use

[https://sandbox.oxylabs.io/products](https://sandbox.oxylabs.io/products)

as the target.

```python

browser.get("https://sandbox.oxylabs.io/products")

```

### Step 4 - Take a screenshot

Take a screenshot to verify the page is loaded properly without showing

any CAPTCHA or bot protection screen. You can use the

`save_screenshot` method of Selenium.

```python

browser.save_screenshot("screenshot.png")

```

Your screenshot might vary slightly due to screen size, but it’ll look

similar to the one given below:

![Screenshot](images/screenshot.png)

The page has loaded properly without showing any CAPTCHA and the

undetected-chromedriver has rendered the Javascript files.

## Bypass CAPTCHA with Web Unblocker

To perform large-scale web scraping while bypassing CAPTCHA, you’ll need

a strong tool. [Web

Unblocker](https://oxylabs.io/products/web-unblocker), an AI–powered

proxy solution for bypassing IP blocks and CAPTCHAs, will automatically

rotate proxies for you, so you don’t have to worry about manually

managing a list of proxies for your bots.

### Step 1 - Import libraries

Let’s use the `requests` module to set up Web Unblocker.

```python

import requests

```

### Step 2 - Get Web Unblocker credentials

[Create an account](https://dashboard.oxylabs.io/en/) to get the Web Unblocker credentials. Within a few

clicks, you can sign up and get a **1-week free trial** to develop and

thoroughly test the solution.

### Step 3 - Prepare Web Unblocker

Web Unblocker’s host and port are `unblock.oxylabs.io` and `60000`

respectively. Additionally, don’t forget to replace the `USERNAME` and

`PASSWORD` with the correct credentials.

```python

proxy = 'http://{}:{}@unblock.oxylabs.io:60000'.format("USERNAME", "PASSWORD")

proxies = {

    'http': proxy,

    'https': proxy

}

```

If you get any authentication-related errors in the later steps, don’t

forget to check the Web Unblocker response codes

[here](https://developers.oxylabs.io/advanced-proxy-solutions/web-unblocker/response-codes).

### Step 4 - Fetch content

Now, you can use the `proxies` dict you created with the `get`

method of the `requests` module. Web Unblocker also requires you to

pass an extra parameter, `verify=False`, to the get method.

```python

page = "https://sandbox.oxylabs.io/products"

response = requests.get(page, proxies=proxies, verify=False)

print(response.status_code)

content = response.content

```

You should see the status code `200` if everything works as expected.

The content of the page will be stored in the `content` object, which

you can process later with HTML Parser libraries such as [Beautiful

Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) or

parse using the [Custom

Parser](https://developers.oxylabs.io/scraper-apis/custom-parser).

Web Unblocker also renders JavaScript for you, so you can use this

method for dynamic websites as well.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/oxylabs/selenium-bypass-captcha

Awesome Lists containing this project

README