Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/oxylabs/selenium-bypass-captcha
See how to easily bypass CAPTCHA tests using Selenium in Python.
https://github.com/oxylabs/selenium-bypass-captcha
bypass-captcha captcha captcha-bypass python selenium selenium-python web-scraping
Last synced: 24 days ago
JSON representation
See how to easily bypass CAPTCHA tests using Selenium in Python.
- Host: GitHub
- URL: https://github.com/oxylabs/selenium-bypass-captcha
- Owner: oxylabs
- Created: 2023-10-11T12:30:50.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-04-19T10:59:25.000Z (10 months ago)
- Last Synced: 2024-11-17T02:09:28.458Z (3 months ago)
- Topics: bypass-captcha, captcha, captcha-bypass, python, selenium, selenium-python, web-scraping
- Language: Python
- Homepage: https://oxylabs.io/blog/selenium-bypass-captcha
- Size: 699 KB
- Stars: 7
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# How to Bypass CAPTCHA With Selenium & Python
[![mubeng](https://user-images.githubusercontent.com/129506779/250792357-8289e25e-9c36-4dc0-a5e2-2706db797bb5.png)](https://github.com/oxylabs/web-unblocker)
[![](https://dcbadge.vercel.app/api/server/eWsVUJrnG5)](https://discord.gg/GbxmdGhZjq)
- [Bypass CAPTCHA with Selenium and Python](#bypass-captcha-with-selenium-and-python)
* [**Step 1 - Install dependencies**](#step-1---install-dependencies)
* [**Step 2 - Import libraries**](#step-2---import-libraries)
* [**Step 3 - Navigate to webpage**](#step-3---navigate-to-webpage)
* [**Step 4 - Take a screenshot**](#step-4---take-a-screenshot)
- [**Bypass CAPTCHA with Web Unblocker**](#bypass-captcha-with-web-unblocker)
* [**Step 1 - Import libraries**](#step-1---import-libraries)
* [**Step 2 - Get Web Unblocker credentials**](#step-2---get-web-unblocker-credentials)
* [**Step 3 - Prepare Web Unblocker**](#step-3---prepare-web-unblocker)
* [**Step 4 - Fetch content**](#step-4---fetch-content)In this tutorial, you’ll learn how to handle CAPTCHA tests in Selenium
and Python using
[undetected-chromedriver](https://github.com/ultrafunkamsterdam/undetected-chromedriver)
and [Oxylabs’ Web
Unblocker](https://oxylabs.io/products/web-unblocker). See the
[full blog post](https://oxylabs.io/blog/selenium-bypass-captcha)
for more details and tips.## Bypass CAPTCHA with Selenium and Python
The first step is to install Python if you haven't installed it already.
You can download it from the [official
website](https://python.org/download). Download the latest version
or a version greater than 3.6; otherwise, undetected-chromedriver won’t
work properly.### Step 1 - Install dependencies
Install the **undetected-chromedriver** and **requests** module. You can use the
`pip` command given below:```bash
pip install undetected-chromedriver requests
```### Step 2 - Import libraries
Now that you’ve installed undetected-chromedriver, you can import it as
shown below:```python
import undetected_chromedriver as webdriverchrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--use_subprocess")browser = webdriver.Chrome(options=chrome_options)
```Notice, you’ve also created a `browser` instance. This will open a
Chrome window in the background in `headless` mode.### Step 3 - Navigate to webpage
Use the `browser` instance to navigate to your target website. For
this tutorial, let’s use
[https://sandbox.oxylabs.io/products](https://sandbox.oxylabs.io/products)
as the target.```python
browser.get("https://sandbox.oxylabs.io/products")
```### Step 4 - Take a screenshot
Take a screenshot to verify the page is loaded properly without showing
any CAPTCHA or bot protection screen. You can use the
`save_screenshot` method of Selenium.```python
browser.save_screenshot("screenshot.png")
```Your screenshot might vary slightly due to screen size, but it’ll look
similar to the one given below:![Screenshot](images/screenshot.png)
The page has loaded properly without showing any CAPTCHA and the
undetected-chromedriver has rendered the Javascript files.## Bypass CAPTCHA with Web Unblocker
To perform large-scale web scraping while bypassing CAPTCHA, you’ll need
a strong tool. [Web
Unblocker](https://oxylabs.io/products/web-unblocker), an AI–powered
proxy solution for bypassing IP blocks and CAPTCHAs, will automatically
rotate proxies for you, so you don’t have to worry about manually
managing a list of proxies for your bots.### Step 1 - Import libraries
Let’s use the `requests` module to set up Web Unblocker.
```python
import requests
```### Step 2 - Get Web Unblocker credentials
[Create an account](https://dashboard.oxylabs.io/en/) to get the Web Unblocker credentials. Within a few
clicks, you can sign up and get a **1-week free trial** to develop and
thoroughly test the solution.### Step 3 - Prepare Web Unblocker
Web Unblocker’s host and port are `unblock.oxylabs.io` and `60000`
respectively. Additionally, don’t forget to replace the `USERNAME` and
`PASSWORD` with the correct credentials.```python
proxy = 'http://{}:{}@unblock.oxylabs.io:60000'.format("USERNAME", "PASSWORD")proxies = {
'http': proxy,
'https': proxy
}
```If you get any authentication-related errors in the later steps, don’t
forget to check the Web Unblocker response codes
[here](https://developers.oxylabs.io/advanced-proxy-solutions/web-unblocker/response-codes).### Step 4 - Fetch content
Now, you can use the `proxies` dict you created with the `get`
method of the `requests` module. Web Unblocker also requires you to
pass an extra parameter, `verify=False`, to the get method.```python
page = "https://sandbox.oxylabs.io/products"
response = requests.get(page, proxies=proxies, verify=False)print(response.status_code)
content = response.content
```You should see the status code `200` if everything works as expected.
The content of the page will be stored in the `content` object, which
you can process later with HTML Parser libraries such as [Beautiful
Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) or
parse using the [Custom
Parser](https://developers.oxylabs.io/scraper-apis/custom-parser).
Web Unblocker also renders JavaScript for you, so you can use this
method for dynamic websites as well.