Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/oxylabs/mechanicalsoup-proxy-integration
Python tutorial for integrating Oxylabs' Residential Proxies with MechanicalSoup library
https://github.com/oxylabs/mechanicalsoup-proxy-integration
beautifulsoup bs4 github-python mechanicalsoup proxy-list proxy-list-github proxy-rotator proxy-site python requests rotating-proxy
Last synced: about 2 months ago
JSON representation
Python tutorial for integrating Oxylabs' Residential Proxies with MechanicalSoup library
- Host: GitHub
- URL: https://github.com/oxylabs/mechanicalsoup-proxy-integration
- Owner: oxylabs
- Created: 2021-10-07T06:45:49.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2024-04-19T08:25:31.000Z (9 months ago)
- Last Synced: 2024-04-21T02:04:44.196Z (9 months ago)
- Topics: beautifulsoup, bs4, github-python, mechanicalsoup, proxy-list, proxy-list-github, proxy-rotator, proxy-site, python, requests, rotating-proxy
- Language: Python
- Homepage:
- Size: 38.1 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Oxylabs’ Residential Proxies integration with MechanicalSoup
[](https://github.com/topics/python) [](https://github.com/topics/mechanicalsoup) [](https://github.com/topics/web-scraping) [](https://github.com/topics/rotating-proxies)
[![Oxylabs promo code](https://user-images.githubusercontent.com/129506779/250792357-8289e25e-9c36-4dc0-a5e2-2706db797bb5.png)](https://oxylabs.go2cloud.org/aff_c?offer_id=7&aff_id=877&url_id=112)
[![](https://dcbadge.vercel.app/api/server/eWsVUJrnG5)](https://discord.gg/GbxmdGhZjq)
[Mechanical Soup](https://github.com/MechanicalSoup/MechanicalSoup) is a Python library designed
for automating web interactions such as submitting forms, following links and redirects. Since it
is built on using Python `requests` and `BeautifulSoup` libraries, `MechanicalSoup`
is often used as a library to perform some web-scraping operations, such as image extraction,
due to the powerful integrated functions that comes in with it. In this tutorial, we're going
to cover how you can integrate [Oxylabs' Residential Proxies](https://oxy.yt/VakT) with
MechanicalSoup and share a code sample for submitting an HTML form while using proxies.## Requirements
For the integration to work, you'll need to install it on your system.
You can do it using `pip` command:
```bash
pip install mechanicalsoup
````Python 3` or higher
Residential Proxies: https://oxy.yt/urSrl
## Proxy Authentication
For proxies to work, you'll need to specify your Oxylabs Residential Proxy access credentials inside the
[main.py](https://github.com/oxylabs/mechanicalsoup-proxy-integration/blob/main/main.py) file.```python
USERNAME = "your_username"
PASSWORD = "your_password"
ENDPOINT = "pr.oxylabs.io:7777"
```
Adjust the `your_username` and `your_password` values with the username and password
of your Oxylabs Residential Proxy access credentials.## Testing Proxy Connection
To see if the proxy is working, try visiting [ip.oxylabs.io/location](https://ip.oxylabs.io/location).
If everything is working correctly,
it will return an IP address of a proxy that you're using.## Locating an HTML Form
Locating an HTML form in MechanicalSoup is relatively easy - all you have to do is to select it
via CSS selector using a `select_form` method. It returns a `soup` object that can be later
retrieved using `form` attribute. Here's an example of locating a form and printing its values in
the input fields.```python
import mechanicalsoup# Credentials of Oxylabs' Residential Proxy access.
USER = "your_username"
PASSWORD = "your_password"
ENDPOINT = "pr.oxylabs.io:7777"proxies = {
"http": f"http://{USER}:{PASSWORD}@{ENDPOINT}",
"https": f"http://{USER}:{PASSWORD}@{ENDPOINT}",
}def get_html_form(proxies):
# Initiate a MechanicalSoup object.
browser = mechanicalsoup.StatefulBrowser()
browser.session.proxies = proxies
browser.open("https://httpbin.org/forms/post")
# Select a form in HTML using a CSS Selector.
form = browser.select_form('form[action="/post"]')
# Print the form field data.
return form.print_summary()if __name__ == "__main__":
print(get_html_form(proxies))
```## Full Code: Submitting an HTML Form with Proxies
```python
import mechanicalsoup# Credentials for Oxylabs' Residential Proxy access.
USER = "your_username"
PASSWORD = "your_password"
ENDPOINT = "pr.oxylabs.io:7777"proxies = {
"http": f"http://{USER}:{PASSWORD}@{ENDPOINT}",
"https": f"http://{USER}:{PASSWORD}@{ENDPOINT}",
}def get_html_form(proxies):
# Initiate a MechanicalSoup object.
browser = mechanicalsoup.StatefulBrowser()
browser.session.proxies = proxies
browser.open("https://httpbin.org/forms/post")# Select a form in HTML using a CSS Selector.
form = browser.select_form('form[action="/post"]')form_info = {
"custname": "John",
"custtel": "123",
"custemail": "[email protected]",
"size": "small",
"topping": ("bacon", "cheese", "onion"),
"delivery": "18:30",
"comments": "I like pizza",
}# Populate the form with values from the `form_info` dict.
for key, value in form_info.items():
form.set(key, value)# Launch a Browser.
browser.launch_browser()
response = browser.submit_selected()
return response.textif __name__ == "__main__":
print(get_html_form(proxies))
```
If you're having any trouble integrating proxies with MechanicalSoup and this guide didn't help
you - feel free to contact Oxylabs customer support at [email protected].