https://github.com/MechanicalSoup/MechanicalSoup
A Python library for automating interaction with websites.
https://github.com/MechanicalSoup/MechanicalSoup
beautifulsoup mechanicalsoup pypi python python-library requests web
Last synced: 8 months ago
JSON representation
A Python library for automating interaction with websites.
- Host: GitHub
- URL: https://github.com/MechanicalSoup/MechanicalSoup
- Owner: MechanicalSoup
- License: mit
- Created: 2014-05-26T09:06:11.000Z (over 11 years ago)
- Default Branch: main
- Last Pushed: 2024-05-21T09:03:33.000Z (over 1 year ago)
- Last Synced: 2024-05-21T21:57:04.677Z (over 1 year ago)
- Topics: beautifulsoup, mechanicalsoup, pypi, python, python-library, requests, web
- Language: Python
- Homepage: http://mechanicalsoup.readthedocs.io/en/stable/
- Size: 670 KB
- Stars: 4,565
- Watchers: 108
- Forks: 374
- Open Issues: 39
-
Metadata Files:
- Readme: README.rst
- Contributing: CONTRIBUTING.rst
- License: LICENSE
Awesome Lists containing this project
- my-awesome-starred - MechanicalSoup - A Python library for automating interaction with websites. (Python)
- awesome-python - MechanicalSoup - A Python library for automating interaction with websites. (Web Crawling)
- awesome-python-zh - mechanicalsoup - 用于自动与网站交互的Python库。 (Web爬行)
- starred-awesome - MechanicalSoup - A Python library for automating interaction with websites. (Python)
- my-awesome-github-stars - MechanicalSoup/MechanicalSoup - A Python library for automating interaction with websites. (Python)
- awesome-data-analysis - MechanicalSoup - A Python library for automating interaction with websites. (🕸️ Web Scraping & Crawling / Tools)
- python-awesome - MechanicalSoup - A Python library for automating interaction with websites. (Web Crawling)
- best-of-web-python - GitHub - 22% open · ⏱️ 08.10.2025): (Web Scraping & Crawling)
- awesome-web-scraping - MechanicalSoup - Web automation library (Core Libraries / Python)
- awesome-python-resources - GitHub - 16% open · ⏱️ 03.07.2022): (HTML 处理)
- fucking-awesome-python - mechanicalsoup - A Python library for automating interaction with websites. (Web Crawling)
- fucking-awesome-python - :octocat: MechanicalSoup - :star: 4406 :fork_and_knife: 394 - A Python library for automating interaction with websites. (Web Crawling)
README
.. image:: /assets/mechanical-soup-logo.png
:alt: MechanicalSoup. A Python library for automating website interaction.
Home page
---------
https://mechanicalsoup.readthedocs.io/
Overview
--------
A Python library for automating interaction with websites.
MechanicalSoup automatically stores and sends cookies, follows
redirects, and can follow links and submit forms. It doesn't do
JavaScript.
MechanicalSoup was created by `M Hickford
`__, who was a fond user of the
`Mechanize `__ library.
Unfortunately, Mechanize was `incompatible with Python 3 until 2019
`__ and its development
stalled for several years. MechanicalSoup provides a similar API, built on Python
giants `Requests `__ (for
HTTP sessions) and `BeautifulSoup
`__ (for document
navigation). Since 2017 it is a project actively maintained by a small
team including `@hemberger `__ and `@moy
`__.
|Gitter Chat|
Installation
------------
|Latest Version| |Supported Versions|
PyPy3 is also supported (and tested against).
Download and install the latest released version from `PyPI `__::
pip install MechanicalSoup
Download and install the development version from `GitHub `__::
pip install git+https://github.com/MechanicalSoup/MechanicalSoup
Installing from source (installs the version in the current working directory)::
python setup.py install
(In all cases, add ``--user`` to the ``install`` command to
install in the current user's home directory.)
Documentation
-------------
The full documentation is available on
https://mechanicalsoup.readthedocs.io/. You may want to jump directly to
the `automatically generated API
documentation `__.
Example
-------
From ``__, code to get the results from
a Qwant search:
.. code:: python
"""Example usage of MechanicalSoup to get the results from the Qwant
search engine.
"""
import re
import mechanicalsoup
import html
import urllib.parse
# Connect to Qwant
browser = mechanicalsoup.StatefulBrowser(user_agent='MechanicalSoup')
browser.open("https://lite.qwant.com/")
# Fill-in the search form
browser.select_form('#search-form')
browser["q"] = "MechanicalSoup"
browser.submit_selected()
# Display the results
for link in browser.page.select('.result a'):
# Qwant shows redirection links, not the actual URL, so extract
# the actual URL from the redirect link:
href = link.attrs['href']
m = re.match(r"^/redirect/[^/]*/(.*)$", href)
if m:
href = urllib.parse.unquote(m.group(1))
print(link.text, '->', href)
More examples are available in ``__.
For an example with a more complex form (checkboxes, radio buttons and
textareas), read ``__
and ``__.
Development
-----------
|Build Status|
|Coverage Status|
|Documentation Status|
|CII Best Practices|
Instructions for building, testing and contributing to MechanicalSoup:
see ``__.
Common problems
---------------
Read the `FAQ
`__.
.. |Latest Version| image:: https://img.shields.io/pypi/v/MechanicalSoup.svg
:target: https://pypi.python.org/pypi/MechanicalSoup/
.. |Supported Versions| image:: https://img.shields.io/pypi/pyversions/mechanicalsoup.svg
:target: https://pypi.python.org/pypi/MechanicalSoup/
.. |Build Status| image:: https://github.com/MechanicalSoup/MechanicalSoup/actions/workflows/python-package.yml/badge.svg?branch=main
:target: https://github.com/MechanicalSoup/MechanicalSoup/actions/workflows/python-package.yml?query=branch%3Amain
.. |Coverage Status| image:: https://codecov.io/gh/MechanicalSoup/MechanicalSoup/branch/main/graph/badge.svg
:target: https://codecov.io/gh/MechanicalSoup/MechanicalSoup
.. |Documentation Status| image:: https://readthedocs.org/projects/mechanicalsoup/badge/?version=latest
:target: https://mechanicalsoup.readthedocs.io/en/latest/?badge=latest
.. |CII Best Practices| image:: https://bestpractices.coreinfrastructure.org/projects/1334/badge
:target: https://bestpractices.coreinfrastructure.org/projects/1334
.. |Gitter Chat| image:: https://badges.gitter.im/MechanicalSoup/MechanicalSoup.svg
:target: https://gitter.im/MechanicalSoup/Lobby