https://github.com/oxylabs/web-scraping-selenium-python
Web Scraping with Python Selenium: Tutorial for Beginners
https://github.com/oxylabs/web-scraping-selenium-python
github-python json-database-python python-ecommerce python-web-crawler scraper-python selenium-web-scraper serp-api-python web-scraping web-scraping-python
Last synced: 15 days ago
JSON representation
Web Scraping with Python Selenium: Tutorial for Beginners
- Host: GitHub
- URL: https://github.com/oxylabs/web-scraping-selenium-python
- Owner: oxylabs
- Created: 2022-11-04T13:45:33.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-02-11T12:57:49.000Z (3 months ago)
- Last Synced: 2025-03-27T18:03:18.741Z (about 1 month ago)
- Topics: github-python, json-database-python, python-ecommerce, python-web-crawler, scraper-python, selenium-web-scraper, serp-api-python, web-scraping, web-scraping-python
- Language: Python
- Homepage:
- Size: 11.7 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Web Scraping with Python Selenium: Tutorial for Beginners
[](https://oxylabs.go2cloud.org/aff_c?offer_id=7&aff_id=877&url_id=112)
[](https://discord.gg/GbxmdGhZjq)
[
](https://github.com/topics/python) [
](https://github.com/topics/selenium) [
](https://github.com/topics/web-scraping)
- [Installing Selenium](#installing-selenium)
- [Testing](#testing)
- [Scraping with Selenium](#scraping-with-selenium)In this article, we’ll cover an overview of web scraping with Selenium using a real-life example.
For a detailed tutorial on Selenium, see [our blog](https://oxylabs.io/blog/selenium-web-scraping).
## Installing Selenium
1. Create a virtual environment:
```sh
python3 -m venv .env
```2. Install Selenium using pip:
```sh
pip install selenium
```3. Install Selenium Web Driver. See [this page](https://www.selenium.dev/documentation/webdriver/getting_started/install_drivers/) for details.
## Testing
With virtual environment activated, enter IDLE by typing in `python3`. Enter the following command on IDLE:
```python
>>> from selenium.webdriver import Chrome```
If there are no errors, move on to the next step. If there is an error, ensure that `chromedriver` is added to the PATH.
## Scraping with Selenium
Import required modules as follows:
```python
from selenium.webdriver import Chrome, ChromeOptions
from selenium.webdriver.common.by import By
```Add the skeleton of the script as follows:
```python
def get_data(url) -> list:
...def main():
...if __name__ == '__main__':
main()
```Create ChromeOptions object and set `headless` to `True`. Use this to create an instance of `Chrome`.
```python
browser_options = ChromeOptions()
browser_options.headless = Truedriver = Chrome(options=browser_options)
```Call the `driver.get` method to load a URL. After that, locate the link for the Humor section by link text and click it:
```python
driver.get(url)element = driver.find_element(By.LINK_TEXT, "Humor")
element.click()
```Create a CSS selector to find all books from this page. After that run a loop on the books and find the bookt title, price, stock availability. Use a dictionary to store one book information and add all these dictionaries to a list. See the code below:
```python
books = driver.find_elements(By.CSS_SELECTOR, ".product_pod")
data = []
for book in books:
title = book.find_element(By.CSS_SELECTOR, "h3 > a")
price = book.find_element(By.CSS_SELECTOR, ".price_color")
stock = book.find_element(By.CSS_SELECTOR, ".instock.availability")
book_item = {
'title': title.get_attribute("title"),
'price': price.text,
'stock': stock. text
}
data.append(book_item)```
Lastly, return the `data` dictionary from this function.
For the complete code, see [main.py](src/main.py).
For a detailed tutorial on Selenium, see [our blog](https://oxylabs.io/blog/selenium-web-scraping).