https://github.com/jakbin/pcdt-scraper
  
  
    A PyChromeDevTools based WebScraper and selenium like syntax. 
    https://github.com/jakbin/pcdt-scraper
  
pychromedevtools python-chrome web-scraper web-scraping-python webscraper webscraping
        Last synced: 4 months ago 
        JSON representation
    
A PyChromeDevTools based WebScraper and selenium like syntax.
- Host: GitHub
- URL: https://github.com/jakbin/pcdt-scraper
- Owner: jakbin
- Created: 2025-02-03T03:58:17.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-02-26T17:19:00.000Z (8 months ago)
- Last Synced: 2025-06-02T16:41:48.553Z (5 months ago)
- Topics: pychromedevtools, python-chrome, web-scraper, web-scraping-python, webscraper, webscraping
- Language: Python
- Homepage: https://jakbin.github.io/pcdt-scraper/
- Size: 6.84 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- 
            Metadata Files:
            - Readme: README.md
 
Awesome Lists containing this project
README
          # pcdt-scraper
A PyChromeDevTools based WebScraper and selenium like syntax.
[](https://github.com/jakbin/pcdt-scraper/actions/workflows/publish.yml)
[](https://pypi.org/project/pcdt-scraper)
[](https://pepy.tech/project/pcdt-scraper)
[](https://pepy.tech/project/pcdt-scraper)


## Introduction
Sometimes website blocks your requests or aiohttp web request but don't block chrome web request.  
For this solution, here is "pcdt-scraper".
## Compatability
Python 3.6+ is required.
## Requirements
- Python 3.6 or higher
- Chrome or Chromium browser
- `bs4` (BeautifulSoup4)
- `PyChromeDevTools`
## Installation
```sh
pip install pcdt-scraper
```
or 
```sh
pip3 install pcdt-scraper
```
## Usage:
1. First run chromium or chrome remote instance
```sh
chromium --remote-debugging-port=9222 --remote-allow-origins=*
```
or You can run as headless mode.
```sh
chromium --remote-debugging-port=9222 --remote-allow-origins=* --headless
```
2. Then run python code
```py
from pcdt_scraper import WebScraper
scraper = WebScraper()
url = "https://www.example.com/"
try:
    # Navigate to a page
    if scraper.get(url):
        # Get page content
        content = scraper.get_page_content()
        # find element by class name
        text = scraper.find_element_by_class_name('class_name').text()
        print(text)
except Exception as e:
    print(f"An error occurred: {str(e)}")
finally:
    scraper.close()
```