https://github.com/nirantak/scraper

Python web scrapers
https://github.com/nirantak/scraper

beautifulsoup playwright python python-web-scraper scraper scraping selenium

Last synced: 10 months ago
JSON representation

Python web scrapers

Host: GitHub
URL: https://github.com/nirantak/scraper
Owner: nirantak
License: gpl-3.0
Created: 2017-08-09T01:06:31.000Z (over 8 years ago)
Default Branch: main
Last Pushed: 2022-07-15T11:52:24.000Z (over 3 years ago)
Last Synced: 2024-05-01T16:23:49.803Z (almost 2 years ago)
Topics: beautifulsoup, playwright, python, python-web-scraper, scraper, scraping, selenium
Language: Python
Homepage:
Size: 16 MB
Stars: 16
Watchers: 2
Forks: 6
Open Issues: 0
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

          # Scraper

> _Python web scrapers built using Selenium, BS4 and Playwright_

## Table of Contents

- [Scraper](#scraper)

  - [Table of Contents](#table-of-contents)

  - [Installation](#installation)

  - [Usage](#usage)

  - [Requirements](#requirements)

## Installation

Clone the git repository:

```bash

git clone https://github.com/nirantak/scraper.git && cd scraper

cp -nv .env.sample .env  # copy and update the env variables

```

Install necessary dependencies

```bash

python3 -m venv .venv

source .venv/bin/activate

pip install -U pip wheel setuptools

pip install -U -r requirements.txt

playwright install

```

## Usage

See [scrapers/README.md](scrapers/) for usage instructions.

Samples present in [demo/](demo/).

## Requirements

1. [Python 3.10](https://www.python.org/downloads/)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nirantak/scraper

Awesome Lists containing this project

README