https://github.com/polycatdev/pyscraper
A quick HTML scraper I built in Python for learning purposes.
https://github.com/polycatdev/pyscraper
learning pyhton3 scraper
Last synced: 2 days ago
JSON representation
A quick HTML scraper I built in Python for learning purposes.
- Host: GitHub
- URL: https://github.com/polycatdev/pyscraper
- Owner: PolyCatDev
- License: mit
- Created: 2024-03-04T07:50:50.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-07-25T13:06:13.000Z (5 months ago)
- Last Synced: 2025-07-25T20:06:48.490Z (5 months ago)
- Topics: learning, pyhton3, scraper
- Language: Python
- Homepage:
- Size: 26.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# pyScarper
A simple HTML web scraper made in Python that I built for learning purposes.
# Contributing
If anyone wants to take this further, bug reports, feature requests and PRs are welcome.
# System Dependencies
1. `python-tkinter`
> [!WARNING]
> TK must be installed through a sepparate package manager (apt, pacman, dnf, etc.) if it didn't come with your python installation.
# How to run it?
### Pull down the code
```bash
git clone https://github.com/PolyCatDev/pyScarper && \
cd pyScarper
```
### Create a Python virtual enviroment and enter it
```bash
python3 -m venv .venv && \
source .venv/bin/activate
```
### Install the dependencies
```bash
pip3 install poetry && poetry install
```
### Run the main.py file
```bash
python3 main.py
```
# How to use it?
1. Paste your link in the text box.
2. Press the download button.
3. A file called `website.html` will appear in the project folder.
# What websites can I scrape?
Any one really, but here is an example: https://quotes.toscrape.com/