https://github.com/gitw1n/web-scrape-wave
This project is a web scraper written in Python using the BeautifulSoup library. It is designed to collect data from websites based on user-provided URLs.
https://github.com/gitw1n/web-scrape-wave
bs4 parser python python3 requests webscraper webscraping
Last synced: about 1 year ago
JSON representation
This project is a web scraper written in Python using the BeautifulSoup library. It is designed to collect data from websites based on user-provided URLs.
- Host: GitHub
- URL: https://github.com/gitw1n/web-scrape-wave
- Owner: GitW1n
- License: mit
- Created: 2024-12-04T14:19:44.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-12-07T11:25:38.000Z (over 1 year ago)
- Last Synced: 2025-02-07T15:12:05.709Z (over 1 year ago)
- Topics: bs4, parser, python, python3, requests, webscraper, webscraping
- Language: Python
- Homepage:
- Size: 21.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Web Scraper
This project is a web scraper written in Python using the BeautifulSoup library. It is designed to collect data from websites based on user-provided URLs.
Main features:
Ability to enter a list of links for parsing.
Support for various types of objects for parsing, such as URLs and headers.
Checking the connection status to sites.
Saving results in different formats.
Currently in development:
Parsing of various types of objects.
Support for multitasking and asynchronous parsing.
Mechanisms for error handling and recovery from failures.
## Installation and launch
1. Clone the repository:
```
git clone https://github.com/GitW1n/Web-Scrape-Wave.git
```
2. Install dependencies:
```
pip install -r requirements.txt
```
3. Run the project:
```
python main.py
```
## Notes:
The project is not yet finished and the functionality may change.
To add new features or fix bugs, please create issues and pull requests.