An open API service indexing awesome lists of open source software.

https://github.com/gitw1n/web-scrape-wave

This project is a web scraper written in Python using the BeautifulSoup library. It is designed to collect data from websites based on user-provided URLs.
https://github.com/gitw1n/web-scrape-wave

bs4 parser python python3 requests webscraper webscraping

Last synced: about 1 year ago
JSON representation

This project is a web scraper written in Python using the BeautifulSoup library. It is designed to collect data from websites based on user-provided URLs.

Awesome Lists containing this project

README

          

## Web Scraper
This project is a web scraper written in Python using the BeautifulSoup library. It is designed to collect data from websites based on user-provided URLs.

Main features:
Ability to enter a list of links for parsing.
Support for various types of objects for parsing, such as URLs and headers.
Checking the connection status to sites.
Saving results in different formats.
Currently in development:
Parsing of various types of objects.
Support for multitasking and asynchronous parsing.
Mechanisms for error handling and recovery from failures.

## Installation and launch
1. Clone the repository:
```
git clone https://github.com/GitW1n/Web-Scrape-Wave.git
```
2. Install dependencies:
```
pip install -r requirements.txt
```
3. Run the project:
```
python main.py
```

## Notes:
The project is not yet finished and the functionality may change.
To add new features or fix bugs, please create issues and pull requests.