An open API service indexing awesome lists of open source software.

https://github.com/0memo07/web-crawler

Web Crawler with Python
https://github.com/0memo07/web-crawler

beautifulsoup4 bs4 crawler crawlers crawling crawling-python web-crawler web-crawler-python web-crawling webcrawler

Last synced: about 2 months ago
JSON representation

Web Crawler with Python

Awesome Lists containing this project

README

        

# Python Web Crawler

This Python project can be used as a corresponding web crawler from a specific URL. This browser monitors its features while navigating on the given URL and keeps detailed logs for each URL visited.

## Features

- Searches a large number of HTML from a URL.
- Finds links in the explored HTML content and adds URLs to visit them.
- You can set the maximum depth level.
- Keeps a list of visited URLs and does not revisit the same URL.
- There are appropriate error message and exception handling elements for error handling.
- Uses color logging.

## Use
1. Clone the project:

```bash
git clone https://github.com/0MeMo07/Web-Crawler.git
2. Go to the project directory:
```bash
cd Web-Crawler
3. Install required dependencies:
```bash
pip install -r requirements.txt
4. Run the crawler Python file::
```bash
python crawler.py

## Support me

Buy Me A Coffee