An open API service indexing awesome lists of open source software.

https://github.com/carloocchiena/python_url_crawler

A script that starting from a webpage, iterate thru all its link, appending them in a list. Sort of proxy to get all pages in a website
https://github.com/carloocchiena/python_url_crawler

beautifulsoup crawler python python3

Last synced: 4 months ago
JSON representation

A script that starting from a webpage, iterate thru all its link, appending them in a list. Sort of proxy to get all pages in a website

Awesome Lists containing this project

README

          

# python_url_crawler
A script that starting from a webpage, iterate thru all its link, appending them in a list. Sort of proxy to get all pages in a website.

the old_main is a raw version I made in 1 hours outta a stack overflow questions;

main.py is a quite better version I created from blank, with less code entropy. Seems working decently.

Consider that the script aims to find only urls within the domain, but this could be easily configured tweaking the "cleaner" function