Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/wisehackermonkey/webscraper
simple script that scrapes top 1 million websites
https://github.com/wisehackermonkey/webscraper
Last synced: 1 day ago
JSON representation
simple script that scrapes top 1 million websites
- Host: GitHub
- URL: https://github.com/wisehackermonkey/webscraper
- Owner: wisehackermonkey
- License: mit
- Created: 2020-10-11T00:15:49.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2021-05-18T03:41:11.000Z (over 3 years ago)
- Last Synced: 2024-11-10T12:44:13.505Z (about 2 months ago)
- Language: Python
- Size: 6.13 MB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# webscraper
simple script that scrapes top 1 million websites![Screenshot_3](/assets/Screenshot_3_1bddlytot.png)
Introduction
============TODO
Usage
============TODO
Future Feature
========
- [Scrapy | A Fast and Powerful Scraping and Web Crawling Framework](https://scrapy.org/)# sources
- [How to Download a List of All Registered Domain Names | Hacker News](https://news.ycombinator.com/item?id=10367342)
- http://s3.amazonaws.com/alexa-static/top-1m.csv.zipAuthor
======- wisehackermonkey