Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sidhantpanda/wiki-scraper
A wikipedia category scraper
https://github.com/sidhantpanda/wiki-scraper
Last synced: 23 days ago
JSON representation
A wikipedia category scraper
- Host: GitHub
- URL: https://github.com/sidhantpanda/wiki-scraper
- Owner: sidhantpanda
- Created: 2014-05-02T00:51:31.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2014-05-03T16:09:03.000Z (over 10 years ago)
- Last Synced: 2024-10-03T22:41:50.864Z (about 1 month ago)
- Language: Python
- Size: 168 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
#Wiki Scraper
This is a implementation of the wiki tools to get articles from a specific category from wikipedia.
The edit the categories list in poc.py to add/delete categories of your choice.
##How to run
In the terminal, navigate to the repo and run
$python poc.py
This will create directory called "data" (ignored in .gitignore) and create directories of the categories and save the articles in a text file in the respective directories.