Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/eklem/wikipedia-stopword-crawler

A Wikipedia text crawler to create stopword lists for any language in the world.
https://github.com/eklem/wikipedia-stopword-crawler

Last synced: 22 days ago
JSON representation

A Wikipedia text crawler to create stopword lists for any language in the world.

Awesome Lists containing this project

README

        

# wikipedia-stopword-crawler

A Wikipedia text crawler to create [stopword](https://github.com/fergiemcdowall/stopword) lists for any language in the world.

Crawl all main pages and extract text for a stopword analysis by [stopword-trainer](https://github.com/eklem/stopword-trainer).

Developed by [Espen Klem](mailto:[email protected])