Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eklem/wikipedia-stopword-crawler
A Wikipedia text crawler to create stopword lists for any language in the world.
https://github.com/eklem/wikipedia-stopword-crawler
Last synced: 22 days ago
JSON representation
A Wikipedia text crawler to create stopword lists for any language in the world.
- Host: GitHub
- URL: https://github.com/eklem/wikipedia-stopword-crawler
- Owner: eklem
- License: mit
- Created: 2018-04-11T07:11:03.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2022-12-07T10:15:27.000Z (almost 2 years ago)
- Last Synced: 2023-02-26T12:26:06.708Z (over 1 year ago)
- Language: JavaScript
- Size: 3.61 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# wikipedia-stopword-crawler
A Wikipedia text crawler to create [stopword](https://github.com/fergiemcdowall/stopword) lists for any language in the world.
Crawl all main pages and extract text for a stopword analysis by [stopword-trainer](https://github.com/eklem/stopword-trainer).
Developed by [Espen Klem](mailto:[email protected])