An open API service indexing awesome lists of open source software.

https://github.com/andrewrporter/wikipedia-stoplist

Build robust StopLists from Wikipedia articles
https://github.com/andrewrporter/wikipedia-stoplist

natural-language-processing nlp python stoplist wikipedia wikipedia-api

Last synced: about 1 year ago
JSON representation

Build robust StopLists from Wikipedia articles

Awesome Lists containing this project

README

          

wikipedia-stoplist
==================

This project will serve as an exploration into building robust stoplists from Wikipedia article contents.

Usage
=====

`$ python main.py --num-pages 50 --term-freq 0.6 --limit 200`

This will generate a csv file called output.csv. You can pass this CSV file in to be analyzed again
with:

`$ python main.py --input output.csv --term-freq 0.6 --limit 200`