An open API service indexing awesome lists of open source software.

https://github.com/clydedacruz/worddoc-indexer-py

Indexes Word Docs after removing stopwords and lemmatization. Allows a simple boolean conjunctive query over the index
https://github.com/clydedacruz/worddoc-indexer-py

boolean-retrieval information-retrieval python

Last synced: 6 months ago
JSON representation

Indexes Word Docs after removing stopwords and lemmatization. Allows a simple boolean conjunctive query over the index

Awesome Lists containing this project

README

          

# worddoc-indexer-py

## Dependencies
Python version: Use python version 3.5 or greater

To install dependencies, run : `pip install nltk`

Then download nltk data : in the python prompt:
```
import nltk
nltk.download('wordnet')
```

## Usage

To create index : `python create_index.py data`

To query : `python query_index.py ...`