Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sambhav/ir-system
An information retrieval system for a comparative analysis of TF-IDF and BM25 ranking mechanisms
https://github.com/sambhav/ir-system
bm25 comparative-analysis information-retrieval reddit scraper tf-idf whoosh
Last synced: 7 days ago
JSON representation
An information retrieval system for a comparative analysis of TF-IDF and BM25 ranking mechanisms
- Host: GitHub
- URL: https://github.com/sambhav/ir-system
- Owner: sambhav
- License: mit
- Created: 2017-08-14T11:31:34.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2017-08-23T14:00:08.000Z (over 7 years ago)
- Last Synced: 2024-11-24T01:41:26.243Z (2 months ago)
- Topics: bm25, comparative-analysis, information-retrieval, reddit, scraper, tf-idf, whoosh
- Language: Python
- Size: 1.35 MB
- Stars: 1
- Watchers: 6
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# IR-system
An information retrieval system for a comparative analysis of TF-IDF and BM25 ranking mechanisms## Setting up the repo
* Clone the repo
* Create a new virtual environment usingvirtualenv -p python3 venv
* Activate the virtual environement via
source venv/bin/activate
* Install the repo requirements via
python setup.py install* To scrape documents use
irs scrape
* To create an index use
irs create_index* To index dumped data use
irs index_documents $JSON_PATH
* To show results use
irs run