Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dizys/nyu-nlp-homework-4

NYU NLP Homework 4: Ad Hoc Information Retrieval system using TF-IDF weights and cosine similarity scores
https://github.com/dizys/nyu-nlp-homework-4

cosine-similarity information-retrieval nlp nyu tf-idf

Last synced: 15 days ago
JSON representation

NYU NLP Homework 4: Ad Hoc Information Retrieval system using TF-IDF weights and cosine similarity scores

Awesome Lists containing this project

README

        

NYU NLP Homework 4: Implement an ad hoc information retrieval system
using TF-IDF weights and cosine similarity scores.
Improved with word stemming and stop words removal.
by Ziyang Zeng (zz2960)
Spring 2022

Pre-requisites:
- Python 3.8+

Install dependencies:
`pip3 install -r requirements.txt`

How to run:
`python3 main_zz2960_HW4.py --help` will give you:
usage: main_zz2960_HW4.py [-h] [-o OUTPUTFILE] articlefile queryfile

An ad hoc information retrieval system using TF-IDF weights and cosine similarity scores.

positional arguments:
articlefile input article corpus file
queryfile input query file

optional arguments:
-h, --help show this help message and exit
-o OUTPUTFILE path for storing output file, default is output.txt

Example:
`python3 main_zz2960_HW4.py data/cran.all.1400 data/cran.qry -o output.txt`