Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dizys/nyu-nlp-homework-4
NYU NLP Homework 4: Ad Hoc Information Retrieval system using TF-IDF weights and cosine similarity scores
https://github.com/dizys/nyu-nlp-homework-4
cosine-similarity information-retrieval nlp nyu tf-idf
Last synced: 15 days ago
JSON representation
NYU NLP Homework 4: Ad Hoc Information Retrieval system using TF-IDF weights and cosine similarity scores
- Host: GitHub
- URL: https://github.com/dizys/nyu-nlp-homework-4
- Owner: dizys
- License: mit
- Created: 2022-02-18T02:23:18.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-03-01T01:08:47.000Z (almost 3 years ago)
- Last Synced: 2024-12-20T20:03:11.457Z (21 days ago)
- Topics: cosine-similarity, information-retrieval, nlp, nyu, tf-idf
- Language: Jupyter Notebook
- Homepage:
- Size: 1.8 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.txt
- License: LICENSE
Awesome Lists containing this project
README
NYU NLP Homework 4: Implement an ad hoc information retrieval system
using TF-IDF weights and cosine similarity scores.
Improved with word stemming and stop words removal.
by Ziyang Zeng (zz2960)
Spring 2022Pre-requisites:
- Python 3.8+Install dependencies:
`pip3 install -r requirements.txt`How to run:
`python3 main_zz2960_HW4.py --help` will give you:
usage: main_zz2960_HW4.py [-h] [-o OUTPUTFILE] articlefile queryfileAn ad hoc information retrieval system using TF-IDF weights and cosine similarity scores.
positional arguments:
articlefile input article corpus file
queryfile input query fileoptional arguments:
-h, --help show this help message and exit
-o OUTPUTFILE path for storing output file, default is output.txtExample:
`python3 main_zz2960_HW4.py data/cran.all.1400 data/cran.qry -o output.txt`