{"id":21429581,"url":"https://github.com/mayank-02/vector-space-model","last_synced_at":"2025-07-14T11:30:42.174Z","repository":{"id":188402709,"uuid":"319626614","full_name":"mayank-02/vector-space-model","owner":"mayank-02","description":"Implementation of Vector Space Model using TF-IDF and Cosine Similarity","archived":false,"fork":false,"pushed_at":"2024-08-27T13:29:48.000Z","size":11,"stargazers_count":5,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-08-27T14:53:59.755Z","etag":null,"topics":["cosine-similarity","information-retrieval","python","ranked-retrieval","tf-idf","vector-space-model"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mayank-02.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2020-12-08T12:06:15.000Z","updated_at":"2024-08-27T13:29:45.000Z","dependencies_parsed_at":null,"dependency_job_id":"c8726576-06bf-4ac4-93c7-0b809a0f7bf0","html_url":"https://github.com/mayank-02/vector-space-model","commit_stats":null,"previous_names":["mayank-02/vector-space-model"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mayank-02%2Fvector-space-model","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mayank-02%2Fvector-space-model/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mayank-02%2Fvector-space-model/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mayank-02%2Fvector-space-model/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mayank-02","download_url":"https://codeload.github.com/mayank-02/vector-space-model/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225971244,"owners_count":17553461,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cosine-similarity","information-retrieval","python","ranked-retrieval","tf-idf","vector-space-model"],"created_at":"2024-11-22T22:18:22.575Z","updated_at":"2024-11-22T22:18:22.995Z","avatar_url":"https://github.com/mayank-02.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Vector Space Model\n\nThe program `vsm.py` implements a toy search engine to illustrate the vector space model using TF-IDF for documents.\n\nThe program asks you to enter a search query, and then returns all documents from the corpus matching the query, in decreasing order of cosine similarity, according to the vector space model.\n\nThe document corpus consists of just four documents, which are product descriptions of popular books, taken from amazon.com.\n\n## Getting Started\n\n- Install Python 3.6+\n- Install all pip requirements from the `requirements.txt`:\n\n```bash\n$ python3 -m pip install -r requirements.txt\n```\n\n- To download stopwords used for the model, open your terminal or command prompt and enter following commands:\n\n```bash\n$ python3\n\u003e\u003e\u003e import nltk\n\u003e\u003e\u003e nltk.download('stopwords')\n```\n\n## Usage\n\n```bash\n$ python3 vsm.py\nSearch query \u003e\u003e lord of the ring\n------------------------------------------\n| Score | Document                       |\n------------------------------------------\n| 0.731 | corpus/lotr.txt                |\n| 0.118 | corpus/the_hobbit.txt          |\n------------------------------------------\n```\n\n### Queries\n\nIt supports free-form queries\n\n### Corpus\n\nYou can point `CORPUS` in `vsm.py` to your own corpus to use vector space model on it.\n\n## Authors\n\n[Mayank Jain](https://github.com/mayank-02)\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmayank-02%2Fvector-space-model","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmayank-02%2Fvector-space-model","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmayank-02%2Fvector-space-model/lists"}