https://github.com/tomaston1996/doc-similarity-api
📄 Document similarity matcher using NLP | FastAPI | SpaCy
https://github.com/tomaston1996/doc-similarity-api
docker fastapi nlp postgresql redis
Last synced: 3 months ago
JSON representation
📄 Document similarity matcher using NLP | FastAPI | SpaCy
- Host: GitHub
- URL: https://github.com/tomaston1996/doc-similarity-api
- Owner: TomAston1996
- Created: 2024-12-14T21:25:51.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-13T14:04:12.000Z (over 1 year ago)
- Last Synced: 2025-09-02T00:40:01.933Z (10 months ago)
- Topics: docker, fastapi, nlp, postgresql, redis
- Language: Python
- Homepage:
- Size: 67.4 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[![Contributors][contributors-shield]][contributors-url]
[![Forks][forks-shield]][forks-url]
[![Stargazers][stars-shield]][stars-url]
[![Issues][issues-shield]][issues-url]
[![MIT License][license-shield]][license-url]
[![LinkedIn][linkedin-shield]][linkedin-url]
# 📄 Document Similarity API
The goal of Document Similarity API is use Natural Language Processing (NLP) to find similar documents based on the cosine similary score of the document title and it's textual contents.
The problem this is trying to solve is replication in the work place. Often, work is replicated and the context of historical work could aid in delivering new work more quickly.
The plan is to use the SpaCy library to preprocess and calculate vector values for all documents uploaded to a Documents table.
When a user searches for a similarity match for any new documents the API should return any similarity matches.
## 🧑💻 Tech Stack
![Python]
![FastAPI]
![Postgres]
![Docker]
## 🔧 Setup
### 📋 Dependencies
Run the command ```pip install -r requirements.txt``` to install dependencies.
### 🐋 Docker
Docker Engine is required to run the PostreSQL database.
Download docker desktop [here](https://www.docker.com/products/docker-desktop/).
Run ```docker-compose --env-file .env up --build``` from your root directory to build and run your docker image from the Dockerfile
### ⚙️ Environment
Set up your environment variables in a ```.env``` file which should look similar to the below:
```
POSTGRES_PASSWORD=
POSTGRES_DB=
POSTGRES_USER=
POSTGRES_HOST_PORT=
POSTGRES_HOST_NAME=
```
## 🧑🤝🧑 Developers
| Name | Email |
| -------------- | -------------------------- |
| Tom Aston | mailto:mail@tomaston.dev |
[contributors-shield]: https://img.shields.io/github/contributors/TomAston1996/doc-similarity-api.svg?style=for-the-badge
[contributors-url]: https://github.com/TomAston1996/doc-similarity-api/graphs/contributors
[forks-shield]: https://img.shields.io/github/forks/TomAston1996/doc-similarity-api.svg?style=for-the-badge
[forks-url]: https://github.com/TomAston1996/doc-similarity-api/network/members
[stars-shield]: https://img.shields.io/github/stars/TomAston1996/doc-similarity-api.svg?style=for-the-badge
[stars-url]: https://github.com/TomAston1996/doc-similarity-api/stargazers
[issues-shield]: https://img.shields.io/github/issues/TomAston1996/doc-similarity-api.svg?style=for-the-badge
[issues-url]: https://github.com/TomAston1996/doc-similarity-api/issues
[license-shield]: https://img.shields.io/github/license/TomAston1996/doc-similarity-api.svg?style=for-the-badge
[license-url]: https://github.com/TomAston1996/doc-similarity-api/blob/master/LICENSE.txt
[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555
[linkedin-url]: https://linkedin.com/in/tomaston96
[Python]: https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54
[FastAPI]: https://img.shields.io/badge/FastAPI-005571?style=for-the-badge&logo=fastapi
[Postgres]: https://img.shields.io/badge/postgres-%23316192.svg?style=for-the-badge&logo=postgresql&logoColor=white
[Docker]: https://img.shields.io/badge/docker-%230db7ed.svg?style=for-the-badge&logo=docker&logoColor=white
[Redis]: https://img.shields.io/badge/redis-%23DD0031.svg?style=for-the-badge&logo=redis&logoColor=white