An open API service indexing awesome lists of open source software.

https://github.com/abhigyan126/similaritymap

This application calculates document similarities using BERT embeddings. It reads text files from a folder, computes pairwise similarities, and generates a heat-map to visualise the results. The similarity matrix can be saved as a CSV file, and the heat-map as an image.
https://github.com/abhigyan126/similaritymap

bert-embeddings cosine-similarity embeddings plagarism-detection

Last synced: 6 months ago
JSON representation

This application calculates document similarities using BERT embeddings. It reads text files from a folder, computes pairwise similarities, and generates a heat-map to visualise the results. The similarity matrix can be saved as a CSV file, and the heat-map as an image.

Awesome Lists containing this project

README

          

### **SimilarityMAP**
This application calculates document similarities using BERT embeddings. It reads text files from a folder, computes pairwise similarities, and generates a heatmap for visualizing the results. The similarity matrix can be saved as a CSV file, and the heatmap as an image.

---

### **Files**:
- **App.py**: A Tkinter application that requires a folder with text files. It creates a similarity heatmap for all the text files.
- **get_map.py**: A non-GUI application that performs the same task as `App.py`.

---

### **Screenshots**


Screenshot 2024-10-16 at 5 02 33 PM

---

### **Usage**
- Can be used to **automatically grade** documents by identifying the best match and comparing similarity.
- **Limitation**: Only provides deviation information, not whether the content is correct.

- Can also be used to **detect copied content and plagiarism**.
- **Limitation**: Only identifies plagiarism within the scope of the provided text context.