Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jbahire/semantic-similarity
This project gives implemetations of semantic similarity using various text embeddings and you can easily compare results using API provided. Go ahead and build your own API for integration in your use case.
https://github.com/jbahire/semantic-similarity
bert elmo machine-learning natural-language-processing semantic-similarity spacy word2vec
Last synced: 6 days ago
JSON representation
This project gives implemetations of semantic similarity using various text embeddings and you can easily compare results using API provided. Go ahead and build your own API for integration in your use case.
- Host: GitHub
- URL: https://github.com/jbahire/semantic-similarity
- Owner: JBAhire
- License: apache-2.0
- Created: 2019-10-01T05:04:23.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-01-01T18:26:46.000Z (almost 5 years ago)
- Last Synced: 2024-10-31T22:24:49.460Z (about 2 months ago)
- Topics: bert, elmo, machine-learning, natural-language-processing, semantic-similarity, spacy, word2vec
- Language: Python
- Size: 551 KB
- Stars: 3
- Watchers: 3
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# What is Semantic Similarity?
semantic similarity is implementation of a technology called text embedding. One of the most useful, new technologies for natural language processing, text embedding transforms words into a numerical representation (vectors) that approximates the conceptual distance of word meaning.Many NLP applications need to compute the similarity in meaning between two short texts. Search engines, for example, need to model the relevance of a document to a query, beyond the overlap in words between the two. Similarly, question-and-answer sites such as Quora need to determine whether a question has already been asked before. This type of text similarity is often computed by first embedding the two short texts and then calculating the cosine similarity between them.
# What embeddings we're using?
We're using following embeddings:
1. BERT
2. Elmo
3. Spacy
4. W2V# Requirements
1. Python (3.0 and above)
2. Flask
3. TensorFlow
4. Download Bert pre-trained model from here(https://github.com/google-research/bert#pre-trained-models)
5. AllenNLP
6. Spcay# Steps
``` pip install -r requirements.py ```