An open API service indexing awesome lists of open source software.

https://github.com/rid17pawar/semantic-search-model-experiments

Experiments in the field of Semantic Search using BM-25 Algorithm, Mean of Word Vectors, along with state of the art Transformer based models namely USE and SBERT.
https://github.com/rid17pawar/semantic-search-model-experiments

bm25 fasttext fasttext-embeddings glove glove-embeddings information-retrieval sbert semantic-search universal-sentence-encoder word2vec word2vec-embeddinngs

Last synced: 7 months ago
JSON representation

Experiments in the field of Semantic Search using BM-25 Algorithm, Mean of Word Vectors, along with state of the art Transformer based models namely USE and SBERT.

Awesome Lists containing this project

README

          

# Semantic-Search-Model-Experiments

## Dataset Used For Semantic Search/ Information Retrieval:
[CISI Dataset - Kaggle](https://www.kaggle.com/datasets/dmaso01dsta/cisi-a-dataset-for-information-retrieval)

## Experiments:
#### Experiment-1. Using BM-25 Algorithm and Parameter Tuning For Semantic Search

*BM-25 Algorithm variations used:*
- BM25Okapi
- BM25L
- BM25Plus

#### Result:
![image](https://github.com/rid17pawar/Semantic-Search-Model-Experiments/assets/47048717/4b061f8d-8626-4f5e-8270-1aaefecb5ad7)

*BEST MODEL: BM25Plus*

#### Experiment-2. Using Mean of Word Vectors (MWV) with Pretrained Embeddings For Semantic Search

*BM-25 Algorithm variations used:*
- word2vec
- GloVe
- FastText

#### Result:
![image](https://github.com/rid17pawar/Semantic-Search-Model-Experiments/assets/47048717/885b4171-b6d7-4176-8226-c92c4828597a)

*BEST MODEL: word2vec*

#### Experiment-3. Using LDA Topic Modelling For Semantic Search

#### Result:
*Performs worst than BM-25*

#### Experiment-4. Using Universal Sentence Encoder (USE) For Semantic Search

*USE Model variations used:*
- Transformer Encoder
- Deep Averaging Network(DAN) Encoder

#### Result:
![image](https://github.com/rid17pawar/Semantic-Search-Model-Experiments/assets/47048717/30f4a784-f121-403c-8dfe-95ad4c638a01)

*BEST MODEL: USE-Transformer*

#### Experiment-5. Using Pretrained and Finetuned Sentence Transformers (SBERT) For Semantic Search

#### Result:
![image](https://github.com/rid17pawar/Semantic-Search-Model-Experiments/assets/47048717/0d96bc97-6620-490d-ac3d-46b41021eb46)

*BEST MODEL: Finetuned SBERT*

#### Final Result:
![image](https://github.com/rid17pawar/Semantic-Search-Model-Experiments/assets/47048717/3d3094a6-0234-45ee-9ef1-b86ad68dc37e)

![image](https://github.com/rid17pawar/Semantic-Search-Model-Experiments/assets/47048717/2d00cdf3-a0d5-4cd7-8945-4df601138813)

**Overall Best Model: Finetuned SBERT**