An open API service indexing awesome lists of open source software.

https://github.com/vectifyai/model-augmented-fine-tuning

Fine-tuning black-box OpenAI embedding models
https://github.com/vectifyai/model-augmented-fine-tuning

embedding-models embeddings fine-tuning openai rag retrieval retrieval-augmented-generation vector-database

Last synced: 12 days ago
JSON representation

Fine-tuning black-box OpenAI embedding models

Awesome Lists containing this project

README

          

# Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning

A method to fine-tune the black-box OpenAI embedding to improve the retrieval performance 📈.

📝 See our paper [Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning](https://arxiv.org/pdf/2402.12177) for a detailed introduction to this method.

Other training details and dataset construction can be found in [LlamaIndex's Blog](https://medium.com/llamaindex-blog/fine-tuning-embeddings-for-rag-with-synthetic-data-e534409a3971).

## Install 🛠️
```bash
pip install -r requirements.txt
```

## Fine-tune the BAAI/bge-small-en model
```bash
cd ./finetune
python train.py
```

## Fine-tune the FAE model
```bash
cd ./finetune
python fae_train.py
```

The comparison of the evaluation results can be found in the [jupyter notebook](https://github.com/VectifyAI/FAE/blob/main/finetune/eval.ipynb) 📊.