https://github.com/vectifyai/model-augmented-fine-tuning
Fine-tuning black-box OpenAI embedding models
https://github.com/vectifyai/model-augmented-fine-tuning
embedding-models embeddings fine-tuning openai rag retrieval retrieval-augmented-generation vector-database
Last synced: 12 days ago
JSON representation
Fine-tuning black-box OpenAI embedding models
- Host: GitHub
- URL: https://github.com/vectifyai/model-augmented-fine-tuning
- Owner: VectifyAI
- License: mit
- Created: 2023-09-21T16:45:08.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-08-04T08:50:40.000Z (11 months ago)
- Last Synced: 2025-08-04T12:12:59.047Z (11 months ago)
- Topics: embedding-models, embeddings, fine-tuning, openai, rag, retrieval, retrieval-augmented-generation, vector-database
- Language: Jupyter Notebook
- Homepage: https://arxiv.org/pdf/2402.12177
- Size: 16 MB
- Stars: 11
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning
A method to fine-tune the black-box OpenAI embedding to improve the retrieval performance 📈.
📝 See our paper [Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning](https://arxiv.org/pdf/2402.12177) for a detailed introduction to this method.
Other training details and dataset construction can be found in [LlamaIndex's Blog](https://medium.com/llamaindex-blog/fine-tuning-embeddings-for-rag-with-synthetic-data-e534409a3971).
## Install 🛠️
```bash
pip install -r requirements.txt
```
## Fine-tune the BAAI/bge-small-en model
```bash
cd ./finetune
python train.py
```
## Fine-tune the FAE model
```bash
cd ./finetune
python fae_train.py
```
The comparison of the evaluation results can be found in the [jupyter notebook](https://github.com/VectifyAI/FAE/blob/main/finetune/eval.ipynb) 📊.