https://github.com/qdrant/demo-colpali-optimized
https://github.com/qdrant/demo-colpali-optimized
Last synced: 7 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/qdrant/demo-colpali-optimized
- Owner: qdrant
- Created: 2024-11-21T14:44:56.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2024-11-21T15:39:33.000Z (over 1 year ago)
- Last Synced: 2025-10-14T13:17:24.814Z (7 months ago)
- Language: Jupyter Notebook
- Size: 15.2 MB
- Stars: 35
- Watchers: 2
- Forks: 4
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ColPali Retrieval Optimization Demo
This repository demonstrates how to optimize **ColPali-based PDF retrieval** using **Qdrant**, with a focus on improving retrieval speed while maintaining quality. The project includes two Jupyter notebooks with experiments and results.
## Repository Structure
- **`ColPali as a reranker I.ipynb`**
Covers the initial setup and data preparation:
- Combining ViDoRe, UFO, and DocVQA datasets into a single collection.
- Uploading vectors to a Qdrant collection.
- **`ColPali as a reranker II.ipynb`**
Focuses on retrieval experiments:
- Using pooled vectors for first-stage retrieval and ColPali for reranking.
- Comparing retrieval quality and speed between pooled and original vectors.
- Exploring optimizations such as cutting tokens and binary quantization.