{"id":50029386,"url":"https://github.com/AnswerDotAI/byaldi","last_synced_at":"2026-06-06T09:01:02.588Z","repository":{"id":255446030,"uuid":"852492838","full_name":"AnswerDotAI/byaldi","owner":"AnswerDotAI","description":"Use late-interaction multi-modal models such as ColPali in just a few lines of code.","archived":false,"fork":false,"pushed_at":"2025-01-28T20:47:40.000Z","size":2038,"stargazers_count":848,"open_issues_count":44,"forks_count":92,"subscribers_count":20,"default_branch":"main","last_synced_at":"2026-06-03T00:00:05.860Z","etag":null,"topics":["colbert","colpali","multi-modal","nlp","rag","reranking","retrieval"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AnswerDotAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-04T22:44:04.000Z","updated_at":"2026-05-30T06:57:41.000Z","dependencies_parsed_at":"2024-09-16T05:00:45.288Z","dependency_job_id":"661d113c-9672-4c98-a90a-14302a6a549b","html_url":"https://github.com/AnswerDotAI/byaldi","commit_stats":null,"previous_names":["answerdotai/byaldi"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/AnswerDotAI/byaldi","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AnswerDotAI%2Fbyaldi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AnswerDotAI%2Fbyaldi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AnswerDotAI%2Fbyaldi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AnswerDotAI%2Fbyaldi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AnswerDotAI","download_url":"https://codeload.github.com/AnswerDotAI/byaldi/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AnswerDotAI%2Fbyaldi/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33975476,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-06T02:00:07.033Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["colbert","colpali","multi-modal","nlp","rag","reranking","retrieval"],"created_at":"2026-05-20T19:00:36.009Z","updated_at":"2026-06-06T09:01:02.572Z","avatar_url":"https://github.com/AnswerDotAI.png","language":"Python","funding_links":[],"categories":["Multimodal RAG"],"sub_categories":["Frameworks \u0026 Tools"],"readme":"# Welcome to Byaldi\n_Did you know? In the movie RAGatouille, the dish Remy makes is not actually a ratatouille, but a refined version of the dish called \"Confit Byaldi\"._\n\n\u003cp align=\"center\"\u003e\u003cimg width=350 alt=\"The Byaldi logo, it's a cheerful rat using a magnifying glass to look at a complex document. It says 'byaldi' in the middle of a circle around the rat.\" src=\"byaldi.webp\"/\u003e\u003c/p\u003e\n\n⚠️ This is the pre-release version of Byaldi. Please report any issue you encounter, there will likely be quite a few quirks to iron out!\n\nByaldi is [RAGatouille](https://github.com/answerdotai/ragatouille)'s mini sister project. It is a simple wrapper around the [ColPali](https://github.com/illuin-tech/colpali) repository to make it easy to use late-interaction multi-modal models such as ColPALI with a familiar API.\n\n## Getting started\n\nFirst, a warning: This is a pre-release library, using uncompressed indexes and lacking other kinds of refinements.\n\nCurrently, we support all models supported by the underlying [colpali-engine](https://github.com/illuin-tech/colpali), including the newer, and better, ColQwen2 checkpoints, such as `vidore/colqwen2-v1.0`.  Broadly, the aim is for byaldi to support all ColVLM models.\n\nAdditional backends will be supported in future updates. As byaldi exists to facilitate the adoption of multi-modal retrievers, we intend to also add support for models such as [VisRAG](https://github.com/openbmb/visrag).\n\nEventually, we'll add an HNSW indexing mechanism, pooling, and, who knows, maybe 2-bit quantization?\n\nIt will get updated as the multi-modal ecosystem develops further!\n\n### Pre-requisites\n\n#### Poppler\n\nTo convert pdf to images with a friendly license, we use the `pdf2image` library. This library requires `poppler` to be installed on your system. Poppler is very easy to install by following the instructions [on their website](https://poppler.freedesktop.org/). The tl;dr is:\n\n__MacOS with homebrew__\n\n```bash\nbrew install poppler\n```\n\n__Debian/Ubuntu__\n\n```bash\nsudo apt-get install -y poppler-utils\n```\n\n#### Flash-Attention\n\nGemma uses a recent version of flash attention. To make things run as smoothly as possible, we'd recommend that you install it after installing the library:\n\n```bash\npip install --upgrade byaldi\npip install flash-attn\n```\n\n\n#### Hardware\n\nColPali uses multi-billion parameter models to encode documents. We recommend using a GPU for smooth operations, though weak/older GPUs are perfectly fine! Encoding your collection would suffer from poor performance on CPU or MPS.\n\n## Using `byaldi`\n\nByaldi is largely modeled after RAGatouille, meaning that everything is designed to take the fewest lines of code possible, so you can very quickly build on top of it rather than spending time figuring out how to create a retrieval pipeline.\n\n### Loading a model\n\nLoading a model with `byaldi` is extremely straightforward:\n\n```python3\nfrom byaldi import RAGMultiModalModel\n# Optionally, you can specify an `index_root`, which is where it'll save the index. It defaults to \".byaldi/\".\nRAG = RAGMultiModalModel.from_pretrained(\"vidore/colqwen2-v1.0\")\n```\n\nIf you've already got an index, and wish to load it along with the model necessary to query it, you can do so just as easily:\n\n```python3\nfrom byaldi import RAGMultiModalModel\n# Optionally, you can specify an `index_root`, which is where it'll look for the index. It defaults to \".byaldi/\".\nRAG = RAGMultiModalModel.from_index(\"your_index_name\")\n```\n\n### Creating an index\nCreating an index with `byaldi` is simple and flexible. **You can index a single PDF file, a single image file, or a directory containing multiple of those**. Here's how to create an index:\n\n```python3\nfrom byaldi import RAGMultiModalModel\n# Optionally, you can specify an `index_root`, which is where it'll save the index. It defaults to \".byaldi/\".\nRAG = RAGMultiModalModel.from_pretrained(\"vidore/colqwen2-v1.0\")\nRAG.index(\n    input_path=\"docs/\", # The path to your documents\n    index_name=index_name, # The name you want to give to your index. It'll be saved at `index_root/index_name/`.\n    store_collection_with_index=False, # Whether the index should store the base64 encoded documents.\n    doc_ids=[0, 1, 2], # Optionally, you can specify a list of document IDs. They must be integers and match the number of documents you're passing. Otherwise, doc_ids will be automatically created.\n    metadata=[{\"author\": \"John Doe\", \"date\": \"2021-01-01\"}], # Optionally, you can specify a list of metadata for each document. They must be a list of dictionaries, with the same length as the number of documents you're passing.\n    overwrite=True # Whether to overwrite an index if it already exists. If False, it'll return None and do nothing if `index_root/index_name` exists.\n)\n```\n\nAnd that's it! The model will start spinning and create your index, exporting all the necessary information to disk when it's done. You can then use the `RAGMultiModalModel.from_index(\"your_index_name\")` method presented above to load it whenever needed (you don't need to do this right after creating it -- it's already loaded in memory and ready to go!).\n\nThe main decision you'll have to make here is whether you want to set `store_collection_with_index` to True or not. If set to true, it greatly simplifies your workflow: the base64-encoded version of relevant documents will be returned as part of the query results, so you can immediately pipe it to your LLM. However, it adds considerable memory and storage requirements to your index, so you might want to set it to False (the default setting) if you're short on those resources, and create the base64 encoded versions yourself whenever needed.\n\n\n### Searching\n\nOnce you've created or loaded an index, you can start searching for relevant documents. Again, it's a single, very straightforward command:\n\n```python3\nresults = RAG.search(query, k=3)\n```\n\nResults will be a list of `Result` objects, which you can also treat as normal dictionaries. Each result will be in this format:\n```python3\n[\n    {\n        \"doc_id\": 0,\n        \"page_num\": 10,\n        \"score\": 12.875,\n        \"metadata\": {},\n        \"base64\": None\n    },\n    ...\n]\n```\n\n`page_num` are 1-indexed, while doc_ids are 0-indexed. This is to make simpler to operate with other PDF manipulation tools, where the 1st page is generally page 1. `page_num` for images and single-page PDFs will always be 1, it's only useful for longer PDFs.\n\nIf you've passed metadata or encoded with the flag to store the base64 versions, these fields will be populated. Results are sorted by score, so item 0 from the list will always be the most relevant document, etc...\n\n### Adding documents to an existing index\n\nSince indexes are in-memory, they're addition-friendly! If you need to ingest some new pdfs, just load your index with `from_index`, and then, call `add_to_index`, with similar parameters to the original `index()` method:\n\n```python3\nRAG.add_to_index(\"path_to_new_docs\",\n        store_collection_with_index: bool = False,\n        ...\n    )\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAnswerDotAI%2Fbyaldi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FAnswerDotAI%2Fbyaldi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAnswerDotAI%2Fbyaldi/lists"}