{"id":20313484,"url":"https://github.com/autogluon/autogluon-rag","last_synced_at":"2025-04-11T17:10:11.923Z","repository":{"id":251154876,"uuid":"808293701","full_name":"autogluon/autogluon-rag","owner":"autogluon","description":"Retrieval-Augmented Generation in 3 Lines of Code!","archived":false,"fork":false,"pushed_at":"2024-10-23T21:40:41.000Z","size":386,"stargazers_count":26,"open_issues_count":7,"forks_count":5,"subscribers_count":7,"default_branch":"main","last_synced_at":"2024-10-24T10:12:35.451Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://auto.gluon.ai/rag/dev/index.html","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/autogluon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-30T19:10:57.000Z","updated_at":"2024-10-23T21:40:45.000Z","dependencies_parsed_at":"2024-08-01T07:29:04.668Z","dependency_job_id":"a0de8bfc-12a8-4669-ab00-3836d8054137","html_url":"https://github.com/autogluon/autogluon-rag","commit_stats":null,"previous_names":["autogluon/autogluon-rag"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autogluon%2Fautogluon-rag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autogluon%2Fautogluon-rag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autogluon%2Fautogluon-rag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autogluon%2Fautogluon-rag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/autogluon","download_url":"https://codeload.github.com/autogluon/autogluon-rag/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":235519884,"owners_count":19003201,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-14T18:11:16.060Z","updated_at":"2025-01-25T00:09:21.249Z","avatar_url":"https://github.com/autogluon.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"left\"\u003e\n  \u003cimg src=\"https://user-images.githubusercontent.com/16392542/77208906-224aa500-6aba-11ea-96bd-e81806074030.png\" width=\"350\"\u003e\n\u003c/div\u003e\n\n# AutoGluon-RAG\n\n## Overview\nAutoGluon-RAG is a framework designed to streamline the development of RAG (Retrieval-Augmented Generation) pipelines. RAG has emerged as a crucial approach for tailoring large language models (LLMs) to address domain-specific queries. However, constructing RAG pipelines traditionally involves navigating through a complex array of modules and functionalities, including retrievers, generators, vector database construction, fast semantic search, and handling long-context inputs, among others.\n\nAutoGluon-RAG allows users to create customized RAG pipelines seamlessly, eliminating the need to delve into any technical complexities. Following the AutoML (Automated Machine Learning) philosophy of simplifying model development with minimal code, as exemplified by AutoGluon; AutoGluon-RAG enables users to create a RAG pipeline with just a few lines of code. The framework provides a user-friendly interface, and abstracts away the underlying modules, allowing users to focus on their domain-specific requirements and leveraging the power of RAG pipelines without the need for extensive technical expertise. \n\n## Goal\nIn line with the AutoGluon team's commitment to meeting user requirements and expanding its user base, the team aims to develop a new feature that simplifies the creation and deployment of end-to-end RAG (Retrieval-Augmented Generation) pipelines. Given a set of user-provided data or documents, this feature will enable users to develop and deploy a RAG pipeline with minimal coding effort, following the AutoML (Automated Machine Learning) philosophy of three-line solutions.\n\n## Usage\nTo use this framework, you must first install AutoGluon RAG:\n```python\ngit clone https://github.com/autogluon/autogluon-rag\ncd autogluon-rag\n\n# Create a Virtual Environment (using Python, or conda if you prefer)\npython3 -m virtualenv venv\nsource venv/bin/activate\n\n#Install the package\npip install -e .\n```\nYou can now use the package in two ways. \n\n### Use AutoGluon-RAG through the command line as `agrag`:\n\n```python\nAutoGluon-RAG\n\n\nusage: agrag [-h] --config_file\n\nAutoGluon-RAG - Retrieval-Augmented Generation Pipeline\n\noptions:\n  -h, --help        show this help message and exit\n  --config_file        Path to the configuration file \n```\n\n### Use AutoGluon-RAG through code:\n```python\nfrom agrag.agrag import AutoGluonRAG\n\n\ndef ag_rag():\n    agrag = AutoGluonRAG(\n        preset_quality=\"medium_quality\", # or path to config file\n        web_urls=[\"https://auto.gluon.ai/stable/index.html\"],\n        base_urls=[\"https://auto.gluon.ai/stable/\"],\n        parse_urls_recursive=True,\n        data_dir=\"s3://autogluon-rag-github-dev/autogluon_docs/\"\n    )\n    agrag.initialize_rag_pipeline()\n    agrag.generate_response(\"What is AutoGluon?\")\n\n\nif __name__ == \"__main__\":\n    ag_rag()\n```\n\nFor a list of configurable parameters that can be passed into the `AutoGluonRAG` class, refer to the tutorial [here](https://github.com/autogluon/autogluon-rag/tree/main/documentation/tutorials/general/code_parameteres.md). \n\nYou can also use a configuration file with `AutoGluonRAG`.\nThe configuration file contains the specific parameters to use for each module in the RAG pipeline. For an example of a config file, please refer to `example_config.yaml` in `src/agrag/configs/`. For specific details about the parameters in each individual module, refer to the `README` files in each module in `src/agrag/modules/`.\n\nThere is also a `shared` section in the config file for parameters that do not refer to a specific module. Currently, the parameters in `shared` are: \n```python\npipeline_batch_size: Optional batch size to use for pre-processing stage (Data Processing, Embedding, Vector DB Module). This represents the number of files in each batch. The default value is 20.\n```\n\n## Evaluation\nFor more information about the evaluation module, refer to the code in `src/agrag/evaluation` and the instructions [here](https://github.com/autogluon/autogluon-rag/tree/main/src/agrag/evaluation/README.md).\n\n## Tutorials\nFor a list of tutorials on using AutoGluon-RAG in different scenarios, refer to the documentation [here](https://github.com/autogluon/autogluon-rag/tree/main/documentation/tutorial.md)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautogluon%2Fautogluon-rag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fautogluon%2Fautogluon-rag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautogluon%2Fautogluon-rag/lists"}