{"id":15054346,"url":"https://github.com/curiousily/ragbase","last_synced_at":"2025-04-15T02:40:58.846Z","repository":{"id":250057958,"uuid":"827267401","full_name":"curiousily/ragbase","owner":"curiousily","description":"Completely local RAG. Chat with your PDF documents (with open LLM) and UI to that uses LangChain, Streamlit, Ollama (Llama 3.1), Qdrant and advanced methods like reranking and semantic chunking.","archived":false,"fork":false,"pushed_at":"2024-07-26T12:01:41.000Z","size":419,"stargazers_count":91,"open_issues_count":9,"forks_count":32,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-05T17:15:31.354Z","etag":null,"topics":["langchain","llama3","llm","pdf","rag","retrieval-augmented-generation","streamlit"],"latest_commit_sha":null,"homepage":"https://www.mlexpert.io/bootcamp/ragbase-local-rag","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/curiousily.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-11T10:12:30.000Z","updated_at":"2025-04-05T17:07:20.000Z","dependencies_parsed_at":"2024-08-14T08:18:24.549Z","dependency_job_id":null,"html_url":"https://github.com/curiousily/ragbase","commit_stats":{"total_commits":3,"total_committers":1,"mean_commits":3.0,"dds":0.0,"last_synced_commit":"5af7b79162329fa0b725b39d1aeeea7eb1e4428b"},"previous_names":["curiousily/ragbase"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/curiousily%2Fragbase","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/curiousily%2Fragbase/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/curiousily%2Fragbase/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/curiousily%2Fragbase/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/curiousily","download_url":"https://codeload.github.com/curiousily/ragbase/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248996752,"owners_count":21195774,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["langchain","llama3","llm","pdf","rag","retrieval-augmented-generation","streamlit"],"created_at":"2024-09-24T21:38:41.776Z","updated_at":"2025-04-15T02:40:58.828Z","avatar_url":"https://github.com/curiousily.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RagBase - Private Chat with Your Documents\n\n\u003e Completely local RAG with chat UI\n\n\u003ca href=\"https://www.mlexpert.io/bootcamp\" target=\"_blank\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/curiousily/ragbase/master/.github/ui.png\"\u003e\n\u003c/a\u003e\n\n## Demo\n\nCheck out the [RagBase on Streamlit Cloud](https://ragbase.streamlit.app/). Runs with Groq API.\n\n## Installation\n\nClone the repo:\n\n```sh\ngit clone git@github.com:curiousily/ragbase.git\ncd ragbase\n```\n\nInstall the dependencies (requires Poetry):\n\n```sh\npoetry install\n```\n\nFetch your LLM (gemma2:9b by default):\n\n```sh\nollama pull gemma2:9b\n```\n\nRun the Ollama server\n\n```sh\nollama serve\n```\n\nStart RagBase:\n\n```sh\npoetry run streamlit run app.py\n```\n\n## Architecture\n\n\u003ca href=\"https://www.mlexpert.io/bootcamp\" target=\"_blank\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/curiousily/ragbase/master/.github/architecture.png\"\u003e\n\u003c/a\u003e\n\n### Ingestor\n\nExtracts text from PDF documents and creates chunks (using semantic and character splitter) that are stored in a vector databse\n\n### Retriever\n\nGiven a query, searches for similar documents, reranks the result and applies LLM chain filter before returning the response.\n\n### QA Chain\n\nCombines the LLM with the retriever to answer a given user question\n\n## Tech Stack\n\n- [Ollama](https://ollama.com/) - run local LLM\n- [Groq API](https://groq.com/) - fast inference for mutliple LLMs\n- [LangChain](https://www.langchain.com/) - build LLM-powered apps\n- [Qdrant](https://qdrant.tech/) - vector search/database\n- [FlashRank](https://github.com/PrithivirajDamodaran/FlashRank) - fast reranking\n- [FastEmbed](https://qdrant.github.io/fastembed/) - lightweight and fast embedding generation\n- [Streamlit](https://streamlit.io/) - build UI for data apps\n- [PDFium](https://pdfium.googlesource.com/pdfium/) - PDF processing and text extraction\n\n## Add Groq API Key (Optional)\n\nYou can also use the Groq API to replace the local LLM, for that you'll need a `.env` file with Groq API key:\n\n```sh\nGROQ_API_KEY=YOUR API KEY\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcuriousily%2Fragbase","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcuriousily%2Fragbase","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcuriousily%2Fragbase/lists"}