{"id":28559889,"url":"https://github.com/ghostpack/ragnarok","last_synced_at":"2025-06-17T15:33:58.037Z","repository":{"id":227503575,"uuid":"771250771","full_name":"GhostPack/RAGnarok","owner":"GhostPack","description":"A Nemesis powered Retrieval-Augmented Generation (RAG) chatbot proof-of-concept.","archived":false,"fork":false,"pushed_at":"2024-03-13T21:24:55.000Z","size":1745,"stargazers_count":61,"open_issues_count":2,"forks_count":8,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-06-10T09:06:36.076Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GhostPack.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-03-13T00:34:14.000Z","updated_at":"2025-06-07T14:52:20.000Z","dependencies_parsed_at":"2025-06-10T09:06:35.955Z","dependency_job_id":"366912a5-6e33-4f5c-bf30-7c1abc939184","html_url":"https://github.com/GhostPack/RAGnarok","commit_stats":null,"previous_names":["ghostpack/ragnarok"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/GhostPack/RAGnarok","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GhostPack%2FRAGnarok","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GhostPack%2FRAGnarok/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GhostPack%2FRAGnarok/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GhostPack%2FRAGnarok/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GhostPack","download_url":"https://codeload.github.com/GhostPack/RAGnarok/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GhostPack%2FRAGnarok/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260388463,"owners_count":23001529,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-10T09:06:32.909Z","updated_at":"2025-06-17T15:33:53.004Z","avatar_url":"https://github.com/GhostPack.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RAGnarok\n\nRAGnarok is a [Retrieval-Augmented Generation](https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/) chatbot frontend for [Nemesis](https://github.com/SpecterOps/Nemesis). It allows you to ask questions about text extracted from compatible documents processed by Nemesis.\n\n## RAG\n\n**Short explanation:** The general idea with Retrieval-Augmented Generation (RAG) is to allow a large language model (LLM) to answer questions about documents you've indexed.\n\n**Medium explanation:** RAG involves processing and turning text inputs into set-length vectors via an embedding model, which are then stored in a backend vector database. Questions to the LLM are then used to look up the \"most similiar\" chunks of text which are then fed into the context prompt for a LLM.\n\n![](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*kSkeaXRvRzbJ9SrFZaMoOg.png)\n[*Source*](https://towardsdatascience.com/retrieval-augmented-generation-rag-from-theory-to-langchain-implementation-4e9bd5f6a4f2)\n\n***Longer explanation in the rest of the section :)***\n\n***Even Longer explanation in [this blog post](https://posts.specterops.io/summoning-ragnarok-with-your-nemesis-7c4f0577c93b).***\n\n#### Indexing\n\nRetrieval-augumented generation is an architecture where documents being processed undergo the following process:\n\n1. Plaintext is extracted from any incoming documents.\n   - Nemesis uses [Apache Tika](https://tika.apache.org/) to extract text from compatible documents.\n2. The text is tokenized into chunks of up to X tokens, where X depends on the *context window* of the embedding model used.\n   - Nemesis uses Langchain's [TokenTextSplitter](https://api.python.langchain.com/en/latest/text_splitter/langchain.text_splitter.TokenTextSplitter.html), a chunk size of 510 tokens, and a 15% overlap between chunks.\n3. Each chunk of text is processed by an [embedding model](https://huggingface.co/spaces/mteb/leaderboard) which turns the input text into a fixed-length vector of floats.\n   - As [Pinecone explains](https://www.pinecone.io/learn/vector-embeddings/), what's cool about embedding models is that the vector representations they produce preserve \"semantic similiarity\", meaning that more similiar chunks of text will have more similiar vectors.\n   - Nemesis currently uses the [TaylorAI/gte-tiny](https://huggingface.co/TaylorAI/gte-tiny) embedding model as it's fast, but others are possible.\n4. Each vector and associated snippet of text is stored in a vector database.\n   - Nemesis uses Elasticsearch for vector storage. \n\n#### Semantic Search\n\nThis is the initial indexing process that Nemesis has been performing for a while. However, in order to complete a RAG-pipeline, the next steps are:\n\n5. Take an input prompt, such as \"*What is a certificate?*\" and run it through the same embedding model files were indexed with.\n6. Query the vector database (e.g., Elasticsearch) for the nearest **k** vectors + associated text chunks that are \"closest\" to the prompt input vector.\n   - This will return the **k** chunks of text that are the most similiar to the input query.\n7. We also use Elasticsearch's traditional(-ish) BM25 text search over the text for each chunk.\n   - These two lists of results are combined with [Reciprocal Rank Fusion](https://learn.microsoft.com/en-us/azure/search/hybrid-search-ranking), and the top results from the fused list are returned.\n   - **Note:** steps 6 and 7 happen in the `nlp` container in Nemesis. This is exposed at http://\\\u003cnemesis\\\u003e/nlp/hybrid_search \n\n#### Reranking\n\nWe now have the **k** most chunks of text most simliar to our input query. If we want to get a bit facier, we can execute what's called [reranking](https://www.pinecone.io/learn/series/rag/rerankers/).\n\n7. With reranking, the the prompt question and text results are paired up (question, text) and fed into a more powerful model (well, more powerful than the embedding model) tuned and known as a reranker. The reranker generates a simliarity score of the input prompt and text chunk.\n   - RAGnarok uses an adapted version of [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) for reranking.\n8. The results are then **reranked** and the top X number of results are selected.\n\n#### LLM Processing\n\n9. Finally, the resulting texts are combined with a prompt to the (local) LLM. Think something along the lines of \"Given these chunks of text {X}, answer this question {Y}\".\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fghostpack%2Fragnarok","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fghostpack%2Fragnarok","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fghostpack%2Fragnarok/lists"}