{"id":20115392,"url":"https://github.com/amperecomputingai/llm_app_frameworks","last_synced_at":"2026-06-07T04:31:16.050Z","repository":{"id":237564793,"uuid":"764312292","full_name":"AmpereComputingAI/llm_app_frameworks","owner":"AmpereComputingAI","description":"Integrating Ampere's high performance LLM inference with popular application building frameworks in the industry","archived":false,"fork":false,"pushed_at":"2024-05-22T17:56:33.000Z","size":9021,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-05-22T18:59:12.540Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AmpereComputingAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-27T21:15:20.000Z","updated_at":"2024-05-22T17:56:36.000Z","dependencies_parsed_at":null,"dependency_job_id":"f2e900f2-92d9-4e03-8cc5-935dbf8b157f","html_url":"https://github.com/AmpereComputingAI/llm_app_frameworks","commit_stats":null,"previous_names":["amperecomputingai/llm_app_frameworks"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AmpereComputingAI%2Fllm_app_frameworks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AmpereComputingAI%2Fllm_app_frameworks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AmpereComputingAI%2Fllm_app_frameworks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AmpereComputingAI%2Fllm_app_frameworks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AmpereComputingAI","download_url":"https://codeload.github.com/AmpereComputingAI/llm_app_frameworks/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241557319,"owners_count":19981905,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T18:35:04.518Z","updated_at":"2026-06-07T04:31:16.019Z","avatar_url":"https://github.com/AmpereComputingAI.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LLM Application Frameworks \nContains scripts that integrates LLM and Vector DB (ChromaDB) to test Retrieval Augmented Generation use cases. Two commonly used open-source frameworks that can be potential candidates to integrate with llama-cpp/llama-cpp-python:\n\nCLI based RAG application for \n[**LangChain**](https://github.com/AmpereComputingAI/llm_app_frameworks/tree/master/langchain) and \n[**LlamaIndex**](https://github.com/AmpereComputingAI/llm_app_frameworks/tree/master/llamaindex)\n\nPreduild Docker image (Image is based on Ampere Optimized PyTorch and llama-cpp):\n\n```\n# docker pull ghcr.io/amperecomputingai/local-rag:v0.0.1\n# docker run -it --rm ghcr.io/amperecomputingai/local-rag:v0.0.1\n```\n\nFirst step is to get the llama model\n```\nwget https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf\n```\n\nRunning LangChain RAG Application\n```\n# python langchain-cli.py\n\nWelcome to llama CLI. Reserved first words help, upload and quit\n               Type \"help\" for available commands.\nllama QA \u003e\u003e Who is president of India?\nThe current President of India is Ram Nath Kovind.\nllama QA \u003e\u003e upload news_india.txt\nUploading  news_india.txt [Done]\nllama QA \u003e\u003e Who is president of India?\nDroupadi Murmu\nllama QA \u003e\u003e quit\n\n```\n\nAt the time of \"llama-2-7b-chat.Q4_K_M.gguf\" training, president of India was \"Ram Nath Kovind\" With \"upload news_india.txt\", latest news regarging president of India is uploaded to RAG. Asking same question again provides the correct answer \"Droupadi Murmu\"\n\nRunning LlamaIndex RAG Application\n```\n# python llamaindex-cli.py\n\nWelcome to llama CLI. Reserved first words help, upload and quit\n               Type \"help\" for available commands.\nllama QA \u003e\u003e who is president of USA?\n president of the United States is Joe Biden.\nllama QA \u003e\u003e upload news_usa.txt\nUploading  news_usa.txt [Done]\nllama QA \u003e\u003e who is president of USA?\n answer to the query is \"Tom Cruise\".\nllama QA \u003e\u003e quit\n```\n\nFake news (news_usa.txt) is uploaded to RAG. Which changes the answer from \"Joe Biden\" to \"Tom Cruise\"\n\nFor best results use Ampere Optimized local-rag Docker image on **OCI A1 instance**.\n\nSystem requirement: 64 OCPUs + 128GB Memory.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famperecomputingai%2Fllm_app_frameworks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famperecomputingai%2Fllm_app_frameworks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famperecomputingai%2Fllm_app_frameworks/lists"}