{"id":13709302,"url":"https://github.com/sroecker/LLM_AppDev-HandsOn","last_synced_at":"2025-05-06T15:32:37.134Z","repository":{"id":207656802,"uuid":"719729045","full_name":"sroecker/LLM_AppDev-HandsOn","owner":"sroecker","description":"Repository and hands-on workshop on how to develop applications with local LLMs","archived":false,"fork":false,"pushed_at":"2024-07-03T15:38:01.000Z","size":5329,"stargazers_count":389,"open_issues_count":1,"forks_count":67,"subscribers_count":8,"default_branch":"main","last_synced_at":"2024-11-13T19:39:49.599Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sroecker.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-16T19:24:49.000Z","updated_at":"2024-10-31T14:52:50.000Z","dependencies_parsed_at":"2023-12-25T21:34:06.641Z","dependency_job_id":"9913d6f8-c225-4133-bb5f-1f418dc593bc","html_url":"https://github.com/sroecker/LLM_AppDev-HandsOn","commit_stats":null,"previous_names":["sroecker/llm_appdev-handson"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sroecker%2FLLM_AppDev-HandsOn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sroecker%2FLLM_AppDev-HandsOn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sroecker%2FLLM_AppDev-HandsOn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sroecker%2FLLM_AppDev-HandsOn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sroecker","download_url":"https://codeload.github.com/sroecker/LLM_AppDev-HandsOn/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252713028,"owners_count":21792416,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T23:00:37.812Z","updated_at":"2025-05-06T15:32:32.125Z","avatar_url":"https://github.com/sroecker.png","language":"Jupyter Notebook","readme":"# LLM App Dev Workshop\n\n## Introduction\n\n\u003cimg src=\"localllamas.png\" alt=\"a bunch of happy local llamas\" width=\"256\"\u003e\n\n**2024-07-03: Streamlit app changes:**\nThe chatbot app code now uses Ollama embeddings and has a configurable system prompt.\n\nThis repository demonstrates how to build a simple LLM-based chatbot that can answer questions based on your documents (retrieval augmented generation - RAG) and how to deploy it using [Podman](https://podman.io) or on the [OpenShift](https://www.openshift.com) Container Platform (k8s).\n\nThe corresponding [workshop](workshop/Darmstadt_v1.md) - first run at [Red Hat Developers Hands-On Day 2023](https://events.redhat.com/profile/form/index.cfm?PKformID=0x900962abcd\u0026sc_cid=7013a000003SlFvAAK) in Darmstadt, Germany - teaches participants the basic concepts of LLMs \u0026 RAG, and how to adapt this example implementation to their own specific purpose GPT.\n\nThe software stack only uses open source tools [streamlit](https://streamlit.io), [LlamaIndex](https://llamaindex.ai) and local open LLMs via [Ollama](https://ollama.ai). Real open AI for the GPU poor.\n\nEveryone is invited to fork this repository, create their own specific purpose chatbot based on their documents, improve the setup or even hold your own workshop.\n\n## Setup\n\nFor the local setup a Mac M1 with 16GB unified memory and above are recommended. First download Ollama from [ollama.ai](https://ollama.ai) and install it.\n\nOn Linux you can disable the Ollama service for better debugging:\n\n```\nsudo systemctl disable ollama\nsudo systemctl stop ollama\n```\n\nand then manually run `ollama serve`.\n\nFor the local example have a look at the folder `streamlit` and install the requirements.\n\nCreate a virtual environment first:\n```\npython -m venv venv\nsource venv/bin/activate\n```\n\nInstall the requirements:\n```\npip install -r requirements.txt\n```\n\nThen start streamlit with:\n```\nstreamlit run app.py\n```\n\nModify the system prompt and copy different data sources to `docs/` in order to create your own version of the chatbot.\nYou can set the ollama host via the enviroment variable `OLLAMA_HOST`.\n\nYou can download models locally with `ollama pull zephyr` or via API:\n\n```\ncurl -X POST http://ollama:11434/api/pull -d '{\"name\": \"zephyr\"}'\n```\n\nFirst start the ollama service as described and download the [Zephyr model](https://ollama.ai/library/zephyr).\nTo test the ollama server you can call the generate API:\n\n```\ncurl -X POST http://ollama:11434/api/generate -d '{\"model\": \"zephyr\", \"prompt\": \"Why is the sky blue?\"}'\n```\n\nAll of these commands are also documented in our [cheat sheet](cheatsheet.txt).\n\n![](linuxbot.gif)\n\n## Deployment\n\n\n### Podman\n\nBuild the container based on [UBI9 Python 3.11](https://catalog.redhat.com/software/containers/ubi9/python-311/63f764b03f0b02a2e2d63fff?architecture=amd64\u0026image=654d1ee47c3bfba06c9c59ea):\n\n```\npodman build -t linuxbot-app .\n```\nIf you're building on arm64 Mac and deploy on amd64 then generally don't forget to add `--platform` (in this case our base image is amd64 anyways):\n\n```\npodman build --platform=\"linux/amd64\" -t linuxbot-app .\n```\n\nWe will create a network for our linuxbot and ollama:\n\n```\npodman network create linuxbot\n```\n\nCheck if DNS is enabled (it's not on the default net):\n\n```\npodman network inspect linuxbot\n```\n\nNow you can either start Ollama locally with `ollama serve` or start a Ollama container with\n\n```\npodman run --net linuxbot --name ollama -p 11434:11434 --rm docker.io/ollama/ollama:latest\n```\n \nNote: We just forward the port so we can curl it more easily locally as well.\n\nClick to unfold the details for \u003cdetails\u003e\u003csummary\u003eGPU support\u003c/summary\u003e\n\nThis ollama service won't have GPU support enabled and much slower compared to running it locally on a Mac M1 for example.\nIn order to run this container with NVIDIA GPU support we recommend to use the [NVIDIA Container Toolkit with Container Device Interface (CDI)](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html). Follow the instructions from NVIDIA then run podman with:\n\n```\npodman run --rm --net linuxbot --name ollama --device nvidia.com/gpu=all --security-opt=label=disable ollama\n```\n\nIn order to test if your graphics card is recognized you can test it using a base image that contains `nvidia-smi`, e.g:\n\n```\npodman run --rm --device nvidia.com/gpu=all --security-opt=label=disable ubuntu nvidia-smi -L\n```\n\nFor AMD graphic cards you need to forward the Kernel Fusion Driver (KFD) and Direct Rendering Infrastructure (DRI) to the container:\n\n```\npodman run -it --device=/dev/kfd --device=/dev/dri --security-opt=label=disable docker.io/ollama/ollama\n```\n\n\u003c/details\u003e\n\nSince we create the embeddings locally in the streamlit app we need to increase shared memory for Pytorch in order to get it running:\n\n```\npodman run --net linuxbot --name linuxbot-app -p 8080:8080 --shm-size=2gb -e OLLAMA_HOST=ollama -it --rm localhost/linuxbot-app\n```\n\nYou can set the Ollama server via the environment variable `OLLAMA_HOST`, the default is `localhost`.\n\nNOTE: It would be much better to generate the embeddings with the ollama service, this is not yet supported in LlamaIndex though.\n\n### OpenShift\n\nCreate a new project (namespace) for your workshop and deploy the ollama service in it:\n\n```\noc new-project my-workshop\noc apply -f deployments/ollama.yaml\n```\n\nIf you want to enable GPU support you have to have to install and instantiate the NVIDIA GPU Operator and Node Feature Discovery (NFD) Operator as described on the [AI on OpenShift](https://ai-on-openshift.io/odh-rhoai/nvidia-gpus/) page, then deploy `ollama-gpu.yaml` instead.\n\n```\noc apply -f deployments/ollama-gpu.yaml\n```\n\nThe streamlit application (linuxbot) can deployed as:\n\n```\noc apply -f deployments/linuxbot.yaml\n```\n\nWe have published a preconfigured container image on [quay.io/sroecker](https://quay.io/sroecker/linuxbot-app) that is used in this deployment.\n\nIn order to debug your application and ollama service you can deploy a curl image like this:\n\n```\noc run mycurl --image=curlimages/curl -it -- sh\noc attach mycurl -c mycurl -i -t\noc delete pod mycurl\n```\n\n## References\n\n- [Build a chatbot with custom data sources, powered by LlamaIndex](https://blog.streamlit.io/build-a-chatbot-with-custom-data-sources-powered-by-llamaindex/)\n- [SQL Query Engine with LlamaIndex + DuckDB](https://gpt-index.readthedocs.io/en/latest/examples/index_structs/struct_indices/duckdb_sql_query.html)\n- [AI on Openshift - LLMs, Chatbots, Talk with your Documentation](https://ai-on-openshift.io/demos/llm-chat-doc/llm-chat-doc/)\n- [Open Sourcerers - A personal AI assistant for developers that doesn't phone home](https://www.opensourcerers.org/2023/11/06/a-personal-ai-assistant-for-developers-that-doesnt-phone-home/)\n\n","funding_links":[],"categories":["Jupyter Notebook"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsroecker%2FLLM_AppDev-HandsOn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsroecker%2FLLM_AppDev-HandsOn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsroecker%2FLLM_AppDev-HandsOn/lists"}