{"id":21810929,"url":"https://github.com/avatsaev/av-local-llm-api","last_synced_at":"2026-04-15T19:34:09.709Z","repository":{"id":218741319,"uuid":"747225989","full_name":"avatsaev/av-local-llm-api","owner":"avatsaev","description":"Allows to easily run local REST API with a custom LLM, running locally or remotely, with user defined system instructions.  Useful for quick local autmations that require problem solving with large langague models and interaction via a REST API.","archived":false,"fork":false,"pushed_at":"2024-01-23T17:08:55.000Z","size":18,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-10-18T23:15:55.085Z","etag":null,"topics":["fastapi","llama","llamacpp","llm","machine-learning","mistral","openai","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/avatsaev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-23T14:17:51.000Z","updated_at":"2024-07-21T10:58:46.000Z","dependencies_parsed_at":"2024-07-27T20:05:45.002Z","dependency_job_id":null,"html_url":"https://github.com/avatsaev/av-local-llm-api","commit_stats":{"total_commits":11,"total_committers":1,"mean_commits":11.0,"dds":0.0,"last_synced_commit":"fc5a7b937e50b8f40fa564d7adc1d72bab0dd7a7"},"previous_names":["avatsaev/av-local-llm-api"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avatsaev%2Fav-local-llm-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avatsaev%2Fav-local-llm-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avatsaev%2Fav-local-llm-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avatsaev%2Fav-local-llm-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/avatsaev","download_url":"https://codeload.github.com/avatsaev/av-local-llm-api/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244767702,"owners_count":20507110,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fastapi","llama","llamacpp","llm","machine-learning","mistral","openai","transformers"],"created_at":"2024-11-27T13:39:02.567Z","updated_at":"2026-04-15T19:34:04.655Z","avatar_url":"https://github.com/avatsaev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Local LLM API\n\nAllows to easily run local REST API with a custom LLM, running locally or remotely, with user defined system instructions.\n\nUseful for quick local autmations that require problem solving with large langague models and interaction via a REST API.\n\n## Installation with local inference\n\nRequirements:\n\n- Apple Silicon Machine with at least 16GB RAM (M1/M2/M3)\n  - OR an Nvidia CUDA GPU\n- Locally downloaded LLAMA Compatible GGUF model\n  - [Download Mistral-7B-Instruct Q8.0](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/tree/main)\n\n### Setup\n\n#### Run the following in the terminal\n\nFor Apple Silicon\n\n```bash\nCMAKE_ARGS=\"-DLLAMA_METAL=on\" pip install -r requirements.txt\n```\n\nFor CUDA:\n\n```bash\nCMAKE_ARGS=\"-DLLAMA_CUBLAS=on\" pip install -r requirements.txt\n```\n\n#### Setup the env config:\n\nUpdate the `system_prompt.txt` with instructions of your choice\n\nrename `.env.example` to `.env`\n\nModify the `MODEL_PATH` var to point it to your locally downloaded `gguf` file\n\n#### Run the web server:\n\n```bash\nuvicorn main:app\n```\n\nInference endpoint is available at: `http://127.0.0.1:8000/inference`\n\nMake a `POST` request with a JSON body containing your input\n\nCURL Example:\n\n```bash\ncurl --location 'http://127.0.0.1:8000/inference' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n  \"user_input\": \"YOU INPUT HERE\"\n}'\n\n```\n\n## Remote inference with a OpenAI compatible API\n\nIn `.env`, set inference mode to `remote`:\n\n```\nLLM_INFERENCE_MODE = 'remote'\n```\n\nDefine the remote server configuration:\n\n```\nLLM_API_URL = 'https://api.openai.com/v1/'\nLLM_API_KEY = 'YOUR_KEY'\nLLM_MODEL = 'gpt-3.5-turbo'\n```\n\n## Test example\n\n`system_prompt.txt`:\n\n```\nYou are a code redactoring expert that takes user input code in any language, and refactors all function parameters called `user` to DEPRECATED_user.\n---EXAMPLES---\nExample input:\nUpdateUser(user: newUserParams)\n\nExample output:\nUpdateUser(DEPRECATED_user: newUserParams)\n---\nRespond only with refactored code output and nothing else.\n\n```\n\n```bash\ncurl --location 'http://127.0.0.1:8000/inference' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n  \"user_input\": \"UpdateUser(user: newUserParams)\\ndeleteUser(user: user)\\ncreateUser({user: {...userParams}})\"\n}'\n```\n\nResponse JSON:\n\n```json\n{\n  \"inference_output\": \" UpdateUser(DEPRECATED_user: newUserParams)\\ndeleteUser(DEPRECATED_user: DEPRECATED_user)\\ncreateUser({DEPRECATED_user: {...newUserParams}})\"\n}\n```\n\n---\n\nAuthor: Vatsaev Aslan (@avatsaev)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Favatsaev%2Fav-local-llm-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Favatsaev%2Fav-local-llm-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Favatsaev%2Fav-local-llm-api/lists"}