{"id":27904822,"url":"https://github.com/runpod/langchain-runpod","last_synced_at":"2025-05-05T23:33:43.708Z","repository":{"id":282014947,"uuid":"946941489","full_name":"runpod/langchain-runpod","owner":"runpod","description":null,"archived":false,"fork":false,"pushed_at":"2025-04-03T15:00:55.000Z","size":153,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-05T14:55:54.702Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/runpod.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-11T23:12:03.000Z","updated_at":"2025-04-03T15:00:59.000Z","dependencies_parsed_at":null,"dependency_job_id":"108cb258-0a4b-4ae8-bff2-160d38d3832d","html_url":"https://github.com/runpod/langchain-runpod","commit_stats":null,"previous_names":["max4c/langchain-runpod","runpod/langchain-runpod"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runpod%2Flangchain-runpod","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runpod%2Flangchain-runpod/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runpod%2Flangchain-runpod/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runpod%2Flangchain-runpod/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/runpod","download_url":"https://codeload.github.com/runpod/langchain-runpod/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252594390,"owners_count":21773657,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-05T23:33:43.101Z","updated_at":"2025-05-05T23:33:43.700Z","avatar_url":"https://github.com/runpod.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# langchain-runpod\n\n`langchain-runpod` integrates [RunPod Serverless](https://www.runpod.io/serverless-gpu) endpoints with LangChain.\n\nIt allows you to interact with custom large language models (LLMs) and chat models deployed on RunPod's cost-effective and scalable GPU infrastructure directly within your LangChain applications.\n\nThis package provides:\n- `RunPod`: For interacting with standard text-completion models.\n- `ChatRunPod`: For interacting with conversational chat models.\n\n## Installation\n\n```bash\npip install -U langchain-runpod\n```\n\n## Authentication\n\nTo use this integration, you need a RunPod API key.\n\n1.  Obtain your API key from the [RunPod API Keys page](https://www.runpod.io/console/user/settings).\n2.  Set it as an environment variable:\n\n```bash\nexport RUNPOD_API_KEY=\"your-runpod-api-key\"\n```\n\nAlternatively, you can pass the `api_key` directly when initializing the `RunPod` or `ChatRunPod` classes, though using environment variables is recommended for security.\n\n## Basic Usage\n\nYou will also need the **Endpoint ID** for your deployed RunPod Serverless endpoint. Find this in the RunPod console under Serverless -\u003e Endpoints.\n\n### LLM (`RunPod`)\n\nUse the `RunPod` class for standard LLM interactions (text completion).\n\n```python\nimport os\nfrom langchain_runpod import RunPod\n\n# Ensure API key is set (or pass it as api_key=\"...\")\n# os.environ[\"RUNPOD_API_KEY\"] = \"your-runpod-api-key\"\n\nllm = RunPod(\n    endpoint_id=\"your-endpoint-id\", # Replace with your actual Endpoint ID\n    model_name=\"runpod-llm\", # Optional: For metadata\n    temperature=0.7,\n    max_tokens=100,\n)\n\n# Synchronous call\nprompt = \"What is the capital of France?\"\nresponse = llm.invoke(prompt)\nprint(f\"Sync Response: {response}\")\n\n# Async call\n# response_async = await llm.ainvoke(prompt)\n# print(f\"Async Response: {response_async}\")\n\n# Streaming (Simulated)\n# print(\"Streaming Response:\")\n# for chunk in llm.stream(prompt):\n#     print(chunk, end=\"\", flush=True)\n# print()\n```\n\n### Chat Model (`ChatRunPod`)\n\nUse the `ChatRunPod` class for conversational interactions.\n\n```python\nimport os\nfrom langchain_core.messages import HumanMessage, SystemMessage\nfrom langchain_runpod import ChatRunPod\n\n# Ensure API key is set (or pass it as api_key=\"...\")\n# os.environ[\"RUNPOD_API_KEY\"] = \"your-runpod-api-key\"\n\nchat = ChatRunPod(\n    endpoint_id=\"your-endpoint-id\", # Replace with your actual Endpoint ID\n    model_name=\"runpod-chat\", # Optional: For metadata\n    temperature=0.7,\n    max_tokens=256,\n)\n\nmessages = [\n    SystemMessage(content=\"You are a helpful assistant.\"),\n    HumanMessage(content=\"What are the planets in our solar system?\"),\n]\n\n# Synchronous call\nresponse = chat.invoke(messages)\nprint(f\"Sync Response:\\n{response.content}\")\n\n# Async call\n# response_async = await chat.ainvoke(messages)\n# print(f\"Async Response:\\n{response_async.content}\")\n\n# Streaming (Simulated)\n# print(\"Streaming Response:\")\n# for chunk in chat.stream(messages):\n#     print(chunk.content, end=\"\", flush=True)\n# print()\n```\n\n## Features and Limitations\n\n### API Interaction\n- **Asynchronous Execution**: RunPod Serverless endpoints are inherently asynchronous. This integration handles the underlying polling mechanism for the `/run` and `/status/{job_id}` endpoints automatically for both `RunPod` and `ChatRunPod` classes.\n- **Synchronous Endpoint**: While RunPod offers a `/runsync` endpoint, this integration primarily uses the asynchronous `/run` -\u003e `/status` flow for better compatibility and handling of potentially long-running jobs. Polling parameters (`poll_interval`, `max_polling_attempts`) can be configured during initialization.\n\n### Feature Support\n\nThe level of support for advanced LLM features depends heavily on the **specific model and handler** deployed on your RunPod endpoint. The RunPod API itself provides a generic interface.\n\n| Feature               | Support Level                                                                                               | Notes                                                                                                                                                                                                |\n|-----------------------|-------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| **Core Invoke/Gen**   | ✅ Supported                                                                                                 | Basic text generation and chat conversations work as expected (sync \u0026 async).                                                                                                                        |\n| **Streaming**         | ⚠️ Simulated                                                                                                | The `.stream()` and `.astream()` methods work by getting the full response first and then yielding it chunk by chunk. True token-level streaming requires a WebSocket-enabled RunPod endpoint handler. |\n| **Tool Calling**      | ↔️ Endpoint Dependent                                                                                       | No built-in support via standardized RunPod API parameters. Depends entirely on the endpoint handler interpreting tool descriptions/schemas passed in the `input`. Standard tests skipped.       |\n| **Structured Output** | ↔️ Endpoint Dependent                                                                                       | No built-in support via standardized RunPod API parameters. Depends on the endpoint handler's ability to generate structured formats (e.g., JSON) based on input instructions. Standard tests skipped. |\n| **JSON Mode**         | ↔️ Endpoint Dependent                                                                                       | No dedicated `response_format` parameter at the RunPod API level. Depends on the endpoint handler. Standard tests skipped.                                                                       |\n| **Token Usage**       | ❌ Not Available                                                                                            | The RunPod API does not provide standardized token usage fields. Usage metadata tests are marked `xfail`. Any token info must come from the endpoint handler's custom output.                        |\n| **Logprobs**          | ❌ Not Available                                                                                            | The RunPod API does not provide logprobs.                                                                                                                                                            |\n| **Image Input**       | ↔️ Endpoint Dependent                                                                                       | Standard tests pass, likely by adapting image URLs/data. Actual support depends on the endpoint handler.                                                                                             |\n\n### Important Notes\n\n1. **Endpoint Handler**: Ensure your RunPod endpoint runs a compatible LLM server (e.g., vLLM, TGI, FastChat, text-generation-webui) that accepts standard inputs (like `prompt` or `messages`) and returns text output in a common format (direct string, or a dictionary containing keys like `text`, `content`, `output`, `choices`, etc.). The integration attempts to parse common formats, but custom handlers might require modifications to the parsing logic (e.g., overriding `_process_response`).\n\n## Setting Up a RunPod Endpoint\n\n1. Go to [RunPod Serverless](https://www.runpod.io/console/serverless) in your RunPod console.\n2. Click \"New Endpoint\".\n3. Select a GPU and a suitable template (e.g., a template running vLLM, TGI, FastChat, or text-generation-webui with your desired model).\n4. Configure settings (like FlashInfer, custom container image if needed) and deploy.\n5. Once active, copy the **Endpoint ID** for use with this library.\n\nFor more details, refer to the [RunPod Serverless Documentation](https://docs.runpod.io/serverless/overview).\n\n## Future Enhancements\n\n- **Native Streaming via RunPod SDK:** While the current implementation simulates streaming, future versions could potentially integrate the official [`runpod` Python SDK](https://docs.runpod.io/sdks/python/endpoints/#streaming). This would enable true, low-latency token streaming *if* the target RunPod endpoint handler is configured with `\"return_aggregate_stream\": True`.\n- **Standardized Feature Handling:** Explore ways to better handle or document patterns for features like Tool Calling or JSON Mode if common conventions emerge for RunPod endpoint handlers.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frunpod%2Flangchain-runpod","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frunpod%2Flangchain-runpod","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frunpod%2Flangchain-runpod/lists"}