{"id":16164007,"url":"https://github.com/cyclenerd/google-cloud-gcp-openai-api","last_synced_at":"2025-04-06T07:13:46.519Z","repository":{"id":188530444,"uuid":"678919406","full_name":"Cyclenerd/google-cloud-gcp-openai-api","owner":"Cyclenerd","description":"🌴 Drop-in replacement REST API for Vertex AI (PaLM 2, Codey, Gemini) that is compatible with the OpenAI API specifications","archived":false,"fork":false,"pushed_at":"2025-02-11T16:39:06.000Z","size":2905,"stargazers_count":94,"open_issues_count":2,"forks_count":21,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-30T06:06:48.589Z","etag":null,"topics":["bard-api","chatbot","chatgpt","chatgpt-api","fastapi","gemini","google","google-cloud","google-cloud-platform","langchain","openai","openai-api","openai-chatgpt","palm","palm2","vertex-ai"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Cyclenerd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"Cyclenerd"}},"created_at":"2023-08-15T17:29:31.000Z","updated_at":"2025-03-06T20:12:25.000Z","dependencies_parsed_at":"2023-08-15T19:27:52.209Z","dependency_job_id":"4be76a8f-f690-4d06-91c3-ea438e4d770e","html_url":"https://github.com/Cyclenerd/google-cloud-gcp-openai-api","commit_stats":{"total_commits":57,"total_committers":3,"mean_commits":19.0,"dds":"0.45614035087719296","last_synced_commit":"eb5eaf0cb19f6efd14ccec8bc30a489965fae9d3"},"previous_names":["cyclenerd/google-cloud-gcp-openai-api"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cyclenerd%2Fgoogle-cloud-gcp-openai-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cyclenerd%2Fgoogle-cloud-gcp-openai-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cyclenerd%2Fgoogle-cloud-gcp-openai-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cyclenerd%2Fgoogle-cloud-gcp-openai-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Cyclenerd","download_url":"https://codeload.github.com/Cyclenerd/google-cloud-gcp-openai-api/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247445671,"owners_count":20939958,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bard-api","chatbot","chatgpt","chatgpt-api","fastapi","gemini","google","google-cloud","google-cloud-platform","langchain","openai","openai-api","openai-chatgpt","palm","palm2","vertex-ai"],"created_at":"2024-10-10T02:44:58.508Z","updated_at":"2025-04-06T07:13:46.500Z","avatar_url":"https://github.com/Cyclenerd.png","language":"Jupyter Notebook","funding_links":["https://github.com/sponsors/Cyclenerd"],"categories":[],"sub_categories":[],"readme":"# OpenAI API for Google Cloud Vertex AI\n\n[![Badge: Google Cloud](https://img.shields.io/badge/Google%20Cloud-%234285F4.svg?logo=google-cloud\u0026logoColor=white)](#readme)\n[![Badge: OpenAI](https://img.shields.io/badge/OpenAI-%23412991.svg?logo=openai\u0026logoColor=white)](#readme)\n[![Badge: Python](https://img.shields.io/badge/Python-3670A0?logo=python\u0026logoColor=ffdd54)](#readme)\n\nThis project is a drop-in replacement REST API for Vertex AI (**PaLM 2, Codey, Gemini**) that is compatible with the OpenAI API specifications.\n\nExamples:\n\n| Chat with Gemini in Chatbot UI                            | Get help from Gemini in VSCode                    |\n|-----------------------------------------------------------|---------------------------------------------------|\n| ![Screenshot: Chatbot UI chat](./img/chatbot-ui-chat.png) | ![Screenshot: VSCode chat](./img/vscode-chat.png) |\n\nThis project is inspired by the idea of [LocalAI](https://github.com/go-skynet/LocalAI)\nbut with the focus on making [Google Cloud Platform Vertex AI PaLM](https://ai.google/) more accessible to anyone.\n\nA Google Cloud Run service is installed that translates the OpenAI API calls to Vertex AI (PaLM 2, Codey, Gemini).\n\n\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"img/openai-api-cloud-run-vertex-dark.png\"\u003e\n    \u003cimg src=\"img/openai-api-cloud-run-vertex.png\" alt=\"Diagram: OpenAI, Google Cloud Run and Vertex AI\"\u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\nSupported OpenAI API services:\n\n| OpenAI               | API                    | Supported |\n|----------------------|------------------------|-----------|\n| List models          | `/v1/models`           | ✅        |\n| Chat Completions     | `/v1/chat/completions` | ✅        |\n| Completions (Legacy) | `/v1/completions`      | ❌        |\n| Embeddings           | `/v1/embeddings`       | ❌        |\n\nThe software is developed in [Python](https://www.python.org/)\nand based on [FastAPI](https://fastapi.tiangolo.com/)\nand [LangChain](https://docs.langchain.com/docs/).\n\nEverything is designed to be very simple,\nso you can easily adjust the source code to your individual needs.\n\n\n## Step by Step Guide\n\nA Jupyter notebook [`Vertex_AI_Chat.ipynb`](./Vertex_AI_Chat.ipynb) with step-by-step instructions is prepared.\nIt will help you to deploy the API backend and [Chatbot UI](https://github.com/mckaywrigley/chatbot-ui) frontend as Google Cloud Run service.\n\n* [Open in Colab](https://colab.research.google.com/github/Cyclenerd/google-cloud-gcp-openai-api/blob/master/Vertex_AI_Chat.ipynb)\n* [Open in Vertex AI Workbench](https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/Cyclenerd/google-cloud-gcp-openai-api/master/Vertex_AI_Chat.ipynb)\n\n\n## Deploying to Cloud Run\n\nRequirements:\n\nYour user (the one used for deployment) must have proper permissions in the project.\nFor a fast and hassle-free deployemnt the \"Owner\" role is recommended.\n\nIn addition, the default compute service account (`[PROJECT_NR]-compute@developer.gserviceaccount.com`)\nmust have the role \"Role Vertex AI User\" (`roles/aiplatform.user`).\n\n\nAuthenticate:\n\n```bash\ngcloud auth login\n```\n\nSet default project:\n\n```bash\ngcloud config set project [PROJECT_ID]\n```\n\nRun the following script to create a container image\nand deploy that container as a public API (which allows unauthenticated calls) in Google Cloud Run:\n\n```bash\nbash deploy.sh\n```\n\n\u003e Note: You can change the generated *fake* OpenAI API key and Google Cloud region with environment variables:\n\u003e \n\u003e ```bash\n\u003e export OPENAI_API_KEY=\"sk-XYZ\"\n\u003e export GOOGLE_CLOUD_LOCATION=\"europe-west1\"\n\u003e bash deploy.sh\n\u003e ```\n\n\n## Running Locally\n\nThe software was tested on GNU/Linux and macOS with Python 3.11 and 3.12.3 ([3.12.4](https://github.com/pydantic/pydantic/issues/9609) currently not working).\nIf you want to use the software under Windows, you must set the environment variables with `set` instead of `export`.\n\nYou should also create a [virtual environment](https://docs.python.org/3/library/venv.html) with the version of Python you want to use, and activate it before proceeding.\n\nYou also need the [Google Cloud CLI](https://cloud.google.com/sdk/docs/install).\nThe Google Cloud CLI includes the `gcloud` command-line tool.\n\nInitiate a Python virtual environment and install requirements:\n\n```bash\npython3 -m venv .venv \u0026\u0026 \\\nsource .venv/bin/activate \u0026\u0026 \\\npip install -r requirements.txt\n```\n\nAuthenticate:\n\n```bash\ngcloud auth application-default login\n```\n\nSet default project:\n\n```bash\ngcloud auth application-default set-quota-project [PROJECT_ID]\n```\n\nRun with default model:\n\n```bash\nexport DEBUG=\"True\"\nexport OPENAI_API_KEY=\"sk-XYZ\"\nuvicorn vertex:app --reload\n```\n\nExample for Windows:\n\n```powershell\nset DEBUG=True\nset OPENAI_API_KEY=sk-XYZ\nuvicorn vertex:app --reload\n```\n\nRun with Gemini `gemini-pro` model:\n\n```bash\nexport DEBUG=\"True\"\nexport OPENAI_API_KEY=\"sk-XYZ\"\nexport MODEL_NAME=\"gemini-pro\"\nuvicorn vertex:app --reload\n```\n\nRun with Codey `codechat-bison-32k` model:\n\n```bash\nexport DEBUG=\"True\"\nexport OPENAI_API_KEY=\"sk-XYZ\"\nexport MODEL_NAME=\"codechat-bison-32k\"\nexport MAX_OUTPUT_TOKENS=\"16000\"\nuvicorn vertex:app --reload\n```\n\nThe application will now be running on your local computer.\nYou can access it by opening a web browser and navigating to the following address:\n\n```text\nhttp://localhost:8000/\n```\n\n## Usage\n\nHTTP request and response formats are consistent with the [OpenAI API](https://platform.openai.com/docs/api-reference/chat/object).\n\nFor example, to generate a chat completion, you can send a POST request to the `/v1/chat/completions` endpoint with the instruction as the request body:\n\n```bash\ncurl --location 'http://[ENDPOINT]/v1/chat/completions' \\\n--header 'Content-Type: application/json' \\\n--header 'Authorization: Bearer [API-KEY]' \\\n--data '{\n    \"model\": \"gpt-3.5-turbo\",\n    \"messages\": [\n      {\n        \"role\": \"user\",\n        \"content\": \"Say this is a test!\"\n      }\n    ]\n  }'\n```\n\nResponse:\n\n```json\n{\n  \"id\": \"cmpl-efccdeb3d2a6cfe144fdde11\",\n  \"created\": 1691577522,\n  \"object\": \"chat.completion\",\n  \"model\": \"gpt-3.5-turbo\",\n  \"usage\": {\n    \"prompt_tokens\": 0,\n    \"completion_tokens\": 0,\n    \"total_tokens\": 0\n  },\n  \"choices\": [\n    {\n      \"message\": {\n        \"role\": \"assistant\",\n        \"content\": \"Sure, this is a test.\"\n      },\n      \"finish_reason\": \"stop\",\n      \"index\": 0\n    }\n  ]\n}\n```\n\n### Bruno API client\n\n![Screenshot: Bruno API client](./img/bruno.png)\n\nDownload export for [Bruno](https://www.usebruno.com/) API client: [`bruno-export.json`](./bruno-export.json)\n\n## Configuration\n\nThe configuration of the software can be done with environment variables.\n\n![Screenshot: Google Cloud run](./img/cloud-run-env.png)\n\nThe following variables with default values exist:\n\n| Variable                | Default                | Description |\n|-------------------------|------------------------|-------------|\n| DEBUG                   | False                  | Show debug messages that help during development. |\n| GOOGLE_CLOUD_LOCATION   | us-central1            | [Google Cloud Platform region](https://gcloud-compute.com/regions.html) for API calls. |\n| GOOGLE_CLOUD_PROJECT_ID | [DEFAULT_AUTH_PROJECT] | Identifier for your project. If not specified, the project of authentication is used. |\n| HOST                    | 0.0.0.0                | Bind socket to this host. |\n| MAX_OUTPUT_TOKENS       | 512                    | Token limit determines the maximum amount of text output from one prompt. Can be overridden by the end user as required by the OpenAI API specification. |\n| MODEL_NAME              | chat-bison             | One of the [foundation models](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/models#foundation_models) that are available in Vertex AI. |\n| OPENAI_API_KEY          | sk-[RANDOM_HEX]        | Self-generated *fake* OpenAI API key used for authentication against the application. |\n| PORT                    | 8000                   | Bind socket to this port. |\n| TEMPERATURE             | 0.2                    | Sampling temperature, it controls the degree of randomness in token selection. Can be overridden by the end user as required by the OpenAI API specification. |\n| TOP_K                   | 40                     | How the model selects tokens for output, the next token is selected from. | \n| TOP_P                   | 0.8                    | Tokens are selected from most probable to least until the sum of their. Can be overridden by the end user as required by the OpenAI API specification. |\n\n### OpenAI Client Library\n\nIf your application uses [client libraries](https://github.com/openai/openai-python) provided by OpenAI,\nyou only need to modify the `OPENAI_API_BASE` environment variable to match your Google Cloud Run endpoint URL:\n\n```bash\nexport OPENAI_API_BASE=\"https://openai-api-vertex-XYZ.a.run.app/v1\"\npython your_openai_app.py\n```\n\n### Chatbot UI\n\nWhen deploying the [Chatbot UI](https://github.com/mckaywrigley/chatbot-ui) application,\nthe following environment variables must be set:\n\n| Variable        | Value                               |\n|-----------------|-------------------------------------|\n| OPENAI_API_KEY  | API key generated during deployment |\n| OPENAI_API_HOST | Google Cloud Run URL                |\n\n![Screenshot: Chatbot UI container](./img/chatbot-ui-env.png)\n\n#### Deploying Chatbot UI to Cloud Run\n\nRun the following script to create a container image from the GitHub source code\nand deploy that container as a public website (which allows unauthenticated calls) in Google Cloud Run:\n\n```bash\nexport OPENAI_API_KEY=\"sk-XYZ\"\nexport OPENAI_API_HOST=\"https://openai-api-vertex-XYZ.a.run.app\"\nbash chatbot-ui.sh\n```\n\n### Chatbox\n\nSet the following [Chatbox](https://chatboxai.app/) settings:\n\n| Setting        | Value                               |\n|----------------|-------------------------------------|\n| AI Provider    | OpenAI API                          |\n| OpenAI API Key | API key generated during deployment |\n| API Host       | Google Cloud Run URL                |\n\n![Screenshot: Chatbot UI container](./img/chatbox-settings.png)\n\n### VSCode-OpenAI\n\nThe [VSCode-OpenAI extension](https://marketplace.visualstudio.com/items?itemName=AndrewButson.vscode-openai) is a powerful and versatile tool designed to integrate OpenAI features seamlessly into your code editor.\n\nTo activate the setup, you have two options:\n\n* either use the command \"vscode-openai.configuration.show.quickpick\" or\n* access it through the vscode-openai Status Bar located at the bottom left corner of VSCode.\n\n![Screenshot: VSCode settings](./img/vscode-settings.png)\n\nSelect `openai.com` and enter the Google Cloud Run URL with `/v1` during setup.\n\n### ChatGPT Discord Bot\n\nWhen deploying the [Discord Bot](https://github.com/openai/gpt-discord-bot) application,\nthe following environment variables must be set:\n\n| Variable        | Value                               |\n|-----------------|-------------------------------------|\n| OPENAI_API_KEY  | API key generated during deployment |\n| OPENAI_API_BASE | Google Cloud Run URL with `/v1`     |\n\n### ChatGPT in Slack\n\nWhen deploying the [ChatGPT in Slack](https://github.com/seratch/ChatGPT-in-Slack) application,\nthe following environment variables must be set:\n\n| Variable        | Value                               |\n|-----------------|-------------------------------------|\n| OPENAI_API_KEY  | API key generated during deployment |\n| OPENAI_API_BASE | Google Cloud Run URL with `/v1`     |\n\n### ChatGPT Telegram Bot\n\nWhen deploying the [ChatGPT Telegram Bot](https://github.com/karfly/chatgpt_telegram_bot) application,\nthe following environment variables must be set:\n\n| Variable        | Value                               |\n|-----------------|-------------------------------------|\n| OPENAI_API_KEY  | API key generated during deployment |\n| OPENAI_API_BASE | Google Cloud Run URL with `/v1`     |\n\n## Contributing\n\nHave a patch that will benefit this project?\nAwesome! Follow these steps to have it accepted.\n\n1. Please read [how to contribute](CONTRIBUTING.md).\n1. Fork this Git repository and make your changes.\n1. Create a Pull Request.\n1. Incorporate review feedback to your changes.\n1. Accepted!\n\n\n## License\n\nAll files in this repository are under the [Apache License, Version 2.0](LICENSE) unless noted otherwise.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyclenerd%2Fgoogle-cloud-gcp-openai-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcyclenerd%2Fgoogle-cloud-gcp-openai-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyclenerd%2Fgoogle-cloud-gcp-openai-api/lists"}