{"id":13456231,"url":"https://github.com/yohasebe/openai-chat-api-workflow","last_synced_at":"2025-05-16T17:08:23.027Z","repository":{"id":136203640,"uuid":"561133658","full_name":"yohasebe/openai-chat-api-workflow","owner":"yohasebe","description":"🎩 An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT models 🤖💬 It also allows image generation/editing/understanding 🖼️, speech-to-text conversion 🎤, and text-to-speech synthesis 🔈","archived":false,"fork":false,"pushed_at":"2025-04-27T06:00:02.000Z","size":118848,"stargazers_count":315,"open_issues_count":22,"forks_count":9,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-05-10T16:09:24.891Z","etag":null,"topics":["ai","alfred","chatbot","dall-e","gpt","image-generation","image-understanding","openai","speech-to-text","text-to-speech","whisper","workflow"],"latest_commit_sha":null,"homepage":"https://github.com/yohasebe/openai-text-completion-workflow","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yohasebe.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-11-03T02:42:02.000Z","updated_at":"2025-04-27T06:00:06.000Z","dependencies_parsed_at":"2023-10-16T03:47:47.810Z","dependency_job_id":"4c51066c-97d7-4dbe-89ef-68145ed25fb8","html_url":"https://github.com/yohasebe/openai-chat-api-workflow","commit_stats":{"total_commits":271,"total_committers":2,"mean_commits":135.5,"dds":"0.0073800738007380184","last_synced_commit":"8acec7dbab347ba5e43ba476dfef4eed37cae815"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yohasebe%2Fopenai-chat-api-workflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yohasebe%2Fopenai-chat-api-workflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yohasebe%2Fopenai-chat-api-workflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yohasebe%2Fopenai-chat-api-workflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yohasebe","download_url":"https://codeload.github.com/yohasebe/openai-chat-api-workflow/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254573589,"owners_count":22093731,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","alfred","chatbot","dall-e","gpt","image-generation","image-understanding","openai","speech-to-text","text-to-speech","whisper","workflow"],"created_at":"2024-07-31T08:01:18.206Z","updated_at":"2025-05-16T17:08:23.013Z","avatar_url":"https://github.com/yohasebe.png","language":null,"funding_links":[],"categories":["Others","Chatbots"],"sub_categories":[],"readme":"# OpenAI Chat API Workflow for Alfred\n\n\u003cimg src='./icons/openai.png' style='height:120px;'/\u003e\n\n🎩 An [Alfred 5](https://www.alfredapp.com/) Workflow for using the [OpenAI](https://beta.openai.com/) Chat API to interact with GPT models 🤖💬. It also allows image generation 🖼️, image understanding 👀, speech-to-text conversion 🎤, and text-to-speech synthesis 🔈.\n\n📦 Download [**OpenAI Chat API Workflow**](https://github.com/yohasebe/openai-chat-api-workflow/raw/main/openai-chat-api.alfredworkflow) (version `3.7.0`)\n\nYou can execute all the above features using:\n\n- Alfred UI 🖥️\n- Selected text 📝\n- A dedicated web UI 🌐\n\nThe web UI is constructed by the workflow and runs locally on your Mac 💻. The API call is made directly between the workflow and OpenAI, ensuring your chat messages are not shared online with anyone other than OpenAI 🔒. Furthermore, OpenAI does not use the data from the API Platform for training 🚫.\n\nYou can export the chat data to an external file in simple JSON format 📄, and it is possible to continue the chat by importing it later 🔄.\n\n\u003cimg src=\"./docs/img/OpenAI-Alfred-Workflow.png\" width=\"600\" /\u003e\n\n\u003ckbd\u003e\u003cimg src=\"./docs/img/web-interface.png\" width=\"700\"\u003e\u003c/kbd\u003e\n\n\u003ckbd\u003e\u003cimg src=\"./docs/img/openai-chat-api-workflow.gif\" width=\"700\" /\u003e\u003c/kbd\u003e\n\n## Installation\n\n1. Install [Homebrew](https://brew.sh/)\n2. Run the following command in a terminal: `brew install pandoc mpv sox jq duti`\n3. Download and run [**OpenAI Chat API Workflow**](https://github.com/yohasebe/openai-chat-api-workflow/raw/main/openai-chat-api.alfredworkflow)\n4. Set your [OpenAI API key](https://platform.openai.com/account/api-keys)\n5. Enable accessibility settings for Alfred in `System Preferences` → `Security \u0026 Privacy` → `Privacy` → `Accessibility`\n\n\u003ckbd\u003e\u003cimg src=\"./docs/img/accessibility.png\" width=\"600\"\u003e\u003c/kbd\u003e\n\n**Setup Hotkeys**\n\nYou can set up hotkeys in the settings screen of the workflow. To set up hotkeys, double-click on the light purple workflow elements.\n\n\u003ckbd\u003e\u003cimg width=\"700\" src=\"./docs/img/openai-workflow-overview.png\"\u003e\u003c/kbd\u003e\n\n1. Open Web UI (Recommended)\n2. Direct Query\n3. Send Selected Text\n4. Screen Capture for Image Editing \n5. Screen Capture for Image Understanding\n6. Speech to Text\n7. Text to Speech (Selected text)\n\n**Dependencies**\n\n- Alfred 5 [Powerpack](https://www.alfredapp.com/shop/)\n- OpenAI [API key](https://platform.openai.com/account/api-keys)\n- [Pandoc](https://pandoc.org/): to convert Markdown to HTML\n- [MPV](https://mpv.io/): to play text-to-speech audio stream\n- [Sox](https://sox.sourceforge.net/sox.html): to record voice input\n- [jq](https://jqlang.github.io/jq/): to handle chat history in JSON\n- [duti](https://github.com/moretension/duti): to detect the default web browser\n\nTo start using this workflow, you must set the environment variable `apikey`, which you can obtain by creating a new [OpenAI account](https://platform.openai.com/account/api-keys). See also the [Configuration](#configuration) section below.\n\nYou will also need to install the `pandoc` and `sox` programs. Pandoc allows this workflow to convert the Markdown response from OpenAI to HTML and display the result in your default web browser with syntax highlighting enabled (especially useful when using this workflow to generate program code). Sox allows you to record voice audio to convert to text using the speech-to-text API.\n\nTo set up dependencies (`pandoc`, `mpv`, `sox`, `jq`, and `duti`), first install [Homebrew](https://brew.sh/) and run the following command:\n\n```shell\nbrew install pandoc mpv sox jq duti\n```\n\n**Recent Changelog**\n\n- 3.7.0:\n  - Image editing capability added\n  - Image generation model `gpt-image-1` supported\n  - Icon image is hosted inside the workflow\n  - Web UI color scheme improved\n  - `o4-mini` and `o3` models supported\n  - `gpt-4.1` series models supported and `gpt-4.1-mini` set as default\n  - Image recognition and understanding features from web UI improved\n- 3.6.4:\n  - New speech-to-text models `gpt-4o-mini-transcribe` (default) and `gpt-4o-transcribe` supported\n  - New text-to-speech model `gpt-4o-mini-tts` supported (default)\n  - Added \"TTS instruction\" feature for character and speaking style control\n  - Added web search capability (triggered by \"search\" at the beginning of prompt)\n  - New TTS voice `ballad` added\n\n[Complete Change Log](https://github.com/yohasebe/openai-chat-api-workflow/blob/main/CHANGELOG.md)\n\n## Methods of Execution\n\nHere are three methods to run the workflow: 1) Using commands within the Alfred UI, 2) Passing selected text to the workflow, 3) Utilizing the Web UI. Additionally, there's a convenient method for making brief inquiries to GPT.\n\n**Commands within the Alfred UI**\n\nYou can enter a query directly into Alfred's textbox:\n\n- Method 1: Alfred textbox → keyword (`openai`) → space/tab → input query → select a command (see below)\n- Method 2: Alfred textbox → input query → select fallback search (`OpenAI Query`)\n\n**Passing Selected Text**\n\nYou can select any text on your Mac and send it to the workflow:\n\n- Method 1: Select text → universal action hotkey → select `OpenAI Query`\n- Method 2: Set up a custom hotkey to `Send selected text to OpenAI`\n\n**Using Web Interface**\n\nYou can open the web interface:\n\n- Method 1: Alfred textbox → keyword (`openai-webui`)\n- Method 2: Set up a custom hotkey to `Open web interface`\n\n**Using the Default Browser**\n\nIf your default browser is set to one of the following and the duti command is installed on your system, the web interface will automatically open in your chosen browser. If not, Safari will be used as the default.\n\n- Google Chrome (Stable, Beta, Dev, etc.)\n- Microsoft Edge (Stable, Beta, Dev, etc.)\n- Brave Browser\n\nRestart the OpenAI Workflow server by executing `openai-restart-server` if the web UI does not work as expected after changing the default browser.\n\n**Web UI Modes**\n\nSwitch modes (`light`/`dark`/`auto`) with the `Web UI Mode` selector in the settings.\n\n\u003ckbd\u003e\u003cimg width=\"700\" src=\"./docs/img/web-interface-dark.png\"\u003e\u003c/kbd\u003e\n\n**Simple Direct Query/Chat**\n\nTo quickly chat with GPT:\n\n- Method 1: Type keyword `gpt` → space/tab → input query (e.g., \"**gpt** what is a large language model?\")\n- Method 2: Set up a custom hotkey to `OpenAI Direct Query`\n\n\u003cimg src='./docs/img/direct-query.png' style='width:700px;'/\u003e\n \n## Basic Commands\n\nWith `Direct Query`, the input text is sent directly to the OpenAI Chat API as a prompt. You can also create a query by prepending or appending text to the input.\n\n\u003cspan\u003e\u003cimg src='./icons/patch-question.png' style='height:1em;'/\u003e\u003c/span\u003e **Direct Query**\n\nThe input text is directly sent as a prompt to the OpenAI Chat API.\n\n\u003ckbd\u003e\u003cimg src='./docs/img/direct-query.gif' style='width:700px;'/\u003e\u003c/kbd\u003e\n\n\u003cspan\u003e\u003cimg src='./icons/arrow-bar-down.png' style='height:1em;'/\u003e\u003c/span\u003e **Prepend Text + Query**\n\nAfter entering the initial text, you are prompted for additional text. The additional text is added *before* the initial text, and the resulting text is used as the query.\n\n\u003ckbd\u003e\u003cimg src='./docs/img/prepend.gif' style='width:700px;'/\u003e\u003c/kbd\u003e\n\n\u003cspan\u003e\u003cimg src='./icons/arrow-bar-up.png' style='height:1em;'/\u003e\u003c/span\u003e **Append Text + Query**\n\nAfter entering the initial text, you are prompted for additional text. The additional text is added *after* the initial text, and the resulting text is used as the query.\n\n\u003cspan\u003e\u003cimg src='./icons/picture.png' style='height:1em;'/\u003e\u003c/span\u003e **Generate Image**\n\nThe DALL-E API (`gpt-image-1`, dall-e-3`, or `dall-e-2`) is used to generate images based on the entered prompts. See [Image Generation](#image-generation) below.\n\n## Commands for Specific Purposes\n\nSome of the examples shown on [OpenAI's Examples page](https://platform.openai.com/examples) are incorporated into this Workflow as commands. Functions not prepared as commands can be realized by giving appropriate prompts to the above [Basic Commands](#basic-commands).\n\n\u003cspan\u003e\u003cimg src='./icons/code-square.png' style='height:1em;'/\u003e\u003c/span\u003e **Write Program Code**\n\nGPT generates program code and example output based on the entered text. You can specify the purpose of the program, its function, the language, and the technology to be used, etc.\n\n**Example Input**\n\n\u003e Create a command line program that takes an English sentence and returns syntactically parsed output. Provide program code in Python and example usage.\n\n**Example Output**\n\n\u003ckbd\u003e\u003cimg width=\"700\" src=\"./docs/img/code.png\"\u003e\u003c/kbd\u003e\n\n\u003cspan\u003e\u003cimg src='./icons/quora.png' style='height:1em;'/\u003e\u003c/span\u003e **Ask in Your Language**\n\nYou can ask questions in the language set to the variable `first_language`.\n\n**Note**: If the value of `first_language` is not `English` (e.g., `Japanese`), the query may result in a less accurate response.\n\n\u003cspan\u003e\u003cimg src='./icons/translate.png' style='height:1em;'/\u003e\u003c/span\u003e **Translate L1 to L2**\n\nGPT translates text from the language specified in the variable `first_language` to the language specified in `second_language`.\n\n\u003cspan\u003e\u003cimg src='./icons/translate.png' style='height:1em;'/\u003e\u003c/span\u003e **Translate L2 to L1**\n\nGPT translates text from the language specified in the variable `second_language` to the language specified in `first_language`.\n\n\u003cspan\u003e\u003cimg src='./icons/pencil.png' style='height:1em;'/\u003e\u003c/span\u003e **Grammar Correction**\n\nGPT corrects sentences that may contain grammatical errors. See OpenAI's [description](https://beta.openai.com/examples/default-grammar).\n\n\u003cspan\u003e\u003cimg src='./icons/lightbulb.png' style='height:1em;'/\u003e\u003c/span\u003e **Brainstorm**\n\nGPT assists you in brainstorming innovative ideas based on any given text.\n\n\u003cspan\u003e\u003cimg src='./icons/book.png' style='height:1em;'/\u003e\u003c/span\u003e **Create Study Notes**\n\nGPT provides study notes on a given topic. See OpenAI's [description](https://beta.openai.com/examples/default-study-notes) for this example.\n\n\u003cspan\u003e\u003cimg src='./icons/arrow-left-right.png' style='height:1em;'/\u003e\u003c/span\u003e **Analogy Maker**\n\nGPT creates analogies. See OpenAI's [description](https://beta.openai.com/examples/default-analogy-maker) for this example.\n\n\u003cspan\u003e\u003cimg src='./icons/list-ul.png' style='height:1em;'/\u003e\u003c/span\u003e **Essay Outline**\n\nGPT generates an outline for a research topic. See OpenAI's [description](https://beta.openai.com/examples/default-essay-outline) for this example.\n\n\u003cspan\u003e\u003cimg src='./icons/chat-left-quote.png' style='height:1em;'/\u003e\u003c/span\u003e **TL;DR Summarization**\n\nGPT summarizes a given text. See OpenAI's [description](https://beta.openai.com/examples/default-tldr-summary) for this example.\n\n\u003cspan\u003e\u003cimg src='./icons/emoji-smile.png' style='height:1em;'/\u003e\u003c/span\u003e **Summarize for a 2nd Grader**\n\nGPT translates complex text into more straightforward concepts. See OpenAI's [description](https://beta.openai.com/examples/default-summarize) for this example.\n\n\u003cspan\u003e\u003cimg src='./icons/key.png' style='height:1em;'/\u003e\u003c/span\u003e **Keywords**\n\nGPT extracts keywords from a block of text. See OpenAI's [description](https://beta.openai.com/examples/default-keywords) for this example.\n\n## Image Generation\n\nImage generation can be executed through one of the above commands. It is also possible to use the web UI. By using the web UI, you can interactively change the prompt to get closer to the desired image.\n\n\u003ckbd\u003e\u003cimg width=\"700\" src=\"./docs/img/image-generation-1.png\"\u003e\u003c/kbd\u003e\n\nTo use the image generation mode with the `gpt-image-1` model, you may need to complete the \u003ca href=\"https://help.openai.com/en/articles/10910291-api-organization-verification\"\u003eAPI Organization Verification\u003c/a\u003e from your \u003ca href=\"https://platform.openai.com/settings/organization/general\"\u003edeveloper console\u003c/a\u003e.\n\n\u003ckbd\u003e\u003cimg width=\"700\" src=\"./docs/img/image-generation-2.png\"\u003e\u003c/kbd\u003e\n\n\u003ckbd\u003e\u003cimg width=\"700\" src=\"./docs/img/image-generation-3.png\"\u003e\u003c/kbd\u003e\n\n\n## Image Editing\n\nThere is a command to edit images using the `gpt-image-1` model. There is an Universal Action command `OpenAI Image Edit`. You can also use the web UI to upload an image file for editing. The image file is sent to the OpenAI Image Editing API, and the result is displayed after a while (at the maximum of 2 minutes).\n\n\u003ckbd\u003e\u003cimg width=\"700\" src=\"./docs/img/image-editing-1.png\"\u003e\u003c/kbd\u003e\n\u003ckbd\u003e\u003cimg width=\"700\" src=\"./docs/img/image-editing-2.png\"\u003e\u003c/kbd\u003e\n\n## Image/PDF Understanding\n\nImage understanding can be executed through the `openai-vision` command. It starts capture mode and lets you specify a part of the screen to be analyzed. Alternatively, you can specify an image file (jpg, jpeg, png, gif) using the \"OpenAI Vision\" file action.\n\n\u003ckbd\u003e\u003cimg src=\"./docs/img/openai-workflow-vision.gif\" width=\"700\"\u003e\u003c/kbd\u003e\n\nAlternatively, you can use the web UI to upload an image file for analysis. The image file is sent to the OpenAI Vision API, and the result is displayed in the web UI.\n\n\u003ckbd\u003e\u003cimg src=\"./docs/img/openai-vision-web-ui.png\" width=\"700\"\u003e\u003c/kbd\u003e\n\nYou can also specify an image file using the universal action hotkey on the file in Finder. With this method you can not only analyze image files (jpg, jpeg, png, gif) but also PDF files.\n\n## Speech Synthesis and Speech Recognition\n\nMost text-to-speech and speech-to-text features are available on the web UI. However, there are certain specific features provided as commands, such as audio file to text conversion and transcription with timestamps.\n\n\u003ckbd\u003e\u003cimg width=\"700\" src=\"./docs/img/speech-to-text-web.png\"\u003e\u003c/kbd\u003e\n\n**Text-to-Speech Synthesis**\n\nText entered or response text from GPT can be read out in a natural voice using OpenAI's text-to-speech API.\n\n- Method 1: Press the `Play TTS` button on the web UI\n- Method 2: Select text → universal action hotkey → select `OpenAI Text-to-Speech`\n\n**Speech-to-Text Conversion**\n\n- Method 1: Press the `Voice Input` button on the web UI\n- Method 2: Alfred textbox → keyword (`openai-speech`)\n\n**Audio File to Text**\n\nYou can select an audio file in `mp3`, `mp4`, `flac`, `webm`, `wav`, or `m4a` format (under 25MB) and send it to the workflow:\n\n- Select the file → universal action hotkey → select `OpenAI Speech-to-Text`\n\n**Record Voice Audio and Transcribe**\n\nYou can record voice audio and send it to the Workflow for transcription using the speech-to-text API. Recording time is limited to 30 minutes and will automatically stop after this duration.\n\n\u003ckbd\u003e\u003cimg width=\"600\" alt=\"transcript-srt\" src=\"./docs/img/speech-to-text.png\"\u003e\u003c/kbd\u003e\n\n- Alfred textbox → keyword (`openai-speech`) → Terminal window opens and recording starts\n- Speak into the internal or external microphone → Press Enter to finish recording\n- Choose processes to apply to the recorded audio:\n\n  - Transcribe (+ delete recording)\n  - Transcribe (+ save recording to desktop)\n  - Transcribe and query (+ delete recording)\n  - Transcribe and query (+ save recording to desktop)\n  - Exit (+ delete recording)\n  - Exit (+ save recording to desktop)\n\nYou can choose the format of the transcribed text as `text`, `srt`, or `vtt` in the workflow's settings. Below are examples in the `text` and `srt` formats:\n\n\u003ckbd\u003e\u003cimg width=\"700\" alt=\"transcript-text\" src=\"./docs/img/transcript-text.png\"\u003e\u003c/kbd\u003e\n\n\u003ckbd\u003e\u003cimg width=\"700\" alt=\"transcript-srt\" src=\"./docs/img/transcript-srt.png\"\u003e\u003c/kbd\u003e\n\n## Speech Synthesis and Speech Recognition\n\nMost text-to-speech and speech-to-text features are available on the web UI. However, there are certain specific features provided as commands, such as audio file to text conversion and transcription with timestamps.\n\n\u003ckbd\u003e\u003cimg width=\"700\" src=\"./docs/img/speech-to-text-web.png\"\u003e\u003c/kbd\u003e\n\n**Text-to-Speech Synthesis**\n\nText entered or response text from GPT can be read out in a natural voice using OpenAI's text-to-speech API.\n\n- Method 1: Press the `Play TTS` button on the web UI\n- Method 2: Select text → universal action hotkey → select `OpenAI Text-to-Speech`\n\n**Speech-to-Text Conversion**\n\n- Method 1: Press the `Voice Input` button on the web UI\n- Method 2: Alfred textbox → keyword (`openai-speech`)\n\n**Audio File to Text**\n\nYou can select an audio file in `mp3`, `mp4`, `flac`, `webm`, `wav`, or `m4a` format (under 25MB) and send it to the workflow:\n\n- Select the file → universal action hotkey → select `OpenAI Speech-to-Text`\n\n**Record Voice Audio and Transcribe**\n\nYou can record voice audio and send it to the Workflow for transcription using the speech-to-text API. Recording time is limited to 30 minutes and will automatically stop after this duration.\n\n\u003ckbd\u003e\u003cimg width=\"600\" alt=\"transcript-srt\" src=\"./docs/img/speech-to-text.png\"\u003e\u003c/kbd\u003e\n\n- Alfred textbox → keyword (`openai-speech`) → Terminal window opens and recording starts\n- Speak into the internal or external microphone → Press Enter to finish recording\n- Choose processes to apply to the recorded audio:\n\n  - Transcribe (+ delete recording)\n  - Transcribe (+ save recording to desktop)\n  - Transcribe and query (+ delete recording)\n  - Transcribe and query (+ save recording to desktop)\n  - Exit (+ delete recording)\n  - Exit (+ save recording to desktop)\n\nYou can choose the format of the transcribed text as `text`, `srt`, or `vtt` in the workflow's settings. Below are examples in the `text` and `srt` formats:\n\n\u003ckbd\u003e\u003cimg width=\"700\" alt=\"transcript-text\" src=\"./docs/img/transcript-text.png\"\u003e\u003c/kbd\u003e\n\n\u003ckbd\u003e\u003cimg width=\"700\" alt=\"transcript-srt\" src=\"./docs/img/transcript-srt.png\"\u003e\u003c/kbd\u003e\n\n- **Reasoning Effort**: Set the reasoning effort to `low`, `medium`, or `high`. (default: `medium`). It gives reasoning models (`o1` and `o3-mini`) a guidance on how many reasoning tokens it should generate before creating a response to the prompt. See OpenAI's [documentation](https://platform.openai.com/docs/guides/reasoning#reasoning-effort).\n- **Max Tokens**: Maximum number of tokens to be generated upon completion (default: `2048`). If this parameter is set to `0`, `null` is sent to the API as the default value (the maximum number of tokens is not specified). See OpenAI's [documentation](https://beta.openai.com/docs/api-reference/chats/create#chats/create-max_tokens).\n- **Temperature**: See OpenAI's [documentation](https://beta.openai.com/docs/api-reference/chats/create#chats/create-temperature). (default: `0.3`)\n- **Top P**: See OpenAI's [documentation](https://beta.openai.com/docs/api-reference/chats/create#chats/create-top_p). (default: `1.0`)\n- **Frequency Penalty**: See OpenAI's [documentation](https://beta.openai.com/docs/api-reference/chats/create#chats/create-frequency_penalty). (default: `0.0`)\n- **Presence Penalty**: See OpenAI's [documentation](https://beta.openai.com/docs/api-reference/chats/create#chats/create-presence_penalty). (default: `0.0`)\n- **Memory Span**: Set the number of past utterances sent to the API as context. Setting `4` for this parameter means 2 conversation turns (user → assistant → user → assistant) will be sent as context for a new query. The larger the value, the more tokens will be consumed. (default: `10`)\n- **Max Characters**: Maximum number of characters that can be included in a query (default: `50000`).\n- **Timeout**: The number of seconds (default: `10`) to wait before opening the socket and connecting to the API. If the connection fails, reconnection (up to 20 times) will be attempted after 1 second.\n- **Add Emoji**: If enabled, the response text from GPT will contain emoji characters appropriate for the content. This is realized by adding the following sentence at the end of the system content. (default: `enabled`)\n  \n  \u003e Add emojis that are appropriate to the content of the response.\n  \n- **System Content**: Text to send with every query sent to the API as general information about the specification of the chat. The default value is as follows:\n  \n  \u003e You are a friendly but professional consultant who answers various questions, makes decent suggestions, and gives helpful advice in response to a prompt from the user. Your response must be concise, suggestive, and accurate.\n\n**Web Search Parameters**\n\n- **Web Search Model**: One of the available web search models: `gpt-4o-mini-search-preview` or `gpt-4o-search-preview`. (default: `gpt-4o-mini-search-preview`)\n\nNote: Web search is triggered when the prompt begins with the word \"search\" (case-insensitive). You can use various formats:\n- `Search what is quantum computing`\n- `search: latest developments in AI`\n- `[search] climate change effects`\n\nWhen such formats are detected, the web search model will be automatically used instead of the standard chat model.\n\n**Image Understanding Parameters**\n\n- **Max Size for Image Understanding**: The maximum pixel value (`512` to `2000`) of the larger side of the image data sent to the image understanding API. Larger images will be resized accordingly. (Default: `512`)\n\n**Image Generation/Editing Parameters**\n\nImage editing feature is available only for the `gpt-image-1` model.\n\n- **Image Generation Model**: `gpt-image-1`, `dall-e-3`, and `dall-e-2` are available. (default: `dall-e-3`)\n- **Style** (for `dall-e-3`): Choose the style of the image from `vivid` and `natural`. (default: `vivid`)\n- **Number of Images** (for `dall-e-2` image generation): Set the number of images to generate in image generation mode from `1` to `10`. (default: `1`)\n- **Image Size** (for image generation/editing): Set the size of images to generate\n- **Quality** (for image generation/editing): Choose the quality of the image\n- **Content Moderation** (for `gpt-image-1` image generation): `auto` or `low` (default: `auto`)\n- **Background** (for `gpt-image-1` image generation/editing): `auto`, `transparent`, or `opaque` (default: `auto`)\n\n**Speech-to-Text Parameters**\n\n- **Transcription Model**: One of the available transcription models: `whisper-1`, `gpt-4o-mini-transcribe`, or `gpt-4o-transcribe`. (default: `gpt-4o-mini-transcribe`)\n- **Transcription Format**: Set the format of the text transcribed from the microphone input or audio files to `text`, `srt`, or `vtt` (default: `text`). Since `srt` and `vtt` formats are supported by `whisper-1` only, the workflow will automatically switch to `whisper-1` when these formats are selected.\n- **Processes after Recording**: Set the default choice of what processes follow after audio recording finishes. (default: `Transcribe [+ delete recording]`).\n  \n  - Transcribe [+ delete recording]\n  - Transcribe [+ save recording to desktop]\n  - Transcribe and query [+ delete recording]\n  - Transcribe and query [+ save recording to desktop]\n  \n- **Audio to English**: When enabled, the speech-to-text (STT) API will transcribe the input audio and output text translated into English. (default: `disabled`)\n\n**Text-to-Speech Parameters**\n\n- **Text-to-Speech Model**: One of the available TTS models: `tts-1`, `tts-1-hd`, or `gpt-4o-mini-tts`. (default: `gpt-4o-mini-tts`)\n- **Text-to-Speech Voice**: The voice to use when generating the audio. Supported voices are: `alloy`, `ash`, `ballad`, `coral`, `echo`, `fable`, `onyx`, `nova`, `sage`, and `shimmer`. (default: `alloy`)\n- **Text-to-Speech Speed**: The speed of the generated audio. Select a value from 0.25 to 4.0. (default: `1.0`)\n- **TTS Instruction**: Specify character or speaking style instructions for text-to-speech synthesis.\n- **Automatic Text to Speech**: If enabled, the results will be read aloud using the system's default text-to-speech language and voice. (default: `disabled`)\n- **Text-to-Speech Replacement CSV Path**: Set the path to the CSV file containing text-to-speech replacement pairs in the format `original_text, replacement_text`.\n\n**Other Settings**\n\n- **Your First Language**: Set your first language. This language is used when using GPT for translation. (default: `English`)\n- **Your Second Language**: Set your second language. This language is used when using GPT for translation. (default: `Japanese`)\n- **Sound**: If checked, a notification sound will play when the response is returned. (default: `disabled`)\n- **Save File Path**: If set, the results will be saved in the specified path as a markdown file. (default: `not set`)\n\n**Environment Variables**\n\nEnvironment variables can be accessed by clicking the `[x]` button located at the top right of the workflow settings screen. Normally, there is no need to change the values of the environment variables.\n\n- `http_keep_alive`: This workflow starts an HTTP server when the web UI is first displayed. After that, if the web UI is not used for the time (in seconds) set by this environment variable, the server will stop. (default: `7200` = 2 hours)\n- `http_port`: Specifies the port number for the web UI. (default: `80`)\n- `http_server_wait`: Specifies the wait time from when the HTTP server is started until the page is displayed in the browser. (default: `2.5`)\n- `websocket_port`: Specifies the port number for websocket communication used to display responses in streaming on the web UI. (default: `8080`)\n\n## Author\n\nYoichiro Hasebe (\u003cyohasebe@gmail.com\u003e)\n\n## License\n\nThe MIT License\n\n## Disclaimer\n\nThe author assumes no responsibility for any potential damages arising from the use of this software.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyohasebe%2Fopenai-chat-api-workflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyohasebe%2Fopenai-chat-api-workflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyohasebe%2Fopenai-chat-api-workflow/lists"}