{"id":13423914,"url":"https://github.com/rksm/org-ai","last_synced_at":"2025-05-15T04:06:51.434Z","repository":{"id":135434661,"uuid":"610071879","full_name":"rksm/org-ai","owner":"rksm","description":"Emacs as your personal AI assistant. Use LLMs such as ChatGPT or LLaMA for text generation or DALL-E and Stable Diffusion for image generation. Also supports speech input / output.","archived":false,"fork":false,"pushed_at":"2025-01-31T22:36:46.000Z","size":13571,"stargazers_count":758,"open_issues_count":45,"forks_count":61,"subscribers_count":21,"default_branch":"master","last_synced_at":"2025-04-14T05:56:22.267Z","etag":null,"topics":["ai","chatgpt","emacs","generative-models","gpt","llms"],"latest_commit_sha":null,"homepage":"","language":"Emacs Lisp","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rksm.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["rksm"]}},"created_at":"2023-03-06T02:53:44.000Z","updated_at":"2025-04-14T04:53:51.000Z","dependencies_parsed_at":null,"dependency_job_id":"a57d1a45-8364-4d69-81ee-6885e32a4395","html_url":"https://github.com/rksm/org-ai","commit_stats":null,"previous_names":[],"tags_count":41,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rksm%2Forg-ai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rksm%2Forg-ai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rksm%2Forg-ai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rksm%2Forg-ai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rksm","download_url":"https://codeload.github.com/rksm/org-ai/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254270646,"owners_count":22042859,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","chatgpt","emacs","generative-models","gpt","llms"],"created_at":"2024-07-31T00:00:44.915Z","updated_at":"2025-05-15T04:06:46.410Z","avatar_url":"https://github.com/rksm.png","language":"Emacs Lisp","funding_links":["https://github.com/sponsors/rksm"],"categories":["Uncategorized","Thanks to all the contributors!","Integrations","Chatbots","Tools/Products","ChatGPT in your editor","插件和扩展","Editors","🔌 ChatGPT in Your Editor","Emacs Lisp","Projects","Addons, extensions, plug-ins for integrating LLM into third-party applications"],"sub_categories":["Uncategorized","Access ChatGPT from other platforms","Examples","ChatGPT in your editor","Emacs","Other user interfaces"],"readme":"# org-ai [![MELPA](https://melpa.org/packages/org-ai-badge.svg)](https://melpa.org/#/org-ai)\n\n[![org-ai video](doc/org-ai-youtube-thumb-github.png)](https://www.youtube.com/watch?v=fvBDxiFPG6I)\n\nMinor mode for Emacs org-mode that provides access to generative AI models. Currently supported are\n- OpenAI API (ChatGPT, DALL-E, other text models), optionally run against Azure API instead of OpenAI\n- Stable Diffusion through [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui)\n\nInside an org-mode buffer you can\n- use ChatGPT to generate text, having full control over system and user prompts ([demo](#chatgpt-in-org-mode))\n- Speech input and output! Talk with your AI!\n- generate images and image variations with a text prompt using Stable Diffusion or DALL-E ([demo 1](#dall-e-in-org-mode), [demo 2](#image-variations))\n- org-ai everywhere: Various commands usable outside org-mode for prompting using the selected text or multiple files.\n\n_Note: In order to use the OpenAI API you'll need an [OpenAI account](https://platform.openai.com/) and you need to get an API token. As far as I can tell, the current usage limits for the free tier get you pretty far._\n\n------------------------------\n\n## Table of Contents\n\n- [Demos](#demos)\n    - [ChatGPT in org-mode](#chatgpt-in-org-mode)\n    - [DALL-E in org-mode](#dall-e-in-org-mode)\n    - [Image variations](#image-variations)\n- [Features and Usage](#features-and-usage)\n    - [`#+begin_ai...#+end_ai` special blocks](#begin_aiend_ai-special-blocks)\n        - [Syntax highlighting in ai blocks](#syntax-highlighting-in-ai-blocks)\n        - [Jump to the end of the block after completion](#jump-to-the-end-of-the-block-after-completion)\n        - [Auto-fill paragraphs on insertion](#auto-fill-paragraphs-on-insertion)\n        - [Block Options](#block-options)\n            - [For ChatGPT](#for-chatgpt)\n            - [For DALL-E](#for-dall-e)\n            - [Other text models](#other-text-models)\n    - [Image variation](#image-variation)\n    - [Global Commands](#global-commands)\n        - [org-ai-on-project](#org-ai-on-project)\n    - [Noweb Support](#noweb-support)\n- [Installation](#installation)\n    - [Melpa](#melpa)\n    - [Straight.el](#straightel)\n    - [Manual](#manual)\n    - [OpenAI API key](#openai-api-key)\n        - [Using other services than OpenAI](#using-other-services-than-openai)\n            - [Azure](#azure)\n            - [perplexity.ai](#perplexityai)\n            - [Anthropic / Claude](#anthropic--claude)\n    - [Setting up speech input / output](#setting-up-speech-input--output)\n        - [Whisper](#whisper)\n            - [macOS specific steps](#macos-specific-steps)\n                - [macOS alternative: Siri dictation](#macos-alternative-siri-dictation)\n            - [Windows specific steps](#windows-specific-steps)\n        - [espeak / greader](#espeak--greader)\n    - [Setting up Stable Diffusion](#setting-up-stable-diffusion)\n    - [Using local LLMs with oobabooga/text-generation-webui](#using-local-llms-with-oobaboogatext-generation-webui)\n- [FAQ](#faq)\n- [Sponsoring](#sponsoring)\n\n## Demos\n\n### ChatGPT in org-mode\n\n```org\n#+begin_ai\nIs Emacs the greatest editor?\n#+end_ai\n```\n\n![chat-gpt in org-mode](doc/org-ai-demo-1.gif)\n\nYou can continue to type and press `C-c C-c` to create a conversation. `C-g` will interrupt a running request.\n\n\n### DALL-E in org-mode\n\nUse the `:image` keyword to generate an image. This uses DALL·E-3 by default.\n\n```org\n#+begin_ai :image :size 1024x1024\nHyper realistic sci-fi rendering of super complicated technical machine.\n#+end_ai\n```\n\n![dall-e in org-mode](doc/org-ai-demo-2.gif)\n\nYou can use the following keywords to control the image generation:\n- `:size \u003cwidth\u003ex\u003cheight\u003e` - the size of the image to generate (default: 1024x1024)\n- `:model \u003cmodel\u003e` - the model to use (default: `\"dall-e-3\"`)\n- `:quality \u003cquality\u003e` - the quality of the image (choices: `hd`, `standard`)\n- `:style \u003cstyle\u003e` - the style to use (choices: `vivid`, `natural`)\n- `:n \u003ccount\u003e - the number of images to generate (default: 1)\n\n(For more information about those settings see [this OpenAI blog post](https://cookbook.openai.com/articles/what_is_new_with_dalle_3).\n\nYou can customize the defaults for those variables with `customize-variable` or by setting them in your config:\n\n```elisp\n(setq org-ai-image-model \"dall-e-3\")\n(setq org-ai-image-default-size \"1792x1024\")\n(setq org-ai-image-default-count 2)\n(setq org-ai-image-default-style 'vivid)\n(setq org-ai-image-default-quality 'hd)\n(setq org-ai-image-directory (expand-file-name \"org-ai-images/\" org-directory))\n```\n\n\n### Image variations\n\n![dall-e image generation in org-mode](doc/org-ai-demo-3.gif)\n\n\n\n## Features and Usage\n### `#+begin_ai...#+end_ai` special blocks\n\nSimilar to org-babel, these blocks demarcate input (and for ChatGPT also output) for the AI model. You can use it for AI chat, text completion and text -\u003e image generation. See [options](#block-options) below for more information.\n\nCreate a block like\n\n```org\n#+begin_ai\nIs Emacs the greatest editor?\n#+end_ai\n```\n\nand press `C-c C-c`. The Chat input will appear inline and once the response is complete, you can enter your reply and so on. See [the demo](#chatgpt-in-org-mode) below. You can press `C-g` while the ai request is running to cancel it.\n\nYou can also modify the _system_ prompt and other parameters used. The system prompt is injected before the user's input and \"primes\" the model to answer in a certain style. For example you can do:\n\n```org\n#+begin_ai :max-tokens 250\n[SYS]: Act as if you are a powerful medival king.\n[ME]: What will you eat today?\n#+end_ai\n```\n\nThis will result in an API payload like\n\n```json\n{\n  \"messages\": [\n    {\n      \"role\": \"system\",\n      \"content\": \"Act as if you are a powerful medival king.\"\n    },\n    {\n      \"role\": \"user\",\n      \"content\": \"What will you eat today?\"\n    }\n  ],\n  \"model\": \"gpt-4o-mini\",\n  \"stream\": true,\n  \"max_tokens\": 250,\n  \"temperature\": 1.2\n}\n```\n\nFor some prompt ideas see for example [Awesome ChatGPT Prompts](https://github.com/f/awesome-chatgpt-prompts).\n\nWhen generating images using the `:image` flag, images will appear underneath the ai block inline. Images will be stored (together with their prompt) inside `org-ai-image-directory` which defaults to `~/org/org-ai-images/`.\n\nYou can also use speech input to transcribe the input. Press `C-c r` for `org-ai-talk-capture-in-org` to start recording. Note that this will require you to setup [speech recognition](#setting-up-speech-input--output) (see below). Speech output can be enabled with `org-ai-talk-output-enable`.\n\nInside an `#+begin_ai...#+end_ai` you can modify and select the parts of the chat with these commands:\n- Press `C-c \u003cbackspace\u003e` (`org-ai-kill-region-at-point`) to remove the chat part under point.\n- `org-ai-mark-region-at-point` will mark the region at point.\n- `org-ai-mark-last-region` will mark the last chat part.\n\n#### Syntax highlighting in ai blocks\n\nTo apply syntax highlighted to your `#+begin_ai ...` blocks just add a language major-mode name after `_ai`. E.g. `#+begin_ai markdown`. For markdown in particular, to then also correctly highlight code in in backticks, you can set `(setq markdown-fontify-code-blocks-natively t)`. Make sure that you also have the [markdown-mode package](https://melpa.org/#/markdown-mode) installed. Thanks @tavisrudd for this trick!\n\n#### Jump to the end of the block after completion\n\nThis behavior is enabled by default to so that the interaction is more similar to a chat. It can be annoying when long output is present and the buffer scrolls while you are reading. So you can disable this with:\n\n```elisp\n(setq org-ai-jump-to-end-of-block nil)\n```\n\n#### Auto-fill paragraphs on insertion\n\nSet `(setq org-ai-auto-fill t)` to \"fill\" (automatically wrap lines according to `fill-column`) the inserted text. Basically like `auto-fill-mode` but for the AI.\n\n#### Block Options\n\nThe `#+begin_ai...#+end_ai` block can take the following options.\n\n##### For ChatGPT\nBy default, the content of ai blocks are interpreted as messages for ChatGPT. Text following `[ME]:` is associated with the user, text following `[AI]:` is associated as the model's response. Optionally you can start the block with a `[SYS]: \u003cbehavior\u003e` input to prime the model (see `org-ai-default-chat-system-prompt` below).\n\n- `:max-tokens number` - number of maximum tokens to generate (default: nil, use OpenAI's default)\n- `:temperature number` - temperature of the model (default: 1)\n- `:top-p number` - top_p of the model (default: 1)\n- `:frequency-penalty number` - frequency penalty of the model (default: 0)\n- `:presence-penalty` - presence penalty of the model (default: 0)\n- `:sys-everywhere` - repeat the system prompt for every user message (default: nil)\n\nIf you have a lot of different threads of conversation regarding the same topic and settings (system prompt, temperature, etc) and you don't want to repeat all the options, you can set org file scope properties or create a org heading with property drawer, such that all `#+begin_ai...#+end_ai` blocks under that heading will inherit the settings.\n\nExamples:\n```org\n* Emacs (multiple conversations re emacs continue in this subtree)\n:PROPERTIES:\n:SYS: You are a emacs expert. You can help me by answering my questions. You can also ask me questions to clarify my intention.\n:temperature: 0.5\n:model: gpt-4o-mini\n:END:\n\n** Web programming via elisp\n#+begin_ai\nHow to call a REST API and parse its JSON response?\n#+end_ai\n\n** Other emacs tasks\n#+begin_ai...#+end_ai\n\n* Python (multiple conversations re python continue in this subtree)\n:PROPERTIES:\n:SYS: You are a python programmer. Respond to the task with detailed step by step instructions and code.\n:temperature: 0.1\n:model: gpt-4\n:END:\n\n** Learning QUIC\n#+begin_ai\nHow to setup a webserver with http3 support?\n#+end_ai\n\n** Other python tasks\n#+begin_ai...#+end_ai\n```\n\nThe following custom variables can be used to configure the chat:\n\n- `org-ai-default-chat-model` (default: `\"gpt-4o-mini\"`)\n- `org-ai-default-max-tokens` How long the response should be. Currently cannot exceed 4096. If this value is too small an answer might be cut off (default: nil)\n- `org-ai-default-chat-system-prompt` How to \"prime\" the model. This is a prompt that is injected before the user's input. (default: `\"You are a helpful assistant inside Emacs.\"`)\n- `org-ai-default-inject-sys-prompt-for-all-messages` Wether to repeat the system prompt for every user message. Sometimes the model \"forgets\" how it was primed. This can help remind it. (default: `nil`)\n\n##### For DALL-E\n\nWhen you add an `:image` option to the ai block, the prompt will be used for image generation.\n\n- `:image` - generate an image instead of text\n- `:size` - size of the image to generate (default: 256x256, can be 512x512 or 1024x1024)\n- `:n` - the number of images to generate (default: 1)\n\nThe following custom variables can be used to configure the image generation:\n- `org-ai-image-directory` - where to store the generated images (default: `~/org/org-ai-images/`)\n\n##### For Stable Diffusion\n\nSimilar to DALL-E but use\n\n```\n#+begin_ai :sd-image\n\u003cPROMPT\u003e\n#+end_ai\n```\n\nYou can run img2img by labeling your org-mode image with #+name and\nreferencing it with :image-ref from your org-ai block.\n\n```\n#+begin_ai :sd-image :image-ref label1\nforest, Gogh style\n#+end_ai\n```\n\nM-x org-ai-sd-clip guesses the previous image's prompt on org-mode\nby the CLIP interrogator and saves it in the kill ring.\n\nM-x org-ai-sd-deepdanbooru guesses the previous image's prompt on\norg-mode by the DeepDanbooru interrogator and saves it in the kill\nring.\n\n##### For local models\nFor requesting completions from a local model served with [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui), go through the setup steps described [below](#using-local-llms-with-oobaboogatext-generation-webui)\n\nThen start an API server:\n\n``` sh\ncd ~/.emacs.d/org-ai/text-generation-webui\nconda activate org-ai\npython server.py --api --model SOME-MODEL\n```\n\nWhen you add a `:local` key to an org-ai block and request completions with `C-c C-c`, the block will be sent to the local API server instead of the OpenAI API. For example:\n\n```\n#+begin_ai :local\n...\n#+end_ai\n```\n\nThis will send a request to `org-ai-oobabooga-websocket-url` and stream the response into the org buffer.\n\n##### Other text models\n\nThe older completion models can also be prompted by adding the `:completion` option to the ai block.\n\n- `:completion` - instead of using the chatgpt model, use the completion model\n- `:model` - which model to use, see https://platform.openai.com/docs/models for a list of models\n\nFor the detailed meaning of those parameters see the [OpenAI API documentation](https://platform.openai.com/docs/api-reference/chat).\n\nThe following custom variables can be used to configure the text generation:\n\n- `org-ai-default-completion-model` (default: `\"text-davinci-003\"`)\n\n\n\n### Image variation\n\nYou can also use an existing image as input to generate more similar looking images. The `org-ai-image-variation` command will prompt for a file path to an image, a size and a count and will then generate as many images and insert links to them inside the current `org-mode` buffer. Images will be stored inside `org-ai-image-directory`. See the [demo](#image-variations) below.\n\n[For more information see the OpenAI documentation](https://platform.openai.com/docs/guides/images/variations). The input image needs to be square and its size needs to be less than 4MB. And you currently need curl available as a command line tool[^1].\n\n[^1]: __Note:__ Currenly the image variation implementation requires a command line curl to be installed. Reason for that is that the OpenAI API expects multipart/form-data requests and the emacs built-in `url-retrieve` does not support that (At least I haven't figured out how). Switching to `request.el` might be a better alternative. If you're interested in contributing, PRs are very welcome!\n\n\n\n### Global Commands\n\n`org-ai` can be used outside of `org-mode` buffers as well. When you enable `org-ai-global-mode`, the prefix `C-c M-a` will be bound to a number of commands:\n\n| command                          | keybinding  | description                                                                                                                                                                                                                                                                                                                                                    |\n|:---------------------------------|:------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `org-ai-on-region`               | `C-c M-a r` | Ask a question about the selected text or tell the AI to do something with it. The response will be opened in an org-mode buffer so that you can continue the conversation. Setting the variable `org-ai-on-region-file` (e.g. `(setq org-ai-on-region-file (expand-file-name \"org-ai-on-region.org\" org-directory))`) will associate a file with that buffer. |\n| `org-ai-summarize`               | `C-c M-a s` | Summarize the selected text.                                                                                                                                                                                                                                                                                                                                   |\n| `org-ai-refactor-code`           | `C-c M-a c` | Tell the AI how to change the selected code, a diff buffer will appear with the changes.                                                                                                                                                                                                                                                                       |\n| `org-ai-on-project`              | `C-c M-a p` | Run prompts and modify / refactor multiple files at once. Will use [projectile](https://github.com/bbatsov/projectile) if available, falls back to the current directory if not.                                                                                                                                                                               |\n| `org-ai-prompt`                  | `C-c M-a P` | Prompt the user for a text and then print the AI's response in current buffer.                                                                                                                                                                                                                                                                                 |\n| `org-ai-switch-chat-model`       | `C-c M-a m` | Interactively change `org-ai-default-chat-model`                                                                                                                                                                                                                                                                                                               |\n| `org-ai-open-account-usage-page` | `C-c M-a $` | Opens https://platform.openai.com/account/usage to see how much money you have burned.                                                                                                                                                                                                                                                                         |\n| `org-ai-open-request-buffer`     | `C-c M-a !` | Opens the `url` request buffer. If something doesn't work it can be helpful to take a look.                                                                                                                                                                                                                                                                    |\n| `org-ai-talk-input-toggle`       | `C-c M-a t` | Generally enable speech input for the different prompt commands.                                                                                                                                                                                                                                                                                               |\n| `org-ai-talk-output-toggle`      | `C-c M-a T` | Generally enable speech output.                                                                                                                                                                                                                                                                                                                                |\n\n#### org-ai-on-project\n\nUsing the org-ai-on-project buffer allows you to run commands on files in a project, alternatively also just on selected text in those files. You can e.g. select the readme of a project and ask \"what is it all about?\" or have code explained to you. You can also ask for code changes, which will generate a diff. If you know somehone who thinks only VS Code with Copilot enabled can do that, point them here.\n\nRunning the `org-ai-on-project` command will open a separate buffer that allows you to select choose multiple files (and optionally select a sub-region inside a file) and then run a prompt on it.\n\n![org-ai-on-project](doc/org-ai-on-project-buffer.png)\n\nIf you deactivate \"modify code\", the effect is similar to running `org-ai-on-region` just that the file contents all appear in the prompt.\n\nWith \"modify code\" activated, you can ask the AI to modify or refactor the code. By default (\"Request diffs\") deactivated, we will prompt to generate the new code for all selected files/regions and you can then see a diff per file and decide to apply it or not. With \"Request diffs\" active, the AI will be asked to directly create a unified diff that can then be applied.\n\n\n### Noweb Support\n\nGiven a named source block\n```\n#+name: sayhi\n#+begin_src shell\necho \"Hello there\"\n#+end_src\n```\nWe can try to reference it by name, but it doesn't work.\n```\n#+begin_ai\n[SYS]: You are a mimic. Whenever I say something, repeat back what I say to you. Say exactly what I said, do not add anything.\n\n[ME]: \u003c\u003csayhi()\u003e\u003e\n\n\n[AI]: \u003c\u003csayhi()\u003e\u003e\n\n[ME]:\n#+end_ai\n```\nWith `:noweb yes`\n\n```\n#+begin_ai :noweb yes\n[SYS]: You are a mimic. Whenever I say something, repeat back what I say to you. Say exactly what I said, do not add anything.\n\n[ME]: \u003c\u003csayhi()\u003e\u003e\n\n\n[AI]: Hello there.\n\n[ME]:\n#+end_ai\n```\n\nYou can also trigger noweb expansion with an `org-ai-noweb: yes` heading proprty anywhere in the parent headings (header args takes precedence).\n\nTo see what your document will expand to when sent to the api, run `org-ai-expand-block`.\n\n#### Run arbitrary lisp inline\n\nThis is a hack but it works really well.\n\nCreate a block\n\n```\n#+name: identity\n#+begin_src emacs-lisp :var x=\"fill me in\"\n(format \"%s\" x)\n#+end_src\n```\n\nWe can invoke it and let noweb parameters (which support lisp) evaluate as code\n\n```\n#+begin_ai :noweb yes\nTell me some 3, simple ways to improve this dockerfile\n\n\u003c\u003cidentity(x=(quelpa-slurp-file \"~/code/ibr-api/Dockerfile\"))\u003e\u003e\n\n\n\n[AI]: 1. Use a more specific version of Python, such as \"python:3.9.6-buster\" instead of \"python:3.9-buster\", to ensure compatibility with future updates.\n\n2. Add a cleanup step after installing poetry to remove any unnecessary files or dependencies, thus reducing the size of the final image.\n\n3. Use multi-stage builds to separate the build environment from the production environment, thus reducing the size of the final image and increasing security. For example, the first stage can be used to install dependencies and build the code, while the second stage can contain only the final artifacts and be used for deployment.\n\n[ME]:\n#+end_ai\n```\n\n\n## Installation\n\n### Melpa\n\norg-ai is on Melpa: https://melpa.org/#/org-ai. If you have added Melpa to your package archives with\n\n```elisp\n(require 'package)\n(add-to-list 'package-archives '(\"melpa\" . \"http://melpa.org/packages/\") t)\n(package-initialize)\n```\n\nyou can install it with:\n\n```elisp\n(use-package org-ai\n  :ensure t\n  :commands (org-ai-mode\n             org-ai-global-mode)\n  :init\n  (add-hook 'org-mode-hook #'org-ai-mode) ; enable org-ai in org-mode\n  (org-ai-global-mode) ; installs global keybindings on C-c M-a\n  :config\n  (setq org-ai-default-chat-model \"gpt-4\") ; if you are on the gpt-4 beta:\n  (org-ai-install-yasnippets)) ; if you are using yasnippet and want `ai` snippets\n\n```\n\n### Straight.el\n\n```elisp\n(straight-use-package\n '(org-ai :type git :host github :repo \"rksm/org-ai\"\n          :local-repo \"org-ai\"\n          :files (\"*.el\" \"README.md\" \"snippets\")))\n```\n\n### Manual\n\nCheckout this repository.\n\n```sh\ngit clone\nhttps://github.com/rksm/org-ai\n```\n\nThen, if you use `use-package`:\n\n```elisp\n(use-package org-ai\n  :ensure t\n  :load-path (lambda () \"path/to/org-ai\"))\n  ;; ...rest as above...\n\n```\n\nor just with `require`:\n\n```elisp\n(package-install 'websocket)\n(add-to-list 'load-path \"path/to/org-ai\")\n(require 'org)\n(require 'org-ai)\n(add-hook 'org-mode-hook #'org-ai-mode)\n(org-ai-global-mode)\n(setq org-ai-default-chat-model \"gpt-4\") ; if you are on the gpt-4 beta:\n(org-ai-install-yasnippets) ; if you are using yasnippet and want `ai` snippets\n```\n\n### OpenAI API key\n\nYou can either directly set your api token in your config:\n\n```elisp\n(setq org-ai-openai-api-token \"\u003cENTER YOUR API TOKEN HERE\u003e\")\n\n```\n\nAlternatively, `org-ai` supports `auth-source` for retrieving your API key. You can store a secret in the format\n\n```\nmachine api.openai.com login org-ai password \u003cyour-api-key\u003e\n```\n\nin your `~/authinfo.gpg` file. If this is present, org-ai will use this mechanism to retrieve the token when a request is made. If you do not want `org-ai` to try to retrieve the key from `auth-source`, you can set `org-ai-use-auth-source` to `nil` before loading `org-ai`.\n\n#### Using other services than OpenAI\n\n##### Azure\n\nYou can switch to Azure by customizing these variables, either interactively with `M-x customize-variable` or by adding them to your config:\n\n```elisp\n(setq org-ai-service 'azure-openai\n      org-ai-azure-openai-api-base \"https://your-instance.openai.azure.com\"\n      org-ai-azure-openai-deployment \"azure-openai-deployment-name\"\n      org-ai-azure-openai-api-version \"2023-07-01-preview\")\n```\n\nTo store the API credentials, follow the authinfo instructions above but use `org-ai-azure-openai-api-base` as the machine name.\n\n##### perplexity.ai\n\nFor a list of available models see the [perplexity.ai documentation](https://docs.perplexity.ai/docs/model-cards).\n\nEither switch the default service in your config:\n\n```elisp\n(setq org-ai-service 'perplexity.ai)\n(setq org-ai-default-chat-model \"llama-3-sonar-large-32k-online\")\n```\n\nor per block:\n\n```org\n#+begin_ai :service perplexity.ai :model llama-3-sonar-large-32k-online\n[ME]: Tell me fun facts about Emacs.\n#+end_ai\n```\n\nFor the authentication have an entry like `machine api.perplexity.ai login org-ai password pplx-***` in your `authinfo.gpg` or set `org-ai-openai-api-token`.\n\n__Note:__ Currently the perplexity.ai does not give access to references/links via the API so Emacs will not be able to display references. They have a beta program for that running and I sure hope that this will be available generally soon.\n\n##### Anthropic / Claude\n\nSimilar to the above. E.g. \n\n```org\n#+begin_ai :service anthropic :model claude-3-opus-20240229\n[ME]: Tell me fun facts about Emacs.\n#+end_ai\n```\n\nAnthropic models are [here](https://docs.anthropic.com/claude/docs/models-overview).\nThere is currently only one API version that is set via `org-ai-anthropic-api-version`. If other version come out you can find them [here](https://docs.anthropic.com/claude/reference/versions).\n\nFor the API token use `machine api.anthropic.com login org-ai password sk-ant-***` in your `authinfo.gpg`.\n\n### Setting up speech input / output\n\n#### Whisper\n\nThese setup steps are optional. If you don't want to use speech input / output, you can skip this section.\n\n_Note: My personal config for org-ai can be found in [this gist](https://gist.github.com/rksm/04be012be07671cd5e1dc6ec5b077e34). It contains a working whisper setup._\n\nThis has been tested on macOS and Linux. Someone with a Windows computer, please test this and let me know what needs to be done to make it work (Thank You!).\n\nThe speech input uses [whisper.el](https://github.com/natrys/whisper.el) and `ffmpeg`. You need to clone the repo directly or use [straight.el](https://github.com/radian-software/straight.el) to install it.\n\n1. install ffmpeg (e.g. `brew install ffmpeg` on macOS) or `sudo apt install ffmpeg` on Linux.\n2. Clone whisper.el: `git clone https://github.com/natrys/whisper.el path/to/whisper.el`\n\nYou should now be able to load it inside Emacs:\n\n```elisp\n(use-package whisper\n  :load-path \"path/to/whisper.el\"\n  :bind (\"M-s-r\" . whisper-run))\n```\n\nNow also load:\n\n```elisp\n(use-package greader :ensure)\n(require 'whisper)\n(require 'org-ai-talk)\n\n;; macOS speech settings, optional\n(setq org-ai-talk-say-words-per-minute 210)\n(setq org-ai-talk-say-voice \"Karen\")\n```\n\n##### macOS specific steps\n\nOn macOS you will need to do two more things:\n1. Allow Emacs to record audio\n2. Tell whisper.el what microphone to use\n\n###### 1. Allow Emacs to record audio\nYou can use the [tccutil helper](https://github.com/DocSystem/tccutil):\n\n```sh\ngit clone https://github.com/DocSystem/tccutil\ncd tccutil\nsudo python ./tccutil.py -p /Applications/Emacs.app -e --microphone\n```\n\nWhen you now run `ffmpeg -f avfoundation -i :0 output.mp3` from within an Emacs shell, there should be no `abort trap: 6` error.\n\n(As an alternative to tccutil.py see the method mentioned in [this issue](https://github.com/rksm/org-ai/issues/86).)\n\n###### 2. Tell whisper.el what microphone to use\n\nYou can use the output of `ffmpeg -f avfoundation -list_devices true -i \"\"` to list the audio input devices and then tell whisper.el about it: `(setq whisper--ffmpeg-input-device \":0\")`. `:0` is the microphone index, see the output of the command above to use another one.\n\nI've created an emacs helper that let's you select the microphone interactively. See [this gist](https://gist.github.com/rksm/04be012be07671cd5e1dc6ec5b077e34#file-init-org-ai-el-L6).\n\nMy full speech enabled config then looks like:\n\n```elisp\n(use-package whisper\n  :load-path (lambda () (expand-file-name \"lisp/other-libs/whisper.el\" user-emacs-directory))\n  :config\n  (setq whisper-model \"base\"\n        whisper-language \"en\"\n        whisper-translate nil)\n  (when *is-a-mac*\n    (rk/select-default-audio-device \"Macbook Pro Microphone\")\n    (when rk/default-audio-device)\n    (setq whisper--ffmpeg-input-device (format \":%s\" rk/default-audio-device))))\n```\n\n###### macOS alternative: Siri dictation\n\nOn macOS, instead of whisper, you can also use the built-in Siri dictation. To enable that, go to `Preferences -\u003e Keyboard -\u003e Dictation`, enable it and set up a shortcut. The default is ctrl-ctrl.\n\n##### Windows specific steps\n\nThe way (defun whisper--check-install-and-run) is implemented does not work on Win10 (see https://github.com/rksm/org-ai/issues/66).\n\nA workaround is to install whisper.cpp and model manually and patch:\n\n``` elisp\n(defun whisper--check-install-and-run (buffer status)\n  (whisper--record-audio))\n```\n\n#### espeak / greader\n\nSpeech output on non-macOS systems defaults to using the [greader](http://elpa.gnu.org/packages/greader.html) package which uses [espeak](https://espeak.sourceforge.net/) underneath to synthesize speech. You will need to install greader manually (e.g. via `M-x package-install`). From that point on it should \"just work\". You can test it by selecting some text and calling `M-x org-ai-talk-read-region`.\n\n### Setting up Stable Diffusion\n\nAn API for Stable Diffusion can be hosted with the [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) project. Go through the [install steps for your platform](https://github.com/AUTOMATIC1111/stable-diffusion-webui#installation-and-running), then start an API-only server:\n\n```sh\ncd path/to/stable-diffusion-webui\n./webui.sh --nowebui\n```\n\nThis will start a server on http://127.0.0.1:7861 by default. In order to use it with org-ai, you need to set `org-ai-sd-endpoint-base`:\n\n```elisp\n(setq org-ai-sd-endpoint-base \"http://localhost:7861/sdapi/v1/\")\n```\n\nIf you use a server hosted elsewhere, change that URL accordingly.\n\n### Using local LLMs with oobabooga/text-generation-webui\nSince version 0.4 org-ai supports local models served with [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui). See the [installation instructions](https://github.com/oobabooga/text-generation-webui#installation) to set it up for your system.\n\nHere is a setup walk-through that was tested on Ubuntu 22.04. It assumes [miniconda or Anaconda](https://docs.conda.io/projects/conda/en/stable/user-guide/install/download.html#anaconda-or-miniconda) as well as [git-lfs](https://git-lfs.com/) to be installed.\n\n#### Step 1: Setup conda env and install pytorch\n\n```sh\nconda create -n org-ai python=3.10.9\nconda activate org-ai\npip3 install torch torchvision torchaudio\n```\n\n#### Step 2: Install oobabooga/text-generation-webui\n\n```sh\nmkdir -p ~/.emacs.d/org-ai/\ncd ~/.emacs.d/org-ai/\ngit clone https://github.com/oobabooga/text-generation-webui\ncd text-generation-webui\npip install -r requirements.txt\n```\n\n#### Step 3: Install a language model\n\noobabooga/text-generation-webui supports [a number of language models](https://github.com/oobabooga/text-generation-webui#downloading-models). Normally, you would install them from [huggingface](https://huggingface.co/models?pipeline_tag=text-generation\u0026sort=downloads). For example, to install the `CodeLlama-7b-Instruct` model:\n\n```sh\ncd ~/.emacs.d/org-ai/text-generation-webui/models\ngit clone git@hf.co:codellama/CodeLlama-7b-Instruct-hf\n```\n\n#### Step 4: Start the API server\n\n```sh\ncd ~/.emacs.d/org-ai/text-generation-webui\nconda activate org-ai\npython server.py --api --model CodeLlama-7b-Instruct-hf\n```\n\nDepending on your hardware and the model used you might need to adjust the server parameters, e.g. use `--load-in-8bit` to reduce memory usage or `--cpu` if you don't have a suitable GPU.\n\nYou should now be able to use the local model with org-ai by adding the `:local` option to the `#+begin_ai` block:\n\n```\n#+begin_ai :local\nHello CodeLlama!\n#+end_ai\n```\n\n## FAQ\n\n### Is this OpenAI specfic?\nNo, OpenAI is the easiest to setup (you only need an API key) but you can use local models as well. See how to use Stable Diffusion and local LLMs with oobabooga/text-generation-webui above. Anthropic Claude and perplexity.ai are also supported. Please open an issue or PR for other services you'd like to see supported. I can be slow to respond but will add support if there is enough interest.\n\n### Are there similar projects around?\n\nThe gptel package provides an alternative interface to the OpenAI ChatGPT API: https://github.com/karthink/gptel\n\n\n## Sponsoring\n\nIf you find this project useful please consider [sponsoring](https://github.com/sponsors/rksm). Thank you!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frksm%2Forg-ai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frksm%2Forg-ai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frksm%2Forg-ai/lists"}