{"id":28192930,"url":"https://github.com/shipclojure/simulflow","last_synced_at":"2025-05-16T12:16:05.604Z","repository":{"id":269611540,"uuid":"907985917","full_name":"shipclojure/simulflow","owner":"shipclojure","description":"A Clojure library for building real-time voice-enabled AI pipelines. Simulflow handles the orchestration of speech recognition, audio processing, and AI service integration with the elegance of functional programming.","archived":false,"fork":false,"pushed_at":"2025-05-11T07:47:09.000Z","size":2910,"stargazers_count":63,"open_issues_count":1,"forks_count":5,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-05-11T08:29:05.640Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"epl-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shipclojure.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-12-24T19:33:06.000Z","updated_at":"2025-05-11T07:47:12.000Z","dependencies_parsed_at":"2025-01-14T06:39:30.147Z","dependency_job_id":"2606fbd1-b5b4-4ada-b255-7f38965e546a","html_url":"https://github.com/shipclojure/simulflow","commit_stats":null,"previous_names":["ovistoica/voice-fn","shipclojure/voice-fn","shipclojure/simulflow"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shipclojure%2Fsimulflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shipclojure%2Fsimulflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shipclojure%2Fsimulflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shipclojure%2Fsimulflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shipclojure","download_url":"https://codeload.github.com/shipclojure/simulflow/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253904180,"owners_count":21981819,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-16T12:15:47.333Z","updated_at":"2025-05-16T12:16:05.578Z","avatar_url":"https://github.com/shipclojure.png","language":"Clojure","funding_links":[],"categories":[],"sub_categories":[],"readme":"# simulflow - Real-time Data-Driven AI Pipeline Framework\n\n\u003cimg src=\"./resources/simulflow.png\"  align=\"right\" height=\"250\" /\u003e\n\n\n\u003e Daydreaming is the first awakening of what we call simulflow. It is\n\u003e an essential tool of rational thought. With it you can clear the mind for\n\u003e better thinking.\n– Frank Herbert, _Heretics of Dune_\n\n\n\u003e Bene Gesserit also have the ability to practice simulflow, literally the\n\u003e simultaneous flow of several threads of consciousness at any given time; mental\n\u003e multitasking, as it were. The combination of simulflow with their analytical\n\u003e abilities and Other Memory is responsible for the frightening intelligence of\n\u003e the average Bene Gesserit.\nSimulflow, [Dune Wiki](https://dune.fandom.com/wiki/Bene_Gesserit_Training#Simulflow)\n\n[![Clojars Project](https://img.shields.io/clojars/v/com.shipclojure/simulflow.svg)](https://clojars.org/com.shipclojure/simulflow)\n\u003cbr\u003e\n\n**simulflow** is a Clojure framework for building real-time multimodal AI applications using a data-driven, functional approach. Built on top of `clojure.core.async.flow`, it provides a composable pipeline architecture for processing audio, text, video and AI interactions with built-in support for major AI providers.\n\n\n\u003e [!WARNING]\n\u003e While Simulflow has been used in live, production applications - it's still under *active* development.\n\u003e Expect breaking changes to support new usecases\n\n## Installation\n\n### Clojure CLI/deps.edn\n\n```clojure\n;; Add to your deps.edn\n{:deps {com.shipclojure/simulflow {:mvn/version \"0.1.4-alpha\"}}}\n```\n\n### Leiningen/Boot\n\n```clojure\n;; Add to your project.clj\n[com.shipclojure/simulflow \"0.1.4-alpha\"]\n```\n\n### Maven\n\n```xml\n\u003cdependency\u003e\n  \u003cgroupId\u003ecom.shipclojure\u003c/groupId\u003e\n  \u003cartifactId\u003esimulflow\u003c/artifactId\u003e\n  \u003cversion\u003e0.1.4-alpha\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n## Video presentation:\n[![Watch the video](https://img.youtube.com/vi/HwoGMhIx5w0/0.jpg)](https://youtu.be/HwoGMhIx5w0?t=345)\n\n## Core Features\n\n-   **Flow-Based Architecture:** Built on `core.async.flow` for robust concurrent processing\n-   **Data-First Design:** Define AI pipelines as data structures for easy configuration and modification\n-   **Streaming Architecture:** Efficient real-time audio and text processing\n-   **Extensible:** Seamless to add new processors to embed into AI flows\n-   **Flexible Frame System:** Type-safe message passing between pipeline components\n-   **Built-in Services:** Ready-to-use integrations with major AI providers\n\n\n## Quick Start: Local example\n\nFirst, create a `resources/secrets.edn`:\n\n```edn\n{:deepgram {:api-key \"\"}\n :elevenlabs {:api-key \"\"\n              :voice-id \"\"}\n :groq {:api-key \"\"}\n :openai {:new-api-sk \"\"}}\n```\n\nObtain the API keys from the respective providers and fill in the blank values.\n\nStart a REPL and evaluate the snippets in the `(comment ...)` blocks to start the flows.\nAllow Microphone access when prompted.\n\n```clojure\n(ns simulflow-examples.local\n  (:require\n   [clojure.core.async :as a]\n   [clojure.core.async.flow :as flow]\n   [taoensso.telemere :as t]\n   [simulflow.processors.deepgram :as asr]\n   [simulflow.processors.elevenlabs :as tts]\n   [simulflow.processors.llm-context-aggregator :as context]\n   [simulflow.processors.openai :as llm]\n   [simulflow.secrets :refer [secret]]\n   [simulflow.transport :as transport]\n   [simulflow.utils.core :as u]))\n\n(defn make-local-flow\n  \"This example showcases a voice AI agent for the local computer.  Audio is\n  usually encoded as PCM at 16kHz frequency (sample rate) and it is mono (1\n  channel).\n    \"\n  ([] (make-local-flow {}))\n  ([{:keys [llm-context extra-procs extra-conns encoding debug?\n            sample-rate language sample-size-bits channels chunk-duration-ms]\n     :or {llm-context {:messages [{:role \"system\"\n                                   :content \"You are a helpful assistant \"}]}\n          encoding :pcm-signed\n          sample-rate 16000\n          sample-size-bits 16\n          channels 1\n          chunk-duration-ms 20\n          language :en\n          debug? false\n          extra-procs {}\n          extra-conns []}}]\n\n   (flow/create-flow\n     {:procs\n      (u/deep-merge\n        {;; Capture audio from microphone and send raw-audio-input frames further in the pipeline\n         :transport-in {:proc transport/microphone-transport-in\n                        :args {:audio-in/sample-rate sample-rate\n                               :audio-in/channels channels\n                               :audio-in/sample-size-bits sample-size-bits}}\n         ;; raw-audio-input -\u003e transcription frames\n         :transcriptor {:proc asr/deepgram-processor\n                        :args {:transcription/api-key (secret [:deepgram :api-key])\n                               :transcription/interim-results? true\n                               :transcription/punctuate? false\n                               :transcription/vad-events? true\n                               :transcription/smart-format? true\n                               :transcription/model :nova-2\n                               :transcription/utterance-end-ms 1000\n                               :transcription/language language\n                               :transcription/encoding encoding\n                               :transcription/sample-rate sample-rate}}\n\n         ;; user transcription \u0026 llm message frames -\u003e llm-context frames\n         ;; responsible for keeping the full conversation history\n         :context-aggregator  {:proc context/context-aggregator\n                               :args {:llm/context llm-context\n                                      :aggregator/debug? debug?}}\n\n         ;; Takes llm-context frames and produces new llm-text-chunk \u0026 llm-tool-call-chunk frames\n         :llm {:proc llm/openai-llm-process\n               :args {:openai/api-key (secret [:openai :new-api-sk])\n                      :llm/model \"gpt-4o-mini\"}}\n\n         ;; llm-text-chunk \u0026 llm-tool-call-chunk -\u003e llm-context-messages-append frames\n         :assistant-context-assembler {:proc context/assistant-context-assembler\n                                       :args {:debug? debug?}}\n\n         ;; llm-text-chunk -\u003e sentence speak frames (faster for text to speech)\n         :llm-sentence-assembler {:proc context/llm-sentence-assembler}\n\n         ;; speak-frames -\u003e audio-output-raw frames\n         :tts {:proc tts/elevenlabs-tts-process\n               :args {:elevenlabs/api-key (secret [:elevenlabs :api-key])\n                      :elevenlabs/model-id \"eleven_flash_v2_5\"\n                      :elevenlabs/voice-id (secret [:elevenlabs :voice-id])\n                      :voice/stability 0.5\n                      :voice/similarity-boost 0.8\n                      :voice/use-speaker-boost? true\n                      :flow/language language\n                      :audio.out/encoding encoding\n                      :audio.out/sample-rate sample-rate}}\n\n         ;; audio-output-raw -\u003e smaller audio-output-raw frames (used for sending audio in realtime)\n         :audio-splitter {:proc transport/audio-splitter\n                          :args {:audio.out/sample-rate sample-rate\n                                 :audio.out/sample-size-bits sample-size-bits\n                                 :audio.out/channels channels\n                                 :audio.out/duration-ms chunk-duration-ms}}\n\n         ;; speakers out\n         :transport-out {:proc transport/realtime-speakers-out-processor\n                         :args {:audio.out/sample-rate sample-rate\n                                :audio.out/sample-size-bits sample-size-bits\n                                :audio.out/channels channels\n                                :audio.out/duration-ms chunk-duration-ms}}}\n        extra-procs)\n      :conns (concat\n               [[[:transport-in :out] [:transcriptor :in]]\n\n                [[:transcriptor :out] [:context-aggregator :in]]\n                [[:context-aggregator :out] [:llm :in]]\n\n                ;; Aggregate full context\n                [[:llm :out] [:assistant-context-assembler :in]]\n                [[:assistant-context-assembler :out] [:context-aggregator :in]]\n\n                ;; Assemble sentence by sentence for fast speech\n                [[:llm :out] [:llm-sentence-assembler :in]]\n                [[:llm-sentence-assembler :out] [:tts :in]]\n\n                [[:tts :out] [:audio-splitter :in]]\n                [[:audio-splitter :out] [:transport-out :in]]]\n               extra-conns)})))\n\n(def local-ai (make-local-flow))\n\n(comment\n\n  ;; Start local ai flow - starts paused\n  (let [{:keys [report-chan error-chan]} (flow/start local-ai)]\n    (a/go-loop []\n      (when-let [[msg c] (a/alts! [report-chan error-chan])]\n        (when (map? msg)\n          (t/log! {:level :debug :id (if (= c error-chan) :error :report)} msg))\n        (recur))))\n\n  ;; Resume local ai -\u003e you can now speak with the AI\n  (flow/resume local-ai)\n\n  ;; Stop the conversation\n  (flow/stop local-ai)\n\n  ,)\n```\n\nWhich roughly translates to:\n\n![Flow Diagram](./resources/flow.png)\n\n\nSee [examples](./examples/src/voice_fn_examples/) for more usages.\n\n\n## Supported Providers\n\n\n\n### Text-to-Speech (TTS)\n\n-   **ElevenLabs**\n    -   Models: `eleven_multilingual_v2`, `eleven_turbo_v2`, `eleven_flash_v2` and more.\n    -   Features: Real-time streaming, multiple voices, multilingual support\n\n\n### Speech-to-Text (STT)\n\n-   **Deepgram**\n    -   Models: `nova-2`, `nova-2-general`, `nova-2-meeting` and more.\n    -   Features: Real-time transcription, punctuation, smart formatting\n\n\n### Text Based Large Language Models (LLM)\n\n-   **OpenAI**\n    -   Models: `gpt-4o-mini`(fastest, cheapest),  `gpt-4`, `gpt-3.5-turbo` and more\n    -   Features: Function calling, streaming responses\n-   **Google**\n    -   Models: `gemini-2.0-flash`(fastest, cheapest),  `gemini-2.5-flash`,  and more\n    -   Features: Function calling, streaming responses, thinking\n-   **Groq**\n    -   Models: `llama-3.2-3b-preview` `llama-3.1-8b-instant` `llama-3.3-70b-versatile` etc\n    -   Features: Function calling, streaming responses, thinking\n\n\n## Key Concepts\n\n### Flows\n\nThe core building block of simulflow pipelines:\n\n-   Composed of processes connected by channels\n-   Processes can be:\n    -   Input/output handlers\n    -   AI service integrations\n    -   Data transformers\n-   Managed by `core.async.flow` for lifecycle control\n\n### Transport\n\nThe modality through which audio comes and goes from the voice ai pipeline. Example transport modalities:\n\n- local (microphone + speakers)\n- telephony (twilio through websocket)\n- webRTC (browser support) - TODO\n- async (through in \u0026 out core async channels)\n\nYou will see processors like `:transport-in` \u0026 `:transport-out`\n\n### Frames\n\nThe basic unit of data flow, representing typed messages like:\n\n-   `:audio/input-raw` - Raw audio data\n-   `:transcription/result` - Transcribed text\n-   `:llm/text-chunk` - LLM response chunks\n-   `:system/start`, `:system/stop` - Control signals\n\nEach frame has a type and optionally a schema for the data contained in it.\n\nSee [frame.clj](./src/voice_fn/frame.clj) for all possible frames.\n\n\n### Processes\n\nComponents that transform frames:\n\n-   Define input/output requirements\n-   Can maintain state\n-   Use core.async for async processing\n-   Implement the `flow/process` protocol\n\n\n## Adding Custom Processes\n\n```clojure\n    (defn custom-processor []\n      (flow/process\n        {:describe (fn [] {:ins {:in \"Input channel\"}\n                           :outs {:out \"Output channel\"}})\n         :init identity\n         :transform (fn [state in msg]\n                      [state {:out [(process-message msg)]}])}))\n```\n\n\nRead core.async.flow docs for more information about flow precesses.\n\n\n## Built With\n\n-   [core.async](\u003chttps://github.com/clojure/core.async\u003e) - Concurrent processing\n-   [core.async.flow](\u003chttps://clojure.github.io/core.async/#clojure.core.async.flow\u003e) - Flow control\n-   [Hato](\u003chttps://github.com/gnarroway/hato\u003e) - WebSocket support\n-   [Malli](\u003chttps://github.com/metosin/malli\u003e) - Schema validation\n\n\n## Acknowledgements\n\nVoice-fn takes heavy inspiration from [pipecat](https://github.com/pipecat-ai/pipecat). Differences:\n- simulflow uses a graph instead of a bidirectional queue for frame transport\n- simulflow has a data centric implementation. The processors in simulflow are\n  pure functions in the `core.async.flow` transform syntax\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshipclojure%2Fsimulflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshipclojure%2Fsimulflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshipclojure%2Fsimulflow/lists"}