{"id":24841587,"url":"https://github.com/drankush/voxrad","last_synced_at":"2025-10-27T19:09:28.181Z","repository":{"id":274397764,"uuid":"831532008","full_name":"drankush/VoxRad","owner":"drankush","description":"VOXRAD is a voice transcription application for radiologists leveraging locally deployed ASR and LLM models.","archived":false,"fork":false,"pushed_at":"2025-02-03T15:02:02.000Z","size":22301,"stargazers_count":1,"open_issues_count":4,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-03T15:39:20.330Z","etag":null,"topics":["desktop-app","ffmpeg","gemini","gpt","llm","macos","medical-informatics","multimodal","natural-language-processing","nlp","openai","openai-api","productivity","python","radiology","reporting","transcription","voice-recognition","whisper","windows"],"latest_commit_sha":null,"homepage":"https://voxrad.gitbook.io/voxrad","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/drankush.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"contributing.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-20T20:49:14.000Z","updated_at":"2025-02-03T15:02:06.000Z","dependencies_parsed_at":"2025-01-27T04:34:59.647Z","dependency_job_id":null,"html_url":"https://github.com/drankush/VoxRad","commit_stats":null,"previous_names":["drankush/voxrad"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drankush%2FVoxRad","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drankush%2FVoxRad/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drankush%2FVoxRad/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drankush%2FVoxRad/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/drankush","download_url":"https://codeload.github.com/drankush/VoxRad/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245598314,"owners_count":20641884,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["desktop-app","ffmpeg","gemini","gpt","llm","macos","medical-informatics","multimodal","natural-language-processing","nlp","openai","openai-api","productivity","python","radiology","reporting","transcription","voice-recognition","whisper","windows"],"created_at":"2025-01-31T07:18:56.457Z","updated_at":"2025-10-27T19:09:28.114Z","avatar_url":"https://github.com/drankush.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"images/voxrad_logo.jpg\" alt=\"VOXRAD Logo\" /\u003e\n\u003c/p\u003e\n\n\u003cdiv align=\"center\"\u003e\n  \n[![Python Badge](https://img.shields.io/badge/Python-3776AB?logo=python\u0026logoColor=fff\u0026style=for-the-badge)](#)\n[![FFmpeg Badge](https://img.shields.io/badge/OpenAI%20API-eee?style=for-the-badge\u0026logo=openai\u0026logoColor=412991)]()\n[![GitBook Badge](https://img.shields.io/badge/GitBook-BBDDE5?logo=gitbook\u0026logoColor=000\u0026style=for-the-badge)](https://voxrad.gitbook.io/voxrad)\n\n[![Release](https://img.shields.io/github/v/release/drankush/voxrad?include_prereleases\u0026color=blue)](https://github.com/drankush/voxrad/releases)\n[![License](https://flat.badgen.net/badge/license/GPLv3/green?icon=github)](https://github.com/drankush/voxrad/blob/main/LICENSE)\n[![Python Version](https://flat.badgen.net/badge/python/3.11%20|%203.12/blue?icon=github)](#)\n\n[![Open Issues](https://img.shields.io/github/issues/drankush/voxrad.svg?color=orange)](https://github.com/drankush/voxrad/issues)\n[![Closed Issues](https://img.shields.io/github/issues-closed/drankush/voxrad.svg?color=red)](https://github.com/drankush/voxrad/issues?q=is%3Aissue+is%3Aclosed)\n\n\n[![Apple](https://flat.badgen.net/badge/icon/apple?icon=apple\u0026label)](https://github.com/drankush/VoxRad/releases/download/v0.3.0-beta/VoxRad_macOS_v0.3.0-beta.zip)\n[![Windows](https://flat.badgen.net/badge/icon/windows?icon=windows\u0026label)](https://github.com/drankush/VoxRad/releases/download/v0.1.0-alpha/VoxRad_winOS_v0.1.0-alpha.zip)\n\n\n\u003c/div\u003e\n\n# 🚀 VOXRAD \n\nVOXRAD is a voice transcription application for radiologists leveraging voice transcription and large language models to restructure and format reports as per predefined user instruction templates.\n\n**Welcome to The VOXRAD App! 🌟 🎙**\n\nThis application leverages the power of generative AI to efficiently transcribe and format radiology reports from audio inputs. Designed for radiologists and radiology residents, it transforms spoken content into structured, readable reports.\n\n**Etymology:**\n\n-  **VoxRad** /vɒks-ræd/ *noun*\n\n1. A portmanteau derived from **Vox** (Latin for *voice*) and **Rad** (*radiology*), symbolizing the fusion of voice recognition with radiology. Represents the integration of voice recognition technology with radiological imaging and reporting.\n\n2. An AI-driven app transforming radiology reporting through voice transcription, enhancing accuracy in medical documentation.\n\n## ✨ Features \n\n- 🎤 Voice transcription\n- 📝 Report formatting\n- 🤖 Integration with large language models\n- ⚙️ Customizable templates\n- 📈 Potential to extend the application for dictating other structured notes (discharge notes, OT notes or legal paperwork)\n\n## 🏗️ Architecture\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"images/voxrad_architecture.png\" alt=\"VOXRAD Logo\" /\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n\u003ci\u003eModified figure from Ankush et al. for v0.4.0-beta [1]\u003c/i\u003e\n\u003c/p\u003e\n\n## 🛠️ Getting Set Up\n\n### 💻 Installation \n\n- Download the `.app` file for Mac or the `.exe` file for Windows from the [releases](https://github.com/drankush/voxrad/releases).\n\n### 🔄 Understanding Workflow\nVOXRAD uses two ways to transcribe audio to report.\n\n- Use a combination of using a transcription model to first transcribe audio and then format and restructure the transcript using instruction template.\n- Use a multimodal model to directly input the audio and instruction template to provide output (experimental).\n\nRead more about the supported models [here](https://voxrad.gitbook.io/voxrad/fundamentals/getting-set-up/understanding-workflow#supported-llms).\n\n### 📄 Customizing Templates and Guidelines\n\n- Click ⚙️ Settings button at bottom right corner of the application interface.\n\n  -  In the first Tab  🛠 General click Browse and select your desired working directory. \n\n  -  Here your templates files (predefined CoT-like systematic instructions such as HRCT_Thorax.txt, CT_Head.txt etc.) and guidelines (such as BIRADS.md, TIRADS.md, PIRADS.md etc.) will be kept.\n\nRead more about [Customizing templates and guidelines](https://voxrad.gitbook.io/voxrad/fundamentals/getting-set-up/customizing-templates).\n\n\n### 🔐 Managing Keys\n\n- You can encrypt keys of transcription, text and multimodal models with password and even lock and unlock them while the application is in use. The application will ask for this password every time you start the applicaiton if encrypted keys are stored.\n- In the \"Base URL\" field,  enter the base URL in OpenAI compatible format. Enter API key in the in the \"API Key\" field.\n- You can use any OpenAI-compatible API key and Base URL and even locally deployed models which create OpenAI compatible endpoints.\n- Click **Fetch Model** to see the available models and choose one.\n- Click **Save Settings** to save your selected model and Base URL (these are not encrypted).\nRead more about managing keys, best practices and troubleshooting [here](https://voxrad.gitbook.io/voxrad/fundamentals/getting-set-up/managing-keys).\n\n### 🖥️ Running Models Locally\n\n- There are [various ways](https://voxrad.gitbook.io/voxrad/running-models-locally) to run models locally and create OpenAI compatible endpoints which can then used with this application.\n- You can also input OpenAI compatible Base URL and API key of [any remotely hosted service](https://voxrad.gitbook.io/voxrad/running-models-locally#remotely-hosted-models), however this is not recommended for sensitive data. For example: Groq: https://api.groq.com/openai/v1\n\n## 🖱️ Usage \n\n### 🎙 Main App Window \n\n\u003c!--\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"images/voxrad_gui.jpg\" alt=\"VOXRAD Logo\" /\u003e\n\u003c/p\u003e\n--\u003e\n\n\n\n- Press the **Record 🔴** button and start dictating your report, keep it around max 15 minutes, as the file sent limit is 25 MB (the application will try to reduce the bitrate to accommodate this size for longer audios). You will see a waveform while the audio is recorded.\n\n- Press **Stop ⬜️** to stop recording. Your audio will be processed.\n\n- The final formatted and structured report will be automatically posted on your clipboard. You can then directly paste using secure paste shortcut key defined in the General Settings (in macOS) or  (Ctrl + V in windows application) it into your application, word processor, or PACS.\n\nRead detailed documentation of generating a report [here](https://voxrad.gitbook.io/voxrad/user-guide/generating-a-report).\n\n## 📚 Documentation \n\nRead comprehensive VOXRAD documentation [here](http://voxrad.gitbook.io/voxrad).\n\n## 🌟 Contributing \n\nVOXRAD is a community-driven project, and we're grateful for the contributions of our team members. Read about the [key contributors](https://voxrad.gitbook.io/voxrad/support-and-contact/contributors). Please read the [contributing guidelines](CONTRIBUTING.md) before getting started.\n\n## 📜 License \n\nThis project is licensed under the GPLv3 License - see the [LICENSE](LICENSE) file for details. Till v0.3.0-beta, the application uses FFmpeg, which is licensed under the GNU General Public License (GPL) version 2 or later. For more details, please refer to the [documentation](https://github.com/drankush/voxrad/docs/FFmpeg.md/) in the repository.\n\n## 🐞 Support \n\nTo report bugs or issues, please follow [this guide](https://github.com/drankush/voxrad/blob/main/contributing.md#reporting-bugs) on how to report bugs.\n\n### 📧 Contact \n\nFor any other questions, support or appreciation, please contact [here](mailto:voxrad@drankush.com).\n\n## 🚨 Disclaimer \n\nThis is a pure demonstrative application for the capabilities of AI and may not be compliant with local regulations of handling sensitive and private data. This is not intended for any diagnostic and clinical use. Please read the terms of use of the API keys that you will be using.\n\n- The application is not intended to replace professional medical advice, diagnosis, or treatment.\n- Users must ensure they comply with all relevant local laws and regulations when using the application, especially concerning data privacy and security.\n- Users are advised to locally host voice transcription and text models and use its endpoints for sensitive data.\n- The developers are not responsible for any misuse of the application or any data breaches that may occur.\n- The application does not encrypt data by default; users must take additional steps to secure their data.\n- Always verify the accuracy of the transcriptions and generated reports manually.\n\n## 🔖 Cite\n```\n@article{ankush_voxrad_2025,\n\ttitle = {{VoxRad}: {Building} an open-source locally-hosted radiology reporting system},\n\tvolume = {119},\n\tissn = {0899-7071, 1873-4499},\n\tshorttitle = {{VoxRad}},\n\turl = {https://www.clinicalimaging.org/article/S0899-7071(25)00014-2/abstract},\n\tdoi = {10.1016/j.clinimag.2025.110414},\n\tlanguage = {English},\n\turldate = {2025-02-01},\n\tjournal = {Clinical Imaging},\n\tauthor = {Ankush, Ankush},\n\tmonth = mar,\n\tyear = {2025},\n\tpmid = {39884167},\n\tnote = {Publisher: Elsevier},\n\tkeywords = {Artificial intelligence, Efficiency, Informatics, Natural language processing, Speech recognition software},\n}\n```\n[1] Ankush A. (2025). VoxRad: Building an open-source locally-hosted radiology reporting system. Clinical imaging, 119, 110414. Advance online publication. https://doi.org/10.1016/j.clinimag.2025.110414 PMID:[39884167](https://pubmed.ncbi.nlm.nih.gov/39884167/)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdrankush%2Fvoxrad","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdrankush%2Fvoxrad","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdrankush%2Fvoxrad/lists"}