{"id":28347440,"url":"https://github.com/rhinodevel/mt_stt","last_synced_at":"2026-05-20T05:31:28.380Z","repository":{"id":295227113,"uuid":"985815994","full_name":"RhinoDevel/mt_stt","owner":"RhinoDevel","description":"Pure C wrapper library to use Whisper.cpp with Linux and Windows as simple as possible.","archived":false,"fork":false,"pushed_at":"2025-05-24T11:39:08.000Z","size":19,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-09T23:28:11.248Z","etag":null,"topics":["speech-to-text","stt","whisper","whisper-cpp"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RhinoDevel.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-18T15:31:53.000Z","updated_at":"2025-06-02T23:09:07.000Z","dependencies_parsed_at":"2025-05-24T10:36:23.134Z","dependency_job_id":"227e45e2-01ab-43d6-a740-fca09ad74d4e","html_url":"https://github.com/RhinoDevel/mt_stt","commit_stats":null,"previous_names":["rhinodevel/mt_stt"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/RhinoDevel/mt_stt","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RhinoDevel%2Fmt_stt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RhinoDevel%2Fmt_stt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RhinoDevel%2Fmt_stt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RhinoDevel%2Fmt_stt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RhinoDevel","download_url":"https://codeload.github.com/RhinoDevel/mt_stt/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RhinoDevel%2Fmt_stt/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272816157,"owners_count":24997719,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-30T02:00:09.474Z","response_time":77,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["speech-to-text","stt","whisper","whisper-cpp"],"created_at":"2025-05-27T17:08:19.904Z","updated_at":"2026-05-20T05:31:28.373Z","avatar_url":"https://github.com/RhinoDevel.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# mt_stt\n\n*Marcel Timm, RhinoDevel, 2025*\n\n**mt_stt** is a C++ library for Linux and Windows that offers a pure C interface\nto the great speech-to-text inference engine\n[Whisper.cpp](https://github.com/ggml-org/whisper.cpp) by Georgi Gerganov that\nitself runs [OpenAI Whisper](https://github.com/openai/whisper) models.\n\nWith **mt_stt** you can:\n- Transcribe from raw audio in memory to a string.\n- Use a model to be loaded from file or already held in memory.\n- Translate to English.\n- Add an optional initial prompt (to bias/help the transcription process).\n- Use a progress callback and the cancel option.\n- Optionally transcribe a specific part of the audio data, only.\n- Output probabilities of the transcribed words (how sure the model is about the\n  word representing the correct result).\n\n## STT -\u003e LLM -\u003e TTS pipeline example in C\n\nTake a look at the [example](https://github.com/RhinoDevel/mt_llm/tree/main/stt_llm_tts-pipeline-example)\nshowing a simple **S**peech-**To**-**T**ext, **L**arge-**L**anguage-**M**odel,\n**T**ext-**T**o-**S**peech pipeline via\n[mt_stt](./),\n[mt_llm](https://github.com/RhinoDevel/mt_llm)\nand [mt_tts](https://github.com/RhinoDevel/mt_tts)!\n\n## How To\n\nClone the **mt_stt** repository:\n\n`git clone https://github.com/RhinoDevel/mt_stt.git`\n\nEnter the created folder:\n\n`cd mt_stt`\n\nGet the [Whisper.cpp](https://github.com/ggml-org/whisper.cpp) submodule\ncontent:\n\n`git submodule update --init --recursive`\n\n## Linux\n\nNo details for Linux here, yet, but you can take a look at the Windows\ninstructions below and at the [Makefile](./mt_stt/Makefile).\n\n## Windows\n\n#### Note:\n\nAll the following examples are building static libraries, there may be use cases\nwhere dynamically linked libraries are sufficient, too.\n\n### Build [Whisper.cpp](https://github.com/ggml-org/whisper.cpp)\n\n#### Compile `whisper.lib` and `ggml.lib` as static libraries\n\nCompile the necessary `whisper.lib` and `ggml.lib` libraries via Visual Studio\nand `mt_stt/whisper.cpp/CMakeLists.txt` as static libraries.\n\nTo do that, modify the file `mt_stt/whisper.cpp/CMakePresets.json` which is\ncreated by Visual Studio:\n\nIf the binary of `git` is not in your path, modify `\"configurePresets\"` entry\nwith `\"name\"` `\"windows-base\"` by adding the following entry to\n`\"cacheVariables\"`:\n\n`\"GIT_EXE\": \"C:\\\\Program Files\\\\Git\\\\bin\\\\git.exe\"`\n\nAdd entry\n\n```\n{\n  \"name\": \"mt-x64-release-static\",\n  \"displayName\": \"MT x64 Release Static (native)\",\n  \"description\": \"MT: Target Windows (64-bit), static, with the Visual Studio development environment. (RelWithDebInfo)\",\n  \"inherits\": \"x64-release\",\n  \"cacheVariables\": {\n    \"BUILD_SHARED_LIBS\": \"OFF\"\n  }\n}\n```\n\nto `mt_stt/whisper.cpp/CMakePresets.json`'s `configurePresets` array.\n\n#### OpenBLAS build\n\nDownload [OpenBLAS](http://www.openmathlib.org/OpenBLAS/) (e.g.\n`OpenBLAS-0.3.29-x64.zip`) and unpack the content to `C:\\openblas`.\n\nAdditionally add entry\n\n```\n{\n  \"name\": \"mt-x64-release-static-blas\",\n  \"displayName\": \"MT x64 Release Static BLAS\",\n  \"description\": \"MT: Target Windows (64-bit), static, BLAS, with the Visual Studio development environment. (RelWithDebInfo)\",\n  \"inherits\": \"mt-x64-release-static\",\n  \"cacheVariables\": {\n    \"GGML_BLAS\": \"ON\",\n    \"BLAS_LIBRARIES\": \"C:/openblas/lib/libopenblas.lib\",\n    \"BLAS_INCLUDE_DIRS\": \"C:/openblas/include\"\n  }\n}\n```\n\nPut the `libopenblas.dll` (from `C:\\openblas\\bin\\libopenblas.dll`) into the\nfolder of the executable file that will be linked with THIS project's resulting\nDLL.\n\n#### CUDA build\n\nWorking with (e.g.): CUDA 12.4.131 and Whisper.cpp v1.7.5\n\nAdditionally add entry\n\n```\n{\n  \"name\": \"mt-x64-release-static-cuda\",\n  \"displayName\": \"MT x64 Release Static CUDA (native)\",\n  \"description\": \"MT: Target Windows (64-bit), static, CUDA, with the Visual Studio development environment. (RelWithDebInfo)\",\n  \"inherits\": \"mt-x64-release-static\",\n  \"cacheVariables\": {\n    \"GGML_CUDA\": \"ON\"\n  }\n}\n```\n\nto `mt_stt/whisper.cpp/CMakePresets.json`'s configurePresets array.\n\nIn **mt_stt**, link with these libraries (e.g. from `C:\\cuda\\lib\\`):\n\n- `x64\\cublas.lib`\n- `x64\\cuda.lib`\n- `x64\\cudart.lib`\n\nPut the following files (e.g. from `C:\\cuda\\bin`) into the folder of the\nexecutable file that will be linked with **this** project's resulting DLL:\n\n- `cublas64_12.dll`\n- `cublasLt64_12.dll`\n- `cudart64_12.dll`\n\nOn a non-development PC, make sure that the most recent Nvidia drivers are\ninstalled (they include CUDA support).\n\n#### Build for non-AVX processors (e.g. Celeron)\n\nAdditionally add entry\n\n```\n{\n  \"name\": \"mt-x64-release-static-sse\",\n  \"displayName\": \"MT x64 Release Static SSE\",\n  \"description\": \"MT: Target Windows (64-bit), static, SSE, with the Visual Studio development environment. (RelWithDebInfo)\",\n  \"inherits\": \"mt-x64-release-static\",\n  \"cacheVariables\": {\n    \"GGML_NATIVE\": \"OFF\",\n    \"GGML_AVX\": \"OFF\",\n    \"GGML_AVX2\": \"OFF\"\n  }\n}\n```\n\nto `mt_stt/whisper.cpp/CMakePresets.json`'s configurePresets array.\n\n**and** change the line\n\n`#if defined(_MSC_VER) \u0026\u0026 (defined(__AVX__) || defined(__AVX2__) || defined(__AVX512F__))`\n\nto\n\n`#if defined(_MSC_VER)// \u0026\u0026 (defined(__AVX__) || defined(__AVX2__) || defined(__AVX512F__))`\n\nin the file\n\n`mt_stt/whisper.cpp/ggml/src/ggml-cpu/ggml-cpu-impl.h`\n\nbefore the line\n\n`#ifndef __SSE3__`\n\nto enable SSE3 and SSSE3.\n\n#### Build for non-AVX processors (e.g. Celeron), with OpenBLAS\n\nDownload [OpenBLAS](http://www.openmathlib.org/OpenBLAS/) (e.g.\n`OpenBLAS-0.3.29-x64.zip`) and unpack the content to `C:\\openblas`.\n\nAdditionally add entry (also don't forget `ggml-cpu-impl.h` - see above)\n\n```\n{\n  \"name\": \"mt-x64-release-static-sse-blas\",\n  \"displayName\": \"MT x64 Release Static SSE and BLAS\",\n  \"description\": \"MT: Target Windows (64-bit), static, SSE, BLAS, with the Visual Studio development environment. (RelWithDebInfo)\",\n  \"inherits\": \"mt-x64-release-static-sse\",\n  \"cacheVariables\": {\n    \"GGML_BLAS\": \"ON\",\n    \"BLAS_LIBRARIES\": \"C:/openblas/lib/libopenblas.lib\",\n    \"BLAS_INCLUDE_DIRS\": \"C:/openblas/include\"\n  }\n}\n```\n\nPut the `libopenblas.dll` (from `C:\\openblas\\bin\\libopenblas.dll`) into the folder\nof the executable file that will be linked with **this** project's resulting DLL.\n\n### Build mt_stt\n\n- Open solution `mt_stt.sln` with Visual Studio (tested with 2022).\n- Compile in release or debug mode.\n\n### Test mt_stt\n\n- The sample code below is using [mt_tts](https://github.com/RhinoDevel/mt_tts),\n  which is kind of the counterpart to **this** project.\n- Follow [Test mt_tts](https://github.com/RhinoDevel/mt_tts?tab=readme-ov-file#test-mt_tts)\n  first.\n- Get the DLL and LIB files resulting from building **this** project, e.g. for\n  release mode `x64\\Release\\mt_stt.dll` and `x64\\Release\\mt_stt.lib`, copy them\n  to the folder from [Test mt_tts](https://github.com/RhinoDevel/mt_tts?tab=readme-ov-file#test-mt_tts).\n- Also copy the file `mt_stt\\mt_stt.h` to that folder.\n- Copy a [Whisper(.cpp) model file](https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small-q5_1.bin) that supports translation to English to the same new folder.\n- Open `x64 Native Tools Command Prompt for VS 2022` commandline.\n- Go to the example folder and put the following code into the already existing file `main.c`:\n\n```\n#include \u003cstdio.h\u003e\n#include \u003cstdlib.h\u003e\n\n#include \"mt_tts.h\"\n#include \"mt_stt.h\"\n\n/** Example use of mt_stt transcribing \u0026 translating German language audio to\n *  text in English.\n *\n *  The audio is generated first with mt_tts.\n */\nint main(void)\n{\n    int16_t* tts_result = NULL;\n    int sample_count = -1;\n    float* stt_input = NULL;\n    char* stt_result = NULL;\n\n    // *************************************************************************\n    // *** TTS: Create raw audio data from a text given in German:           ***\n    // *************************************************************************\n\n    // Initialize TTS system with a model/voice for output in German:\n    mt_tts_reinit(\"de_DE-thorsten-high.onnx\", \"de_DE-thorsten-high.onnx.json\");\n\n    // Get the actual raw audio data:\n    tts_result = mt_tts_to_raw(\n        \"Hallo! Dies ist ein Text in deutscher Sprache. Erst wird er in ein Tonsignal umgewandelt, welches dann wiederum in Text transkribiert wird, jedoch nun auf Englisch.\",\n        \u0026sample_count);\n\n    // Convert the audio data into normalized floating-point representation:\n\n    stt_input = malloc(sample_count * sizeof *stt_input);\n\n    for(int i = 0; i \u003c sample_count; ++i)\n    {\n        stt_input[i] = (float)tts_result[i] / 16384.0f;\n    }\n\n    // Free memory and de-initialize TTS system:\n\n    mt_tts_free_raw(tts_result);\n    tts_result = NULL;\n\n    mt_tts_deinit();\n    \n    // *************************************************************************\n    // *** STT: Transcribe the audio while also translating it to English:   ***\n    // *************************************************************************\n    \n    stt_result = mt_stt_transcribe_with_file(\n        false,\n        4,\n        NULL,\n        true,\n        NULL,\n        \"ggml-small-q5_1.bin\",\n        stt_input,\n        sample_count,\n        NULL,\n        NULL,\n        NULL,\n        NULL,\n        NULL,\n        NULL,\n        0);\n\n    // Output the translated transcription of the spoken text:\n    printf(\"%s\\n\", stt_result);\n\n    // Free memory and exit:\n    free(stt_result);\n    stt_result = NULL;\n    return 0;\n}\n```\n\n- Compile via `cl main.c mt_tts.lib mt_stt.lib`.\n- Run `main.exe`, which should show the transcription/translation result.\n\n### Notes\n\n- Install Microsoft Visual C++ Redistributable Version for Visual Studio 2015,\n  2017, 2019, and 2022 (e.g. version 14.42.34433.0).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frhinodevel%2Fmt_stt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frhinodevel%2Fmt_stt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frhinodevel%2Fmt_stt/lists"}