{"id":47882723,"url":"https://github.com/robacarp/whisper-cry","last_synced_at":"2026-04-04T02:00:14.094Z","repository":{"id":342850310,"uuid":"1175033777","full_name":"robacarp/whisper-cry","owner":"robacarp","description":"A Crystal wrapper for Whisper.CPP","archived":false,"fork":false,"pushed_at":"2026-03-08T22:54:33.000Z","size":36,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-04-04T01:59:39.625Z","etag":null,"topics":["whisper-cpp"],"latest_commit_sha":null,"homepage":"","language":"Crystal","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/robacarp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-07T06:11:21.000Z","updated_at":"2026-03-28T04:58:44.000Z","dependencies_parsed_at":null,"dependency_job_id":"31c80eb4-3c5a-4275-acee-c7d714a26060","html_url":"https://github.com/robacarp/whisper-cry","commit_stats":null,"previous_names":["robacarp/whisper-cry"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/robacarp/whisper-cry","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robacarp%2Fwhisper-cry","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robacarp%2Fwhisper-cry/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robacarp%2Fwhisper-cry/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robacarp%2Fwhisper-cry/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/robacarp","download_url":"https://codeload.github.com/robacarp/whisper-cry/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/robacarp%2Fwhisper-cry/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31384847,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T01:22:39.193Z","status":"online","status_checked_at":"2026-04-04T02:00:07.569Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["whisper-cpp"],"created_at":"2026-04-04T02:00:13.293Z","updated_at":"2026-04-04T02:00:14.069Z","avatar_url":"https://github.com/robacarp.png","language":"Crystal","funding_links":[],"categories":[],"sub_categories":[],"readme":"# whisper-cry\n\nCrystal bindings for [whisper.cpp](https://github.com/ggml-org/whisper.cpp), providing local speech-to-text transcription using OpenAI's Whisper models. Version tracks whisper.cpp releases (currently v1.8.3).\n\n## Installation\n\n1. Add the dependency to your `shard.yml`:\n\n   ```yaml\n   dependencies:\n     whisper-cry:\n       github: robacarp/whisper-cry\n   ```\n\n2. Run `shards install`\n\n3. Build the native libraries:\n\n   ```sh\n   cd lib/whisper-cry \u0026\u0026 make\n   ```\n\n   This clones whisper.cpp v1.8.3, builds it as a static library, and copies the `.a` files into `vendor/lib/`. Requires `cmake` and a C++ compiler. See the [whisper.cpp build documentation](https://github.com/ggml-org/whisper.cpp#building-the-project) for platform-specific details and options.\n\n4. Download a Whisper model (e.g. the base English model):\n\n   ```sh\n   curl -L -o ggml-base.en.bin https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin\n   ```\n\n   See the [whisper.cpp models directory](https://github.com/ggml-org/whisper.cpp/tree/master/models) for all available models.\n\n5. Optimize the model for your hardware (optional but recommended):\n\n  The Whisper.cpp project has documentation and scripting support for optimizing models for different hardware, quantization, etc.\n\n  - MacOS [CoreML](https://github.com/ggml-org/whisper.cpp?tab=readme-ov-file#core-ml-support)\n  - [OpenVINO](https://github.com/ggml-org/whisper.cpp?tab=readme-ov-file#openvino-support)\n  - [Nvidia](https://github.com/ggml-org/whisper.cpp?tab=readme-ov-file#nvidia-gpu-support)\n\n## Usage\n\n```crystal\nrequire \"whisper-cry\"\n\nwhisper = Whisper.new(\"/path/to/ggml-base.en.bin\")\nsegments = whisper.transcribe_file(\"audio.wav\")\n\nsegments.each do |segment|\n  puts \"#{segment.start_timestamp} --\u003e #{segment.end_timestamp}\"\n  puts segment.text\nend\n\nwhisper.close\n```\n\nAudio files must be 16-bit PCM WAV, mono, 16kHz. Convert with ffmpeg:\n\n```sh\nffmpeg -i input.mp3 -ar 16000 -ac 1 -f wav output.wav\n```\n\n### API\n\n#### `Whisper.new(model_path, use_gpu = false)`\n\nLoads a GGML-format model file and initializes the inference context. Set `use_gpu: true` to enable Metal acceleration on macOS. Raises `Whisper::Error` if the model file is missing or fails to load.\n\n#### `#transcribe_file(path, language = \"en\", n_threads = 4, translate = false)`\n\nTranscribes a WAV file and returns an `Array(Whisper::Segment)`. The file must be 16-bit signed PCM, mono, 16kHz.\n\n#### `#transcribe(samples, language = \"en\", n_threads = 4, translate = false)`\n\nTranscribes pre-loaded `Float32` audio samples (normalized to `[-1.0, 1.0]`, mono, 16kHz). Useful when you already have audio data in memory.\n\nOptions:\n- **language**: BCP-47 code (e.g. `\"en\"`, `\"es\"`), or `nil` for auto-detection\n- **n_threads**: CPU threads for inference\n- **translate**: when `true`, translates to English regardless of source language\n\n#### `#close`\n\nFrees the underlying whisper context. Safe to call multiple times. Also called automatically by `#finalize`.\n\n#### `#version`, `#model_type`, `#multilingual?`, `#system_info`\n\nQuery the whisper.cpp version string, loaded model type (e.g. `\"base\"`), multilingual support, and available CPU features.\n\n### `Whisper::Segment`\n\nEach segment represents a span of recognized speech:\n\n| Method | Returns |\n|---|---|\n| `#text` | Transcribed text |\n| `#start_ms` / `#end_ms` | Timing in milliseconds |\n| `#start_seconds` / `#end_seconds` | Timing in seconds |\n| `#duration_ms` | Segment duration in milliseconds |\n| `#start_timestamp` / `#end_timestamp` | Formatted as `\"HH:MM:SS.mmm\"` |\n| `#no_speech_probability` | `Float32` (0.0-1.0), higher = likely not speech |\n| `#speaker_turn_next` | `true` if next segment is a different speaker |\n\n## Development\n\nRun tests:\n\n```sh\ncrystal spec\n```\n\nTests cover `Segment` formatting/conversion, WAV file parsing and validation, and `Whisper` initialization error handling. No model file is needed to run the test suite.\n\n## License\n\n[MIT](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobacarp%2Fwhisper-cry","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frobacarp%2Fwhisper-cry","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobacarp%2Fwhisper-cry/lists"}