{"id":50595011,"url":"https://github.com/zserge/govad","last_synced_at":"2026-06-05T13:31:00.589Z","repository":{"id":348059963,"uuid":"1196335637","full_name":"zserge/govad","owner":"zserge","description":"Silero VAD (Voice Activity Detector) in Pure Go","archived":false,"fork":false,"pushed_at":"2026-03-30T15:54:05.000Z","size":1084,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-30T17:39:12.948Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zserge.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-30T15:46:18.000Z","updated_at":"2026-03-30T16:03:08.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/zserge/govad","commit_stats":null,"previous_names":["zserge/govad"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/zserge/govad","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zserge%2Fgovad","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zserge%2Fgovad/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zserge%2Fgovad/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zserge%2Fgovad/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zserge","download_url":"https://codeload.github.com/zserge/govad/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zserge%2Fgovad/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33944671,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-05T02:00:06.157Z","response_time":120,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-05T13:30:59.603Z","updated_at":"2026-06-05T13:31:00.580Z","avatar_url":"https://github.com/zserge.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# govad\n\n[![CI](https://github.com/zserge/govad/actions/workflows/ci.yml/badge.svg)](https://github.com/zserge/govad/actions/workflows/ci.yml)\n[![GoDoc](https://pkg.go.dev/badge/github.com/zserge/govad.svg)](https://pkg.go.dev/github.com/zserge/govad)\n[![Go Report Card](https://goreportcard.com/badge/github.com/zserge/govad)](https://goreportcard.com/report/github.com/zserge/govad)\n\nPure Go voice activity detection using the [Silero VAD](https://github.com/snakers4/silero-vad) neural network.\n\nNo CGo. No ONNX runtime. No external dependencies.\n\nThe model weights are embedded in the binary.\n\n## Features\n\n- Pure Go inference (~300 lines), zero C dependencies\n- Processes 512-sample frames (32 ms at 16 kHz)\n- Stateful LSTM — feed frames sequentially, get speech probabilities\n- Embedded model weights — no extra files to ship\n- Validated against the ONNX reference (max diff \u003c 0.001)\n\n## Installation\n\n```\ngo get github.com/zserge/govad@latest\n```\n\n## Usage\n\n```go\npackage main\n\nimport (\n\t\"fmt\"\n\t\"github.com/zserge/govad\"\n)\n\nfunc main() {\n\t// Create a VAD detector (uses embedded weights)\n\tv, err := govad.New()\n\tif err != nil {\n\t\tpanic(err)\n\t}\n\n\t// Feed 512 float32 samples at 16 kHz per call\n\tsamples := make([]float32, govad.SamplesPerFrame)\n\t// ... fill samples from your audio source ...\n\n\tprob := v.Process(samples)\n\tif prob \u003e 0.5 {\n\t\tfmt.Println(\"Speech detected!\")\n\t}\n\n\t// Call Reset() between unrelated audio streams\n\tv.Reset()\n}\n```\n\n## Live microphone example\n\nThe `examples/live-vad` directory contains a complete real-time VAD demo\nusing [malgo](https://github.com/gen2brain/malgo) (miniaudio bindings):\n\n```\ncd examples/live-vad\ngo run . -threshold 0.5\n```\n\nIt captures audio from your default microphone and prints speech/silence\ntransitions in real time.\n\n## API\n\n| Function | Description |\n|----------|-------------|\n| `govad.New()` | Create a detector with embedded default weights |\n| `govad.NewFromFile(path)` | Load weights from a file |\n| `govad.NewFromReader(r)` | Load weights from an `io.Reader` |\n| `v.Process(samples)` | Run inference on 512 samples, returns probability `[0, 1]` |\n| `v.Reset()` | Clear LSTM state for a new audio stream |\n\n## Performance\n\nOn Apple M1:\n\n```\nBenchmarkProcess-8    1911    632370 ns/op    10112 B/op    7 allocs/op\n```\n\n~632 µs per 32 ms frame — roughly 50× faster than real time.\n\n## Model\n\nThe weights are exported from `silero_vad_half.onnx` (Silero VAD v5, 16 kHz only).\nThe architecture is:\n\n```\nAudio (512 samples, 16 kHz)\n  → Reflect pad (64 right)\n  → Conv-STFT (n_fft=256, hop=128)\n  → Magnitude spectrum\n  → Conv1d(129→128, k=3) + ReLU\n  → Conv1d(128→64,  k=3, stride=2) + ReLU\n  → Conv1d(64→64,   k=3, stride=2) + ReLU\n  → Conv1d(64→128,  k=3) + ReLU\n  → LSTMCell(128)\n  → ReLU → Linear(128→1) → Sigmoid\n  → Speech probability\n```\n\n## License\n\nThe Go code is MIT licensed. The model weights are from [Silero VAD](https://github.com/snakers4/silero-vad), also MIT licensed.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzserge%2Fgovad","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzserge%2Fgovad","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzserge%2Fgovad/lists"}