https://github.com/k2-fsa/sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.
https://github.com/k2-fsa/sherpa-ncnn

asr c cpp csharp go kotlin python speech-recognition vad voice-activity-detection

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/k2-fsa/sherpa-ncnn
Owner: k2-fsa
License: apache-2.0
Created: 2022-09-04T11:26:54.000Z (over 3 years ago)
Default Branch: master
Last Pushed: 2025-05-09T13:08:17.000Z (7 months ago)
Last Synced: 2025-05-09T13:58:57.488Z (7 months ago)
Topics: asr, c, cpp, csharp, go, kotlin, python, speech-recognition, vad, voice-activity-detection
Language: C++
Homepage: https://k2-fsa.github.io/sherpa/ncnn/index.html
Size: 2.06 MB
Stars: 1,317
Watchers: 34
Forks: 178
Open Issues: 51
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

StarryDivineSky - k2-fsa/sherpa-ncnn

README

          ### Supported functions

|Real-time Speech recognition| Voice activity detection |

|----------------------------|--------------------------|

|   ✔️                        |         ✔️                |

### Supported platforms

|Architecture| Android          | iOS           | Windows    | macOS | linux |

|------------|------------------|---------------|------------|-------|-------|

|   x64      |  ✔️               |               |   ✔️        | ✔️     |  ✔️    |

|   x86      |  ✔️               |               |   ✔️        |       |       |

|   arm64    |  ✔️               | ✔️             |   ✔️        | ✔️     |  ✔️    |

|   arm32    |  ✔️               |               |            |       |  ✔️    |

|   riscv64  |                  |               |            |       |  ✔️    |

### Supported programming languages

| 1. C++ | 2. C  | 3. Python | 4. JavaScript |

|--------|-------|-----------|---------------|

|   ✔️    | ✔️     | ✔️         |    ✔️          |

|5. Go   | 6. C# | 7. Kotlin | 8. Swift |

|--------|-------|-----------|----------|

| ✔️      |  ✔️    | ✔️         |  ✔️       |

It also supports WebAssembly.

## Introduction

This repository supports running the following functions **locally**

  - Streaming speech-to-text (i.e., real-time speech recognition)

  - VAD (e.g., [silero-vad](https://github.com/snakers4/silero-vad))

on the following platforms and operating systems:

  - x86, ``x86_64``, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64)

  - Linux, macOS, Windows, openKylin

  - Android, WearOS

  - iOS

  - NodeJS

  - WebAssembly

  - [Raspberry Pi](https://www.raspberrypi.com/)

  - [RV1126](https://www.rock-chips.com/uploads/pdf/2022.8.26/191/RV1126%20Brief%20Datasheet.pdf)

  - [LicheePi4A](https://sipeed.com/licheepi4a)

  - [VisionFive 2](https://www.starfivetech.com/en/site/boards)

  - [旭日X3派](https://developer.horizon.ai/api/v1/fileData/documents_pi/index.html)

  - etc

with the following APIs

  - C++, C, Python, Go, ``C#``

  - Kotlin

  - JavaScript

  - Swift

We support all platforms that [ncnn](https://github.com/tencent/ncnn) supports.

Everything can be compiled from source with static link. The generated

executable depends only on system libraries.

**HINT**: It does not depend on PyTorch or any other inference frameworks

other than [ncnn](https://github.com/tencent/ncnn).

Please see the documentation 

for installation and usages, e.g.,

  - How to build an Android app

  - How to download and use pre-trained models

We provide a few YouTube videos for demonstration about real-time speech recognition

with `sherpa-ncnn` using a microphone:

  - `English`: 

  - `Chinese`: 

  - Multilingual (Chinese + English) with endpointing Python demo : 

  - **Android demos**

  - Multilingual (Chinese + English) Android demo 1: 

  - Multilingual (Chinese + English) Android demo 2: 

  - `Chinese (with background noise)` Android demo : 

  - `Chinese` Android demo : 

  - `Chinese poem with background music` Android demo : 

### Links for pre-built Android APKs

| Description                    | URL                                                       |

|--------------------------------|-----------------------------------------------------------|

| Streaming speech recognition   | [Address](https://github.com/k2-fsa/sherpa-ncnn/releases) |

### Links for pre-trained models

https://github.com/k2-fsa/sherpa-ncnn/releases/tag/models

### Useful links

- Documentation: https://k2-fsa.github.io/sherpa/ncnn/

- Bilibili 演示视频: https://search.bilibili.com/all?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi

### How to reach us

Please see

https://k2-fsa.github.io/sherpa/social-groups.html

for 新一代 Kaldi **微信交流群** and **QQ 交流群**.

## See also

  - 

  -

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/k2-fsa/sherpa-ncnn

Awesome Lists containing this project

README