Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/exPHAT/SwiftWhisper
🎤 The easiest way to transcribe audio in Swift
https://github.com/exPHAT/SwiftWhisper
ios macos openai speech-recognition speech-to-text swift transcription whisper whisper-cpp
Last synced: about 2 months ago
JSON representation
🎤 The easiest way to transcribe audio in Swift
- Host: GitHub
- URL: https://github.com/exPHAT/SwiftWhisper
- Owner: exPHAT
- License: mit
- Created: 2023-03-29T02:53:01.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2024-05-23T04:20:11.000Z (7 months ago)
- Last Synced: 2024-10-29T17:36:06.693Z (about 2 months ago)
- Topics: ios, macos, openai, speech-recognition, speech-to-text, swift, transcription, whisper, whisper-cpp
- Language: Swift
- Homepage:
- Size: 720 KB
- Stars: 590
- Watchers: 9
- Forks: 62
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# SwiftWhisper
> The easiest way to use Whisper in Swift
Easily add transcription to your app or package. Powered by [whisper.cpp](https://github.com/ggerganov/whisper.cpp).
[![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2FexPHAT%2FSwiftWhisper%2Fbadge%3Ftype%3Dswift-versions)](https://swiftpackageindex.com/exPHAT/SwiftWhisper)
[![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2FexPHAT%2FSwiftWhisper%2Fbadge%3Ftype%3Dplatforms)](https://swiftpackageindex.com/exPHAT/SwiftWhisper)## Install
#### Swift Package Manager
Add SwiftWhisper as a dependency in your `Package.swift` file:
```swift
let package = Package(
...
dependencies: [
// Add the package to your dependencies
.package(url: "https://github.com/exPHAT/SwiftWhisper.git", branch: "master"),
],
...
targets: [
// Add SwiftWhisper as a dependency on any target you want to use it in
.target(name: "MyTarget",
dependencies: [.byName(name: "SwiftWhisper")])
]
...
)
```#### Xcode
Add `https://github.com/exPHAT/SwiftWhisper.git` in the ["Swift Package Manager" tab.](https://developer.apple.com/documentation/xcode/adding-package-dependencies-to-your-app)
## Usage
[API Documentation.](https://swiftpackageindex.com/exPHAT/SwiftWhisper/1.0.1/documentation/)
```swift
import SwiftWhisperlet whisper = Whisper(fromFileURL: /* Model file URL */)
let segments = try await whisper.transcribe(audioFrames: /* 16kHz PCM audio frames */)print("Transcribed audio:", segments.map(\.text).joined())
```#### Delegate methods
You can subscribe to segments, transcription progress, and errors by implementing `WhisperDelegate` and setting `whisper.delegate = ...`
```swift
protocol WhisperDelegate {
// Progress updates as a percentage from 0-1
func whisper(_ aWhisper: Whisper, didUpdateProgress progress: Double)// Any time a new segments of text have been transcribed
func whisper(_ aWhisper: Whisper, didProcessNewSegments segments: [Segment], atIndex index: Int)
// Finished transcribing, includes all transcribed segments of text
func whisper(_ aWhisper: Whisper, didCompleteWithSegments segments: [Segment])// Error with transcription
func whisper(_ aWhisper: Whisper, didErrorWith error: Error)
}
```## Misc
### Downloading Models :inbox_tray:
You can find the pre-trained models [here](https://huggingface.co/ggerganov/whisper.cpp) for download.
### CoreML Support :brain:
To use CoreML, you'll need to include a CoreML model file with the suffix `-encoder.mlmodelc` under the same name as the whisper model (Example: `tiny.bin` would also sit beside a `tiny-encoder.mlmodelc` file). In addition to the additonal model file, you will also need to use the `Whisper(fromFileURL:)` initializer. You can verify CoreML is active by checking the console output during transcription.
### Converting audio to 16kHz PCM :wrench:
The easiest way to get audio frames into SwiftWhisper is to use [AudioKit](https://github.com/AudioKit/AudioKit). The following example takes an input audio file, converts and resamples it, and returns an array of 16kHz PCM floats.
```swift
import AudioKitfunc convertAudioFileToPCMArray(fileURL: URL, completionHandler: @escaping (Result<[Float], Error>) -> Void) {
var options = FormatConverter.Options()
options.format = .wav
options.sampleRate = 16000
options.bitDepth = 16
options.channels = 1
options.isInterleaved = falselet tempURL = URL(fileURLWithPath: NSTemporaryDirectory()).appendingPathComponent(UUID().uuidString)
let converter = FormatConverter(inputURL: fileURL, outputURL: tempURL, options: options)
converter.start { error in
if let error {
completionHandler(.failure(error))
return
}let data = try! Data(contentsOf: tempURL) // Handle error here
let floats = stride(from: 44, to: data.count, by: 2).map {
return data[$0..<$0 + 2].withUnsafeBytes {
let short = Int16(littleEndian: $0.load(as: Int16.self))
return max(-1.0, min(Float(short) / 32767.0, 1.0))
}
}try? FileManager.default.removeItem(at: tempURL)
completionHandler(.success(floats))
}
}
```### Development speed boost :rocket:
You may find the performance of the transcription slow when compiling your app for the `Debug` build configuration. This is because the compiler doesn't fully optimize SwiftWhisper unless the build configuration is set to `Release`.
You can get around this by installing a version of SwiftWhisper that uses `.unsafeFlags(["-O3"])` to force maximum optimization. The easiest way to do this is to use the latest commit on the [`fast`](https://github.com/exPHAT/SwiftWhisper/tree/fast) branch. Alternatively, you can configure your scheme to build in the `Release` configuration.
```swift
...
dependencies: [
// Using latest commit hash for `fast` branch:
.package(url: "https://github.com/exPHAT/SwiftWhisper.git", revision: "deb1cb6a27256c7b01f5d3d2e7dc1dcc330b5d01"),
],
...
```