Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/exPHAT/SwiftWhisper

🎤 The easiest way to transcribe audio in Swift
https://github.com/exPHAT/SwiftWhisper

ios macos openai speech-recognition speech-to-text swift transcription whisper whisper-cpp

Last synced: 3 months ago
JSON representation

🎤 The easiest way to transcribe audio in Swift

Host: GitHub
URL: https://github.com/exPHAT/SwiftWhisper
Owner: exPHAT
License: mit
Created: 2023-03-29T02:53:01.000Z (almost 2 years ago)
Default Branch: master
Last Pushed: 2024-05-23T04:20:11.000Z (9 months ago)
Last Synced: 2024-10-29T17:36:06.693Z (3 months ago)
Topics: ios, macos, openai, speech-recognition, speech-to-text, swift, transcription, whisper, whisper-cpp
Language: Swift
Homepage:
Size: 720 KB
Stars: 590
Watchers: 9
Forks: 62
Open Issues: 8
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # SwiftWhisper

> The easiest way to use Whisper in Swift

Easily add transcription to your app or package. Powered by [whisper.cpp](https://github.com/ggerganov/whisper.cpp).

[![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2FexPHAT%2FSwiftWhisper%2Fbadge%3Ftype%3Dswift-versions)](https://swiftpackageindex.com/exPHAT/SwiftWhisper)

[![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2FexPHAT%2FSwiftWhisper%2Fbadge%3Ftype%3Dplatforms)](https://swiftpackageindex.com/exPHAT/SwiftWhisper)

## Install

#### Swift Package Manager

Add SwiftWhisper as a dependency in your `Package.swift` file:

```swift

let package = Package(

  ...

  dependencies: [

    // Add the package to your dependencies

    .package(url: "https://github.com/exPHAT/SwiftWhisper.git", branch: "master"),

  ],

  ...

  targets: [

    // Add SwiftWhisper as a dependency on any target you want to use it in

    .target(name: "MyTarget",

            dependencies: [.byName(name: "SwiftWhisper")])

  ]

  ...

)

```

#### Xcode

Add `https://github.com/exPHAT/SwiftWhisper.git` in the ["Swift Package Manager" tab.](https://developer.apple.com/documentation/xcode/adding-package-dependencies-to-your-app)

## Usage

[API Documentation.](https://swiftpackageindex.com/exPHAT/SwiftWhisper/1.0.1/documentation/)

```swift

import SwiftWhisper

let whisper = Whisper(fromFileURL: /* Model file URL */)

let segments = try await whisper.transcribe(audioFrames: /* 16kHz PCM audio frames */)

print("Transcribed audio:", segments.map(\.text).joined())

```

#### Delegate methods

You can subscribe to segments, transcription progress, and errors by implementing `WhisperDelegate` and setting `whisper.delegate = ...`

```swift

protocol WhisperDelegate {

  // Progress updates as a percentage from 0-1

  func whisper(_ aWhisper: Whisper, didUpdateProgress progress: Double)

  // Any time a new segments of text have been transcribed

  func whisper(_ aWhisper: Whisper, didProcessNewSegments segments: [Segment], atIndex index: Int)

  

  // Finished transcribing, includes all transcribed segments of text

  func whisper(_ aWhisper: Whisper, didCompleteWithSegments segments: [Segment])

  // Error with transcription

  func whisper(_ aWhisper: Whisper, didErrorWith error: Error)

}

```

## Misc

### Downloading Models :inbox_tray:

You can find the pre-trained models [here](https://huggingface.co/ggerganov/whisper.cpp) for download.

### CoreML Support :brain:

To use CoreML, you'll need to include a CoreML model file with the suffix `-encoder.mlmodelc` under the same name as the whisper model (Example: `tiny.bin` would also sit beside a `tiny-encoder.mlmodelc` file). In addition to the additonal model file, you will also need to use the `Whisper(fromFileURL:)` initializer. You can verify CoreML is active by checking the console output during transcription.

### Converting audio to 16kHz PCM :wrench:

The easiest way to get audio frames into SwiftWhisper is to use [AudioKit](https://github.com/AudioKit/AudioKit). The following example takes an input audio file, converts and resamples it, and returns an array of 16kHz PCM floats.

```swift

import AudioKit

func convertAudioFileToPCMArray(fileURL: URL, completionHandler: @escaping (Result<[Float], Error>) -> Void) {

    var options = FormatConverter.Options()

    options.format = .wav

    options.sampleRate = 16000

    options.bitDepth = 16

    options.channels = 1

    options.isInterleaved = false

    let tempURL = URL(fileURLWithPath: NSTemporaryDirectory()).appendingPathComponent(UUID().uuidString)

    let converter = FormatConverter(inputURL: fileURL, outputURL: tempURL, options: options)

    converter.start { error in

        if let error {

            completionHandler(.failure(error))

            return

        }

        let data = try! Data(contentsOf: tempURL) // Handle error here

        let floats = stride(from: 44, to: data.count, by: 2).map {

            return data[$0..<$0 + 2].withUnsafeBytes {

                let short = Int16(littleEndian: $0.load(as: Int16.self))

                return max(-1.0, min(Float(short) / 32767.0, 1.0))

            }

        }

        try? FileManager.default.removeItem(at: tempURL)

        completionHandler(.success(floats))

    }

}

```

### Development speed boost :rocket:

You may find the performance of the transcription slow when compiling your app for the `Debug` build configuration. This is because the compiler doesn't fully optimize SwiftWhisper unless the build configuration is set to `Release`.

You can get around this by installing a version of SwiftWhisper that uses `.unsafeFlags(["-O3"])` to force maximum optimization. The easiest way to do this is to use the latest commit on the [`fast`](https://github.com/exPHAT/SwiftWhisper/tree/fast) branch. Alternatively, you can configure your scheme to build in the `Release` configuration.

```swift

  ...

  dependencies: [

    // Using latest commit hash for `fast` branch:

    .package(url: "https://github.com/exPHAT/SwiftWhisper.git", revision: "deb1cb6a27256c7b01f5d3d2e7dc1dcc330b5d01"),

  ],

  ...

```