https://github.com/cardona/swiftmistralcoreml

Swift Tokenizer / BPE Encoder for Mistral 7B
https://github.com/cardona/swiftmistralcoreml

coreml mistral mistral-7b swift

Last synced: about 1 month ago
JSON representation

Swift Tokenizer / BPE Encoder for Mistral 7B

Host: GitHub
URL: https://github.com/cardona/swiftmistralcoreml
Owner: cardona
License: apache-2.0
Created: 2024-10-12T14:27:50.000Z (8 months ago)
Default Branch: main
Last Pushed: 2025-04-20T08:51:41.000Z (about 2 months ago)
Last Synced: 2025-05-12T14:14:24.227Z (about 1 month ago)
Topics: coreml, mistral, mistral-7b, swift
Language: Swift
Homepage:
Size: 724 KB
Stars: 22
Watchers: 3
Forks: 3
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # SwiftMistralCoreML

An open-source Swift library that enables macOS and iOS projects to utilize the Mistral-Interact7B models (INT4 and upcoming FP16) in chat mode. This library includes a complete Swift implementation of the tokenizer and Byte Pair Encoding (BPE) encoder, providing an out-of-the-box solution for integrating advanced language models into your Swift applications.

## Features

- **Full Swift Implementation**: Includes tokenizer and BPE encoder written entirely in Swift.

- **CoreML Integration**: Leverages Apple's CoreML framework to run Mistral-Interact7B models efficiently.

- **Multiple Decoding Strategies**: Supports Greedy and Top-K sampling, with plans to add more strategies.

- **Chat Functionality**: Designed to work in chat mode for interactive applications.

- **FP16 Support (Coming Soon)**: Future version will support FP16 models for improved performance.

- **iOS Compatibility**: Aiming for compatibility with iOS 18 and newer.

- **Extensible Interface**: Future improvements will focus on enhancing flexibility and adding more features.

## Important Note

**The Mistral-Interact7B CoreML model is not included in this library.** You must download the model from [Hugging Face](https://huggingface.co/apple/mistral-coreml) and add it to your project manually.

[Demo App](https://github.com/cardona/SwiftMistralCoreML-DemoApp)

![Demo of SwiftMistralCoreML App](demo.gif)

## Installation

Add `SwiftMistralCoreML` to your project using Swift Package Manager:

1. In Xcode, go to **File > Swift Packages > Add Package Dependency**.

2. Enter the repository URL:

   ```

   https://github.com/cardona/SwiftMistralCoreML.git

   ```

3. Choose the latest version and follow the prompts to add the package.

## Model Setup

1. Download the Mistral-Interact7B CoreML model (INT4 or FP16) from [Hugging Face](https://huggingface.co/apple/mistral-coreml).

2. Drag and drop the downloaded `.mlmodel` file into your Xcode project.

## Usage

Here's how to use `SwiftMistralCoreML` in your project:

### Import the Library

```swift

import SwiftMistralCoreML

```

### Define the Parameters

Create an instance of `MistralParameters` to specify the model's behavior:

```swift

let parameters = MistralParameters(

    userInput: "Hello, how are you?",

    systemPrompt: "You are a helpful assistant.",

    algorithm: .greedy,

    maxTokens: 500,

    topK: nil  // Not required for Greedy decoding

)

```

### Generate Text

Use the `TextGenerator` class to generate responses:

```swift

let tokenizerParser = try TokenizerParser()

let bpeEncoder = BPEEncoder(tokenizerParser: tokenizerParser)

let textGenerator = try TextGenerator(bpeEncoder: bpeEncoder, tokenizerParser: tokenizerParser)

let messages = [

    Message(role: .system, content: parameters.systemPrompt),

    Message(role: .user, content: parameters.userInput)

]

let mistralInput = try MistralInput(messages: messages, bpeEncoder: bpeEncoder, tokenizer: tokenizerParser)

let generatedText = try await textGenerator.generateText(

    from: mistralInput.inputTokens,

    using: parameters,

    progressHandler: { generatedWord in

        print(generatedWord)

    }

)

print("Assistant: \(generatedText)")

```

## Example

Below is a complete example of how to integrate and use the library:

```swift

import SwiftMistralCoreML

final class MistralChat {

    private var messages: [Message] = []

    func startChat() async {

        do {

            let parameters = MistralParameters(

                userInput: "Tell me a joke.",

                systemPrompt: "You are a witty assistant.",

                algorithm: .greedy,

                maxTokens: 50,

                topK: nil

            )

            let tokenizerParser = try TokenizerParser()

            let bpeEncoder = BPEEncoder(tokenizerParser: tokenizerParser)

            let textGenerator = try TextGenerator(bpeEncoder: bpeEncoder, tokenizerParser: tokenizerParser)

            messages.append(Message(role: .system, content: parameters.systemPrompt))

            messages.append(Message(role: .user, content: parameters.userInput))

            let mistralInput = try MistralInput(messages: messages, bpeEncoder: bpeEncoder, tokenizer: tokenizerParser)

            let generatedText = try await textGenerator.generateText(

                from: mistralInput.inputTokens,

                using: parameters,

                progressHandler: { generatedWord in

                    print(generatedWord, terminator: "")

                }

            )

            messages.append(Message(role: .assistant, content: generatedText))

            print("\nAssistant: \(generatedText)")

        } catch {

            print("Error: \(error.localizedDescription)")

        }

    }

}

let chat = MistralChat()

Task {

    await chat.startChat()

}

```

## Decoding Algorithms

The library supports multiple decoding algorithms:

- **Greedy Decoding**: Selects the token with the highest probability at each step.

- **Top-K Sampling**: Samples the next token from the top K most probable tokens.

Specify the algorithm when creating `MistralParameters`:

```swift

let parameters = MistralParameters(

    userInput: "What's the weather like today?",

    systemPrompt: "",

    algorithm: .topK,

    maxTokens: 500,

    topK: 10  // Required for Top-K decoding

)

```

## Async Prediction

For better performance in scenarios where you need to process multiple requests concurrently, you can use the async prediction interface:

```swift

let textGenerator = try TextGenerator()

// Process multiple questions concurrently

try await withThrowingTaskGroup(of: (String, String).self) { group in

    for question in questions {

        group.addTask {

            let messages = [

                Message(role: .system, content: "You are a helpful assistant."),

                Message(role: .user, content: question)

            ]

            

            let parameters = MistralParameters(

                modelType: .int4,

                userInput: question,

                systemPrompt: "You are a helpful assistant.",

                algorithm: .greedy,

                maxTokens: 16,

                topK: nil

            )

            

            // Use generateTextAsync for concurrent processing

            let result = try await textGenerator.generateTextAsync(

                messages: messages,

                using: parameters,

                progressHandler: nil

            )

            

            return (question, result)

        }

    }

    

    // Collect results as they complete

    for try await (question, answer) in group {

        print("Q: \(question)")

        print("A: \(answer)\n")

    }

}

```

The async interface is particularly useful when:

- Processing multiple requests concurrently

- Integrating with async/await workflows

- Handling batch processing scenarios

- Building responsive UIs that need to remain interactive during processing

## Planned Features

- **FP16 Model Support**: Upcoming support for FP16 models, providing better accuracy and performance.

- **More Decoding Strategies**: Planned addition of new decoding strategies such as Beam Search and Temperature Sampling.

- **Enhanced Compatibility**: Ensuring the library is fully compatible with iOS 18 and supports iPhone devices.

- **Interface Improvements**: Ongoing improvements to the API for better ease of use and flexibility.

## Customization

You can extend the library to support additional features or models. The modular design allows you to replace components like the decoding strategy or tokenizer as needed.

## Contributing

Contributions are welcome! Please open an issue or submit a pull request on the [GitHub repository](https://github.com/cardona/SwiftMistralCoreML).

## License

This project is licensed under the Apache 2.0 License.

---

*Created because there were no existing examples on how to use Mistral CoreML models with Swift.*

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cardona/swiftmistralcoreml

Awesome Lists containing this project

README