An open API service indexing awesome lists of open source software.

https://github.com/tinybiggames/lumina

Local Generative AI
https://github.com/tinybiggames/lumina

gen-ai gguf llama-cpp llm-inference local-ai pascal win64 windows-10 windows-11

Last synced: about 2 months ago
JSON representation

Local Generative AI

Awesome Lists containing this project

README

          

![Lumina](media/lumina.png)
[![Chat on Discord](https://img.shields.io/discord/754884471324672040?style=for-the-badge)](https://discord.gg/tPWjMwK)
[![Follow on Bluesky](https://img.shields.io/badge/Bluesky-tinyBigGAMES-blue?style=for-the-badge&logo=bluesky)](https://bsky.app/profile/tinybiggames.com)

# ๐ŸŒŸ Lumina: Advanced Local Generative AI for Delphi Developers ๐Ÿ’ป๐Ÿค–

Lumina offers a cutting-edge ๐Ÿ› ๏ธ for Delphi developers to seamlessly integrate advanced generative AI capabilities into their ๐Ÿ“ฑ. Built on the computational backbone of **llama.cpp** ๐Ÿช, Lumina prioritizes data privacy ๐Ÿ”’, performance โšก, and a user-friendly API ๐Ÿ“š, making it a powerful tool for local AI inference ๐Ÿค–.

## ๐Ÿง Why Choose Lumina?

- **Localized Processing** ๐Ÿ : Operates entirely offline, ensuring sensitive data remains confidential ๐Ÿ›ก๏ธ while offering complete computational control ๐Ÿง .
- **Broad Model Compatibility** ๐ŸŒ: Supports **GGUF models** compliant with llama.cpp standards, granting access to diverse AI architectures ๐Ÿงฉ.
- **Intuitive Development Interface** ๐ŸŽ›๏ธ: A concise, flexible API simplifies model management ๐Ÿ—‚๏ธ, inference execution ๐Ÿงฎ, and callback customization ๐ŸŽš๏ธ, minimizing implementation complexity.
- **Future-Ready Scalability** ๐Ÿš€: This release emphasizes stability ๐Ÿ—๏ธ and foundational features, with plans for multi-turn conversation ๐Ÿ’ฌ and retrieval-augmented generation (RAG) ๐Ÿ” in future updates.

## ๐Ÿ› ๏ธ Key Functionalities

### ๐Ÿค– Advanced AI Integration

Lumina expands your development toolkit ๐ŸŽ’ with capabilities such as:
- Dynamic chatbot creation ๐Ÿ’ฌ.
- Automated text generation ๐Ÿ“ and summarization ๐Ÿ“ฐ.
- Context-sensitive content generation โœ๏ธ.
- Real-time inference for adaptive processes โšก.

### ๐Ÿ”’ Privacy-Driven AI Execution

- Operates independently of external networks ๐Ÿ›ก๏ธ, guaranteeing data security.
- Uses Vulkan ๐Ÿ–ฅ๏ธ for optional GPU acceleration to enhance performance.

### โš™๏ธ Performance Optimization

- Configurable GPU utilization through the `AGPULayers` parameter ๐Ÿงฉ.
- Dynamic thread allocation based on hardware capabilities ๐Ÿ–ฅ๏ธ via `AMaxThreads`.
- Comprehensive performance metrics ๐Ÿ“Š, offering insights into throughput ๐Ÿ“ˆ and efficiency.

### ๐Ÿ”— Streamlined Integration

- Embedded dependencies eliminate the need for external libraries ๐Ÿ“ฆ.
- Lightweight architecture (~2.5MB overhead) ensures broad deployment compatibility ๐ŸŒ.

## ๐Ÿ“ฅ Installation

1. **Download the Repository** ๐Ÿ“ฆ
- [Download here](https://github.com/tinyBigGAMES/Lumina/archive/refs/heads/main.zip) and extract the files to your preferred directory ๐Ÿ“‚.

2. **Acquire a GGUF Model** ๐Ÿง 
- Obtain a model from [Hugging Face](https://huggingface.co), such as [Gemma 2.2B GGUF (Q8_0)](https://huggingface.co/bartowski/gemma-2-2b-it-abliterated-GGUF/resolve/main/gemma-2-2b-it-abliterated-Q8_0.gguf?download=true). Save it to a directory accessible to your application (e.g., `C:/LLM/GGUF`) ๐Ÿ’พ.

3. **Ensure GPU Compatibility** ๐ŸŽฎ
- Verify Vulkan compatibility for enhanced performance โšก. Adjust `AGPULayers` as needed to accommodate VRAM limitations ๐Ÿ“‰.

4. **โœจ TLumina Class**
- ๐Ÿ“œ Add `Lumina` to your `uses` section.
- ๐Ÿ› ๏ธ Create an instance of `TLumina`.
- ๐Ÿš€ All functionality will then be at your disposal. That simple! ๐ŸŽ‰

5. **Explore Examples** ๐Ÿ”
- Check the `examples` directory for detailed usage demonstrations ๐Ÿ“š.

## ๐Ÿ› ๏ธ Usage

### ๐Ÿ”ง Basic Setup

Integrate Lumina into your Delphi project ๐Ÿ–ฅ๏ธ:

```delphi
var
Lumina: TLumina;
begin
Lumina := TLumina.Create;
try
if Lumina.LoadModel('C:\LLM\GGUF\gemma-2-2b-it-abliterated-Q8_0.gguf',
'', 8192, -1, 8) then
begin
if Lumina.SimpleInference('What is the capital of Italy?') then
WriteLn('Inference completed successfully.')
else
WriteLn('Error: ', Lumina.GetError);
end;
finally
Lumina.Free;
end;
end;
```

### ๐ŸŽš๏ธ Customizing Callbacks

Define custom behavior using Luminaโ€™s callback functions ๐Ÿ› ๏ธ:

```delphi
procedure NextTokenCallback(const AToken: string; const AUserData: Pointer);
begin
Write(AToken);
end;

Lumina.SetNextTokenCallback(NextTokenCallback, nil);
```

## ๐Ÿ“– API Reference

### ๐Ÿงฉ Core Methods

- **LoadModel** ๐Ÿ“‚
- Parameters:
- `AModelFilename`: Path to the GGUF model file ๐Ÿ“„.
- `ATemplate`: Optional inference template ๐Ÿ“.
- `AMaxContext`: Maximum context size (default: 512) ๐Ÿง .
- `AGPULayers`: GPU layer configuration (-1 for maximum) ๐ŸŽฎ.
- `AMaxThreads`: Number of CPU threads allocated ๐Ÿ–ฅ๏ธ.
- Returns a boolean indicating success โœ….

- **SimpleInference** ๐Ÿง 
- Accepts a single query for immediate processing ๐Ÿ“.
- Returns a boolean indicating success โœ….

- **SetNextTokenCallback** ๐Ÿ’ฌ
- Assigns a handler to process tokens during inference ๐Ÿงฉ.

- **UnloadModel** โŒ
- Frees resources allocated during model loading ๐Ÿ—‘๏ธ.

- **GetPerformanceResult** ๐Ÿ“Š
- Provides metrics, including token generation rates ๐Ÿ“ˆ.

## ๐Ÿ› ๏ธ Advanced Configurations

### ๐Ÿง  Custom Inference Templates

Lumina will use the template defined in the model's meta data by default, but you can also define custom templates to match your modelโ€™s requirements or change its behavor. These are some common model templates โœ๏ธ:

```delphi
const
CHATML_TEMPLATE = '<|im_start|>{role} {content}<|im_end|><|im_start|>assistant';
GEMMA_TEMPLATE = '{role} {content}';
PHI_TEMPLATE = '<|{role}|> {content}<|end|><|assistant|>';
```

- **{role}** - will be replaced with the role (user, assistant, etc.)
- **{content}** - will be replaced with the content sent to the model

### ๐ŸŽฎ GPU Optimization

- `AGPULayers` values:
- `-1`: Utilize all available layers (default) ๐Ÿ–ฅ๏ธ.
- `0`: CPU-only processing ๐Ÿ–ฅ๏ธ.
- Custom values for partial GPU utilization ๐ŸŽ›๏ธ.

### ๐Ÿ“Š Performance Metrics

Retrieve detailed operational metrics ๐Ÿ“ˆ:

```delphi
var
Perf: TLumina.PerformanceResult;
begin
Perf := Lumina.GetPerformanceResult;
WriteLn('Tokens/Sec: ', Perf.TokensPerSecond);
WriteLn('Input Tokens: ', Perf.TotalInputTokens);
WriteLn('Output Tokens: ', Perf.TotalOutputTokens);
end;
```

## ๐ŸŽ™๏ธ Media

### ๐ŸŒŠ Deep Dive Podcast
Discover in-depth discussions and insights about **Lumina** and its innovative features. ๐Ÿš€โœจ

https://github.com/user-attachments/assets/165e3dee-b29f-4478-b9ef-4fb6d2df2485

### ๐Ÿ› ๏ธ Support and Resources

- Report issues via the [Issue Tracker](https://github.com/tinyBigGAMES/Lumina/issues) ๐Ÿž.
- Engage in discussions on the [Forum](https://github.com/tinyBigGAMES/Lumina/discussions) and [Discord](https://discord.gg/tPWjMwK) ๐Ÿ’ฌ.
- Learn more at [Learn Delphi](https://learndelphi.org) ๐Ÿ“š.

### ๐Ÿค Contributing

Contributions to **โœจ Lumina** are highly encouraged! ๐ŸŒŸ
- ๐Ÿ› **Report Issues:** Submit issues if you encounter bugs or need help.
- ๐Ÿ’ก **Suggest Features:** Share your ideas to make **Lumina** even better.
- ๐Ÿ”ง **Create Pull Requests:** Help expand the capabilities and robustness of the library.

Your contributions make a difference! ๐Ÿ™Œโœจ

#### Contributors ๐Ÿ‘ฅ๐Ÿค



### ๐Ÿ“œ Licensing

**Lumina** is distributed under the ๐Ÿ†“ **BSD-3-Clause License**, allowing for redistribution and use in both source and binary forms, with or without modification, under specific conditions. See the [LICENSE](https://github.com/tinyBigGAMES/Lumina?tab=BSD-3-Clause-1-ov-file#BSD-3-Clause-1-ov-file) file for more details.

---

Advance your Delphi applications with Lumina ๐ŸŒŸ โ€“ a sophisticated solution for integrating local generative AI ๐Ÿค–.


Delphi


Made with :heart: in Delphi