https://github.com/llmrb/llm

Ruby adapter for various LLM providers
https://github.com/llmrb/llm
ai llm llm-agents llm-framework llm-frameworks llms ruby-lib ruby-library
Last synced: 10 months ago
JSON representation
Ruby adapter for various LLM providers
Host: GitHub
URL: https://github.com/llmrb/llm
Owner: llmrb
License: other
Created: 2024-10-03T14:12:21.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-05-09T01:08:38.000Z (10 months ago)
Last Synced: 2025-05-09T02:25:44.953Z (10 months ago)
Topics: ai, llm, llm-agents, llm-framework, llm-frameworks, llms, ruby-lib, ruby-library
Language: Ruby
Homepage:
Size: 36.9 MB
Stars: 39
Watchers: 3
Forks: 2
Open Issues: 7
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          ## About

llm.rb is a zero-dependency Ruby toolkit for Large Language Models that

includes OpenAI, Gemini, Anthropic, Ollama, and LlamaCpp. It’s fast, simple

and composable – with full support for chat, tool calling, audio,

images, files, and JSON Schema generation.

## Features

#### General

- ✅ A single unified interface for multiple providers

- 📦 Zero dependencies outside Ruby's standard library

- 🚀 Optimized for performance and low memory usage

- 🔌 Retrieve models dynamically for introspection and selection

#### Chat, Agents

- 🧠 Stateless and stateful chat via completions and responses API

- 🤖 Tool calling and function execution

- 🗂️ JSON Schema support for structured, validated responses

#### Media

- 🗣️ Text-to-speech, transcription, and translation

- 🖼️ Image generation, editing, and variation support

- 📎 File uploads and prompt-aware file interaction

- 💡 Multimodal prompts (text, images, PDFs, URLs, files)

#### Embeddings

- 🧮 Text embeddings and vector support

## Demos

  1. Tools: "system" function

  

  2. Files: import at boot time

  

3. Files: import at runtime

  

## Examples

### Providers

#### LLM::Provider

All providers inherit from [LLM::Provider](https://0x1eef.github.io/x/llm.rb/LLM/Provider.html) –

they share a common interface and set of functionality. Each provider can be instantiated

using an API key (if required) and an optional set of configuration options via

[the singleton methods of LLM](https://0x1eef.github.io/x/llm.rb/LLM.html). For example:

```ruby

#!/usr/bin/env ruby

require "llm"

llm = LLM.openai(key: "yourapikey")

llm = LLM.gemini(key: "yourapikey")

llm = LLM.anthropic(key: "yourapikey")

llm = LLM.ollama(key: nil)

llm = LLM.llamacpp(key: nil)

llm = LLM.voyageai(key: "yourapikey")

```

### Conversations

#### Completions

> This example uses the stateless chat completions API that all

> providers support. A similar example for OpenAI's stateful

> responses API is available in the [docs/](docs/OPENAI.md)

> directory.

The following example enables lazy mode for a

[LLM::Chat](https://0x1eef.github.io/x/llm.rb/LLM/Chat.html)

object by entering into a conversation where messages are buffered and

sent to the provider only when necessary. Both lazy and non-lazy conversations

maintain a message thread that can be reused as context throughout a conversation.

The example captures the spirit of llm.rb by demonstrating how objects cooperate

together through composition, and it uses the stateless chat completions API that

all LLM providers support:

```ruby

#!/usr/bin/env ruby

require "llm"

llm  = LLM.openai(key: ENV["KEY"])

bot  = LLM::Chat.new(llm).lazy

msgs = bot.chat do |prompt|

  prompt.system File.read("./share/llm/prompts/system.txt")

  prompt.user "Tell me the answer to 5 + 15"

  prompt.user "Tell me the answer to (5 + 15) * 2"

  prompt.user "Tell me the answer to ((5 + 15) * 2) / 10"

end

# At this point, we execute a single request

msgs.each { print "[#{_1.role}] ", _1.content, "\n" }

##

# [system] You are my math assistant.

#          I will provide you with (simple) equations.

#          You will provide answers in the format "The answer to  is ".

#          I will provide you a set of messages. Reply to all of them.

#          A message is considered unanswered if there is no corresponding assistant response.

#

# [user] Tell me the answer to 5 + 15

# [user] Tell me the answer to (5 + 15) * 2

# [user] Tell me the answer to ((5 + 15) * 2) / 10

#

# [assistant] The answer to 5 + 15 is 20.

#             The answer to (5 + 15) * 2 is 40.

#             The answer to ((5 + 15) * 2) / 10 is 4.

```

### Schema

#### Structured

All LLM providers except Anthropic allow a client to describe the structure

of a response that a LLM emits according to a schema that is described by JSON.

The schema lets a client describe what JSON object (or value) an LLM should emit,

and the LLM will abide by the schema. See also: [JSON Schema website](https://json-schema.org/overview/what-is-jsonschema).

 We will use the

[llmrb/json-schema](https://github.com/llmrb/json-schema)

library for the sake of the examples – the interface is designed so you

could drop in any other library in its place:

```ruby

#!/usr/bin/env ruby

require "llm"

llm = LLM.openai(key: ENV["KEY"])

schema = llm.schema.object({fruit: llm.schema.string.enum("Apple", "Orange", "Pineapple")})

bot = LLM::Chat.new(llm, schema:).lazy

bot.chat "Your favorite fruit is Pineapple", role: :system

bot.chat "What fruit is your favorite?", role: :user

bot.messages.find(&:assistant?).content! # => {fruit: "Pineapple"}

schema = llm.schema.object({answer: llm.schema.integer.required})

bot = LLM::Chat.new(llm, schema:).lazy

bot.chat "Tell me the answer to ((5 + 5) / 2)", role: :user

bot.messages.find(&:assistant?).content! # => {answer: 5}

schema = llm.schema.object({probability: llm.schema.number.required})

bot = LLM::Chat.new(llm, schema:).lazy

bot.chat "Does the earth orbit the sun?", role: :user

bot.messages.find(&:assistant?).content! # => {probability: 1}

```

### Tools

#### Functions

The OpenAI, Anthropic, Gemini and Ollama providers support a powerful feature known as

tool calling, and although it is a little complex to understand at first,

it can be powerful for building agents. The following example demonstrates how we

can define a local function (which happens to be a tool), and OpenAI can

then detect when we should call the function.

The

[LLM::Chat#functions](https://0x1eef.github.io/x/llm.rb/LLM/Chat.html#functions-instance_method)

method returns an array of functions that can be called after sending a message and

it will only be populated if the LLM detects a function should be called. Each function

corresponds to an element in the "tools" array. The array is emptied after a function call,

and potentially repopulated on the next message.

The following example defines an agent that can run system commands based on natural language,

and it is only intended to be a fun demo of tool calling - it is not recommended to run

arbitrary commands from a LLM without sanitizing the input first :) Without further ado:

```ruby

#!/usr/bin/env ruby

require "llm"

llm  = LLM.openai(key: ENV["KEY"])

tool = LLM.function(:system) do |fn|

  fn.description "Run a shell command"

  fn.params do |schema|

    schema.object(command: schema.string.required)

  end

  fn.define do |params|

    ro, wo = IO.pipe

    re, we = IO.pipe

    Process.wait Process.spawn(params.command, out: wo, err: we)

    [wo,we].each(&:close)

    {stderr: re.read, stdout: ro.read}

  end

end

bot = LLM::Chat.new(llm, tools: [tool]).lazy

bot.chat "Your task is to run shell commands via a tool.", role: :system

bot.chat "What is the current date?", role: :user

bot.chat bot.functions.map(&:call) # report return value to the LLM

bot.chat "What operating system am I running? (short version please!)", role: :user

bot.chat bot.functions.map(&:call) # report return value to the LLM

##

# {stderr: "", stdout: "Thu May  1 10:01:02 UTC 2025"}

# {stderr: "", stdout: "FreeBSD"}

```

### Audio

#### Speech

Some but not all providers implement audio generation capabilities that

can create speech from text, transcribe audio to text, or translate

audio to text (usually English). The following example uses the OpenAI provider

to create an audio file from a text prompt. The audio is then moved to

`${HOME}/hello.mp3` as the final step:

```ruby

#!/usr/bin/env ruby

require "llm"

llm = LLM.openai(key: ENV["KEY"])

res = llm.audio.create_speech(input: "Hello world")

IO.copy_stream res.audio, File.join(Dir.home, "hello.mp3")

```

#### Transcribe

The following example transcribes an audio file to text. The audio file

(`${HOME}/hello.mp3`) was theoretically created in the previous example,

and the result is printed to the console. The example uses the OpenAI

provider to transcribe the audio file:

```ruby

#!/usr/bin/env ruby

require "llm"

llm = LLM.openai(key: ENV["KEY"])

res = llm.audio.create_transcription(

  file: File.join(Dir.home, "hello.mp3")

)

print res.text, "\n" # => "Hello world."

```

#### Translate

The following example translates an audio file to text. In this example

the audio file (`${HOME}/bomdia.mp3`) is theoretically in Portuguese,

and it is translated to English. The example uses the OpenAI provider,

and at the time of writing, it can only translate to English:

```ruby

#!/usr/bin/env ruby

require "llm"

llm = LLM.openai(key: ENV["KEY"])

res = llm.audio.create_translation(

  file: File.join(Dir.home, "bomdia.mp3")

)

print res.text, "\n" # => "Good morning."

```

### Images

#### Create

Some but not all LLM providers implement image generation capabilities that

can create new images from a prompt, or edit an existing image with a

prompt. The following example uses the OpenAI provider to create an

image of a dog on a rocket to the moon. The image is then moved to

`${HOME}/dogonrocket.png` as the final step:

```ruby

#!/usr/bin/env ruby

require "llm"

require "open-uri"

require "fileutils"

llm = LLM.openai(key: ENV["KEY"])

res = llm.images.create(prompt: "a dog on a rocket to the moon")

res.urls.each do |url|

  FileUtils.mv OpenURI.open_uri(url).path,

               File.join(Dir.home, "dogonrocket.png")

end

```

#### Edit

The following example is focused on editing a local image with the aid

of a prompt. The image (`/images/cat.png`) is returned to us with the cat

now wearing a hat. The image is then moved to `${HOME}/catwithhat.png` as

the final step:

```ruby

#!/usr/bin/env ruby

require "llm"

require "open-uri"

require "fileutils"

llm = LLM.openai(key: ENV["KEY"])

res = llm.images.edit(

  image: "/images/cat.png",

  prompt: "a cat with a hat",

)

res.urls.each do |url|

  FileUtils.mv OpenURI.open_uri(url).path,

               File.join(Dir.home, "catwithhat.png")

end

```

#### Variations

The following example is focused on creating variations of a local image.

The image (`/images/cat.png`) is returned to us with five different variations.

The images are then moved to `${HOME}/catvariation0.png`, `${HOME}/catvariation1.png`

and so on as the final step:

```ruby

#!/usr/bin/env ruby

require "llm"

require "open-uri"

require "fileutils"

llm = LLM.openai(key: ENV["KEY"])

res = llm.images.create_variation(

  image: "/images/cat.png",

  n: 5

)

res.urls.each.with_index do |url, index|

  FileUtils.mv OpenURI.open_uri(url).path,

               File.join(Dir.home, "catvariation#{index}.png")

end

```

### Files

#### Create

Most LLM providers provide a Files API where you can upload files

that can be referenced from a prompt and llm.rb has first-class support

for this feature. The following example uses the OpenAI provider to describe

the contents of a PDF file after it has been uploaded. The file (an instance

of [LLM::Response::File](https://0x1eef.github.io/x/llm.rb/LLM/Response/File.html))

is passed directly to the chat method, and generally any object a prompt supports

can be given to the chat method:

```ruby

#!/usr/bin/env ruby

require "llm"

llm = LLM.openai(key: ENV["KEY"])

bot = LLM::Chat.new(llm).lazy

file = llm.files.create(file: "/documents/openbsd_is_awesome.pdf")

bot.chat(file)

bot.chat("What is this file about?")

bot.messages.select(&:assistant?).each { print "[#{_1.role}] ", _1.content, "\n" }

##

# [assistant] This file is about OpenBSD, a free and open-source Unix-like operating system

#             based on the Berkeley Software Distribution (BSD). It is known for its

#             emphasis on security, code correctness, and code simplicity. The file

#             contains information about the features, installation, and usage of OpenBSD.

```

### Prompts

#### Multimodal

Generally all providers accept text prompts but some providers can

also understand URLs, and various file types (eg images, audio, video,

etc). The llm.rb approach to multimodal prompts is to let you pass `URI`

objects to describe links, `LLM::File` | `LLM::Response::File` objects

to describe files, `String` objects to describe text blobs, or an array

of the aforementioned objects to describe multiple objects in a single

prompt. Each object is a first class citizen that can be passed directly

to a prompt:

```ruby

#!/usr/bin/env ruby

require "llm"

llm = LLM.openai(key: ENV["KEY"])

bot = LLM::Chat.new(llm).lazy

bot.chat [URI("https://example.com/path/to/image.png"), "Describe the image in the link"]

bot.messages.select(&:assistant?).each { print "[#{_1.role}] ", _1.content, "\n" }

file = llm.files.create(file: "/documents/openbsd_is_awesome.pdf")

bot.chat [file, "What is this file about?"]

bot.messages.select(&:assistant?).each { print "[#{_1.role}] ", _1.content, "\n" }

bot.chat [LLM.File("/images/puffy.png"), "What is this image about?"]

bot.messages.select(&:assistant?).each { print "[#{_1.role}] ", _1.content, "\n" }

bot.chat [LLM.File("/images/beastie.png"), "What is this image about?"]

bot.messages.select(&:assistant?).each { print "[#{_1.role}] ", _1.content, "\n" }

```

### Embeddings

#### Text

The

[`LLM::Provider#embed`](https://0x1eef.github.io/x/llm.rb/LLM/Provider.html#embed-instance_method)

method generates a vector representation of one or more chunks

of text. Embeddings capture the semantic meaning of text –

a common use-case for them is to store chunks of text in a

vector database, and then to query the database for *semantically

similar* text. These chunks of similar text can then support the

generation of a prompt that is used to query a large language model,

which will go on to generate a response:

```ruby

#!/usr/bin/env ruby

require "llm"

llm = LLM.openai(key: ENV["KEY"])

res = llm.embed(["programming is fun", "ruby is a programming language", "sushi is art"])

print res.class, "\n"

print res.embeddings.size, "\n"

print res.embeddings[0].size, "\n"

##

# LLM::Response::Embedding

# 3

# 1536

```

### Models

#### List

Almost all LLM providers provide a models endpoint that allows a client to

query the list of models that are available to use. The list is dynamic,

maintained by LLM providers, and it is independent of a specific llm.rb release.

[LLM::Model](https://0x1eef.github.io/x/llm.rb/LLM/Model.html)

objects can be used instead of a string that describes a model name (although

either works). Let's take a look at an example:

```ruby

#!/usr/bin/env ruby

require "llm"

##

# List all models

llm = LLM.openai(key: ENV["KEY"])

llm.models.all.each do |model|

  print "model: ", model.id, "\n"

end

##

# Select a model

model = llm.models.all.find { |m| m.id == "gpt-3.5-turbo" }

bot = LLM::Chat.new(llm, model:)

bot.chat "Hello #{model.id} :)"

bot.messages.select(&:assistant?).each { print "[#{_1.role}] ", _1.content, "\n" }

```

## Documentation

### API

The README tries to provide a high-level overview of the library. For everything

else there's the API reference. It covers classes and methods that the README glances

over or doesn't cover at all. The API reference is available at

[0x1eef.github.io/x/llm.rb](https://0x1eef.github.io/x/llm.rb).

### Guides

The [docs/](docs/) directory contains some additional documentation that

didn't quite make it into the README. It covers the design guidelines that

the library follows, some strategies for memory management, and other

provider-specific features.

## See also

**[llmrb/llm-shell](https://github.com/llmrb/llm-shell)**

An extensible, developer-oriented command line utility that is powered by

llm.rb and serves as a demonstration of the library's capabilities. The

[demo](https://github.com/llmrb/llm-shell#demos) section has a number of GIF

previews might be especially interesting.

## Install

llm.rb can be installed via rubygems.org:

	gem install llm.rb

## License

[BSD Zero Clause](https://choosealicense.com/licenses/0bsd/)




See [LICENSE](./LICENSE)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/llmrb/llm

Awesome Lists containing this project

README