https://github.com/noizu-labs-ml/ex_llama

Elixir NIFs for interacting with llama_cpp.rust managed GGUF models.
https://github.com/noizu-labs-ml/ex_llama

Last synced: 2 months ago
JSON representation

Elixir NIFs for interacting with llama_cpp.rust managed GGUF models.

Host: GitHub
URL: https://github.com/noizu-labs-ml/ex_llama
Owner: noizu-labs-ml
License: mit
Created: 2024-04-11T18:27:35.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-04-12T09:30:22.000Z (over 1 year ago)
Last Synced: 2024-08-04T01:09:52.289Z (11 months ago)
Language: Elixir
Size: 45.9 KB
Stars: 7
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-elixir - ExLLama - LlamaCpp Nif Extensions for Elixir/Erlang. ([Docs](https://hexdocs.pm/ex_llama/ExLLama.html)). (Artificial Intelligence)
awesomenoizu - noizu-labs-ml/ex_llama
fucking-awesome-elixir - ExLLama - LlamaCpp Nif Extensions for Elixir/Erlang. 🌎 [Docs](hexdocs.pm/ex_llama/ExLLama.html)). (Artificial Intelligence)
fucking-awesome-elixir - ExLLama - LlamaCpp Nif Extensions for Elixir/Erlang. 🌎 [Docs](hexdocs.pm/ex_llama/ExLLama.html)). (Artificial Intelligence)

README

        ExLLama: LlammaCpp.rs NIF wrapper for Elixir/Erlang.

=======

This is an Alpha Library for loading and interacting with models via the llama_cpp rust client exposed as nif extensions. 

Inspired By [llama_cpp_ex](https://github.com/jeregrine/llama_cpp_ex)

## Getting Started

1. Add the `ex_llama` dependency to your `mix.exs` file:

```elixir

def deps do 

  [

    {:ex_llama, "~> 0.0.1"}

  ]

end

```

## Chat Completion 

As of this build only `<|role|>messsage` format chat completion is supported, such as used by tiny llama. 

```elixir 

    {:ok, llama} = ExLLama.load_model("./test/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf")

    thread = [

      %{role: :user, content: "Say Hello. And only hello. Example \"Hello\"."},

      %{role: :assistant, content: "Hello"},

      %{role: :user, content: "Repeat what you just said."},

      %{role: :assistant, content: "Hello"},

      %{role: :user, content: "Say Goodbye."},

      %{role: :assistant, content: "Goodbye"},

      %{role: :user, content: "Say Apple."},

      %{role: :assistant, content: "Apple"},

      %{role: :user, content: "What did you just say?."},

    ]

    {:ok, response} = ExLLama.chat_completion(llama, thread, %{seed: 2})

    # response = %{

    #         choices: [

    #           %{reason: :end, role: "assistant", content: "Apple"},

    #           %{reason: :end, role: "assistant", content: "Apple"},

    #           %{reason: :end, role: "assistant", content: "Apple"}

    #         ]

    # }

```

## Simple Completion (direct)

```elixir

    {:ok, llama} = ExLLama.load_model("./test/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf")

    {:ok, options} = ExLLama.Session.default_options()

    {:ok, session} = ExLLama.create_session(llama, %{options| seed: 2})

    ExLLama.advance_context(session, "<|user|>\n Say Hello. And only hello. Example \"Hello\".\n<|assistant|>\n Hello\n<|user|>\n Repeat what you just said.\n<|assistant|>\n Hello\n<|user|>\n Say Goodbye.\n<|assistant|>\n")

    {:ok, response} = ExLLama.completion(session, 512, "\n*")

    response = String.trim_leading(response)

    # "Goodbye."

```

## Streaming Completion (final mechanism will be replaced with a Stream

```elixir

  def receive_text(acc \\ []) do

    receive do

      x = {:ok, _} -> Enum.reverse([x|acc])

      x = {:error, _} -> Enum.reverse([x|acc])

      :fin ->

        Enum.reverse(acc)

      x ->

        receive_text([x | acc])

    end

  end

#...

    {:ok, llama} = ExLLama.load_model("./test/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf")

    {:ok, options} = ExLLama.Session.default_options()

    {:ok, session} = ExLLama.create_session(llama, %{options| seed: 2})

    ExLLama.advance_context(session, "<|user|>\n Say Hello. And only hello. Example \"Hello\".\n<|assistant|>\n Hello\n<|user|>\n Repeat what you just said.\n<|assistant|>\n Hello\n<|user|>\n Say Goodbye.\n<|assistant|>\n")

    ExLLama.Session.start_completing_with(session, %{max_tokens: 512})

    receive_text()

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/noizu-labs-ml/ex_llama

Awesome Lists containing this project

README