Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/leftmove/cria

Run LLMs locally with as little friction as possible.
https://github.com/leftmove/cria
collaborate github github-codespaces github-copilot llama llm ollama python
Last synced: 7 days ago
JSON representation
Run LLMs locally with as little friction as possible.
Host: GitHub
URL: https://github.com/leftmove/cria
Owner: leftmove
License: mit
Created: 2024-04-20T21:02:08.000Z (10 months ago)
Default Branch: main
Last Pushed: 2024-10-08T02:58:08.000Z (4 months ago)
Last Synced: 2025-02-01T03:04:43.401Z (14 days ago)
Topics: collaborate, github, github-codespaces, github-copilot, llama, llm, ollama, python
Language: Python
Homepage: https://pypi.org/project/cria/
Size: 73.2 KB
Stars: 114
Watchers: 3
Forks: 5
Open Issues: 3
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE.md
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project

jimsghstars - leftmove/cria - Run LLMs locally with as little friction as possible. (Python)
README

        


  





    Cria, use Python to run LLMs with as little friction as possible.



Cria is a library for programmatically running Large Language Models through Python. Cria is built so you need as little configuration as possible — even with more advanced features.

- **Easy**: No configuration is required out of the box. Getting started takes just five lines of code.

- **Concise**: Write less code to save time and avoid duplication.

- **Local**: Free and unobstructed by rate limits, running LLMs requires no internet connection.

- **Efficient**: Use advanced features with your own `ollama` instance, or a subprocess.

## Guide

- [Quick Start](#quickstart)

- [Installation](#installation)

  - [Windows](#windows)

  - [Mac](#mac)

  - [Linux](#linux)

- [Advanced Usage](#advanced-usage)

  - [Custom Models](#custom-models)

  - [Streams](#streams)

  - [Closing](#closing)

  - [Message History](#message-history)

    - [Follow-Up](#follow-up)

    - [Clear Message History](#clear-message-history)

    - [Passing In Custom Context](#passing-in-custom-context)

  - [Interrupting](#interrupting)

    - [With Message History](#with-message-history)

    - [Without Message History](#without-message-history)

  - [Multiple Models and Parallel Conversations](#multiple-models-and-parallel-conversations)

    - [Models](#models)

    - [With](#with-model)

    - [Standalone](#standalone-model)

  - [Running Standalone](#running-standalone)

  - [Formatting](#formatting)

- [Contributing](#contributing)

- [License](#license)

## Quickstart

Running Cria is easy. After installation, you need just five lines of code — no configurations, no manual downloads, no API keys, and no servers to worry about.

```python

import cria

ai = cria.Cria()

prompt = "Who is the CEO of OpenAI?"

for chunk in ai.chat(prompt):

    print(chunk, end="")

```

```

>>> The CEO of OpenAI is Sam Altman!

```

or, you can run this more configurable example.

```python

import cria

with cria.Model() as ai:

  prompt = "Who is the CEO of OpenAI?"

  response = ai.chat(prompt, stream=False)

  print(response)

```

```

>>> The CEO of OpenAI is Sam Altman!

```

> [!WARNING]

> If no model is configured, Cria automatically installs and runs the default model: `llama3.1:8b` (4.7GB).

## Installation

1. Cria uses [`ollama`](https://ollama.com/), to install it, run the following.

   ### Windows

   [Download](https://ollama.com/download/windows)

   ### Mac

   [Download](https://ollama.com/download/mac)

   ### Linux

   ```

   curl -fsSL https://ollama.com/install.sh | sh

   ```

2. Install Cria with `pip`.

   ```

   pip install cria

   ```

## Advanced Usage

### Custom Models

To run other LLMs, pass them into your `ai` variable.

```python

import cria

ai = cria.Cria("llama2")

prompt = "Who is the CEO of OpenAI?"

for chunk in ai.chat(prompt):

    print(chunk, end="") # The CEO of OpenAI is Sam Altman. He co-founded OpenAI in 2015 with...

```

You can find available models [here](https://ollama.com/library).

### Streams

Streams are used by default in Cria, but you can turn them off by passing in a boolean for the `stream` parameter.

```python

prompt = "Who is the CEO of OpenAI?"

response = ai.chat(prompt, stream=False)

print(response) # The CEO of OpenAI is Sam Altman!

```

### Closing

By default, models are closed when you exit the Python program, but closing them manually is a best practice.

```python

ai.close()

```

You can also use [`with`](#with-model) statements to close models automatically (recommended).

### Message History

#### Follow-Up

Message history is automatically saved in Cria, so asking follow-up questions is easy.

```python

prompt = "Who is the CEO of OpenAI?"

response = ai.chat(prompt, stream=False)

print(response) # The CEO of OpenAI is Sam Altman.

prompt = "Tell me more about him."

response = ai.chat(prompt, stream=False)

print(response) # Sam Altman is an American entrepreneur and technologist who serves as the CEO of OpenAI...

```

#### Clear Message History

You can reset message history by running the `clear` method.

```python

prompt = "Who is the CEO of OpenAI?"

response = ai.chat(prompt, stream=False)

print(response) # Sam Altman is an American entrepreneur and technologist who serves as the CEO of OpenAI...

ai.clear()

prompt = "Tell me more about him."

response = ai.chat(prompt, stream=False)

print(response) # I apologize, but I don't have any information about "him" because the conversation just started...

```

#### Passing In Custom Context

You can also create a custom message history, and pass in your own context.

```python

context = "Our AI system employed a hybrid approach combining reinforcement learning and generative adversarial networks (GANs) to optimize the decision-making..."

messages = [

    {"role": "system", "content": "You are a technical documentation writer"},

    {"role": "user", "content": context},

]

prompt = "Write some documentation using the text I gave you."

for chunk in ai.chat(messages=messages, prompt=prompt):

    print(chunk, end="") # AI System Optimization: Hybrid Approach Combining Reinforcement Learning and...

```

In the example, instructions are given to the LLM as the `system`. Then, extra context is given as the `user`. Finally, the prompt is entered (as a `user`). You can use any mixture of roles to specify the LLM to your liking.

The available roles for messages are:

- `user` - Pass prompts as the user.

- `system` - Give instructions as the system.

- `assistant` - Act as the AI assistant yourself, and give the LLM lines.

The prompt parameter will always be appended to messages under the `user` role, to override this, you can choose to pass in nothing for `prompt`.

### Interrupting

#### With Message History

If you are streaming messages with Cria, you can interrupt the prompt mid way.

```python

response = ""

max_token_length = 5

prompt = "Who is the CEO of OpenAI?"

for i, chunk in enumerate(ai.chat(prompt)):

  if i >= max_token_length:

    ai.stop()

  response += chunk

print(response) # The CEO of OpenAI is

```

```python

response = ""

max_token_length = 5

prompt = "Who is the CEO of OpenAI?"

for i, chunk in enumerate(ai.generate(prompt)):

  if i >= max_token_length:

    ai.stop()

  response += chunk

print(response) # The CEO of OpenAI is

```

In the examples, after the AI generates five tokens (units of text that are usually a couple of characters long), text generation is stopped via the `stop` method. After `stop` is called, you can safely `break` out of the `for` loop.

#### Without Message History

By default, Cria automatically saves responses in message history, even if the stream is interrupted. To prevent this behaviour though, you can pass in the `allow_interruption` boolean.

```python

ai = cria.Cria(allow_interruption=False)

response = ""

max_token_length = 5

prompt = "Who is the CEO of OpenAI?"

for i, chunk in enumerate(ai.chat(prompt)):

  if i >= max_token_length:

    ai.stop()

    break

  print(chunk, end="") # The CEO of OpenAI is

prompt = "Tell me more about him."

for chunk in ai.chat(prompt):

  print(chunk, end="") # I apologize, but I don't have any information about "him" because the conversation just started...

```

### Multiple Models and Parallel Conversations

#### Models

If you are running multiple models or parallel conversations, the `Model` class is also available. This is recommended for most use cases.

```python

import cria

ai = cria.Model()

prompt = "Who is the CEO of OpenAI?"

response = ai.chat(prompt, stream=False)

print(response) # The CEO of OpenAI is Sam Altman.

```

_All methods that apply to the `Cria` class also apply to `Model`._

#### With Model

Multiple models can be run through a `with` statement. This automatically closes them after use.

```python

import cria

prompt = "Who is the CEO of OpenAI?"

with cria.Model("llama3") as ai:

  response = ai.chat(prompt, stream=False)

  print(response) # OpenAI's CEO is Sam Altman, who also...

with cria.Model("llama2") as ai:

  response = ai.chat(prompt, stream=False)

  print(response) # The CEO of OpenAI is Sam Altman.

```

#### Standalone Model

Or, models can be run traditionally.

```python

import cria

prompt = "Who is the CEO of OpenAI?"

llama3 = cria.Model("llama3")

response = llama3.chat(prompt, stream=False)

print(response) # OpenAI's CEO is Sam Altman, who also...

llama2 = cria.Model("llama2")

response = llama2.chat(prompt, stream=False)

print(response) # The CEO of OpenAI is Sam Altman.

# Not required, but best practice.

llama3.close()

llama2.close()

```

### Generate

Cria also has a `generate` method.

```python

prompt = "Who is the CEO of OpenAI?"

for chunk in ai.generate(prompt):

    print(chunk, end="") # The CEO of OpenAI (Open-source Artificial Intelligence) is Sam Altman.

promt = "Tell me more about him."

response = ai.generate(prompt, stream=False)

print(response) # I apologize, but I think there may have been some confusion earlier. As this...

```

### Running Standalone

When you run `cria.Cria()`, an `ollama` instance will start up if one is not already running. When the program exits, this instance will terminate.

However, if you want to save resources by not exiting `ollama`, either run your own `ollama` instance in another terminal, or run a managed subprocess.

#### Running Your Own Ollama Instance

```bash

ollama serve

```

```python

prompt = "Who is the CEO of OpenAI?"

with cria.Model() as ai:

    response = ai.generate("Who is the CEO of OpenAI?", stream=False)

    print(response)

```

#### Running A Managed Subprocess (Reccomended)

```python

# If it is the first time you start the program, ollama will start automatically

# If it is the second time (or subsequent times) you run the program, ollama will already be running

ai = cria.Cria(standalone=True, close_on_exit=False)

prompt = "Who is the CEO of OpenAI?"

with cria.Model("llama2") as llama2:

    response = llama2.generate("Who is the CEO of OpenAI?", stream=False)

    print(response)

with cria.Model("llama3") as llama3:

    response = llama3.generate("Who is the CEO of OpenAI?", stream=False)

    print(response)

quit()

# Despite exiting, olama will keep running, and be used the next time this program starts.

```

### Formatting

To format the output of the LLM, pass in the format keyword.

```python

ai = cria.Cria()

prompt = "Return a JSON array of AI companies."

response = ai.chat(prompt, stream=False, format="json")

print(response) # ["OpenAI", "Anthropic", "Meta", "Google", "Cohere", ...].

```

The current supported formats are:

* JSON 

## Contributing

If you have a feature request, feel free to make an issue!

Contributions are highly appreciated.

## License

[MIT](./LICENSE.md)