Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/leftmove/cria
Run LLMs locally with as little friction as possible.
https://github.com/leftmove/cria
collaborate github github-codespaces github-copilot llama llm ollama python
Last synced: 8 days ago
JSON representation
Run LLMs locally with as little friction as possible.
- Host: GitHub
- URL: https://github.com/leftmove/cria
- Owner: leftmove
- License: mit
- Created: 2024-04-20T21:02:08.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-10-08T02:58:08.000Z (2 months ago)
- Last Synced: 2024-11-30T00:08:14.081Z (12 days ago)
- Topics: collaborate, github, github-codespaces, github-copilot, llama, llm, ollama, python
- Language: Python
- Homepage: https://pypi.org/project/cria/
- Size: 73.2 KB
- Stars: 116
- Watchers: 4
- Forks: 5
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE.md
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- jimsghstars - leftmove/cria - Run LLMs locally with as little friction as possible. (Python)
README
Cria, use Python to run LLMs with as little friction as possible.Cria is a library for programmatically running Large Language Models through Python. Cria is built so you need as little configuration as possible — even with more advanced features.
- **Easy**: No configuration is required out of the box. Getting started takes just five lines of code.
- **Concise**: Write less code to save time and avoid duplication.
- **Local**: Free and unobstructed by rate limits, running LLMs requires no internet connection.
- **Efficient**: Use advanced features with your own `ollama` instance, or a subprocess.## Guide
- [Quick Start](#quickstart)
- [Installation](#installation)
- [Windows](#windows)
- [Mac](#mac)
- [Linux](#linux)
- [Advanced Usage](#advanced-usage)
- [Custom Models](#custom-models)
- [Streams](#streams)
- [Closing](#closing)
- [Message History](#message-history)
- [Follow-Up](#follow-up)
- [Clear Message History](#clear-message-history)
- [Passing In Custom Context](#passing-in-custom-context)
- [Interrupting](#interrupting)
- [With Message History](#with-message-history)
- [Without Message History](#without-message-history)
- [Multiple Models and Parallel Conversations](#multiple-models-and-parallel-conversations)
- [Models](#models)
- [With](#with-model)
- [Standalone](#standalone-model)
- [Running Standalone](#running-standalone)
- [Formatting](#formatting)
- [Contributing](#contributing)
- [License](#license)## Quickstart
Running Cria is easy. After installation, you need just five lines of code — no configurations, no manual downloads, no API keys, and no servers to worry about.
```python
import criaai = cria.Cria()
prompt = "Who is the CEO of OpenAI?"
for chunk in ai.chat(prompt):
print(chunk, end="")
``````
>>> The CEO of OpenAI is Sam Altman!
```or, you can run this more configurable example.
```python
import criawith cria.Model() as ai:
prompt = "Who is the CEO of OpenAI?"
response = ai.chat(prompt, stream=False)
print(response)
``````
>>> The CEO of OpenAI is Sam Altman!
```> [!WARNING]
> If no model is configured, Cria automatically installs and runs the default model: `llama3.1:8b` (4.7GB).## Installation
1. Cria uses [`ollama`](https://ollama.com/), to install it, run the following.
### Windows
[Download](https://ollama.com/download/windows)
### Mac
[Download](https://ollama.com/download/mac)
### Linux
```
curl -fsSL https://ollama.com/install.sh | sh
```2. Install Cria with `pip`.
```
pip install cria
```## Advanced Usage
### Custom Models
To run other LLMs, pass them into your `ai` variable.
```python
import criaai = cria.Cria("llama2")
prompt = "Who is the CEO of OpenAI?"
for chunk in ai.chat(prompt):
print(chunk, end="") # The CEO of OpenAI is Sam Altman. He co-founded OpenAI in 2015 with...
```You can find available models [here](https://ollama.com/library).
### Streams
Streams are used by default in Cria, but you can turn them off by passing in a boolean for the `stream` parameter.
```python
prompt = "Who is the CEO of OpenAI?"
response = ai.chat(prompt, stream=False)
print(response) # The CEO of OpenAI is Sam Altman!
```### Closing
By default, models are closed when you exit the Python program, but closing them manually is a best practice.
```python
ai.close()
```You can also use [`with`](#with-model) statements to close models automatically (recommended).
### Message History
#### Follow-Up
Message history is automatically saved in Cria, so asking follow-up questions is easy.
```python
prompt = "Who is the CEO of OpenAI?"
response = ai.chat(prompt, stream=False)
print(response) # The CEO of OpenAI is Sam Altman.prompt = "Tell me more about him."
response = ai.chat(prompt, stream=False)
print(response) # Sam Altman is an American entrepreneur and technologist who serves as the CEO of OpenAI...
```#### Clear Message History
You can reset message history by running the `clear` method.
```python
prompt = "Who is the CEO of OpenAI?"
response = ai.chat(prompt, stream=False)
print(response) # Sam Altman is an American entrepreneur and technologist who serves as the CEO of OpenAI...ai.clear()
prompt = "Tell me more about him."
response = ai.chat(prompt, stream=False)
print(response) # I apologize, but I don't have any information about "him" because the conversation just started...
```#### Passing In Custom Context
You can also create a custom message history, and pass in your own context.
```python
context = "Our AI system employed a hybrid approach combining reinforcement learning and generative adversarial networks (GANs) to optimize the decision-making..."
messages = [
{"role": "system", "content": "You are a technical documentation writer"},
{"role": "user", "content": context},
]prompt = "Write some documentation using the text I gave you."
for chunk in ai.chat(messages=messages, prompt=prompt):
print(chunk, end="") # AI System Optimization: Hybrid Approach Combining Reinforcement Learning and...
```In the example, instructions are given to the LLM as the `system`. Then, extra context is given as the `user`. Finally, the prompt is entered (as a `user`). You can use any mixture of roles to specify the LLM to your liking.
The available roles for messages are:
- `user` - Pass prompts as the user.
- `system` - Give instructions as the system.
- `assistant` - Act as the AI assistant yourself, and give the LLM lines.The prompt parameter will always be appended to messages under the `user` role, to override this, you can choose to pass in nothing for `prompt`.
### Interrupting
#### With Message History
If you are streaming messages with Cria, you can interrupt the prompt mid way.
```python
response = ""
max_token_length = 5prompt = "Who is the CEO of OpenAI?"
for i, chunk in enumerate(ai.chat(prompt)):
if i >= max_token_length:
ai.stop()
response += chunkprint(response) # The CEO of OpenAI is
``````python
response = ""
max_token_length = 5prompt = "Who is the CEO of OpenAI?"
for i, chunk in enumerate(ai.generate(prompt)):
if i >= max_token_length:
ai.stop()
response += chunkprint(response) # The CEO of OpenAI is
```In the examples, after the AI generates five tokens (units of text that are usually a couple of characters long), text generation is stopped via the `stop` method. After `stop` is called, you can safely `break` out of the `for` loop.
#### Without Message History
By default, Cria automatically saves responses in message history, even if the stream is interrupted. To prevent this behaviour though, you can pass in the `allow_interruption` boolean.
```python
ai = cria.Cria(allow_interruption=False)response = ""
max_token_length = 5prompt = "Who is the CEO of OpenAI?"
for i, chunk in enumerate(ai.chat(prompt)):if i >= max_token_length:
ai.stop()
breakprint(chunk, end="") # The CEO of OpenAI is
prompt = "Tell me more about him."
for chunk in ai.chat(prompt):
print(chunk, end="") # I apologize, but I don't have any information about "him" because the conversation just started...
```### Multiple Models and Parallel Conversations
#### Models
If you are running multiple models or parallel conversations, the `Model` class is also available. This is recommended for most use cases.
```python
import criaai = cria.Model()
prompt = "Who is the CEO of OpenAI?"
response = ai.chat(prompt, stream=False)
print(response) # The CEO of OpenAI is Sam Altman.
```_All methods that apply to the `Cria` class also apply to `Model`._
#### With Model
Multiple models can be run through a `with` statement. This automatically closes them after use.
```python
import criaprompt = "Who is the CEO of OpenAI?"
with cria.Model("llama3") as ai:
response = ai.chat(prompt, stream=False)
print(response) # OpenAI's CEO is Sam Altman, who also...with cria.Model("llama2") as ai:
response = ai.chat(prompt, stream=False)
print(response) # The CEO of OpenAI is Sam Altman.
```#### Standalone Model
Or, models can be run traditionally.
```python
import criaprompt = "Who is the CEO of OpenAI?"
llama3 = cria.Model("llama3")
response = llama3.chat(prompt, stream=False)
print(response) # OpenAI's CEO is Sam Altman, who also...llama2 = cria.Model("llama2")
response = llama2.chat(prompt, stream=False)
print(response) # The CEO of OpenAI is Sam Altman.# Not required, but best practice.
llama3.close()
llama2.close()```
### Generate
Cria also has a `generate` method.
```python
prompt = "Who is the CEO of OpenAI?"
for chunk in ai.generate(prompt):
print(chunk, end="") # The CEO of OpenAI (Open-source Artificial Intelligence) is Sam Altman.promt = "Tell me more about him."
response = ai.generate(prompt, stream=False)
print(response) # I apologize, but I think there may have been some confusion earlier. As this...
```### Running Standalone
When you run `cria.Cria()`, an `ollama` instance will start up if one is not already running. When the program exits, this instance will terminate.
However, if you want to save resources by not exiting `ollama`, either run your own `ollama` instance in another terminal, or run a managed subprocess.
#### Running Your Own Ollama Instance
```bash
ollama serve
``````python
prompt = "Who is the CEO of OpenAI?"
with cria.Model() as ai:
response = ai.generate("Who is the CEO of OpenAI?", stream=False)
print(response)
```#### Running A Managed Subprocess (Reccomended)
```python
# If it is the first time you start the program, ollama will start automatically
# If it is the second time (or subsequent times) you run the program, ollama will already be runningai = cria.Cria(standalone=True, close_on_exit=False)
prompt = "Who is the CEO of OpenAI?"with cria.Model("llama2") as llama2:
response = llama2.generate("Who is the CEO of OpenAI?", stream=False)
print(response)with cria.Model("llama3") as llama3:
response = llama3.generate("Who is the CEO of OpenAI?", stream=False)
print(response)quit()
# Despite exiting, olama will keep running, and be used the next time this program starts.
```### Formatting
To format the output of the LLM, pass in the format keyword.
```python
ai = cria.Cria()prompt = "Return a JSON array of AI companies."
response = ai.chat(prompt, stream=False, format="json")
print(response) # ["OpenAI", "Anthropic", "Meta", "Google", "Cohere", ...].
```The current supported formats are:
* JSON
## Contributing
If you have a feature request, feel free to make an issue!
Contributions are highly appreciated.
## License
[MIT](./LICENSE.md)