Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/soldni/tokreate

A minimal library to create tokens using LLMs.
https://github.com/soldni/tokreate
llm nlp python
Last synced: 3 months ago
JSON representation
A minimal library to create tokens using LLMs.
Host: GitHub
URL: https://github.com/soldni/tokreate
Owner: soldni
Created: 2023-09-25T18:16:46.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-11-07T22:41:02.000Z (over 1 year ago)
Last Synced: 2024-11-03T09:25:16.164Z (4 months ago)
Topics: llm, nlp, python
Language: Python
Homepage: https://tokreate.soldaini.net
Size: 41.7 MB
Stars: 6
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

        
Tokreate




    



A minimal library to create tokens using LLMs.

**I refuse to write more than 200 lines of code for this.**

No [LangChain](https://www.langchain.com) shenanigans.

See [tokreate.py](https://github.com/soldni/tokreate/blob/main/src/tokreate/tokreate.py) to see how well I'm sticking to this.

Tokreate is [available on PyPI](https://pypi.org/project/tokreate/)! Simply run:

```shell

pip install tokreate

```

to get started.

Use the following shortcut to navigate to a section and learn more about tokreate:

- [**Usage**](#usage)

- [**Design**](#design)

- [**Supported APIs**](#supported-apis)

- [**GitHub**](https://github.com/soldni/tokreate)

## Usage

Below, I show some ways tokreate can be used to prompt LLMs.

### A Simple Call Action

```python

from tokreate import CallAction

greetings = CallAction(

    prompt="Hello, my name is {{ name }}; what's yours?",

    model="gpt-3.5-turbo"

)

# tokreate always returns a history of the conversation

history = greetings.run(name="Alice")

for turn in history:

    print(repr(turn))

# user>>> Hello, my name is Alice; what's yours?

# assistant>>> Hello Alice, nice to meet you! I'm ChatGPT. How can I assist you today?

```

### Chaining Actions

```python

from tokreate import CallAction

greetings = CallAction(

    prompt="Hello, my name is {{ name }}; what's yours?",

    model="gpt-3.5-turbo"

)

tell_joke = CallAction(

    prompt="Can you finish this joke: What do you call a {{ animal }} with no legs?",

    model="gpt-3.5-turbo"

)

history = (greetings >> tell_joke).run(name="Alice", animal="snake")

for turn in history:

    print(repr(turn))

# user>>> Hello, my name is Alice; what's yours?

# assistant>>> Hello Alice, nice to meet you! I'm ChatGPT. How can I assist you today?

# user>>> Can you finish this joke: What do you call a snake with no legs?

# assistant>>> What do you call a snake with no legs? A "legless" reptile!

```

You can use parsers to extract information from any call:

```python

import json

from tokreate import CallAction, ParseAction

random_numbers = CallAction(

    prompt="Return a list of {{ random_count }} random numbers.",

    system="You are an helpful assistant who responds by always returning a valid JSON object.",

    model="gpt-3.5-turbo"

)

json_parser = ParseAction(

    parser=json.loads

)

*_, last_turn = (random_numbers >> json_parser).run(random_count='five')

parsed_content = last_turn.state["json.loads"]

print(f"Response {parsed_content} (type: {type(parsed_content)})")

# Response {'numbers': [4, 9, 2, 7, 1]} (type: )

```

### Using Different Models

This is an example on how to use Anthropic's models:

```python

from tokreate import CallAction

greetings = CallAction(

    prompt="Hello, my name is {{ name }}; what's yours?",

    model="claude-instant-v1.1"

)

history = greetings.run(name="Alice")

for turn in history:

    print(repr(turn))

# user>>> Hello, my name is Alice; what's yours?

# assistant>>> I'm Claude, an AI assistant created by Anthropic.

```

Let's try with one of the TogetherAI models:

```python

from tokreate import CallAction

greetings = CallAction(

    prompt="Hello, my name is {{ name }}; what's yours?",

    model="togethercomputer/llama-2-70b-chat",

    history=False   # TogetherAI models don't support history at API level yet :( you can make your own in prompt

)

history = greetings.run(name="Alice")

for turn in history:

    print(repr(turn))

# user>>> Hello, my name is Alice; what's yours?

# assistant>>>  Hello Alice! My name is ChatBot, it's nice to meet you. How can I assist you today? Is there a specific topic you'd like to discuss or ask me a question about?

```

### Async Support

You can make calls to LLMs asynchronously thanks to async support in tokreate. Note that some APIs (e.g., Anthropic) might support very few concurrent calls by default.

```python

import asyncio

from tokreate import CallAction

async def main():

    greetings = CallAction(

        prompt="Generate a very short description of the color {{ color }}.",

        model="gpt-3.5-turbo"

    )

    futures = []

    for color in ["red", "green", "blue"]:

        futures.append(greetings.arun(color=color))

    results = await asyncio.gather(*futures)

    for (request, response) in results:

        print(f"{request.state['color']}:", response)

if __name__ == "__main__":

    asyncio.run(main())

# red: Passionate and intense, red is a vibrant hue that evokes feelings of love, power, and energy.

# green: Green is the vibrant hue of nature, symbolizing growth, freshness, and harmony.

# blue: Blue is a cool and calming color that evokes feelings of tranquility and serenity.

```

## Design

I tried to keep the design of tokreate as simple as I possibly could. The library is composed of two main classes: a `Turn` and `Action`.

- **Turns** are serializable objects that represent a single turn in a conversation.

- **Actions** are objects that use prompts to call an LLMs. Actions generate turns.

### Turn APIs

Under the hood, turns are Dataclasses objects. Each turn has the following fields:

- `Turn.role`: this can be either `user` or `assistant`, depending on who generated the turn.

- `Turn.content`: the output of the LLM if the turn was generated by an action. Otherwise, this is the prompt from an action.

- `Turn.state`: a dictionary that can be used to store any information. Typically this is either variables provided when calling the `run` method of an action, or the output of a `ParseAction`.

- `Turn.meta`: a dictionary that stores metadata, typically from calling an LLM api.

For example, the following code produces a turn with the following content:

```python

from tokreate import CallAction

greetings = CallAction(

    prompt="Hello, my name is {{ name }}; what's yours?",

    model="gpt-3.5-turbo"

)

user_turn, assistant_turn = greetings.run(name="Alice")

print("User turn:")

print('\trole:', user_turn.role)

print('\tcontent:', user_turn.content)

print('\tstate:', user_turn.state)

print('\tmeta:', user_turn.meta)

print('\n')

print("Assistant turn:")

print('\trole:', assistant_turn.role)

print('\tcontent:', assistant_turn.content)

print('\tstate:', assistant_turn.state)

print('\tmeta:', assistant_turn.meta)

# User turn:

#    role: user

#    content: Hello, my name is Alice; what's yours?

#    state: {'name': 'Alice'}

#    meta: {}

# Assistant turn:

#   role: assistant

#   content: Hello Alice, nice to meet you! I'm ChatGPT. How can I assist you today?

#   state: {'name': 'Alice'}

#   meta: {'tokens_prompt': 18, 'tokens_completion': 21, 'latency': 0.47, 'messages': [{'role': 'user', 'content': "Hello, my name is Alice; what's yours?"}], 'temperature': 0, 'max_tokens': 300}

```

Note that `str(turn)` returns the `content` of the turn, and `repr(turn)` returns '{role}>>> {content}'.

### Action APIs

Actions are objects that use prompts to call an LLMs. Actions are run through the `run()` method, and always return a list of turns (i.e., a history). They can be chained together using `>>` and `<<` operators. A chain is itself an action.

The library supports two types of actions: `CallAction` and `ParseAction`. In theory, one could add actions for retrieval, planning, etc. In practice, I don't think those are in scope for this library, and can be added by users.

#### CallAction

A `CallAction` is an action that calls an LLMs. It requires two parameters:

- `prompt`: the prompt to use when calling the LLMs. Prompts are parsed using [jinja2](https://jinja.palletsprojects.com/); each prompt has access to all variables passed by the user when calling the `run()` method of the action; the `state` of the previous turn, and the `history` of the conversation so far.

- `model`: the name of the model to use. This is a string that can be found by running `python -m tokreate.registry`.

For example, the following chain of action uses many advanced features of jinja2 templating language:

```python

from tokreate import CallAction

system = "You are a conversational assistant. You like to talk, but prefer giving short answers."

greetings_action = CallAction(

    prompt="Hello, my name is {{ name }}; what's yours?",

    model="gpt-3.5-turbo"

)

hobbies_action = CallAction(

    prompt="What are your hobbies? Mine are {% for hobby in hobbies %}{{ hobby }}{% if not loop.last %}, {% else %}.{% endif %}{% endfor %}",

    model="gpt-3.5-turbo"

)

color_action = CallAction(

    prompt="What's your favorite color? Mine is {{ favorite_color }}",

    model="gpt-3.5-turbo"

)

reflect_prompt = """

So far, you told me that:

{% for turn in history %}{% if turn.role == "assistant" %}

- "{{ turn.content.strip() }}"{% endif %}{% endfor %}

Please reflect on what you told me, and what it says about you.

""".strip()

reflect_action = CallAction(

    prompt=reflect_prompt,

    model="gpt-3.5-turbo"

)

history = (

    greetings_action

    >> hobbies_action

    >> color_action

    >> reflect_action

).run(

    name="Alice",

    hobbies=["reading", "writing", "running"],

    favorite_color="blue"

)

for turn in history:

    print(repr(turn))

```

### ParseAction

The parse action is used to convert the content of the last turn into a conversation using a function. The output of the conversion is saved in the state of the turn.

For example, a parse action can be used to convert the output of an LLMs into a JSON object:

```python

import json

from tokreate import ParseAction

from tokreate.tokreate import Turn

turn = Turn(

    role="assistant",

    content='{"numbers": [4, 9, 2]}'

)

json_parser = ParseAction(

    parser=json.loads,

    name="structured_output"

)

*_, parsed_turn = json_parser.step(history=[turn])

assert parsed_turn.state["structured_output"] == {"numbers": [4, 9, 2]}

```

Note how `name` is used to specify the key in the state where the output of the parser should be saved. If `name` is not specified, the output of the parser is save at `{function_module}.{function_name}` (in this case, it would be at `parsed_turn.state["json.loads"]`).

## Supported APIs

This library currently supports OpenAI, Anthropic, and TogetherAI. For a list of all models, run `python -m tokreate.registry`. Please open an [issue on GitHub](https://github.com/soldni/tokreate/issues) to request support for a new API.

## GPT Json Output (New! 11/06)

Tokreate has been updated to keep advantage of the newest GPT-3 API features. For example, using `-1106` models, we can ensure that the output of the API is a valid JSON object:

```python

from tokreate import CallAction

random_action = CallAction(

    model="gpt-3.5-turbo-1106",

    prompt="Give me three random numbers as a JSON array",

    parameters={'response_format': {'type': 'json_object'}}

)

*, response = random_action.run()

print(response)

# {

#   "numbers": [5, 23, 17]

# }

```