https://github.com/thetokencompany/the-token-company-python

Python SDK for The Token Company. Compress LLM prompts to reduce costs and latency
https://github.com/thetokencompany/the-token-company-python

ai compression llm middleware optimization pip python

Last synced: 13 days ago
JSON representation

Python SDK for The Token Company. Compress LLM prompts to reduce costs and latency

Host: GitHub
URL: https://github.com/thetokencompany/the-token-company-python
Owner: TheTokenCompany
License: mit
Created: 2026-06-03T00:32:04.000Z (20 days ago)
Default Branch: main
Last Pushed: 2026-06-05T05:41:56.000Z (17 days ago)
Last Synced: 2026-06-08T18:28:14.776Z (14 days ago)
Topics: ai, compression, llm, middleware, optimization, pip, python
Language: Python
Homepage: https://thetokencompany.com/docs
Size: 35.2 KB
Stars: 5
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          


# The Token Company Python SDK

Compress LLM prompts to reduce costs and latency. 100K tokens compressed in ~85ms.

[![CI](https://github.com/TheTokenCompany/the-token-company-python/actions/workflows/ci.yml/badge.svg)](https://github.com/TheTokenCompany/the-token-company-python/actions/workflows/ci.yml)

[![PyPI version](https://img.shields.io/pypi/v/the-token-company)](https://pypi.org/project/the-token-company/)

[![Python versions](https://img.shields.io/pypi/pyversions/the-token-company)](https://pypi.org/project/the-token-company/)

[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/TheTokenCompany/the-token-company-python/blob/main/LICENSE)

[Docs](https://thetokencompany.com/docs) · [Website](https://thetokencompany.com) · [Dashboard](https://app.thetokencompany.com) · [Node.js SDK](https://github.com/TheTokenCompany/the-token-company-node)



## Install

```bash

pip install the-token-company

```

## Quick start

```python

from thetokencompany import TheTokenCompany

client = TheTokenCompany(api_key="ttc-...")

result = client.compress("Your long prompt text here...", model="bear-2")

print(result.output)           # compressed text

print(result.tokens_saved)     # tokens removed

print(result.compression_ratio)  # e.g. 1.8

```

## SDK wrappers

Drop-in wrappers that auto-compress all non-assistant messages before sending to your LLM. Assistant messages pass through unchanged so the provider's KV cache stays warm.

### OpenAI / OpenRouter

```python

from openai import OpenAI

from thetokencompany.openai import with_compression

client = with_compression(OpenAI(), compression_api_key="ttc-...")

response = client.chat.completions.create(

    model="gpt-4o",

    messages=[

        {"role": "system", "content": "You are a helpful assistant..."},

        {"role": "user", "content": "Summarize these results..."},

    ],

)

```

Works with `AsyncOpenAI` too — the wrapper detects async automatically.

### Anthropic

```python

from anthropic import Anthropic

from thetokencompany.anthropic import with_compression

client = with_compression(Anthropic(), compression_api_key="ttc-...")

response = client.messages.create(

    model="claude-sonnet-4-6",

    max_tokens=1024,

    system="You are a helpful assistant...",

    messages=[{"role": "user", "content": "Summarize these results..."}],

)

```

Both `messages` and the `system` parameter are compressed.

## Async

```python

from thetokencompany import AsyncTheTokenCompany

async with AsyncTheTokenCompany(api_key="ttc-...") as client:

    result = await client.compress("Your long prompt text...")

```

## Models

| Model      | Description            |

|------------|------------------------|

| `bear-2`   | Latest, recommended    |

| `bear-1.2` | Previous generation    |

## Aggressiveness

Control compression intensity with `aggressiveness` (0.0 – 1.0, default 0.5):

```python

result = client.compress(text, model="bear-2", aggressiveness=0.8)

```

## App ID

Tag compression requests with an application identifier for usage tracking:

```python

# Set on the client — applies to all requests

client = TheTokenCompany(api_key="ttc-...", app_id="my-chatbot")

# Or per-request (overrides the client-level value)

result = client.compress(text, model="bear-2", app_id="my-chatbot")

```

Also supported in wrappers:

```python

client = with_compression(OpenAI(), compression_api_key="ttc-...", app_id="my-chatbot")

```

## Gzip

Enable gzip compression of request payloads for better performance on large inputs (up to 2.2x faster on 1M+ tokens):

```python

client = TheTokenCompany(api_key="ttc-...", gzip=True)

```

## Protect text from compression

Use `protect()` to wrap content in `` tags — protected text passes through unchanged:

```python

from thetokencompany import protect

prompt = f"{protect('system:')} You are a helpful assistant.\n{protect('user:')} Hello!"

result = client.compress(prompt, model="bear-2")

```

## Response

`CompressResponse` fields:

| Field              | Type    | Description                        |

|--------------------|---------|------------------------------------|

| `output`           | `str`   | Compressed text                    |

| `output_tokens`    | `int`   | Token count after compression      |

| `input_tokens`     | `int`   | Token count before compression     |

| `tokens_saved`     | `int`   | Tokens removed                     |

| `compression_ratio`| `float` | Ratio (e.g. 1.8x)                 |

## License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/thetokencompany/the-token-company-python

Awesome Lists containing this project

README