Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mobarski/aidapter

Adapter / facade for language models (OpenAI, Anthropic, Cohere, local transformers, etc)
https://github.com/mobarski/aidapter

adapter ai anthropic cohere facade llm llms openai python transformers

Last synced: 26 days ago
JSON representation

Adapter / facade for language models (OpenAI, Anthropic, Cohere, local transformers, etc)

Host: GitHub
URL: https://github.com/mobarski/aidapter
Owner: mobarski
License: mit
Created: 2023-05-23T22:49:35.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-09-21T19:15:24.000Z (over 1 year ago)
Last Synced: 2025-01-16T19:53:58.076Z (about 1 month ago)
Topics: adapter, ai, anthropic, cohere, facade, llm, llms, openai, python, transformers
Language: Python
Homepage:
Size: 104 KB
Stars: 20
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # aidapter

Simple adapter for many language models -  remote (Hugging Face, OpenAI, AnthropicAI, CohereAI) and local (transformers library).

Facilitates loading of many new models (Guanaco, Falcon, Vicuna, etc) in 16/8/4 bit modes.

It also supports embedding models (OpenAI, CohereAI, Sentence Transformers).

## Installation

:construction: This is experimental software. Anything can change without any notice.

```

pip install git+https://github.com/mobarski/aidapter.git

```

**Note**: each vendor API requires manual installation of dependencies.

## Features

- simple, unified API to many models (remote and local)

- batching

- parallel calls

- caching

- usage tracking

- automatic retries

- response priming

## Usage examples

**completion:**

```python

>>> import aidapter

>>> model = aidapter.model('openai:gpt-3.5-turbo') # uses OPENAI_API_KEY env variable

>>> model.complete('2+2=')

4

```

```python

>>> model.complete(['2+2=','7*6=']) # parallel calls

['4', '42']

```

**embeddings:**

```python

>>> model = aidapter.model('sentence-transformers:multi-qa-mpnet-base-dot-v1')

>>> vector = model.embed('mighty indeed')

>>> vector[:5]

[-0.07946087, -0.2150347, -0.33358946, 0.18340564, 0.16403404]

```

```python

>>> vectors = model.embed(['this is the way', 'so say we all']) # parallel / batch processing

>>> [x[:5] for x in vectors]

[[0.037638217, -0.30608281, -0.3064257, -0.46715638, -0.2608084],

 [-0.063842215, -0.16669855, -0.22363697, -0.2893797, 0.060464755]]

```

**multiple models:**

```python

>>> m1 = aidapter.model('transformers:ehartford/Wizard-Vicuna-13B-Uncensored:4bit') # 4 bit mode

>>> m2 = aidapter.model('anthropic:claude-instant-v1') # uses ANTHROPIC_API_KEY env variable

```

**persistent cache and usage tracking:**

```python

>>> import shelve

>>> model.cache = shelve.open('/tmp/aidapter.cache') # persistant disk cache

>>> model.usage = shelve.open('/tmp/aidapter.usage') # persistant usage tracking

```

```python

>>> import diskcache as dc

>>> model.cache = dc.Cache('/tmp/aidapter.cache') # persistant disk cache

>>> model.usage = dc.Cache('/tmp/aidapter.usage') # persistant usage tracking

```

**function calling interface\*:**

```python

>>> def get_weather(city):

>>>     "get weather info for a city; city must be all caps after ISO country code and a : separator (e.g. FR:PARIS)"

>>>     ...

>>> model = aidapter.model('openai:gpt-3.5-turbo-0613')

>>> model.complete('Whats the weather in the capital of Poland?', functions=[get_weather])

{'function_name': 'get_weather', 'arguments': {'city': 'PL:WARSAW'}}

```

\* currently, it works only with selected OpenAI models

**use last_hidden_state from any transformer as an embedding\*:**

```python

>>> model = aidapter.model('transformers:RWKV/rwkv-raven-1b5')

>>> model.raw_embed_one('mighty indeed')[:5]

[0.14850381016731262, -0.021324729546904564, 0.09214707463979721, 0.34308338165283203, -0.11288302391767502]

```

\* requires additional normalization over a corpus, API will change

## API

aidapter.**model**(model_id, \*\*api_kwargs) **-> model**

- `model_id` - model identifier in the following format `:`

- `api_kwargs` - default API arguments

model.**complete**(prompt, system='', start='', stop=[], limit=100, temperature=0, functions=[], cache='use', debug=False) **-> str | list | dict**

- `prompt` - main prompt or list of prompts

- `system` - system prompt

- `start` - the text that will be appended to the start of the response and to the end of the prompt (aka response priming)

- `stop` - list of strings upon which to stop generating

- `limit` - maximum number of tokens to generate before stopping (aka max_new_tokens, max_tokens_to_sample)

- `temperature` - amount of randomness

- `functions` - list of functions available to the model (none of them will be executed - only the signatures are used)

- `cache` - cache usage:

  

  - `use` - use the cache if the temperature is 0 (default)

  - `skip` - don't use the cache

  - `force` - use the cache even if the temperature is not 0

  

- `debug` - if True, the function will return a dictionary (or a list of dictionaries) containing internal objects / values

  

  

  **FULL_PROMPT** = `system` + `prompt` + `start`

  

  

model.**embed**(input, limit=None) -> **list | list[list]**

- `input` - text or list of texts

- `limit` - limit the vector length to first n dimensions (default = None = no limit)

**model configuration:**

- `model.workers` - number of concurrent workers for parallel completion (default=4)

- `model.show_progress` - show progress bar when performing parallel completion (default=False)

- `model.retry_tries` - maximum number of retry attempts (default=5)

- `model.retry_delay` - initial delay between retry attempts (default=0.1)

- `model.retry_backoff` - multiplier applied to the delay between retry attempts (default=3)

## Supported models

### OpenAI

- `openai:gpt-4`

- `openai:gpt-4-32k`

- `openai:gpt-3.5-turbo`

- `openai:text-davinci-003`

- `openai:code-davinci-002`

- ...

API key env. variable: **OPENAI_API_KEY**

### Anthropic

- `anthropic:claude-v1`

- `anthropic:claude-instant-v1`

- `anthropic:claude-v1-100k`

- `anthropic:claude-instant-v1-100k`

- ...

API key env. variable: **ANTHROPIC_API_KEY**

### Cohere

- `cohere:command`

- `cohere:command-light`

- ...

API key env. variable: **CO_API_KEY**

### Transformers

- `transformers:TheBloke/guanaco-7B-HF`

- `transformers:tiiuae/falcon-7b`

- `transformers:RWKV/rwkv-raven-3b`

- `transformers:ehartford/Wizard-Vicuna-13B-Uncensored`

- `transformers:roneneldan/TinyStories-33M`

- ...

  

## Change log

### 0.6.4

- initial support for HF API models

- removed old HF implementation

### 0.6.3

- OpenAI's embeddings now use BaseModelV2

### 0.6.2

- as_iter option in BaseModelV2.transform

- removed BaseModelV2.register_progress

### 0.6.1

- handle cache=False in BaseModelV2.transform_many

- hf2 brand renamed to huggingface

### 0.6

- initial support for HF API embeddings

- BaseModelV2

  - cleaner code

  - diskcache support

  - batch + threads support

  - retry configuration

  - progress update

### 0.5.4

- initial support for the functions argument (works only with selected OpenAI models)

### 0.5.3

- initial support for raw_embed_one in transformers (for creating embeddings from ANY transformer models)

### 0.5.2

- fix: kw handling in get_cache_key

### 0.5.1

- `limit` option for embedding models

### 0.5

- initial support for embedding models (requires more work with batch / parallel processing):

  - OpenAI

  - Cohere

  - Sentence Transformers

### 0.4.4

- response priming (`start` option)

### 0.4.3

- `stop` option for transformers

### 0.4.2

- anthropic usage: tokens, characters

- transformers usage: tokens, characters

### 0.4.1

- remove prompt from transformers output

- removed kvdb

- usage['time']

- fixed pad_token_id

- fixed limit in transformer models

### 0.4

- initial support for local transformers models

  - float16 (add ":16bit" to the model name)

  - load_in_8bit (add ":8bit" to the model name)

  - load_in_4bit (add ":4bit" to the model name)

- cache = use | skip | force

- shelve based persistence (for cache and usage)

### 0.3.2

- kvdb import fix

### 0.3

- Cohere models

- disk cache

### 0.2

- OpenAI instruct models

- Anthropic models (ANTHROPIC_API_KEY env variable)

- complete: debug option

- BaseModel.RENAME_KWARGS

- pip install

- limit handling

### 0.1

- parallel calls / cache / usage tracking / retries

- OpenAI chat models

### Next

- HF API text generation

- llama.cpp models (GGML!)

- strangulate BaseModel with BaseModelV2

## Reference Materials

- https://github.com/kagisearch/pyllms

- https://chat.lmsys.org/?leaderboard

- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

- https://huggingface.co/spaces/mteb/leaderboard

- https://huggingface.co/blog/getting-started-with-embeddings