Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mobarski/aidapter
Adapter / facade for language models (OpenAI, Anthropic, Cohere, local transformers, etc)
https://github.com/mobarski/aidapter
adapter ai anthropic cohere facade llm llms openai python transformers
Last synced: about 1 month ago
JSON representation
Adapter / facade for language models (OpenAI, Anthropic, Cohere, local transformers, etc)
- Host: GitHub
- URL: https://github.com/mobarski/aidapter
- Owner: mobarski
- License: mit
- Created: 2023-05-23T22:49:35.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-09-21T19:15:24.000Z (about 1 year ago)
- Last Synced: 2024-10-06T09:52:32.121Z (about 1 month ago)
- Topics: adapter, ai, anthropic, cohere, facade, llm, llms, openai, python, transformers
- Language: Python
- Homepage:
- Size: 104 KB
- Stars: 18
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# aidapter
Simple adapter for many language models - remote (Hugging Face, OpenAI, AnthropicAI, CohereAI) and local (transformers library).
Facilitates loading of many new models (Guanaco, Falcon, Vicuna, etc) in 16/8/4 bit modes.
It also supports embedding models (OpenAI, CohereAI, Sentence Transformers).
## Installation
:construction: This is experimental software. Anything can change without any notice.
```
pip install git+https://github.com/mobarski/aidapter.git
```**Note**: each vendor API requires manual installation of dependencies.
## Features
- simple, unified API to many models (remote and local)
- batching
- parallel calls
- caching
- usage tracking
- automatic retries
- response priming## Usage examples
**completion:**
```python
>>> import aidapter
>>> model = aidapter.model('openai:gpt-3.5-turbo') # uses OPENAI_API_KEY env variable
>>> model.complete('2+2=')
4
``````python
>>> model.complete(['2+2=','7*6=']) # parallel calls
['4', '42']
```**embeddings:**
```python
>>> model = aidapter.model('sentence-transformers:multi-qa-mpnet-base-dot-v1')
>>> vector = model.embed('mighty indeed')
>>> vector[:5]
[-0.07946087, -0.2150347, -0.33358946, 0.18340564, 0.16403404]
``````python
>>> vectors = model.embed(['this is the way', 'so say we all']) # parallel / batch processing
>>> [x[:5] for x in vectors]
[[0.037638217, -0.30608281, -0.3064257, -0.46715638, -0.2608084],
[-0.063842215, -0.16669855, -0.22363697, -0.2893797, 0.060464755]]
```**multiple models:**
```python
>>> m1 = aidapter.model('transformers:ehartford/Wizard-Vicuna-13B-Uncensored:4bit') # 4 bit mode
>>> m2 = aidapter.model('anthropic:claude-instant-v1') # uses ANTHROPIC_API_KEY env variable
```**persistent cache and usage tracking:**
```python
>>> import shelve
>>> model.cache = shelve.open('/tmp/aidapter.cache') # persistant disk cache
>>> model.usage = shelve.open('/tmp/aidapter.usage') # persistant usage tracking
``````python
>>> import diskcache as dc
>>> model.cache = dc.Cache('/tmp/aidapter.cache') # persistant disk cache
>>> model.usage = dc.Cache('/tmp/aidapter.usage') # persistant usage tracking
```**function calling interface\*:**
```python
>>> def get_weather(city):
>>> "get weather info for a city; city must be all caps after ISO country code and a : separator (e.g. FR:PARIS)"
>>> ...
>>> model = aidapter.model('openai:gpt-3.5-turbo-0613')
>>> model.complete('Whats the weather in the capital of Poland?', functions=[get_weather])
{'function_name': 'get_weather', 'arguments': {'city': 'PL:WARSAW'}}
```\* currently, it works only with selected OpenAI models
**use last_hidden_state from any transformer as an embedding\*:**
```python
>>> model = aidapter.model('transformers:RWKV/rwkv-raven-1b5')
>>> model.raw_embed_one('mighty indeed')[:5]
[0.14850381016731262, -0.021324729546904564, 0.09214707463979721, 0.34308338165283203, -0.11288302391767502]
```\* requires additional normalization over a corpus, API will change
## API
aidapter.**model**(model_id, \*\*api_kwargs) **-> model**
- `model_id` - model identifier in the following format `:`
- `api_kwargs` - default API argumentsmodel.**complete**(prompt, system='', start='', stop=[], limit=100, temperature=0, functions=[], cache='use', debug=False) **-> str | list | dict**
- `prompt` - main prompt or list of prompts
- `system` - system prompt
- `start` - the text that will be appended to the start of the response and to the end of the prompt (aka response priming)
- `stop` - list of strings upon which to stop generating
- `limit` - maximum number of tokens to generate before stopping (aka max_new_tokens, max_tokens_to_sample)
- `temperature` - amount of randomness
- `functions` - list of functions available to the model (none of them will be executed - only the signatures are used)
- `cache` - cache usage:
- `use` - use the cache if the temperature is 0 (default)
- `skip` - don't use the cache
- `force` - use the cache even if the temperature is not 0
- `debug` - if True, the function will return a dictionary (or a list of dictionaries) containing internal objects / values
**FULL_PROMPT** = `system` + `prompt` + `start`
model.**embed**(input, limit=None) -> **list | list[list]**
- `input` - text or list of texts
- `limit` - limit the vector length to first n dimensions (default = None = no limit)**model configuration:**
- `model.workers` - number of concurrent workers for parallel completion (default=4)
- `model.show_progress` - show progress bar when performing parallel completion (default=False)
- `model.retry_tries` - maximum number of retry attempts (default=5)
- `model.retry_delay` - initial delay between retry attempts (default=0.1)
- `model.retry_backoff` - multiplier applied to the delay between retry attempts (default=3)
## Supported models
### OpenAI
- `openai:gpt-4`
- `openai:gpt-4-32k`- `openai:gpt-3.5-turbo`
- `openai:text-davinci-003`
- `openai:code-davinci-002`
- ...API key env. variable: **OPENAI_API_KEY**
### Anthropic
- `anthropic:claude-v1`
- `anthropic:claude-instant-v1`
- `anthropic:claude-v1-100k`
- `anthropic:claude-instant-v1-100k`
- ...API key env. variable: **ANTHROPIC_API_KEY**
### Cohere
- `cohere:command`
- `cohere:command-light`
- ...API key env. variable: **CO_API_KEY**
### Transformers
- `transformers:TheBloke/guanaco-7B-HF`
- `transformers:tiiuae/falcon-7b`
- `transformers:RWKV/rwkv-raven-3b`
- `transformers:ehartford/Wizard-Vicuna-13B-Uncensored`
- `transformers:roneneldan/TinyStories-33M`
- ...
## Change log
### 0.6.4
- initial support for HF API models
- removed old HF implementation### 0.6.3
- OpenAI's embeddings now use BaseModelV2
### 0.6.2
- as_iter option in BaseModelV2.transform
- removed BaseModelV2.register_progress### 0.6.1
- handle cache=False in BaseModelV2.transform_many
- hf2 brand renamed to huggingface### 0.6
- initial support for HF API embeddings
- BaseModelV2
- cleaner code
- diskcache support
- batch + threads support
- retry configuration
- progress update### 0.5.4
- initial support for the functions argument (works only with selected OpenAI models)
### 0.5.3
- initial support for raw_embed_one in transformers (for creating embeddings from ANY transformer models)
### 0.5.2
- fix: kw handling in get_cache_key
### 0.5.1
- `limit` option for embedding models
### 0.5
- initial support for embedding models (requires more work with batch / parallel processing):
- OpenAI
- Cohere
- Sentence Transformers### 0.4.4
- response priming (`start` option)
### 0.4.3
- `stop` option for transformers
### 0.4.2
- anthropic usage: tokens, characters
- transformers usage: tokens, characters
### 0.4.1
- remove prompt from transformers output
- removed kvdb
- usage['time']
- fixed pad_token_id
- fixed limit in transformer models### 0.4
- initial support for local transformers models
- float16 (add ":16bit" to the model name)
- load_in_8bit (add ":8bit" to the model name)
- load_in_4bit (add ":4bit" to the model name)
- cache = use | skip | force
- shelve based persistence (for cache and usage)
### 0.3.2
- kvdb import fix
### 0.3
- Cohere models
- disk cache### 0.2
- OpenAI instruct models
- Anthropic models (ANTHROPIC_API_KEY env variable)
- complete: debug option
- BaseModel.RENAME_KWARGS
- pip install
- limit handling### 0.1
- parallel calls / cache / usage tracking / retries
- OpenAI chat models### Next
- HF API text generation
- llama.cpp models (GGML!)
- strangulate BaseModel with BaseModelV2## Reference Materials
- https://github.com/kagisearch/pyllms
- https://chat.lmsys.org/?leaderboard
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
- https://huggingface.co/spaces/mteb/leaderboard
- https://huggingface.co/blog/getting-started-with-embeddings