https://github.com/arcee-ai/arcee-python
The Arcee client for executing domain-adpated language model routines https://pypi.org/project/arcee-py/
https://github.com/arcee-ai/arcee-python
ai llm llm-inference llm-training llmops
Last synced: about 2 months ago
JSON representation
The Arcee client for executing domain-adpated language model routines https://pypi.org/project/arcee-py/
- Host: GitHub
- URL: https://github.com/arcee-ai/arcee-python
- Owner: arcee-ai
- Created: 2023-06-08T21:52:18.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-10-08T17:24:18.000Z (8 months ago)
- Last Synced: 2025-04-14T12:57:38.032Z (about 2 months ago)
- Topics: ai, llm, llm-inference, llm-training, llmops
- Language: Python
- Homepage: https://www.arcee.ai
- Size: 257 KB
- Stars: 27
- Watchers: 4
- Forks: 4
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# Arcee Python Client
> The Arcee Python client allows you to manage CPT, SFT, DPO, and Merge models on the Arcee Platform.
This client may be used as a CLI by invoking `arcee` from the terminal, or as an SDK for programmatic use by `import arcee` in Python.
Learn more at https://docs.arcee.ai
## Installation
```
pip install --upgrade arcee-py
```## Authenticating
Your Arcee API key is obtained at https://app.arcee.ai
In bash:
```
export ARCEE_API_KEY=********
```In notebook:
```
import os
os.environ["ARCEE_API_KEY"] = "********"
```(Optional) To customize the URL of the Arcee platform:
```
export ARCEE_API_URL="https://your-url.arcee.ai"
```(Optional) To specify an organization to issue requests for:
```
export ARCEE_ORG="my-organization"
```If you do not specify an organization, your default organization will be used. You can change the default in your Arcee account settings.
## Upload Context
Upload context for retriever training:
```
import arcee
arcee.upload_docs("pubmed", docs=[{"doc_name": "doc1", "doc_text": "foo"}, {"doc_name": "doc2", "doc_text": "bar"}])
```## Upload Finetuning Dataset
### Method 1: Via CSV
```
arcee.upload_instructions_from_csv(
"finetuning-dataset-name",
csv_path="./your_data.csv",
prompt_column="prompt",
completion_column="completion"
)
```### Method 2: Via HF Dataset
NOTE: you will need to set `HUGGINGFACE_TOKEN` in your environment to use this function.
```
arcee.api.upload_hugging_face_dataset_qa_pairs(
"my_qa_pairs",
hf_dataset_id="org/dataset",
dataset_split="train",
data_format="chatml"
)
```## Using the Arcee CLI
You can easily train and use your Domain-Adapted Language Model (DALM) with Arcee using the CLI. Follow these steps post installation to train and utilize your DALM:
### Upload Context
Upload a context file for your DALM like,
```shell
arcee upload context pubmed --file doc1
```
Upload all files in a directory like,
```shell
arcee upload context pubmed --directory docs
```
Upload any combination of files and directories with,
```shell
arcee upload context pubmed --directory some_docs --file doc1 --directory more_docs --file doc2
```
*Note: The upload command ensures only valid and unique files are uploaded.*### Train your DALM:
Train your DALM with any uploaded context like,
```shell
arcee train medical_dalm --context pubmed
# wait for training to complete...
```
### DALM Generation:
Generate text completions from a model like,
```shell
arcee generate medical_dalm --query "Can AI-driven music therapy contribute to the rehabilitation of patients with disorders of consciousness?"
```### DALM Retrieval:
Retrieve documents for a given query and to view them or plug into a different LLM like,
```shell
arcee retrieve medical_dalm --query "Can AI-driven music therapy contribute to the rehabilitation of patients with disorders of consciousness?"
```# Contributing
We use `invoke` to manage this repo. You don't need to use it, but it simplifies the workflow.
## Set up the repo
```shell
git clone https://github.com/arcee-ai/arcee-python && cd arcee-python
# optionally setup your virtual environment (recommended)
python -m venv .venv && source .venv/bin/activate
# install repo
pip install invoke
inv install
```## Format, lint, test
```shell
inv format # run black and ruff
inv lint # black check, ruff check, mypy
inv test # pytest
```## Publishing
We publish in this repo by creating a new release/tag in github. On release, a github action will
publish the `__version__` of arcee-py that is in `arcee/__init__.py`**So you need to increase that version before releasing, otherwise it will fail**
### To create a new release
1. Open a PR increasing the `__version__` of arcee-py. You can manually edit it or run `inv version`
2. Create a new release, with the name being the `__version__` of arcee-py### Manual release [not recommended]
We do not recommend this. If you need to, please make the version number an alpha or beta release.
If you need to create a manual release, you can run `inv build && inv publish`