Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/olivierduchenne/llm_json_schema
Guaranty the output of an LLM to follow a json schema.
https://github.com/olivierduchenne/llm_json_schema
ai generative-ai jsonschema large-language-models llamacpp llm
Last synced: about 2 months ago
JSON representation
Guaranty the output of an LLM to follow a json schema.
- Host: GitHub
- URL: https://github.com/olivierduchenne/llm_json_schema
- Owner: olivierDuchenne
- License: mit
- Created: 2023-11-19T15:53:25.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-12-06T16:00:01.000Z (about 1 year ago)
- Last Synced: 2024-11-07T10:03:35.468Z (about 2 months ago)
- Topics: ai, generative-ai, jsonschema, large-language-models, llamacpp, llm
- Language: Python
- Homepage:
- Size: 30.3 KB
- Stars: 22
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# What is LLM_json_schema?
LLM_json_schema can enforce the output of an LLM model to follow a given json schema. The following types are available: string, number, boolean, array, object.
The output is guaranteed to have the correct format.
# Examples
```bash
python3 LLM_json_schema.py \
--model models/Mistral-7B-Instruct-v0.1.gguf \
--json-schema '{"type":"object", "properties":{"country":{"type":"string"}, "capital":{"type":"string"}}}' \
--prompt "What is the capital of France?\n\n"
```output:
```json
{"country":"France", "capital":"Paris"}
``````bash
python3 LLM_json_schema.py \
--model models/Mistral-7B-Instruct-v0.1.gguf \
--json-schema '{"type":"array", "items":{"type":"number"}}' \
--prompt "Count until 20.\n\n"
```output:
```json
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
```# How does it work?
It adds biases to the logits outputted by the LLM to enforce that only valid tokens can be chosen.
# Installation
## Install LLM_json_schema
```bash
cd LLM_json_schema
pip3 install -r requirements.txt
```## Download an convert an LLM model
Download an LLM model, and convert it to the gguf format.
Example:
```bash
mkdir models
cd models
git clone https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
git clone https://github.com/ggerganov/llama.cpp.git
pip install -r llama.cpp/requirements.txt
python3 llama.cpp/convert.py Mistral-7B-Instruct-v0.1 \
--outfile Mistral-7B-Instruct-v0.1.gguf \
--outtype q8_0
cd ..
```# Usage from CLI
```
usage: LLM_json_schema.py [-h] --model-path MODEL_PATH --prompt PROMPT [--json-schema JSON_SCHEMA]options:
-h, --help show this help message and exit
--model-path MODEL_PATH
Path to the LLM model in gguf format
--prompt PROMPT Input prompt
--json-schema JSON_SCHEMA
JSON schema to enforce
``````bash
python3 LLM_json_schema.py --model models/Mistral-7B-Instruct-v0.1.gguf --json-schema '{"type":"object", "properties":{"country":{"type":"string"}, "captial":{"type":"string"}}}' --prompt "What is the capital of France?\n\n"
```# Usage from Python
```python
from LLM_json_schema import run_inference_constrained_by_json_schema
import os
script_path = os.path.dirname(os.path.realpath(__file__))
model_path=os.environ.get('MODEL_PATH', os.path.join(script_path, "./models/Mistral-7B-Instruct-v0.1.gguf"))
prompt = "\n\n### Instruction:\nWhat is the capital of France?\n\n### Response:\n"
json_schema = {"type":"object", "properties":{"country":{"type":"string"}, "capital":{"type":"string"}}}
for chunk in run_inference_constrained_by_json_schema(model_path=model_path, json_schema=json_schema, prompt=prompt):
print(chunk, end="", flush=True)
print("")
```# Citation
If you use this work please cite the following:
```
@article{duchenne2023llm_json_schema,
title={LLM Json Schema},
author={Olivier Duchenne},
journal={Github},
url={https://github.com/olivierDuchenne/LLM_json_schema},
year={2023}
}
```