https://github.com/probcomp/gengpt3.jl
GPT-3 as a generative function in Gen.
https://github.com/probcomp/gengpt3.jl
Last synced: 15 days ago
JSON representation
GPT-3 as a generative function in Gen.
- Host: GitHub
- URL: https://github.com/probcomp/gengpt3.jl
- Owner: probcomp
- License: apache-2.0
- Created: 2022-11-28T22:23:01.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-07-14T23:07:52.000Z (over 1 year ago)
- Last Synced: 2025-09-11T20:28:55.766Z (6 months ago)
- Language: Julia
- Size: 617 KB
- Stars: 19
- Watchers: 8
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# GenGPT3.jl
[GPT-3](https://en.wikipedia.org/wiki/GPT-3) as a generative function in [Gen.jl](https://www.gen.dev/), implemented by wrapping the [OpenAI API](https://openai.com/api/) in Gen's interface.
## Usage
Install both Gen and this package via the Julia Pkg REPL:
```
add Gen
add GenGPT3
```
Add your OpenAI API key as an environment variable named `OPENAI_API_KEY`. You can follow [this guide](https://help.openai.com/en/articles/5112595-best-practices-for-api-key-safety), or set `ENV["OPENAI_API_KEY"]` to the value of your API key in the Julia REPL.
Now you can construct GPT-3 as a generative function, and call GFI functions on it:
```julia
using Gen, GenGPT3
# Construct GPT3GenerativeFunction
gpt3 = GPT3GF(model="davinci-002", max_tokens=256)
# Untraced execution
prompt = "What is the tallest mountain on Mars?"
output = gpt3(prompt)
# Traced execution
trace = simulate(gpt3, (prompt,))
# Constrained generation
constraints = choicemap((:output, "Olympus Mons."))
trace, weight = generate(gpt3, (prompt,), constraints)
```
## Configuration
The constructor for a `GPT3GenerativeFunction` (or `GPT3GF` for short), can be used to configure a variety of options, documented below:
```julia
GPT3GenerativeFunction(;
model = "davinci-002",
temperature = 1.0,
max_tokens = 1024,
stop = nothing,
encoding = GenGPT3.MODEL_ENCODINGS[model],
api_key_lookup = () -> ENV["OPENAI_API_KEY"],
organization_lookup = () -> ENV["OPENAI_ORGANIZATION"]
)
```
Constructs GPT-3 as a generative function, where sampling and scoring of completions are performed via calls to the OpenAI API.
The generative function takes in a prompt as an (optional) argument, then samples and returns a completion. This represents a distribution over strings (up to `max_tokens` long) which end in a `stop` sequence. The completion is stored in the `:output` address of the resulting trace.
### Arguments
- `model::String`:
The pretrained model to query. Defaults to `"davinci-002"`.
- `temperature::Float64 = 1.0`:
The softmax temperature. Values between `0.0` and `2.0` are allowed.
Higher temperatures increase randomness. Note that if this is not set
to `1.0`, then the resulting log probabilities will no longer be normalized.
- `max_tokens::Int = 1024`:
The maximum number of output tokens generated (including the stop sequence).
- `encoding::String = GenGPT3.MODEL_ENCODINGS[model]`:
Tokenizer encoding for the model. For most models, this is `"cl100k_base"`.
- `stop::Union{String,Nothing} = nothing`:
The stop sequence as a string. Defaults to the `<|endoftext|>` token if not
specified. If specified, then the model will be prevented from generating
any `<|endoftext|>` tokens (to avoid multiple termination possibilities).
- `api_key_lookup::Function`:
A zero-argument function that returns the OpenAI API key. Defaults to
looking up the `"OPENAI_API_KEY"` environment variable.
- `organization_lookup::Function`:
A zero-argument function that returns the OpenAI organization ID to use.
Defaults to the `"OPENAI_ORGANIZATION"` environment variable, if specified.
### Variants
GenGPT3.jl also provides support for batched LLM calls with [`MultiGPT3GF`](src/multi_fn.jl) and a "mixture-of-prompts" generative function called [`GPT3Mixture`](src/mixture.jl). `MultiGPT3GF` reduces the latency of making multiple API calls, and `GPT3Mixture` (which uses `MultiGPT3GF` under the hood) can be used to marginalize over uncertainty about the prompt.
Support for batched importance sampling of LLM generations is provided by [`GPT3ImportanceSampler`](src/importance.jl), which can be configured to have
separate model and proposal LLM generative functions.
## Utilities
Utilities for converting between strings and tokens are also included as part of this package (using functionality provided by [BytePairEncoding.jl](https://github.com/chengchingwen/BytePairEncoding.jl) and [TextEncodeBase.jl](https://github.com/chengchingwen/TextEncodeBase.jl)):
```julia-repl
julia> tokens = GenGPT3.tokenize("cl100k_base", "What is the tallest mountain on Mars?")
["What", " is", " the", " tallest", " mountain", " on", " Mars", "?"]
julia> ids = GenGPT3.encode("cl100k_base", tokens)
[3923, 374, 279, 82717, 16700, 389, 21725, 30]
julia> text = GenGPT3.id_detokenize("cl100k_base", ids)
"What is the tallest mountain on Mars?"
julia> ids = GenGPT3.id_tokenize("cl100k_base", text)
[3923, 374, 279, 82717, 16700, 389, 21725, 30]
julia> tokens = GenGPT3.decode("cl100k_base", ids)
["What", " is", " the", " tallest", " mountain", " on", " Mars", "?"]
```
Support for calling the OpenAI Embeddings API is also provided:
```julia-repl
julia> embedder = GenGPT3.Embedder(model="text-embedding-ada-002");
julia> embedder("What is the tallest mountain on Mars?")
1536-element Vector{Float64}:
0.009133313
-0.0017141271
-0.010894737
⋮
-0.0018739601
-0.013621684
-0.037864104
```