
An open API service indexing awesome lists of open source software.

LLM powered development for Neovim

lua neovim neovim-plugin

Last synced: 4 days ago
JSON representation

LLM powered development for Neovim




# LLM powered development for Neovim

**llm.nvim** is a plugin for all things LLM. It uses [**llm-ls**]( as a backend.

This project is influenced by [copilot.vim]( and [tabnine-nvim](

Formerly **hfcc.nvim**.

![demonstration use of llm.nvim](assets/llm_nvim_demo.gif)

> [!NOTE]
> When using the Inference API, you will probably encounter some limitations. Subscribe to the *PRO* plan to avoid getting rate limited in the free tier.

## Features

### Code completion

This plugin supports "ghost-text" code completion, à la Copilot.

### Choose your model

Requests for code generation are made via an HTTP request.

You can use the Hugging Face [Inference API]( or your own HTTP endpoint, provided it adheres to the APIs listed in [backend](#backend).

### Always fit within the context window

The prompt sent to the model will always be sized to fit within the context window, with the number of tokens determined using [tokenizers](

## Configuration

### Backend

**llm.nvim** can interface with multiple backends hosting models.

You can override the url of the backend with the `LLM_NVIM_URL` environment variable. If url is `nil`, it will default to the Inference API's [default url](

When `api_token` is set, it will be passed as a header: `Authorization: Bearer `.

#### Inference API

##### **backend = "huggingface"**


1. Create and get your API token from here

2. Define how the plugin will read your token. For this you have multiple options, in order of precedence:
1. Pass `api_token = ` in plugin opts - this is not recommended if you use a versioning tool for your configuration files
2. Set the `LLM_NVIM_HF_API_TOKEN` environment variable
3. You can define your `HF_HOME` environment variable and create a file containing your token at `$HF_HOME/token`
4. Install the [huggingface-cli]( and run `huggingface-cli login` - this will prompt you to enter your token and set it at the right path

3. Choose your model on the [Hugging Face Hub](, and, in order of precedence, you can either:
1. Set the `LLM_NVIM_MODEL` environment variable
2. Pass `model = ` in plugin opts

Note: the `model`'s value will be appended to the url like so : `{url}/{model}` as this is how we route requests to the right model.

#### [Ollama](

##### **backend = "ollama"**


Refer to Ollama's documentation on how to run ollama. Here is an example configuration:

model = "codellama:7b",
url = "http://localhost:11434/api/generate",
-- cf
request_body = {
-- Modelfile options for the model you use
options = {
temperature = 0.2,
top_p = 0.95,

Note: `model`'s value will be added to the request body.

#### Open AI

##### **backend = "openai"**

Refer to Ollama's documentation on how to run ollama. Here is an example configuration:

model = "codellama",
url = "http://localhost:8000/v1/completions",
-- cf
request_body = {}

Note: `model`'s value will be added to the request body.

#### [TGI](

##### **backend = "tgi"**


Refer to TGI's documentation on how to run TGI. Here is an example configuration:

model = "bigcode/starcoder",
url = "http://localhost:8080/generate",
-- cf
request_body = {
parameters = {
temperature = 0.2,
top_p = 0.95,

### Models

#### [Starcoder](

tokens_to_clear = { "<|endoftext|>" },
fim = {
enabled = true,
prefix = "",
middle = "",
suffix = "",
model = "bigcode/starcoder",
context_window = 8192,
tokenizer = {
repository = "bigcode/starcoder",

> [!NOTE]
> These are the default config values

#### [CodeLlama](

tokens_to_clear = { "" },
fim = {
enabled = true,
prefix = "


middle = " ",
suffix = " ",
model = "codellama/CodeLlama-13b-hf",
context_window = 4096,
tokenizer = {
repository = "codellama/CodeLlama-13b-hf",

> [!NOTE]
> Spaces are important here

### [**llm-ls**](

By default, **llm-ls** is installed by **llm.nvim** the first time it is loaded. The binary is downloaded from the [release page]( and stored in:
vim.api.nvim_call_function("stdpath", { "data" }) .. "/llm_nvim/bin"

When developing locally, when using mason or if you built your own binary because your platform is not supported, you can set the `lsp.bin_path` setting to the path of the binary. You can also start **llm-ls** via tcp using the `--port [PORT]` option, which is useful when using a debugger.

`lsp.version` is used only when **llm.nvim** downloads **llm-ls** from the release page.

#### Mason

You can install **llm-ls** via [mason.nvim]( To do so, run the following command:

:MasonInstall llm-ls

Then reference **llm-ls**'s path in your configuration:

-- ...
lsp = {
bin_path = vim.api.nvim_call_function("stdpath", { "data" }) .. "/mason/bin/llm-ls",
-- ...
### Tokenizer

**llm-ls** uses [**tokenizers**]( to make sure the prompt fits the `context_window`.

To configure it, you have a few options:
* No tokenization, **llm-ls** will count the number of characters instead:
tokenizer = nil,
* from a local file on your disk:
tokenizer = {
path = "/path/to/my/tokenizer.json"
* from a Hugging Face repository, **llm-ls** will attempt to download `tokenizer.json` at the root of the repository:
tokenizer = {
repository = "myusername/myrepo"
api_token = nil -- optional, in case the API token used for the backend is not the same
* from an HTTP endpoint, **llm-ls** will attempt to download a file via an HTTP GET request:
tokenizer = {
url = "",
to = "/download/path/of/mytokenizer.json"

### Suggestion behavior

You can tune the way the suggestions behave:
- `enable_suggestions_on_startup` lets you choose to enable or disable "suggest-as-you-type" suggestions on neovim startup. You can then toggle auto suggest with `LLMToggleAutoSuggest` (see [Commands](#commands))
- `enable_suggestions_on_files` lets you enable suggestions only on specific files that match the pattern matching syntax you will provide. It can either be a string or a list of strings, for example:
- to match on all types of buffers: `enable_suggestions_on_files: "*"`
- to match on all files in `my_project/`: `enable_suggestions_on_files: "/path/to/my_project/*"`
- to match on all python and rust files: `enable_suggestions_on_files: { "*.py", "*.rs" }`

### Commands

**llm.nvim** provides the following commands:

- `LLMToggleAutoSuggest` enables/disables automatic "suggest-as-you-type" suggestions
- `LLMSuggestion` is used to manually request a suggestion

### Package manager

#### Using [packer](

use {
config = function()
-- cf Setup

#### Using [lazy.nvim](

opts = {
-- cf Setup

#### Using [vim-plug](

Plug 'huggingface/llm.nvim'
-- cf Setup

### Setup

local llm = require('llm')

api_token = nil, -- cf Install paragraph
model = "bigcode/starcoder", -- the model ID, behavior depends on backend
backend = "huggingface", -- backend ID, "huggingface" | "ollama" | "openai" | "tgi"
url = nil, -- the http url of the backend
tokens_to_clear = { "<|endoftext|>" }, -- tokens to remove from the model's output
-- parameters that are added to the request body, values are arbitrary, you can set any field:value pair here it will be passed as is to the backend
request_body = {
parameters = {
max_new_tokens = 60,
temperature = 0.2,
top_p = 0.95,
-- set this if the model supports fill in the middle
fim = {
enabled = true,
prefix = "",
middle = "",
suffix = "",
debounce_ms = 150,
accept_keymap = "",
dismiss_keymap = "",
tls_skip_verify_insecure = false,
-- llm-ls configuration, cf llm-ls section
lsp = {
bin_path = nil,
host = nil,
port = nil,
version = "0.5.2",
tokenizer = nil, -- cf Tokenizer paragraph
context_window = 8192, -- max number of tokens for the context window
enable_suggestions_on_startup = true,
enable_suggestions_on_files = "*", -- pattern matching syntax to enable suggestions on specific files, either a string or a list of strings
