https://github.com/imagewize/ollama-opencode-setup

Ollama + Open Code Setup
https://github.com/imagewize/ollama-opencode-setup
Last synced: 13 days ago
JSON representation
Ollama + Open Code Setup
Host: GitHub
URL: https://github.com/imagewize/ollama-opencode-setup
Owner: imagewize
License: mit
Created: 2025-11-18T02:56:14.000Z (7 months ago)
Default Branch: main
Last Pushed: 2025-11-18T03:50:06.000Z (7 months ago)
Last Synced: 2025-11-18T05:29:48.440Z (7 months ago)
Size: 46.9 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.md
- Agents: docs/AGENTS.md
Awesome Lists containing this project

README

          # Ollama + Open Code Setup

Complete configuration and documentation for running Open Code CLI with local Ollama models.

## Table of Contents

- [Quick Start](#quick-start)

  - [Prerequisites](#prerequisites)

  - [Setup](#setup)

- [What's Included](#whats-included)

- [⚠️ Important: Tool Usage Discovery](#️-important-tool-usage-discovery)

- [Available Models](#available-models)

- [Common Commands](#common-commands)

  - [Ollama Management](#ollama-management)

  - [Creating Custom Models](#creating-custom-models)

  - [Open Code Usage](#open-code-usage)

- [Performance Tips](#performance-tips)

- [When to Use Local vs Cloud Models](#when-to-use-local-vs-cloud-models)

- [Documentation](#documentation)

- [Examples](#examples)

- [Troubleshooting](#troubleshooting)

- [Resources](#resources)

- [Contributing](#contributing)

- [License](#license)

---

## Quick Start

### Prerequisites

1. **Install Ollama**: [ollama.ai](https://ollama.ai)

2. **Install Open Code CLI**: [opencode.ai](https://opencode.ai)

### Setup

1. **Clone this repository:**

   ```bash

   git clone https://github.com/YOUR-USERNAME/ollama-opencode-setup.git ~/code/ollama-opencode-setup

   ```

2. **Start Ollama:**

   ```bash

   ollama serve

   ```

3. **Pull your first model (and build the recommended 16k variant):**

   ```bash

   ollama pull ministral-3:8b   # fast, reliable tool calling

   # Open Code can't set Ollama's num_ctx, so bake a 16k context variant:

   ollama create ministral-3:8b-16k -f modelfiles/ministral-3-8b-16k.Modelfile

   ```

4. **Use the configuration in your project:**

   ```bash

   # Option 1: Symlink into your project

   ln -s ~/code/ollama-opencode-setup/opencode.json ~/code/your-project/opencode.json

   # Option 2: Copy into your project

   cp ~/code/ollama-opencode-setup/opencode.json ~/code/your-project/opencode.json

   ```

5. **Run Open Code:**

   ```bash

   cd ~/code/your-project

   opencode

   ```

## What's Included

- **[opencode.json](opencode.json)** - Open Code configuration for Ollama models

- **[docs/LOCALLLMS.md](docs/LOCALLLMS.md)** - Complete documentation on local LLM setup

- **[docs/AGENTS.md](docs/AGENTS.md)** - Guide to using Open Code CLI agent modes

- **[examples/](examples/)** - Example workflows and prompts

- **[test-opencode.md](test-opencode.md)** - Test suite for validating Open Code CLI setup

## ⚠️ Important: Tool Usage Discovery

**Tool calling requires a model trained for it — and that capability is not tied to size or recency.**

All models in this config have been tested. Results (M1 16GB, 2026-05-31):

- ✅ **Ministral 3 8B** — full tool usage, **fastest tool-caller (~4s warm)**, no think-mode tax — **recommended daily driver**

- ✅ **Qwen3 models** (qwen3:8b-16k, qwen3:8b, qwen3:4b) — full tool usage confirmed, but verbose think mode (~26s)

- ❌ **DeepSeek-Coder-V2-Lite 16B** — Ollama reports `does not support tools`; fits RAM and is fast, but it's a code-completion/FIM model with no tool calling

- ❌ **Qwen3.5 9B / 4B** — outputs bash commands instead of invoking write tool

- ❌ **Phi-4** — Open Code CLI explicitly reports "does not support tools"

- ❌ **Gemma 4 E4B** — attempts tool call but sends malformed/incompatible call format

- ❌ **Mistral Nemo & Granite** — analysis only, cannot create files

Tool-call support is verified with [`scripts/tool-call-test.sh`](scripts/tool-call-test.sh). See [docs/LOCALLLMS.md](docs/LOCALLLMS.md) for full test details.

## Available Models

| Model | Size | Context | Tool Usage | Best For |

|-------|------|---------|------------|----------|

| `ministral-3:8b-16k` ⭐ | 6.0 GB | 16k | ✅ YES | **Recommended for Open Code** — 16k variant (num_ctx baked in) |

| `ministral-3:8b` | 6.0 GB | ~4k default | ✅ YES | Base model — fast tool use (~4s); runs at Ollama's small default context in Open Code |

| `qwen3:8b-16k` | 5.2 GB | 16k | ✅ YES | Multi-file analysis (larger context) |

| `qwen3:8b` | 5.2 GB | 8k | ✅ YES | General file operations (~26s) |

| `qwen3:4b` | 2.5 GB | 8k | ✅ YES | Quick file edits |

| `deepseek-coder-v2:16b` | 8.9 GB | 128k | ❌ NO | No tool support (`does not support tools`) — FIM/completion only |

| `qwen3.5:9b` | 6.6 GB | 32k | ❌ NO | Read-only analysis (too slow, 13+ min) |

| `qwen3.5:4b` | ~2.5 GB | 32k | ❌ NO | Read-only analysis only |

| `phi4:latest` | ~5 GB | 16k | ❌ NO | Read-only analysis only |

| `gemma4:e4b` | ~5.5 GB | 32k | ❌ NO | Read-only analysis only |

| `mistral-nemo:12b-instruct-2407-q4_K_M` | 7.5 GB | 8k | ❌ NO | Code review (read-only) |

| `granite3.1-moe` | 2.0 GB | 8k | ❌ NO | Fast analysis (read-only) |

## Common Commands

### Ollama Management

```bash

# List installed models

ollama list

# Run a model interactively

ollama run qwen3:8b

# Pull a new model

ollama pull mistral-nemo:12b-instruct-2407-q4_K_M

# Remove a model

ollama rm qwen3:4b

```

### Creating Custom Models

Open Code talks to Ollama via the OpenAI-compatible endpoint, which does **not** pass Ollama's `num_ctx`. To get a usable context window, bake it into a custom variant.

**Recommended — from a committed Modelfile (reproducible):**

```bash

ollama create ministral-3:8b-16k -f modelfiles/ministral-3-8b-16k.Modelfile

# Verify the context is baked in

ollama show ministral-3:8b-16k --modelfile | grep num_ctx

# PARAMETER num_ctx 16384

```

The Modelfile is just `FROM ministral-3:8b` + `PARAMETER num_ctx 16384`.

**Alternative — interactive `/save` (used for `qwen3:8b-16k`):**

```bash

# Start interactive session

ollama run qwen3:8b

# Set extended context

>>> /set parameter num_ctx 16384

Set parameter 'num_ctx' to '16384'

# Save as new model

>>> /save qwen3:8b-16k

Created new model 'qwen3:8b-16k'

# Exit

>>> /bye

```

### Open Code Usage

```bash

# Run with default model

opencode run "create a todo.md file"

# Specify model

opencode run "analyze this codebase" --model ollama/qwen3:8b-16k

# Interactive session

opencode

```

## Performance Tips

**Use the right model for the task:**

**File Creation/Modification (use a tool-capable model):**

- **Default / fastest tool use** → `ministral-3:8b-16k` (~4s, no think-mode tax, 16k context for Open Code) ⭐

- **Multi-file changes (larger context)** → `qwen3:8b-16k` (extended context + tool usage)

- **Standard file operations** → `qwen3:8b` (balanced, ~26s)

- **Quick file edits** → `qwen3:4b` (fastest Qwen3 model)

**Code Review/Analysis (read-only — any model works):**

- **Best quality review** → `mistral-nemo:12b-instruct-2407-q4_K_M` (excellent analysis)

- **Fast analysis** → `granite3.1-moe` (quickest)

- **Large context analysis** → `qwen3.5:4b` (32k context, read-only) — avoid `qwen3.5:9b` (too slow)

**Performance expectations (write tool call):**

| Task | ministral-3:8b-16k ⭐ | qwen3:8b | qwen3:8b-16k | Claude Sonnet 4 |

|------|-----------------------|----------|--------------|-----------------|

| Simple file write | **~4s** | 15-30s | 45-90s | 2-5s |

| Multi-file analysis | fast | 40-90s | 90-180s | 10-30s |

**Notes:**

- `ministral-3:8b` is the fastest tool-caller tested — no `` overhead

- Qwen3 models enter verbose "thinking mode" before execution (slower but successful)

- A model must be trained/templated for tools — fitting in RAM is not enough (e.g. DeepSeek-Coder-V2-Lite fits but has no tool calling)

## When to Use Local vs Cloud Models

### Use Local Models (Ollama) When:

- ✅ Working offline

- ✅ Processing sensitive/proprietary code

- ✅ Running batch operations overnight

- ✅ Learning/experimenting without API costs

- ✅ Privacy requirements mandate local processing

- ✅ Code review that doesn't require changes (any model)

- ⚠️ **File operations (use a tool-capable model — Ministral 3 8B or Qwen3)**

### Use Cloud Models (Claude API) When:

- ⏱️ Real-time interactive development

- ⚡ Complex multi-file operations requiring fast iteration

- 🚀 Time-sensitive tasks

- 📚 Working with very large codebases (200k+ context)

- 💰 Speed is more important than cost

- 🎯 Best code quality is critical

## Documentation

### [docs/OPENCODE-COMMANDS.md](docs/OPENCODE-COMMANDS.md)

**Complete Open Code CLI commands reference:**

- All built-in slash commands (15 commands documented)

- Bash command integration (`!command`)

- Agent switching (Tab key for build/plan agents)

- Custom command creation (file-based and config-based)

- Navigation and workflows

- Troubleshooting command issues

### [docs/LOCALLLMS.md](docs/LOCALLLMS.md)

Comprehensive guide to local LLM setup:

- Custom model creation

- Context window comparison (4k vs 8k vs 16k vs 200k)

- Ollama commands reference

- Model selection guidelines

- Troubleshooting guide

- Performance optimization

### [docs/AGENTS.md](docs/AGENTS.md)

Guide to using Open Code CLI agent modes:

- Build and plan agents (Tab key switching)

- Model capabilities for agent workflows

- Agent workflow patterns

- Controlling agent behavior

- Performance benchmarks by model

- Best practices and troubleshooting

### [test-opencode.md](test-opencode.md) & [RECOMMENDATIONS.md](RECOMMENDATIONS.md)

**Critical testing results:**

- ✅ Qwen3 models have full tool usage (file creation works)

- ❌ Mistral Nemo & Granite lack tool usage (analysis only)

- Model-by-model test results and recommendations

## Examples

Check the [examples/](examples/) directory for:

- Code review workflows

- Refactoring prompts

- Multi-file analysis examples

- Batch processing scripts

## Troubleshooting

### Ollama Not Running

```bash

# Check if Ollama is running

curl http://localhost:11434/v1/models

# Start Ollama

ollama serve

```

### Model Not Found

```bash

# Verify model exists

ollama list

# Pull model if missing

ollama pull qwen3:8b

```

### Slow Performance

- Use smaller models for simple tasks (`qwen3:4b`)

- Use standard context when extended context isn't needed (`qwen3:8b` instead of `qwen3:8b-16k`)

- Consider cloud models for time-sensitive work

See [docs/LOCALLLMS.md#troubleshooting](docs/LOCALLLMS.md#troubleshooting) for more details.

## Resources

- [Ollama Documentation](https://github.com/ollama/ollama)

- [Open Code Documentation](https://opencode.ai/docs)

- [Qwen3 Model Card](https://huggingface.co/Qwen/Qwen3-8B)

- [Mistral Nemo Documentation](https://mistral.ai/news/mistral-nemo/)

## Contributing

Contributions welcome! Please feel free to submit issues or pull requests with:

- New model configurations

- Performance optimizations

- Example workflows

- Documentation improvements

## License

MIT
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/imagewize/ollama-opencode-setup

Awesome Lists containing this project

README