Self-host Ollama, point Claude Code at it

Prereq

Mac M-series or Linux with 16GB RAM (32GB recommended)
NVMe disk with 30GB free
Claude Code or OpenCode already installed

Steps

01. Install Ollama

bash

# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh

ollama --version  # 0.5.4+

02. Pull a coder model

bash

ollama pull qwen3-coder:32b-q4_K_M   # ~19GB, fast
# or
ollama pull deepseek-coder-v3:14b-q8  # ~15GB, higher precision

03. Serve the OpenAI-compat endpoint

bash

ollama serve  # defaults to 0.0.0.0:11434
# OpenAI-compatible routes live at /v1/*

04. Point Claude Code at it

bash

export ANTHROPIC_BASE_URL="http://localhost:11434/v1"
export ANTHROPIC_API_KEY="ollama"   # ignored, but required
claude --model qwen3-coder:32b-q4_K_M

For OpenCode, set the ollama provider in opencode.json as shown in the multi-provider tutorial.

05. Verify with a real task

bash

cd ~/projects/your-app
claude "read src/index.ts and suggest one optimization"

First response in 2-4s on M3 Max. No cloud, no telemetry, no bill.

Q4_K_M — ~4 bits per weight. Fits bigger models in RAM. Expect ~2-3% quality loss on code tasks. For 90% of refactors you won’t notice.
Q8 — 8 bits per weight. Nearly indistinguishable from fp16. Twice the RAM, twice the VRAM bandwidth. Notable for tricky reasoning (math, logic puzzles, rare idioms).
The honest test: run both on your actual codebase for a week. Q4 wins on speed and memory, Q8 wins when you catch it hallucinating API signatures.
fp16 full precision only matters if you’re benchmarking against the paper. In daily dev work it’s overkill.
GGUF format is standard now — any engine (llama.cpp, vLLM, Ollama) reads the same file.
Real rule: pick the largest model your RAM fits at Q4. 32B-Q4 beats 14B-Q8 for coding nine times out of ten.

Self-hosted Ollama on a home server
When to graduate from Ollama to vLLM

Prereq

Steps

01. Install Ollama

02. Pull a coder model

03. Serve the OpenAI-compat endpoint

04. Point Claude Code at it

05. Verify with a real task

Next