LLM Backend Setup (Optional)¶
An LLM backend unlocks receipt OCR, recipe suggestions (L3–L4), and style auto-classification. Everything else works without one.
You can use any OpenAI-compatible inference server: Ollama, vLLM, LM Studio, a local llama.cpp server, or a commercial API.
BYOK — Bring Your Own Key¶
BYOK means you provide your own LLM backend. Paid AI features are unlocked at any tier when a valid backend is configured. You pay for your own inference; Kiwi just uses it.
Choosing a backend¶
| Backend | Best for | Notes |
|---|---|---|
| Ollama | Local, easy setup | Recommended for getting started |
| vLLM | Local, high throughput | Better for faster hardware |
| OpenAI API | No local GPU | Requires paid API key |
| Anthropic API | No local GPU | Requires paid API key |
Ollama setup (recommended)¶
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model — llama3.1 8B works well for recipe tasks
ollama pull llama3.1
# Verify it's running
ollama list
In your Kiwi .env:
Docker networking
Use host.docker.internal instead of localhost when Ollama is running on your host and Kiwi is in Docker.
OpenAI-compatible API¶
LLM_BACKEND=openai
LLM_BASE_URL=https://api.openai.com/v1
LLM_API_KEY=sk-your-key-here
LLM_MODEL=gpt-4o-mini
Verify the connection¶
In the Kiwi Settings page, the LLM status indicator shows whether the backend is reachable. A green checkmark means OCR and L3–L4 recipe suggestions are active.
What LLM is used for¶
| Feature | LLM required |
|---|---|
| Receipt OCR (line-item extraction) | Yes |
| Recipe suggestions L1 (pantry match) | No |
| Recipe suggestions L2 (substitution) | No |
| Recipe suggestions L3 (style templates) | Yes |
| Recipe suggestions L4 (full generation) | Yes |
| Style auto-classifier | Yes |
L1 and L2 suggestions use deterministic matching — they work without any LLM configured. See Recipe Engine for the full algorithm breakdown.
Model recommendations¶
- Receipt OCR: any model with vision capability (LLaVA, GPT-4o, etc.)
- Recipe suggestions: 7B–13B instruction-tuned models work well; larger models produce more creative L4 output
- Style classification: small models handle this fine (3B+)