Ollama Setup¶
Hybrid vector search and RAG chat are gated behind a local Ollama instance. This is the BYOK (bring your own key) unlock for the Free tier — no paid subscription required.
Install Ollama¶
Pull the required models¶
# Embedding model — converts pages into vectors
ollama pull nomic-embed-text
# Chat model — answers questions using retrieved page excerpts
ollama pull mistral:7b
nomic-embed-text produces 1024-dimensional vectors and runs comfortably on 8 GB of VRAM.
mistral:7b requires roughly 5 GB of VRAM. Substitute any compatible model.
Configure Pagepiper¶
In your .env:
PAGEPIPER_OLLAMA_URL=http://localhost:11434
PAGEPIPER_EMBED_MODEL=nomic-embed-text
PAGEPIPER_CHAT_MODEL=mistral:7b
Restart Pagepiper:
Verify¶
Upload or re-index a document. The document card should show Embedding N / M pages during ingest. Once complete, the Chat tab becomes active.
Changing embedding models¶
If you switch PAGEPIPER_EMBED_MODEL, Pagepiper detects the dimension mismatch at startup, deletes the old vector database, and automatically re-embeds all indexed documents in the background. BM25 search remains available throughout.
Note
Re-embedding a large library can take 30-60 minutes depending on hardware.