Skip to content

Architecture

This page describes Peregrine's system structure, layer boundaries, and key design decisions.


System Overview

┌─────────────────────────────────────────────────────────────┐
│                     Docker Compose                           │
│                                                             │
│  ┌──────────┐  ┌──────────┐  ┌───────┐  ┌───────────────┐ │
│  │  app     │  │  ollama  │  │ vllm  │  │ vision        │ │
│  │ :8501    │  │ :11434   │  │ :8000 │  │ :8002         │ │
│  │Streamlit │  │ Local LLM│  │ vLLM  │  │ Moondream2    │ │
│  └────┬─────┘  └──────────┘  └───────┘  └───────────────┘ │
│       │                                                      │
│  ┌────┴───────┐  ┌─────────────┐                           │
│  │ searxng    │  │  staging.db │                           │
│  │ :8888      │  │  (SQLite)   │                           │
│  └────────────┘  └─────────────┘                           │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                   Streamlit App Layer                        │
│                                                             │
│  app/app.py (entry point, navigation, sidebar task badge)  │
│                                                             │
│  app/pages/                                                 │
│    0_Setup.py       First-run wizard (gates everything)    │
│    1_Job_Review.py  Approve / reject queue                 │
│    2_Settings.py    All user configuration                 │
│    4_Apply.py       Cover letter gen + PDF export          │
│    5_Interviews.py  Kanban: phone_screen → hired           │
│    6_Interview_Prep.py  Research brief + practice Q&A     │
│    7_Survey.py      Culture-fit survey assistant           │
│                                                             │
│  app/wizard/                                                │
│    step_hardware.py ... step_integrations.py               │
│    tiers.py         Feature gate definitions               │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                     Scripts Layer                            │
│   (framework-independent — could be called by FastAPI)      │
│                                                             │
│  discover.py          JobSpy + custom board orchestration   │
│  match.py             Resume keyword scoring                │
│  db.py                All SQLite helpers (single source)    │
│  llm_router.py        LLM fallback chain                   │
│  generate_cover_letter.py  Cover letter generation         │
│  company_research.py  Pre-interview research brief         │
│  task_runner.py       Background daemon thread executor    │
│  imap_sync.py         IMAP email fetch + classify          │
│  sync.py              Push to external integrations        │
│  user_profile.py      UserProfile wrapper for user.yaml   │
│  preflight.py         Port + resource check                │
│                                                             │
│  custom_boards/       Per-board scrapers                   │
│  integrations/        Per-service integration drivers      │
│  vision_service/      FastAPI Moondream2 inference server  │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                      Config Layer                            │
│                                                             │
│  config/user.yaml          Personal data + wizard state    │
│  config/llm.yaml           LLM backends + fallback chains  │
│  config/search_profiles.yaml   Job search configuration    │
│  config/resume_keywords.yaml   Scoring keywords            │
│  config/blocklist.yaml         Excluded companies/domains  │
│  config/email.yaml             IMAP credentials            │
│  config/integrations/          Per-integration credentials │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                      Database Layer                          │
│                                                             │
│  staging.db (SQLite, local, gitignored)                     │
│                                                             │
│  jobs              Core pipeline — all job data            │
│  job_contacts      Email thread log per job                │
│  company_research  LLM-generated research briefs           │
│  background_tasks  Async task queue state                  │
│  survey_responses  Culture-fit survey Q&A pairs            │
└─────────────────────────────────────────────────────────────┘

Layer Boundaries

App layer (app/)

The Streamlit UI layer. Its only responsibilities are:

  • Reading from scripts/db.py helpers
  • Calling scripts/ functions directly or via task_runner.submit_task()
  • Rendering results to the browser

The app layer does not contain business logic. Database queries, LLM calls, and integrations all live in scripts/.

Scripts layer (scripts/)

This is the stable public API of Peregrine. Scripts are designed to be framework-independent — they do not import Streamlit and can be called from a CLI, FastAPI endpoint, or background thread without modification.

All personal data access goes through scripts/user_profile.py (UserProfile class). Scripts never read config/user.yaml directly.

All database access goes through scripts/db.py. No script does raw SQLite outside of db.py.

Config layer (config/)

Plain YAML files. Gitignored files contain secrets; .example files are committed as templates.


Background Tasks

scripts/task_runner.py provides a simple background thread executor for long-running LLM tasks.

from scripts.task_runner import submit_task

# Queue a cover letter generation task
submit_task(db_path, task_type="cover_letter", job_id=42)

# Queue a company research task
submit_task(db_path, task_type="company_research", job_id=42)

Tasks are recorded in the background_tasks table with statuses: queued → running → completed / failed.

Dedup rule: Only one queued or running task per (task_type, job_id) pair is allowed at a time. Submitting a duplicate is a silent no-op.

On startup: app/app.py resets any running or queued rows to failed to clear tasks that were interrupted by a server restart.

Sidebar indicator: app/app.py polls the background_tasks table every 3 seconds via a Streamlit fragment and displays a badge in the sidebar.


LLM Router

scripts/llm_router.py provides a single complete() call that tries backends in priority order and falls back transparently. See LLM Router for full documentation.


Key Design Decisions

scripts/ is framework-independent

The scripts layer was deliberately kept free of Streamlit imports. This means the full pipeline can be migrated to a FastAPI or Celery backend without rewriting business logic.

All personal data via UserProfile

scripts/user_profile.py is the single source of truth for all user data. This makes it easy to swap the storage backend (e.g. from YAML to a database) without touching every script.

SQLite as staging layer

staging.db acts as the staging layer between discovery and external integrations. This lets discovery, matching, and the UI all run independently without network dependencies. External integrations (Notion, Airtable, etc.) are push-only and optional.

Tier system in app/wizard/tiers.py

FEATURES is a single dict that maps feature key → minimum tier. can_use(tier, feature) is the single gating function. New features are added to FEATURES in one place.

Vision service is a separate process

Moondream2 requires torch and transformers, which are incompatible with the lightweight main conda environment. The vision service runs as a separate FastAPI process in a separate conda environment (job-seeker-vision), keeping the main env free of GPU dependencies.