Operator Guide · Chapter 08

AI integration

The harness assumes AI is part of your operation, not separate from it. This chapter is how to integrate AI without becoming dependent on a single provider, without leaking sensitive data, and without surrendering decisions you should be making yourself.

The split: local vs. cloud

Two tiers of AI live inside the harness:

Local models (Ollama)

Models running on your own hardware. Free at the margin. Always available. Qualitatively weaker than cloud models — but for many tasks, weaker is fine.

What local models are good at:

Classification ("is this email a vendor invoice?")
Structured extraction ("pull the date and amount from this PDF text")
Summarization
Daily morning brief synthesis
Transcription (with faster-whisper)
Anything you run 100+ times per day

What local models are bad at:

Long-context reasoning
Code generation
Subtle writing
Anything where the cost of being wrong is high

Cloud models (Claude, GPT, Gemini)

API-based models. Cost real money per call. Higher quality. Latency from network.

What cloud models are good at:

Heavy reasoning
Code generation and debugging
Long-context analysis (reading multiple files)
Production-quality writing
Customer-facing AI

What cloud models are bad at:

High-frequency cheap tasks (the cost adds up)
Anything where you'd be uncomfortable sending the data over the network

Setting up Ollama

If you don't have Ollama installed yet, install it. On macOS:

brew install ollama
ollama serve &
ollama pull qwen3:4b   # 2.5GB, fast
ollama pull qwen3:8b   # 5.2GB, better quality, slower

Optionally: pull a transcription model (faster-whisper is a Python package, not an Ollama model — install separately if you need transcription).

Model storage location

By default, Ollama stores models in ~/.ollama/. If you're tight on internal disk, set OLLAMA_MODELS to a directory on your external drive:

# In ~/.zshrc or wherever you set env vars
export OLLAMA_MODELS=/Volumes/your-drive/ollama/models

Keeping Ollama running

Ollama needs to run as a background service. Two options:

Manual: ollama serve & in a terminal. Survives until reboot.
Launchd (macOS): A .plist file in ~/Library/LaunchAgents/ that starts Ollama at login.

The vault template doesn't ship a launchd plist (varies too much by user). This guide includes a sample at the end of the chapter if you want to wire it up.

Watchdog

If Ollama dies, your cron jobs that depend on it will fail silently. Add a watchdog:

# Every 5 minutes
*/5 * * * * curl -sf http://localhost:11434/api/version >/dev/null || ollama serve >/dev/null 2>&1 &

This restarts Ollama within 5 minutes of any death.

Cloud model setup

Claude and GPT both have CLI tools and APIs. Pick one as your primary, but install at least two so you have fallback.

For Claude, install the Claude API key in your shell config:

export ANTHROPIC_API_KEY=sk-...

Same pattern for OpenAI:

export OPENAI_API_KEY=sk-...

Don't put API keys in files committed to git. Don't put them in workspace/MEMORY.md. Use shell env or a credentials manager.

The AI tool: Claude Code (or Cursor, or similar)

Beyond raw API calls, you'll use an AI tool that can read your filesystem, edit files, run shell commands. Claude Code, Cursor, Continue, Aider — all in this category.

Whichever you pick, it reads CLAUDE.md (or similar) at session start to orient. The vault template includes a CLAUDE.md you can adapt.

The key configuration: tell the AI tool that your vault is the primary working directory. Everything else (downloads folder, home directory, scratch directories) is exceptional.

Slash commands

Slash commands are short prompts mapped to common workflows. They turn "I want to close out the session" into a single keystroke (/logs).

The vault template includes 5 starter slash commands:

/logs — session-closing protocol
/file — categorize an inbox item
/weekly-retro — Sunday-night retrospective
/daily-brief — generate today's brief on demand
/audit — surface stale state in the vault

These live in vault-template/_System/claude-commands/. To make Claude Code find them, symlink:

ln -s /path/to/your-vault/_System/claude-commands ~/.claude/commands

After the symlink, typing /logs in Claude Code reads your-vault/_System/claude-commands/logs.md and runs the prompt inside.

Writing your own slash commands

After running the starter five for a couple of weeks, you'll notice tasks you do repeatedly. Each is a candidate for a new slash command.

Slash command format:

# /command-name — Short description

You are doing X. Walk through the steps.

## Step 1: ...
## Step 2: ...
## Step 3: ...

## Final report
[What to tell the user]

Be specific about what files to read, what to update, what to report back. AI tools follow these instructions literally.

The morning brief

The most important AI integration. Cron at 6 AM. Reads from BRAIN.md, yesterday's daily log, loop signals, weather, inbox count. Calls local Ollama for synthesis. Sends iMessage. Writes markdown file.

The script ships with the template (_System/scripts/morning-brief.sh and morning-brief-synth.py). You'll adapt the INPUTS section to your domain — adding queries to your POS, bank, CRM, or whatever else you need surfaced.

Tuning the prompt is where most operators spend the most time. The default prompt is grounded (no hallucination) but conservative (says "Nothing pressing" often). Adjust toward whatever signal you need.

Cross-session memory

Claude Code, Cursor, and similar tools have memory systems that persist across conversations. Use them.

What to save in memory:

Your role and primary domain (so AI doesn't re-ask each session)
Your communication preferences (terse vs. detailed, emojis or not)
Stable facts about your operation (vault location, key paths, recurring people)
Lessons from past sessions ("we tried X, it didn't work because Y")

What NOT to save:

Ephemeral state (today's priorities — that's BRAIN.md)
Anything that's already in the vault (don't duplicate)
Secrets, credentials, PII

The vault has long-term memory in workspace/MEMORY.md. The AI tool has memory at the OS/account level. They serve different purposes — the vault is portable across AI tools; the AI tool memory is portable across vaults.

The PII boundary

Some inputs should NEVER reach a cloud AI:

SSNs, full credit card numbers, bank account numbers, routing numbers
Passport / ID numbers
Health records (HIPAA)
Employee compensation (depending on agreements)
Customer payment details

The harness enforces this with a pii-scan.py hook that runs before any file write. The hook detects common PII patterns and blocks the write.

Set up the hook before your first session writes to a file. The hook ships with the template at _System/hooks/pii-scan.py.

For more on document filing with PII, see chapter 10.

Concentration risk

If a single AI provider supplies more than 60% of your AI capability, you have concentration risk. Anthropic could change pricing, OpenAI could deprecate a model, Ollama could have a security issue.

Plan a fallback. The cheap pattern: keep API keys for two cloud providers and one local model. If your primary breaks, you can swap in 5 minutes.

Don't pay for the fallback in advance. Just have it ready.

What AI should NOT do

The harness uses AI heavily but holds a hard line:

AI does not make decisions for you. It surfaces information; you decide.
AI does not act autonomously on irreversible operations without your explicit confirmation. No auto-sending emails, no auto-paying invoices, no auto-deleting files.
AI does not "manage your life." It does the bookkeeping; you do the operating.

Operators who hand decision-making to AI lose the muscle. The harness is built to keep the operator in the loop.

Sample launchd plist for Ollama (macOS)

Save to ~/Library/LaunchAgents/com.ollama.server.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
  "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.ollama.server</string>
    <key>ProgramArguments</key>
    <array>
        <string>/opt/homebrew/bin/ollama</string>
        <string>serve</string>
    </array>
    <key>EnvironmentVariables</key>
    <dict>
        <key>OLLAMA_MODELS</key>
        <string>/Volumes/your-drive/ollama/models</string>
    </dict>
    <key>RunAtLoad</key>
    <true/>
    <key>KeepAlive</key>
    <true/>
</dict>
</plist>

Load it with launchctl load ~/Library/LaunchAgents/com.ollama.server.plist. Ollama now starts at login and restarts if it crashes.

Next chapter: cron and automation.