AI integration
The harness assumes AI is part of your operation, not separate from it. This chapter is how to integrate AI without becoming dependent on a single provider, without leaking sensitive data, and without surrendering decisions you should be making yourself.
The split: local vs. cloud
Two tiers of AI live inside the harness:
Local models (Ollama)
Models running on your own hardware. Free at the margin. Always available. Qualitatively weaker than cloud models — but for many tasks, weaker is fine.
What local models are good at:
- Classification ("is this email a vendor invoice?")
- Structured extraction ("pull the date and amount from this PDF text")
- Summarization
- Daily morning brief synthesis
- Transcription (with
faster-whisper) - Anything you run 100+ times per day
What local models are bad at:
- Long-context reasoning
- Code generation
- Subtle writing
- Anything where the cost of being wrong is high
Cloud models (Claude, GPT, Gemini)
API-based models. Cost real money per call. Higher quality. Latency from network.
What cloud models are good at:
- Heavy reasoning
- Code generation and debugging
- Long-context analysis (reading multiple files)
- Production-quality writing
- Customer-facing AI
What cloud models are bad at:
- High-frequency cheap tasks (the cost adds up)
- Anything where you'd be uncomfortable sending the data over the network
Setting up Ollama
If you don't have Ollama installed yet, install it. On macOS:
Optionally: pull a transcription model (faster-whisper is a Python package, not an Ollama model — install separately if you need transcription).
Model storage location
By default, Ollama stores models in ~/.ollama/. If you're tight on internal disk, set OLLAMA_MODELS to a directory on your external drive:
Keeping Ollama running
Ollama needs to run as a background service. Two options:
- Manual:
ollama serve &in a terminal. Survives until reboot. - Launchd (macOS): A
.plistfile in~/Library/LaunchAgents/that starts Ollama at login.
The vault template doesn't ship a launchd plist (varies too much by user). This guide includes a sample at the end of the chapter if you want to wire it up.
Watchdog
If Ollama dies, your cron jobs that depend on it will fail silently. Add a watchdog:
This restarts Ollama within 5 minutes of any death.
Cloud model setup
Claude and GPT both have CLI tools and APIs. Pick one as your primary, but install at least two so you have fallback.
For Claude, install the Claude API key in your shell config:
Same pattern for OpenAI:
Don't put API keys in files committed to git. Don't put them in workspace/MEMORY.md. Use shell env or a credentials manager.
The AI tool: Claude Code (or Cursor, or similar)
Beyond raw API calls, you'll use an AI tool that can read your filesystem, edit files, run shell commands. Claude Code, Cursor, Continue, Aider — all in this category.
Whichever you pick, it reads CLAUDE.md (or similar) at session start to orient. The vault template includes a CLAUDE.md you can adapt.
The key configuration: tell the AI tool that your vault is the primary working directory. Everything else (downloads folder, home directory, scratch directories) is exceptional.
Slash commands
Slash commands are short prompts mapped to common workflows. They turn "I want to close out the session" into a single keystroke (/logs).
The vault template includes 5 starter slash commands:
/logs— session-closing protocol/file— categorize an inbox item/weekly-retro— Sunday-night retrospective/daily-brief— generate today's brief on demand/audit— surface stale state in the vault
These live in vault-template/_System/claude-commands/. To make Claude Code find them, symlink:
After the symlink, typing /logs in Claude Code reads your-vault/_System/claude-commands/logs.md and runs the prompt inside.
Writing your own slash commands
After running the starter five for a couple of weeks, you'll notice tasks you do repeatedly. Each is a candidate for a new slash command.
Slash command format:
Be specific about what files to read, what to update, what to report back. AI tools follow these instructions literally.
The morning brief
The most important AI integration. Cron at 6 AM. Reads from BRAIN.md, yesterday's daily log, loop signals, weather, inbox count. Calls local Ollama for synthesis. Sends iMessage. Writes markdown file.
The script ships with the template (_System/scripts/morning-brief.sh and morning-brief-synth.py). You'll adapt the INPUTS section to your domain — adding queries to your POS, bank, CRM, or whatever else you need surfaced.
Tuning the prompt is where most operators spend the most time. The default prompt is grounded (no hallucination) but conservative (says "Nothing pressing" often). Adjust toward whatever signal you need.
Cross-session memory
Claude Code, Cursor, and similar tools have memory systems that persist across conversations. Use them.
What to save in memory:
- Your role and primary domain (so AI doesn't re-ask each session)
- Your communication preferences (terse vs. detailed, emojis or not)
- Stable facts about your operation (vault location, key paths, recurring people)
- Lessons from past sessions ("we tried X, it didn't work because Y")
What NOT to save:
- Ephemeral state (today's priorities — that's
BRAIN.md) - Anything that's already in the vault (don't duplicate)
- Secrets, credentials, PII
The vault has long-term memory in workspace/MEMORY.md. The AI tool has memory at the OS/account level. They serve different purposes — the vault is portable across AI tools; the AI tool memory is portable across vaults.
The PII boundary
Some inputs should NEVER reach a cloud AI:
- SSNs, full credit card numbers, bank account numbers, routing numbers
- Passport / ID numbers
- Health records (HIPAA)
- Employee compensation (depending on agreements)
- Customer payment details
The harness enforces this with a pii-scan.py hook that runs before any file write. The hook detects common PII patterns and blocks the write.
Set up the hook before your first session writes to a file. The hook ships with the template at _System/hooks/pii-scan.py.
For more on document filing with PII, see chapter 10.
Concentration risk
If a single AI provider supplies more than 60% of your AI capability, you have concentration risk. Anthropic could change pricing, OpenAI could deprecate a model, Ollama could have a security issue.
Plan a fallback. The cheap pattern: keep API keys for two cloud providers and one local model. If your primary breaks, you can swap in 5 minutes.
Don't pay for the fallback in advance. Just have it ready.
What AI should NOT do
The harness uses AI heavily but holds a hard line:
- AI does not make decisions for you. It surfaces information; you decide.
- AI does not act autonomously on irreversible operations without your explicit confirmation. No auto-sending emails, no auto-paying invoices, no auto-deleting files.
- AI does not "manage your life." It does the bookkeeping; you do the operating.
Operators who hand decision-making to AI lose the muscle. The harness is built to keep the operator in the loop.
Sample launchd plist for Ollama (macOS)
Save to ~/Library/LaunchAgents/com.ollama.server.plist:
Load it with launchctl load ~/Library/LaunchAgents/com.ollama.server.plist. Ollama now starts at login and restarts if it crashes.
Next chapter: cron and automation.