|
root / docs / superpowers / specs / 2026-05-06-repoman-v0.3-llm-and-setup.md
2026-05-06-repoman-v0.3-llm-and-setup.md markdown 358 lines 21.2 KB

repoman v0.3 — Setup wizard + LLM stack integration

Scope reduction (2026-05-08)

The --hermes/--no-hermes/--purge-hermes flag-based provisioning described in
this spec was removed during smoke testing and will not ship in v0.3.

Root cause: the bind-mount-the-host-runtime architecture does not survive Python venv
portability constraints. Hermes' venv pins to a uv-vendored host-only path; bind-
mounting it into a container where that path doesn't exist fails at import time. Uid
mapping for the bind also does not generalize cleanly. Copying the venv breaks shebang
paths.

v0.4 will revisit per-container hermes provisioning via pre-built incus images that
embed a self-contained hermes install rather than sharing the host runtime.

v0.3 still ships: repoman setup wizard, llm-share profile (ollama client
wiring), schema-2 migration, and the hermes module helpers as a library for v0.4.


Status: v0.3 design, under review
Date: 2026-05-06
Implementation language: reef-lang 0.5.20 (no new stdlib requirements vs v0.2)
Origin: VISION.md §4 (repoman setup), v0.1 spec, conversation 2026-05-06 with hermes Docker docs at https://hermes-agent.nousresearch.com/docs/user-guide/docker
Outcome: the contract for v0.3 — the first version of repoman that productizes the host-side bootstrap and bundles local-LLM tooling.


0. What's new vs v0.2

v0.2 shipped new/sync/list/status/remove/shell. It assumes the host is already prepared (Incus project exists, claude-share profile authored, ZFS/NFS available). v0.3 closes that gap by introducing the host-bootstrap subcommand and adds first-class support for local LLM tooling (ollama + hermes) since that's the most common reason a fresh host needs more than just claude-share.

Two threads, one release:

  1. repoman setup — idempotent host bootstrap. Replaces the README's manual incus-project-create / profile-edit walkthrough with a guided wizard.
  2. LLM stack integrationllm-share profile, ollama client wiring, per-container hermes data dirs with selective seeding.

The two are coupled because the wizard is the natural place to offer LLM-stack setup as an option, and the per-container hermes seeding adds new behavior to repoman new.


1. Scope

In scope:

  • repoman setup — interactive + flag-driven (--non-interactive, --with-llm, --without-llm).
  • llm-share Incus profile, repoman-managed (created/refreshed by setup).
  • repoman new --hermes — opt-in flag that provisions a per-container hermes data dir.
  • repoman new --no-hermes (and --llm/--no-llm umbrella) for explicit opt-out when defaults change.
  • Selective hermes seeding from host's ~/.hermes/ into ~/.local/share/repoman/hermes/<name>/.
  • repoman remove --purge-hermes — delete the per-container hermes data dir (default: leave it for safety).
  • Host LAN-IP detection for OLLAMA_HOST (read once at setup, written into profile).
  • [defaults].llm = { enabled, hermes_default, ollama_url, hermes_seed } block in repoman.toml.

Out of scope (deferred to v0.4 or later):

  • Adopting an existing host hermes install whose ~/.hermes should be one of the project dirs (no migration tool yet — user can mv manually).
  • Re-keying / rotating .env API keys across N seeded containers.
  • Bind-mounting /opt/ollama/imports into containers for in-container ollama create. Will be added if real demand surfaces; for v0.3, model imports stay a host operation.
  • claude-share lifecycle. v0.3 checks its existence in setup and tells the user how to create it if missing, but does not author or edit it. (Same boundary as v0.2.)
  • Hermes server-side gateway (port 8642) exposure to the LAN as a shared service. The hermes docs explicitly reject the daemon model; we don't fight it.
  • YAML rewriting of seeded config.yaml. v0.3 issues a warning if localhost:11434 is detected in the seed source and asks the user to fix once on the host; we do not parse YAML.

2. Architecture additions

Two new modules, plus targeted edits to cli.reef, config.reef, incus.reef:

File Module Responsibility Pure?
src/setup.reef setup The wizard: detect state (incus project, profiles, ollama, hermes binary, host LAN IP), print summary, prompt yes/no per stage, apply changes. Composes incus.* and hermes.*. wrappers thin
src/hermes.reef hermes Per-container data-dir management: seed_data_dir(name, source, dest, seed_list), purge_data_dir(dest). Pure helpers default_seed_list() (returns the allow-list), state_dir_for(name) (resolves ~/.local/share/repoman/hermes/<name>/). helpers pure; copy effectful

Edits:

  • cli.reef — add cmd_setup dispatch, extend cmd_new to honor --hermes / --no-hermes, extend cmd_remove to honor --purge-hermes.
  • config.reef — add [defaults].llm substructure, schema bump to 2, migration path for schema = 1 registries.
  • incus.reef — add profile_exists(name, project), profile_create_or_edit(name, project, yaml), container_add_disk_device(name, device, source, path, opts). The disk-device add is needed because hermes-state is a per-container device, not a profile-level one.

Dependency graph addition: cli → setup → {incus, hermes, config, path}. hermes → {path, io.file, io.dir}. No cycles.

Non-goal: a new abstraction over Incus profiles. setup constructs llm-share from a string-template embedded in the binary — we don't build a profile-modeling layer in reef just yet.


3. The llm-share profile

Created and maintained by repoman setup, in the repoman Incus project:

name: llm-share
description: |
  Local LLM client tools (ollama client + hermes runtime) and host-daemon wiring.
  Created by repoman setup; do not hand-edit (changes will be overwritten).
config:
  environment.OLLAMA_HOST: "http://<HOST_LAN_IP>:11434"
devices:
  ollama-bin:
    type: disk
    source: /usr/local/bin/ollama
    path: /usr/local/bin/ollama
    readonly: "true"
  ollama-state:
    type: disk
    source: /home/<USER>/.ollama
    path: /home/<USER>/.ollama
    shift: "true"

<HOST_LAN_IP> is the address bound to br0 on the host, resolved at setup time by parsing ip -4 addr show br0. <USER> is the invoking user. Both are baked into the YAML at write time — repoman setup is the single source of truth.

Notably absent: any hermes bind-mount. Hermes data is per-container (see §4).

Refresh policy. Re-running repoman setup rewrites llm-share from template if and only if the on-disk content differs. setup tells the user what it changed and why.


4. Per-container hermes data dirs

Host hermes' ~/.hermes/ is left untouched. For each container that opts into hermes, repoman provisions:

  • A host directory at ~/.local/share/repoman/hermes/<container-name>/, owned by the invoking user.
  • Selective seed from ~/.hermes/ into that directory (see §4.2).
  • An Incus disk device on the container itself (not a profile) bind-mounting that host dir to /home/<USER>/.hermes inside.

The hermes binary at ~/.local/bin/hermes is reachable inside the container via the existing claude-share profile's bind on ~/.local/bin/. (We verify this assumption in §6.1; if claude-share doesn't share ~/.local/bin, the wizard tells the user to add it.)

4.1 Why per-container, not shared

Per the hermes Docker guide:

Never run two Hermes gateway containers against the same data directory simultaneously — session files and memory stores are not designed for concurrent write access.

state.db is SQLite + WAL. Sharing the data dir between host and N containers risks corruption. Per-container dirs eliminate the risk entirely and align with the hermes team's recommended pattern (one data dir per profile/container).

4.2 Seed list

Default [defaults].llm.hermes_seed:

hermes_seed = [
  ".env",                 # API keys
  "config.yaml",          # model defaults, daemon URL
  "SOUL.md",              # persona
  "skills/",              # user-authored skills (recursive)
  "hooks/",               # user hooks (recursive)
  "hermes-agent/",        # vendored runtime — symlink (see §4.3)
  "node/",                # vendored node — symlink
  "bin/",                 # extra binaries — symlink
]

Not seeded (per-instance state, must be fresh): sessions/, memories/, logs/, state.db, state.db-shm, state.db-wal, audio_cache/, image_cache/, sandboxes/, cron/, pairing/, models_dev_cache.json, ollama_cloud_models_cache.json, context_length_cache.yaml, .skills_prompt_snapshot.json, .update_check, .hermes_history, auth.lock.

The seed list is in [defaults].llm.hermes_seed so users can adjust it without rebuilding.

hermes-agent/, node/, bin/ are runtime, not user data. By default we symlink them from the host's ~/.hermes/ (so a hermes upgrade on the host applies to every container with no rebuild), and copy the credential/config files. A future --hermes-isolate-runtime flag can flip everything to copy for users who want hermes versions to diverge per container.

Symlinks must point at the host path as visible from inside the container — i.e., the symlink target must already be reachable through some other bind. Since ~/ is bind-mounted in or mappable, this works as long as the user paths align. Open question O-3 (§9) owns the cross-mount-namespace symlink correctness check.

4.4 Storage location: why ~/.local/share/repoman/hermes/<name>/

Considered: ~/.hermes-<name>/ (parallel to ~/.hermes).

Chose ~/.local/share/repoman/hermes/<name>/ because:

  • All repoman-owned per-project state ends up under one tree (~/.local/share/repoman/), which matters for backups and for users who want to know "what does repoman own?"
  • ~/.hermes-* pollutes $HOME and risks collision with hypothetical future hermes profile features.
  • XDG Base Directory convention.

5. Subcommand flows

5.1 repoman setup [--non-interactive] [--with-llm | --without-llm]

Stages, run sequentially. Each stage prints what it found, what it'd change, and (interactive mode) waits for [Y/n]. --non-interactive accepts every default; --with-llm/--without-llm non-interactively pin the LLM stage.

  1. Detect environment. incus reachable, current user, host LAN IP via br0, ollama binary, hermes binary, ~/.hermes/ presence, ZFS/NFS roots from registry defaults.
  2. Incus project repoman. Create if missing. (No-op if v0.1 already created it.)
  3. claude-share profile. Verify it exists in the repoman project and bind-mounts ~/.local/bin/. If missing or doesn't bind that path, print the recommended incus profile edit snippet and exit non-zero with a clear message — we do not author claude-share.
  4. LLM stack (gated on --with-llm or interactive yes).
    1. Verify ollama daemon is reachable on <host-lan-ip>:11434. If only on loopback, print the systemd-override snippet to make it LAN-listen and exit non-zero.
    2. Write/refresh the llm-share profile in the repoman project from template (§3).
    3. Verify hermes binary at ~/.local/bin/hermes and host data dir at ~/.hermes/. If absent, print install pointer (link to hermes user guide) and skip per-container hermes seeding default.
  5. Registry defaults. Write [defaults].profiles = ["default", "claude-share", "llm-share"] if user said yes to LLM stack; write [defaults].llm.{enabled = true, hermes_default = false, ollama_url, hermes_seed = [...]} block. Schema bumps to 2.
  6. Summary. Print setup complete summary with three follow-on hints: repoman new <name>, repoman new <name> --hermes, repoman list.

Exit codes: 0 success, 2 bad usage, 3 environment (incus unreachable, ollama not LAN-bound, br0 missing), 4 user said no to a required stage in non-interactive mode.

Idempotency: every stage is rerunnable. Re-running setup after a hermes upgrade refreshes the llm-share profile if its content changed and is otherwise a no-op.

5.2 repoman new <name> [...] [--hermes | --no-hermes]

Existing v0.2 flow plus:

  • After the container launch, if --hermes (explicit) or [defaults].llm.hermes_default = true and not --no-hermes:
    1. Compute dest = ~/.local/share/repoman/hermes/<name>/.
    2. Refuse if dest already exists and is non-empty (exit 4 with hint to repoman remove --purge-hermes <name> first).
    3. Run the seed (§4.2): copy the credential/config files, symlink the runtime dirs.
    4. incus.container_add_disk_device(<name>, "hermes-state", source=dest, path=/home/<USER>/.hermes, shift=true).
    5. Restart the container so the device takes effect.
  • The registry's [[project]] entry gains a hermes = true|false field so list/status can show it.

If the LLM stack wasn't enabled at setup time, --hermes errors out with a hint to repoman setup --with-llm.

5.3 repoman remove <name> [--purge-hermes]

Existing v0.2 flow plus:

  • Container removal proceeds as today.
  • If the project had hermes = true, the per-container data dir is left in place by default. The user's reauthorized .env and skills survive the container teardown.
  • --purge-hermes (or --purge umbrella, see open question O-1) deletes ~/.local/share/repoman/hermes/<name>/. Logged loudly because this destroys session/memory state.

5.4 repoman status <name> / repoman list

Show hermes: yes/no per project. No new flags.


6. Data shapes

6.1 Registry schema bump (1 → 2)

[repoman]
schema = 2

[defaults]
repos_root     = "~/repos"
backup_root    = "/nfs/repos"
incus_project  = "repoman"
default_image  = "images:ubuntu/26.04/cloud"
profiles       = ["default", "claude-share", "llm-share"]   # llm-share added if user opted in

[defaults.llm]
enabled        = true
hermes_default = false                                       # false → opt-in via --hermes
ollama_url     = "http://192.168.168.42:11434"               # LAN IP captured at setup
hermes_seed    = [
  ".env", "config.yaml", "SOUL.md",
  "skills/", "hooks/",
  "hermes-agent/", "node/", "bin/",
]

[[project]]
name        = "isurus"
repo        = "isurus-project"
image       = "images:ubuntu/26.04/cloud"
profiles    = ["default", "claude-share", "llm-share"]
created     = "2026-04-28T15:00:00Z"
last_sync   = ""
backup      = true
hermes      = true                                           # NEW; defaults false

Migration from schema 1: config.load_or_init recognizes schema = 1, prints a one-line note, populates [defaults].llm with enabled = false (i.e., user must opt in via setup), sets hermes = false on every existing [[project]], writes back as schema = 2. Idempotent. No data loss.

6.2 Per-project override addition

Override files (~/.config/repoman/repos.d/<name>.toml) gain an optional field:

[hermes]
enabled = true   # equivalent to passing --hermes; flag wins if both specified

Unknown to v0.2; harmless to v0.2 since override parser ignores unknown sections.

6.3 Profile YAML template

Embedded as a string constant in setup.reef:

let LLM_SHARE_TEMPLATE: string =
  "name: llm-share\n" ++
  "description: |\n" ++
  "  Local LLM client tools (ollama client + hermes runtime) and host-daemon wiring.\n" ++
  "  Created by repoman setup; do not hand-edit (changes will be overwritten).\n" ++
  "config:\n" ++
  "  environment.OLLAMA_HOST: \"http://{HOST_LAN_IP}:11434\"\n" ++
  "devices:\n" ++
  "  ollama-bin:\n" ++
  "    type: disk\n" ++
  "    source: /usr/local/bin/ollama\n" ++
  "    path: /usr/local/bin/ollama\n" ++
  "    readonly: \"true\"\n" ++
  "  ollama-state:\n" ++
  "    type: disk\n" ++
  "    source: /home/{USER}/.ollama\n" ++
  "    path: /home/{USER}/.ollama\n" ++
  "    shift: \"true\"\n"

Substitutions are literal {HOST_LAN_IP} / {USER} replacement — no template engine. Validated with a roundtrip test (substitution → YAML parse via the bundled toolchain → assert structure).


7. Testing

Mirrors the v0.1 boundary: pure logic gets unit tests; effectful wrappers get smoke-tested via integration. Specifically:

Pure tests (run on every build):

  • hermes.default_seed_list() returns the documented allow-list.
  • hermes.state_dir_for("foo") resolves to <expand_home("~")>/.local/share/repoman/hermes/foo/.
  • Profile template substitution: given known HOST_LAN_IP/USER, the rendered YAML matches a golden string.
  • Schema migration: load schema = 1 toml, get back schema = 2 toml with expected defaults populated and existing [[project]] entries unchanged except for hermes = false.
  • Seed-list partition: given the documented hermes dir contents (test fixture), the seeded vs not-seeded sets match §4.2.
  • setup stage planner: given a fixture environment description (profile present, ollama on LAN, hermes installed), the planner returns the expected list of "would change" actions and "no-op" actions.

Smoke tests (require an Incus host, gated on REPOMAN_SMOKE=1):

  • repoman setup --non-interactive --with-llm on a fresh-ish host produces a working baseline.
  • repoman new foo --hermes then incus exec foo -- ls /home/$USER/.hermes shows the seeded layout, and incus exec foo -- hermes --version runs.
  • repoman remove foo leaves the data dir; repoman remove foo --purge-hermes removes it.
  • Symlink correctness: from inside the container, readlink ~/.hermes/hermes-agent resolves to a real path (no broken link).

8. Risks / mitigations

Risk Mitigation
Symlinks for runtime dirs break across mount namespaces. Smoke test 4 above. If broken, fall back to copy for runtime dirs and surface as O-3.
Host LAN IP changes (DHCP renewal). setup --refresh re-detects and rewrites llm-share. Documented as the recovery step in repoman status when ollama health-check fails.
Hermes upgrades change the on-disk layout (hermes-agent/ schema, etc.). Symlinking the runtime dirs means upgrades flow through automatically. Seed list is config ([defaults].llm.hermes_seed) so users can adjust without recompiling.
.env shared across N containers means a leaked container key is the user's main key. Same trust model as claude-share. Documented in README. Future v0.4 work on rotation is explicitly out of scope here.
User has hermes installed in a non-default location (not ~/.hermes/). setup reads HERMES_HOME-equivalent if set; otherwise hardcoded default. Open question O-2.
setup partial-fails midway (e.g., wrote llm-share but couldn't restart a container). Each stage is independently idempotent. setup is safe to rerun. No transactional rollback in v0.3 — same posture as v0.1's container-create.

9. Open questions

  • O-1: --purge-hermes vs --purge. Should repoman remove have a single --purge flag that removes both the hermes data dir and any future per-container repoman state, or stay tool-specific (--purge-hermes, --purge-claude, …)? Defaulting to --purge-hermes for v0.3.
  • O-2: Hermes home env var. Does the hermes CLI honor a HERMES_HOME (or similar) env var to redirect from ~/.hermes/? If yes, the per-container path could be set via env rather than bind-mount-overlay. Needs probe.
  • O-3: Symlink correctness across the bind boundary. Validate empirically that ~/.local/share/repoman/hermes/<name>/hermes-agent → /home/<USER>/.hermes/hermes-agent resolves correctly inside the container when only the per-container dir is bind-mounted. If not, fall back to copy and document.
  • O-4: claude-share baseline check. v0.3 currently checks that claude-share bind-mounts ~/.local/bin/. If that's a fragile assumption (the user may name their profile differently), should setup accept a --claude-share-profile=<name> override?
  • O-5: Authoring claude-share itself. Out of scope for v0.3 (§1), but eventually repoman should manage claude-share the same way it manages llm-share. v0.4 candidate.
  • O-6: Multi-host repoman. If a user runs repoman on two hosts and their LAN IPs differ, the registry's ollama_url is host-specific. v0.3 treats repoman.toml as host-local; document.

10. Build sequence

Suggested order so each step produces a working binary:

  1. Schema bump. config.reef adds [defaults].llm parsing + schema 1→2 migration. Tests pass; v0.2 behavior unchanged.
  2. incus.profile_* and container_add_disk_device wrappers. Pure plumbing; no caller yet.
  3. hermes.reef module. default_seed_list, state_dir_for, seed_data_dir, purge_data_dir. Unit-tested.
  4. setup.reef module. Detection + planner + applier. The interactive prompt scaffolding is io.console-based; matches the wizard pattern named in VISION.
  5. cli.cmd_setup — wire it up.
  6. cmd_new extensions--hermes/--no-hermes, registry write of hermes field, post-launch seed + device-add + restart.
  7. cmd_remove extensions--purge-hermes.
  8. list/status display — show hermes: yes/no.
  9. README + VISION updates — document setup, document --hermes, link to hermes Docker docs as the source for the per-container-data-dir decision.
  10. Smoke run on a fresh host — fold findings back as bug fixes, then tag v0.3.0.