Decision #5 — agenKic-orKistrator

Terminal Substrate

Why tmux over WezTerm Lua, raw PTY, or X11/Wayland positioning

The Question

Each AI agent needs its own isolated terminal environment for executing commands, capturing output, and displaying real-time work. The substrate must allow the orchestrator to spawn sessions on demand, inject commands, read back output, split panes for side-by-side views, and tear down sessions cleanly — all from Go code without blocking the agent's own goroutine.

Options Considered

tmux CLI wrapper
Send-keys / capture-pane over a Go subprocess bridge
Chosen
Pros
  • 30+ years of battle-hardened session multiplexing; every edge case is solved
  • send-keys, capture-pane, split-window, kill-session map 1:1 to Go helper calls
  • Sessions persist across orchestrator restarts — agents survive a crash
  • No GUI dependency; works headless in CI, containers, and SSH
  • Built-in scrollback buffer captures full historical output without extra plumbing
Cons
  • CLI subprocess per operation adds ~5–15 ms overhead per call
  • No native Windows support; requires WSL or a separate path
WezTerm Lua
GPU-rendered terminal with a first-class Lua scripting API
Partial — future
Pros
  • Cross-platform (macOS, Linux, Windows) from a single binary
  • GPU-accelerated rendering supports rich agent UIs out of the box
  • Lua event API allows deep integration without forking the binary
Cons
  • Relatively newer; fewer production deployments at orchestration scale
  • Lua-to-Go boundary adds an IPC hop for every control action
  • Requires a display server; headless operation is limited
Raw PTY spawning
os/exec + golang.org/x/sys/unix pty primitives
Rejected
Pros
  • Pure Go stdlib; no external runtime dependency
  • Full control over the PTY lifecycle and I/O framing
Cons
  • No multiplexing — each session requires its own goroutine + PTY pair
  • Must build pane management, scrollback, and session persistence from scratch
  • ANSI escape sequence parsing for output capture is non-trivial
  • Sessions disappear on orchestrator restart — no persistence story
X11/Wayland positioning
Spawn terminal windows and tile them via EWMH/wlr-foreign-toplevel
Rejected
Pros
  • True window isolation — each agent occupies its own native window
  • Familiar desktop layout for human observation
Cons
  • Extremely platform-specific; X11 and Wayland APIs differ significantly
  • Requires a live display; breaks headless, CI, and SSH deployments entirely
  • Window management state is held by the compositor, not the orchestrator
  • No programmatic output capture — scraping requires additional tooling

Interface Mapping

Every orchestrator operation maps to exactly one tmux subcommand. A single execTmux helper wraps os/exec and surfaces errors; no other abstraction is needed. The table below shows the full surface area.

Go Interface
tmux Command
SpawnSession
tmux new-session -d -s {name}
SendCommand
tmux send-keys -t {session} {cmd} Enter
CaptureOutput
tmux capture-pane -t {session} -p -S - {lines}
SplitPane
tmux split-window {-h|-v} -t {session}
DestroySession
tmux kill-session -t {name}

The Decision

tmux is the primary terminal substrate. A thin Go wrapper named execTmux shells out to the tmux binary for all session operations. WezTerm remains a named future alternative for cross-platform deployments where GPU rendering or Windows support becomes a requirement.

tmux

Primary — tmux via execTmux helper

The orchestrator calls execTmux(args ...string) for every terminal operation. The helper constructs the tmux subprocess, captures stdout/stderr, and returns structured errors. No session state is held in memory — tmux owns the session tree. This means the orchestrator can restart without killing any agent’s work-in-progress.

Session names follow the pattern agk-{agentID} to allow the orchestrator to enumerate all live sessions via tmux list-sessions and reconcile against its own registry on startup. Sessions that exist in tmux but not in the registry are treated as orphans and killed after a grace period.

WZ

Future Alternative — WezTerm Lua API

If the system expands to Windows or requires rich GPU-rendered agent UIs, WezTerm provides a Lua scripting surface that exposes pane, tab, and window management. The Go orchestrator would send JSON commands over a local socket to a WezTerm Lua listener, preserving the same SpawnSession / SendCommand / CaptureOutput interface contract.

No WezTerm code is written today. The interface boundary is kept clean precisely so this swap is a single-file change in the terminal adapter layer.

Why not raw PTY? The 1:1 mapping between the orchestrator interface and tmux subcommands is the deciding factor. Raw PTY would require implementing session persistence, scrollback, and pane splitting — all things tmux already provides and has tested across decades of edge cases. The execTmux wrapper is under 60 lines of Go; a raw PTY equivalent would be several hundred with ongoing maintenance.

Trade-offs Accepted

No native Windows support. tmux does not run on Windows without WSL. Any deployment targeting bare Windows requires the WezTerm path or a separate PTY backend. Documented as a known gap; WSL is the accepted workaround for development environments.
CLI subprocess overhead per operation. Each execTmux call forks a process and parses output. At typical orchestration frequency (one command per agent task step) the 5–15 ms overhead is acceptable. High-frequency polling patterns (e.g., streaming output at <100 ms intervals) must batch capture-pane calls or use a polling goroutine rather than issuing per-line subprocesses.
stderr pattern matching for error detection. tmux writes error messages to stderr without machine-readable codes. The execTmux helper must pattern-match known error strings (e.g., no server running, can't find session) to surface typed Go errors. New tmux versions may change these strings; requires a test fixture against the pinned tmux version.
Enter as a separate argument in send-keys. tmux send-keys requires the Enter key literal as a final argument to submit a command. This means every SendCommand call appends "Enter" after the command string, which is invisible to callers but must be documented clearly to avoid double-Enter bugs when callers pre-append newlines.