Decision #5 — agenKic-orKistrator

Terminal Substrate

Why tmux over WezTerm Lua, raw PTY, or X11/Wayland positioning

The Question

Each AI agent needs its own isolated terminal environment for executing commands, capturing output, and displaying real-time work. The substrate must allow the orchestrator to spawn sessions on demand, inject commands, read back output, split panes for side-by-side views, and tear down sessions cleanly — all from Go code without blocking the agent's own goroutine.

Options Considered

tmux CLI wrapper

Send-keys / capture-pane over a Go subprocess bridge

Chosen

Pros

30+ years of battle-hardened session multiplexing; every edge case is solved
send-keys, capture-pane, split-window, kill-session map 1:1 to Go helper calls
Sessions persist across orchestrator restarts — agents survive a crash
No GUI dependency; works headless in CI, containers, and SSH
Built-in scrollback buffer captures full historical output without extra plumbing

Cons

CLI subprocess per operation adds ~5–15 ms overhead per call
No native Windows support; requires WSL or a separate path

WezTerm Lua

GPU-rendered terminal with a first-class Lua scripting API

Partial — future

Pros

Cross-platform (macOS, Linux, Windows) from a single binary
GPU-accelerated rendering supports rich agent UIs out of the box
Lua event API allows deep integration without forking the binary

Cons

Relatively newer; fewer production deployments at orchestration scale
Lua-to-Go boundary adds an IPC hop for every control action
Requires a display server; headless operation is limited

Raw PTY spawning

os/exec + golang.org/x/sys/unix pty primitives

Rejected

Pros

Pure Go stdlib; no external runtime dependency
Full control over the PTY lifecycle and I/O framing

Cons

No multiplexing — each session requires its own goroutine + PTY pair
Must build pane management, scrollback, and session persistence from scratch
ANSI escape sequence parsing for output capture is non-trivial
Sessions disappear on orchestrator restart — no persistence story

X11/Wayland positioning

Spawn terminal windows and tile them via EWMH/wlr-foreign-toplevel

Rejected

Pros

True window isolation — each agent occupies its own native window
Familiar desktop layout for human observation

Cons

Extremely platform-specific; X11 and Wayland APIs differ significantly
Requires a live display; breaks headless, CI, and SSH deployments entirely
Window management state is held by the compositor, not the orchestrator
No programmatic output capture — scraping requires additional tooling

Interface Mapping

Every orchestrator operation maps to exactly one tmux subcommand. A single execTmux helper wraps os/exec and surfaces errors; no other abstraction is needed. The table below shows the full surface area.

Go Interface

tmux Command

SpawnSession

tmux new-session -d -s {name}

SendCommand

tmux send-keys -t {session} {cmd} Enter

CaptureOutput

tmux capture-pane -t {session} -p -S - {lines}

SplitPane

tmux split-window {-h|-v} -t {session}

DestroySession

tmux kill-session -t {name}

The Decision

tmux is the primary terminal substrate. A thin Go wrapper named execTmux shells out to the tmux binary for all session operations. WezTerm remains a named future alternative for cross-platform deployments where GPU rendering or Windows support becomes a requirement.

tmux

Primary — tmux via execTmux helper

The orchestrator calls execTmux(args ...string) for every terminal operation. The helper constructs the tmux subprocess, captures stdout/stderr, and returns structured errors. No session state is held in memory — tmux owns the session tree. This means the orchestrator can restart without killing any agent’s work-in-progress.

Session names follow the pattern agk-{agentID} to allow the orchestrator to enumerate all live sessions via tmux list-sessions and reconcile against its own registry on startup. Sessions that exist in tmux but not in the registry are treated as orphans and killed after a grace period.

Future Alternative — WezTerm Lua API

If the system expands to Windows or requires rich GPU-rendered agent UIs, WezTerm provides a Lua scripting surface that exposes pane, tab, and window management. The Go orchestrator would send JSON commands over a local socket to a WezTerm Lua listener, preserving the same SpawnSession / SendCommand / CaptureOutput interface contract.

No WezTerm code is written today. The interface boundary is kept clean precisely so this swap is a single-file change in the terminal adapter layer.

Why not raw PTY? The 1:1 mapping between the orchestrator interface and tmux subcommands is the deciding factor. Raw PTY would require implementing session persistence, scrollback, and pane splitting — all things tmux already provides and has tested across decades of edge cases. The execTmux wrapper is under 60 lines of Go; a raw PTY equivalent would be several hundred with ongoing maintenance.

Trade-offs Accepted

No native Windows support. tmux does not run on Windows without WSL. Any deployment targeting bare Windows requires the WezTerm path or a separate PTY backend. Documented as a known gap; WSL is the accepted workaround for development environments.

CLI subprocess overhead per operation. Each execTmux call forks a process and parses output. At typical orchestration frequency (one command per agent task step) the 5–15 ms overhead is acceptable. High-frequency polling patterns (e.g., streaming output at <100 ms intervals) must batch capture-pane calls or use a polling goroutine rather than issuing per-line subprocesses.

stderr pattern matching for error detection. tmux writes error messages to stderr without machine-readable codes. The execTmux helper must pattern-match known error strings (e.g., no server running, can't find session) to surface typed Go errors. New tmux versions may change these strings; requires a test fixture against the pinned tmux version.

Enter as a separate argument in send-keys. tmux send-keys requires the Enter key literal as a final argument to submit a command. This means every SendCommand call appends "Enter" after the command string, which is invisible to callers but must be documented clearly to avoid double-Enter bugs when callers pre-append newlines.