External YAML configuration for tier-to-model mappings, provider settings, and ordered fallback chains with context-aware abort and transparent error unwrapping.
A completion request enters the gateway, resolves to a tier, walks the fallback chain, and returns the first successful response — or a structured error aggregating every attempt.
flowchart TD
REQ["CompletionRequest\n{Model, Messages, Tier}"]
FC["FallbackCompleter.Complete()"]
CFM["chainForModel(model)"]
CWT["CompleteWithTier(tier, req)"]
TC["tryChain(ctx, req, chain)"]
CTX{"ctx.Err() != nil?"}
LOOK{"completer exists?"}
CALL["c.Complete(ctx, r)"]
OK{"success?"}
RESP["CompletionResponse\n{FallbackUsed, ProviderName}"]
SKIP["append ProviderError\ncontinue"]
NEXT["next model in chain"]
FAIL["*FallbackError\n{Errors: []ProviderError}"]
REQ --> FC
FC -->|"by model"| CFM
FC -->|"by tier"| CWT
CFM --> TC
CWT --> TC
TC --> CTX
CTX -->|"yes"| FAIL
CTX -->|"no"| LOOK
LOOK -->|"no"| SKIP
LOOK -->|"yes"| CALL
CALL --> OK
OK -->|"yes"| RESP
OK -->|"no"| SKIP
SKIP --> NEXT
NEXT --> CTX
style REQ fill:#1e3a5f,stroke:#4a8fd4,color:#e8dcc8
style RESP fill:#1a3528,stroke:#5a9e6f,color:#e8dcc8
style FAIL fill:#2a1a1a,stroke:#c2574a,color:#e8dcc8
style CTX fill:#2a2210,stroke:#d4a73a,color:#e8dcc8
Two-phase parsing separates YAML-specific concerns from domain types. Raw intermediary structs prevent YAML tags from leaking into the domain model.
map[string]rawTierConfig and
map[string]rawProviderConfig. String-keyed tiers
allow yaml.Unmarshal to parse without custom unmarshaler.
No APIKey field — keys come from env vars only.
config.go:43-48
map[ModelTier]TierConfig and
map[string]ProviderConfig. ModelTier is a typed string
with UnmarshalText — invalid tier names fail at parse time.
config.go:23-28
LoadConfig and ValidateConfig are separate public functions
by design. Callers compose them: load for inspection, load for migration, load + validate
for production. Go idiom: parsing and validation are distinct steps.
Structural validation at config load time. All checks return ErrConfigInvalid
with context identifying the failing field.
| Check | Location | Error message | Category |
|---|---|---|---|
LiteLLMBaseURL == "" |
config.go:99 |
gateway.litellm_base_url is required | required |
TimeoutSeconds <= 0 |
config.go:102 |
gateway.timeout_seconds must be positive | required |
len(Tiers) == 0 |
config.go:105 |
at least one tier must be defined | required |
!tier.Valid() |
config.go:110 |
unknown tier "X" | structural |
PrimaryModel == "" |
config.go:113 |
tier "X" has no primary_model | structural |
FallbackChain[i] == "" |
config.go:116 |
tier "X" fallback_chain[i] is empty | structural |
● Added by council fix — commit 0c6bab0
The core fallback loop iterates through [primary, fallback-1, fallback-2, ...],
returning on the first success. A context guard at the top of each iteration provides
cancellation-aware abort with accumulated error preservation.
i=0: catches pre-cancelled contexts.
At i>0: stops the chain after a slow provider if the context timed out during the call.
Preserves all accumulated errs from prior iterations.
fallback.go:80-83
ErrNoProvider and continues. This handles empty-string
model names from malformed YAML gracefully — no crash, chain continues.
fallback.go:85-92
*ProviderError, it's recorded verbatim.
Generic errors are wrapped with c.Provider().
Ordered slice position maps to chain position for disambiguation.
fallback.go:97-104
FallbackError.Unwrap() returns []error (Go 1.20+ multi-error),
surfacing both ErrAllProvidersFailed and any context.Canceled /
context.DeadlineExceeded found in the chain. This enables
errors.Is(err, context.Canceled) to work transparently.
Prior design returned error (single). Now returns []error
enabling errors.Is(err, context.Canceled) without manual
fe.Errors iteration. Commit 0c6bab0.
Three tiers map to primary models with ordered fallback chains. Each model resolves to a registered completer at runtime.
| Tier | Primary Model | Fallback Chain | Provider |
|---|---|---|---|
| cheap | claude-haiku-4-5-20251001 |
gpt-4o-mini → ollama/llama3 |
Anthropic → OpenAI → Ollama |
| mid | claude-sonnet-4-6 |
gpt-4o |
Anthropic → OpenAI |
| frontier | claude-opus-4-6 |
gpt-4o → claude-sonnet-4-6 |
Anthropic → OpenAI → Anthropic |
flowchart LR
subgraph Tiers
CHEAP["cheap"]
MID["mid"]
FRONT["frontier"]
end
subgraph Models
H["haiku-4-5"]
S["sonnet-4-6"]
O["opus-4-6"]
G4["gpt-4o"]
GM["gpt-4o-mini"]
LL["ollama/llama3"]
end
subgraph Providers
ANT["Anthropic"]
OAI["OpenAI"]
OLL["Ollama"]
end
CHEAP --> H
CHEAP -.-> GM
CHEAP -.-> LL
MID --> S
MID -.-> G4
FRONT --> O
FRONT -.-> G4
FRONT -.-> S
H --> ANT
S --> ANT
O --> ANT
G4 --> OAI
GM --> OAI
LL --> OLL
style CHEAP fill:#1a3528,stroke:#5a9e6f,color:#e8dcc8
style MID fill:#1e3a5f,stroke:#4a8fd4,color:#e8dcc8
style FRONT fill:#2a2210,stroke:#d4a73a,color:#e8dcc8
Solid arrows = primary model • Dashed arrows = fallback chain
| Test | File | Scenario | Council |
|---|---|---|---|
PrimarySucceeds |
fallback_test.go:50 |
Happy path, no fallback | original |
PrimaryFails_FirstFallbackSucceeds |
fallback_test.go:70 |
Primary fails, first fallback serves | original |
PrimaryAndFirstFail_SecondSucceeds |
fallback_test.go:91 |
Two failures, second fallback serves | original |
AllFail_ReturnsFallbackError |
fallback_test.go:112 |
Full chain exhaustion | original |
UnknownModel_ReturnsErrNoProvider |
fallback_test.go:138 |
Model not in any tier or completer | original |
CompleteWithTier_PrimarySucceeds |
fallback_test.go:151 |
Tier-based routing happy path | original |
CompleteWithTier_InvalidTier |
fallback_test.go:170 |
Invalid tier name | original |
FallbackOnlyModel_SingleAttempt |
fallback_test.go:183 |
Model in chain but not primary | council 1 |
MidChainCancellation_PreservesAccumulatedErrors |
fallback_test.go:203 |
Primary fails, ctx cancelled, guard fires at i=1 | council 3 |
CancelledContext_StopsChain |
fallback_test.go:264 |
Pre-cancelled context, guard at i=0 | council 2 |
FallbackError_UnwrapExposesContextCanceled |
fallback_test.go:290 |
errors.Is(err, context.Canceled) transparency | council 3 |
d0022cc.
6ea715c.
0c6bab0.