Skip to content

Orchestration Cheat Sheet

Coordinating autonomous workers — containers or agents — is the same problem at different levels of determinism.

Orchestration experience builds instincts that apply directly to agent systems.

K8s ConceptAgent EquivalentWhy It Transfers
Declarative stateIntent-driven promptsDescribe the outcome, not the procedure
Orchestration != execDelegation != doingThe scheduler doesn’t run your code
Service discoveryRegistry / capability adWorkers find each other through a central catalog
Resource limitsContext window budgetsFinite capacity requires explicit allocation
ConfigMaps / SecretsCLAUDE.md / contextConfiguration travels alongside the workload
Liveness probesOutput validationVerify workers are producing useful results
Eventual consistencyAsync coordinationNot everything settles immediately
Rolling updatesProgressive rolloutChange gradually, watch for regressions
Labels and selectorsMetadata and routingRoute work by properties, not names
NamespacesSession / project scopeIsolation prevents cross-contamination

The core transfer: Think in desired state, not step sequences.

K8s instincts that break when applied to agents without adjustment.

K8s AssumptionAgent RealityThe Gap
Deterministic executionStochastic outputSame input rarely yields identical results
Clean failure (exit codes)Semantic failure (confident nonsense)The worker says “done” but the answer is wrong
Horizontal scalingContext doesn’t shardYou can’t split a reasoning task across 10 pods
Strong contracts (schemas)Fuzzy interfaces (natural language)Input/output validation is probabilistic
Stateless workersContext-dependent reasoningAgent output depends on what it has seen
Fast restartExpensive cold start (context rebuild)Losing state costs minutes, not seconds
Observable metricsHard-to-measure qualityLatency and throughput miss the point
Idempotent operationsNon-repeatable reasoningRe-running a prompt gives different output

The core trap: Expecting mechanical reliability from cognitive workers.

K8s solved resource scheduling — bin-packing CPU and memory across nodes. Nobody has solved context scheduling for agents.

K8s scheduler: "This pod needs 2 CPU and 4GB. Node 3 has room."
Solved. Measurable. Provably optimal.
Agent scheduler: "This task needs the auth codebase context, the API
design doc, and awareness of last week's decisions."
Unsolved. Unmeasurable. Currently manual.
Resource Scheduling (Solved)Context Scheduling (Unsolved)
CPU/memory are fungibleContext is semantic, non-fungible
Usage is measurableRelevance is subjective
Capacity is fixed per nodeWindow size is fixed, density varies
Bin-packing is well-studiedNo algorithm for “what matters”
Overcommit = OOM killOvercommit = degraded reasoning
  • Manual curation — CLAUDE.md files, explicit context injection
  • Convention over discovery — standard file locations, naming patterns
  • Progressive disclosure — start small, load more on demand
  • Handoff protocols — structured summaries for session transfer

None of these scale. The team that solves automatic context routing wins the orchestration layer.

K8s health checks assume binary state: healthy or not. Agent failures are gradient.

Failure TypeK8s DetectionAgent Detection
CrashProcess exit codeException / timeout
HangLiveness probeToken stream stops
Wrong answerN/AOutput validation (hard)
Subtle driftN/ASemantic comparison (harder)
Confident errorN/ACurrently undetectable

The gap: K8s never has to ask “is this output correct?” It only asks “is the process alive?” Agent orchestration must answer both.

Real agent fleets reveal patterns that theory alone misses.

Each agent in a pipeline starts with a fresh context window, receiving only explicit inputs from the previous step. This directly counters the drift problem: an agent at step 7 works with a different implicit model than the agent at step 1.

The tradeoff: you must explicitly design what context passes between steps. This forces clarity about what actually matters at each stage.

Source: Antfarm

The developer doesn’t mark their own homework. A separate agent verifies implementation against acceptance criteria.

This catches a class of errors that self-review cannot: rationalization, satisfied-by-construction failures, and blind spots from having written the code. Simple to implement. Most systems skip it.

Source: Antfarm

Routing decisions — which agent handles a task — should be structured artifacts, not prose. Pydantic models or equivalent typed schemas prevent the orchestration layer from becoming the weakest link.

Source: AgenticFleet (via DSPy signatures)

Name your modes: sequential, parallel, delegated, handoff, discussion. The vocabulary itself improves reasoning about agent coordination, regardless of which framework you use.

“Discussion” as a first-class mode — multi-agent deliberation — acknowledges that some problems need deliberation, not just delegation.

Source: AgenticFleet

Checkpoint workflow state so you can rewind and replay from any point. Without it, debugging a failed multi-agent run means replaying from scratch.

Rare in practice. High engineering cost. Invaluable when you need it.

Source: Microsoft Agent Framework

Not every step needs an LLM. Mixing deterministic functions with agent calls in the same orchestration graph — with the same interface — prevents the common failure of using an LLM where json.loads() suffices.

Source: Microsoft Agent Framework

  1. State your intent, not your steps — declarative beats imperative
  2. Budget context like memory — every token has an opportunity cost
  3. Validate outputs, not just completion — “done” isn’t “correct”
  4. Design for cold start — assume every session begins from zero
  5. Make handoffs explicit — structured summaries, not “it’s in the chat”
  6. Registry before wiring — know what exists before connecting it
  1. Check what the agent saw — bad input explains bad output
  2. Reproduce the context, not just the prompt — same prompt, different window, different result
  3. Look for semantic failure — the agent completed successfully and produced garbage
  4. Suspect the handoff — most failures happen at boundaries

Where are you on the orchestration spectrum?

LevelK8s EquivalentAgent EquivalentIndicator
0Manual deploymentCopy-paste prompts”I’ll just run it myself”
1Shell scriptsSingle agent, manual review”Claude handles the simple stuff”
2Docker ComposeMulti-agent, structured”Agents coordinate through files”
3K8s with operatorsOrchestrated with registry”The system routes work automatically”
4Service meshContext-aware routingNobody is here yet

Most teams are at level 1-2. Level 3 requires a registry and contracts. Level 4 requires solving context routing.

You don’t need K8s experience to orchestrate agents — but if you have it, recognize which instincts transfer and which deceive. The API and data model matter more than the scheduler. Complexity must be earned by scale, not borrowed from ambition.