Workers¶
Target Architecture — Final-State Design
Workers are the asynchronous engine of the Agent Mesh. They consume commands and events from Azure Service Bus (via MassTransit) and drive the task pipeline forward without blocking API callers. Every worker is idempotent, tenant-aware, and retry-safe, deriving its idempotency key per the metadata schema.
All workers honor the consumer rules of the canonical event envelope: deduplicate on eventId, assert tenantId before touching a store, propagate traceId/correlationId into telemetry, and dead-letter unprocessable messages with the full envelope preserved.
Worker Catalog¶
| Worker | Trigger | Purpose | Input | Output | Retry | Idempotency |
|---|---|---|---|---|---|---|
AgentTaskDispatchWorker |
AgentTaskAssigned event |
Match an assigned task to a warm agent instance and start the runtime pipeline. | AgentTaskAssigned envelope |
Claimed task + AgentExecutionStarted |
Exponential backoff, 5 attempts; then dead-letter | sha256(eventId + ":AgentTaskDispatch:" + taskId) — skips if execution already exists for task |
ExecutionTimeoutWorker |
Scheduled timer / task deadline |
Detect executions exceeding deadline or heartbeat and fail or requeue them. | Active AgentExecution records |
AgentTaskFailed or re-dispatch |
3 attempts on transient store errors | Keyed on executionId + timeout epoch; transition guarded by current state |
CorrectionRetryWorker |
ValidationFailed event |
Construct correction feedback and re-run the skill within the attempt budget. | ValidationFailed envelope + ValidationResult |
CorrectionAttempted + new skill run |
Bounded by maxCorrectionAttempts (no extra messaging retry) |
sha256(eventId + ":CorrectionRetry:" + executionId); attempt counter is monotonic |
ModelInvocationWorker |
InvokeModel command |
Execute a model call via the ModelRouterService and record the result. |
Model request (prompt, policy) | ModelInvoked + ModelInvocation record |
Provider-aware backoff; failover to alternate provider per policy | sha256(eventId + ":ModelInvocation:" + executionId + ":" + stepId) |
ToolInvocationWorker |
InvokeTool command |
Execute an MCP tool call via the ToolAdapterService and record the result. |
Tool request (tool, args, scope) | ToolInvoked + ToolInvocation record |
3 attempts for transient tool errors; non-idempotent tools run once | sha256(eventId + ":ToolInvocation:" + executionId + ":" + stepId) |
AgentHealthProbeWorker |
Scheduled timer | Probe pooled agent instances and update AgentHealthStatus. |
Pool roster | AgentHealthChanged on transition |
2 attempts; missed probe marks degraded | Keyed on agentId + probe epoch; last-writer-wins on status |
TelemetryFlushWorker |
Scheduled timer / batch threshold | Flush buffered model/tool telemetry and metrics to App Insights / OTEL. | Buffered ModelInvocation / ToolInvocation |
Exported spans + metrics | At-least-once export; consumers dedupe | Keyed on batch id; exporters are idempotent on span id |
PoisonTaskWorker |
Dead-letter subqueue | Quarantine and triage repeatedly failing messages for replay or human review. | Dead-lettered envelope | Quarantine record + operator alert | No auto-retry; manual replay | Keyed on original eventId; quarantine record is upserted |
Dispatch and Correction Flow¶
flowchart TB
Assigned["AgentTaskAssigned"] --> Dispatch["AgentTaskDispatchWorker"]
Dispatch --> Exec["Runtime execution"]
Exec --> ModelW["ModelInvocationWorker"]
Exec --> ToolW["ToolInvocationWorker"]
Exec --> Validate["Validation"]
Validate -->|"failed"| CorrW["CorrectionRetryWorker"]
CorrW --> Exec
Exec --> Timeout["ExecutionTimeoutWorker watches deadline"]
Exec --> Telemetry["TelemetryFlushWorker"]
Dispatch -.->|"unprocessable"| Poison["PoisonTaskWorker"]
Hold "Alt" / "Option" to enable pan & zoom
Worker Guarantees¶
- Idempotency — each worker stores its outcome under the derived key so retries and duplicate deliveries return the original result rather than re-executing side effects.
- Tenant isolation — handlers assert
tenantIdagainst the operation scope before any store access. - Bounded correction —
CorrectionRetryWorkernever exceeds the task'smaxCorrectionAttempts; exhaustion triggersAgentTaskFailedand human escalation. - Poison safety —
PoisonTaskWorkerpreserves the full envelope for replay; nothing is silently dropped.