Skip to content

Workers

Target Architecture — Final-State Design

The 7 workers below run the continuous, asynchronous operations of the platform. They are deployed as Azure Container Apps / Azure Functions, consume the canonical event envelope from Azure Service Bus, and deduplicate on eventId.

Workers carry the platform's autonomous behaviour: they react to runtime events, poll Azure control planes, and reconcile state without human intervention. Each worker is idempotent, tenant-aware, and emits enveloped events correlated by traceId.

Worker Catalog

Worker Trigger Purpose Input Output Retry Idempotency
RuntimeDeploymentWorker Deploy command / RuntimeDeploymentRequested event Execute the rollout of generated workloads to Azure compute and drive the deployment state machine. RuntimeDeployment spec, config + secret bindings, images Rollout to AKS/ACA/Functions/App Service; RuntimeDeploymentCompleted Exponential backoff (max 5), then dead-letter Keyed on deploymentId + step; re-applying a completed step is a no-op
RuntimeHealthWorker Timer (every 30s) + dependency events Evaluate liveness/readiness/dependency health of every running component. Service inventory, health probe endpoints HealthCheckResult; HealthCheckCompleted Per-probe retry (3), degraded after threshold Keyed on serviceId + evaluation window; latest result wins
ScalingPolicyWorker Timer (every 60s) + HealthCheckCompleted Evaluate scaling policies against live telemetry and apply replica/throughput changes. ScalingPolicy, App Insights metrics Scale action; ScalingPolicyApplied Backoff (max 4); cooldown between actions Keyed on serviceId + desired replica count; no-op if already at target
DriftDetectionWorker Timer (every 5m) + RuntimeDeploymentCompleted Compare actual runtime state to Git/Pulumi desired state and raise findings. Live inventory snapshot, desired-state manifests RuntimeDriftFinding; RuntimeDriftDetected Backoff (max 3) on transient read errors Keyed on environmentId + finding hash; duplicate findings merged
ConfigurationSyncWorker ConfigurationPublished event Propagate versioned runtime configuration to running workloads. RuntimeConfiguration version Applied config; ConfigurationSynced Backoff (max 5), then dead-letter Keyed on configurationId + serviceId; re-apply of same version is a no-op
SecretRotationWorker Timer (rotation schedule) + SecretRotationRequested Rotate Key Vault secrets and refresh workload bindings via managed identity. SecretBinding, Key Vault references New secret version; SecretRotated Backoff (max 5); alert on repeated failure Keyed on secretBindingId + Key Vault version; idempotent on version
RuntimeInventoryWorker Timer (every 2m) + deployment/scale events Reconcile the live inventory of generated components from Azure control planes. AKS/ACA/Functions/App Service APIs Updated RuntimeService records; RuntimeInventoryUpdated Backoff (max 3) on throttling Keyed on serviceId; converges to observed state

Event-Flow Diagram

flowchart LR
    Deploy["Deploy command"] --> DepW["RuntimeDeploymentWorker"]
    DepW -->|"RuntimeDeploymentCompleted"| InvW["RuntimeInventoryWorker"]
    DepW -->|"RuntimeDeploymentCompleted"| DriftW["DriftDetectionWorker"]
    Timer30["Timer 30s"] --> HealthW["RuntimeHealthWorker"]
    HealthW -->|"HealthCheckCompleted"| ScaleW["ScalingPolicyWorker"]
    HealthW -->|"HealthCheckCompleted"| DriftW
    ScaleW -->|"ScalingPolicyApplied"| InvW
    CfgPub["ConfigurationPublished"] --> CfgW["ConfigurationSyncWorker"]
    CfgW -->|"ConfigurationSynced"| DepW
    RotSched["Rotation schedule"] --> SecW["SecretRotationWorker"]
    SecW -->|"SecretRotated"| DepW
    InvW -->|"RuntimeInventoryUpdated"| DriftW
    DriftW -->|"RuntimeDriftDetected"| DepW
Hold "Alt" / "Option" to enable pan & zoom

Reliability Patterns

  • Idempotency — every worker deduplicates inbound messages on eventId and applies a handler-scoped idempotency key (see Event Envelope — Consumer rules).
  • Poison handling — unprocessable messages move to a dead-letter subqueue with the full envelope preserved for replay.
  • Tenant guard — each handler asserts tenantId against the target environment's RuntimeTenantBinding before acting.
  • Cooldowns & convergence — control-loop workers (health, scaling, drift, inventory) are convergent: repeated runs against unchanged state are no-ops, and scaling enforces cooldowns to avoid flapping.
  • Trace propagationtraceId and correlationId flow from the triggering event into every emitted event and into OpenTelemetry spans and Serilog context.