Workers¶
Target Architecture — Final-State Design
This page enumerates the final-state background workers of the Integration Platform. Each worker is a MassTransit consumer or scheduled host running in the same .NET 10 process family as its owning service, deduplicating on eventId, and emitting canonical events for every meaningful state change.
The Integration Platform runs 5 background workers that handle the asynchronous, resilient, and scheduled work that must not block a synchronous API call: delivering webhooks with retry, syncing external state, rotating credentials, probing external health, and retrying failed integration runs. Every worker is idempotent, tenant-scoped, and traceable — it carries the traceId from the triggering envelope into all downstream work.
Worker Catalog¶
| Worker | Trigger | Purpose | Input | Output | Retry | Idempotency |
|---|---|---|---|---|---|---|
WebhookDeliveryWorker |
WebhookDelivery queued (event/command) |
Deliver outbound factory events to subscribed endpoints with signed payloads | WebhookDelivery (pending) |
HTTP POST to endpoint; WebhookDelivered or IntegrationFailed |
Exponential backoff, max 8 attempts, then dead-letter | Dedup on deliveryId; delivery state machine guards re-send |
IntegrationSyncWorker |
Schedule (per connection cadence) + on-demand command | Pull external state (repos, tickets, contacts) into the factory and reconcile | IntegrationConnection + sync cursor |
Normalised events; updated IntegrationRun |
Backoff per provider rate limit; resume from cursor | Cursor/syncToken + contentHash skip unchanged |
CredentialRotationWorker |
Schedule (rotation policy) + POST /integrations/credentials/rotate |
Rotate Key Vault secret versions and re-test the connection | IntegrationCredential + rotation reason |
New Key Vault version; verification IntegrationRun; CredentialRotated |
Retry rotation up to 3×; on verify failure, keep prior version active | Rotation token per credentialId; no-op if current version is target |
ExternalApiHealthWorker |
Schedule (probe interval) | Probe vendor endpoints, update connection health, gate routing | IntegrationConnection, IntegrationProvider |
Health status; IntegrationFailed on outage; metrics |
Probe retried within window; circuit opens on sustained failure | Probe keyed by connectionId + window; latest-wins health write |
IntegrationRetryWorker |
IntegrationFailed event + schedule sweep |
Retry transient integration-run failures per policy; escalate poison | IntegrationFailure, IntegrationRun |
Re-executed run; IntegrationRunCompleted or escalation |
Policy-driven backoff; max attempts from RetryPolicy |
Dedup on failureId + attempt counter; poison after cap |
Event Flow¶
flowchart TB
subgraph Outbound["Outbound Webhook Delivery"]
Evt["Factory event<br/>(e.g. DeploymentPromoted)"] --> Match["Match subscriptions"]
Match --> Queue["Enqueue WebhookDelivery"]
Queue --> WDW["WebhookDeliveryWorker"]
WDW -->|2xx| Done["WebhookDelivered"]
WDW -->|"non-2xx / timeout"| Backoff["Backoff + retry"]
Backoff --> WDW
Backoff -->|"cap reached"| DLQ["Dead-letter +<br/>IntegrationFailure"]
end
subgraph Resilience["Failure & Retry"]
Fail["IntegrationFailed"] --> IRW["IntegrationRetryWorker"]
IRW -->|transient| Retry["Re-execute run"]
Retry --> Ok["IntegrationRunCompleted"]
IRW -->|"poison / cap"| Escalate["Escalate to Observability"]
end
subgraph Scheduled["Scheduled Maintenance"]
Clock["Scheduler"] --> Sync["IntegrationSyncWorker"]
Clock --> Rot["CredentialRotationWorker"]
Clock --> Health["ExternalApiHealthWorker"]
Rot --> Rotated["CredentialRotated"]
Health -->|outage| Fail
end
Worker Concerns¶
- At-least-once + idempotent. All workers assume redelivery. Each derives an idempotency key from
eventId(ordeliveryId/failureId) plus the handler name, per the event envelope consumer rules. - Tenant guard. Every handler asserts
tenantIdbefore touching a store or vendor endpoint. - Poison handling. Unprocessable messages move to a dead-letter subqueue with the full envelope preserved for replay; an
IntegrationFailureaggregate records the cause. See Workflows. - Backpressure & rate limits.
IntegrationSyncWorkerandWebhookDeliveryWorkerrespect per-provider rate-limit buckets surfaced by the Vendor Registry, backing off rather than overwhelming a vendor. - Circuit breaking.
ExternalApiHealthWorkeropens a circuit on sustained failure so dependent services fail fast and stop sending traffic to a degraded vendor. - Trace propagation.
traceIdandcorrelationIdflow into OpenTelemetry spans and Serilog context for end-to-end correlation in Observability.