Skip to content

Workers

Target Architecture — Final-State Design

This page enumerates the final-state background workers of the Integration Platform. Each worker is a MassTransit consumer or scheduled host running in the same .NET 10 process family as its owning service, deduplicating on eventId, and emitting canonical events for every meaningful state change.

The Integration Platform runs 5 background workers that handle the asynchronous, resilient, and scheduled work that must not block a synchronous API call: delivering webhooks with retry, syncing external state, rotating credentials, probing external health, and retrying failed integration runs. Every worker is idempotent, tenant-scoped, and traceable — it carries the traceId from the triggering envelope into all downstream work.

Worker Catalog

Worker Trigger Purpose Input Output Retry Idempotency
WebhookDeliveryWorker WebhookDelivery queued (event/command) Deliver outbound factory events to subscribed endpoints with signed payloads WebhookDelivery (pending) HTTP POST to endpoint; WebhookDelivered or IntegrationFailed Exponential backoff, max 8 attempts, then dead-letter Dedup on deliveryId; delivery state machine guards re-send
IntegrationSyncWorker Schedule (per connection cadence) + on-demand command Pull external state (repos, tickets, contacts) into the factory and reconcile IntegrationConnection + sync cursor Normalised events; updated IntegrationRun Backoff per provider rate limit; resume from cursor Cursor/syncToken + contentHash skip unchanged
CredentialRotationWorker Schedule (rotation policy) + POST /integrations/credentials/rotate Rotate Key Vault secret versions and re-test the connection IntegrationCredential + rotation reason New Key Vault version; verification IntegrationRun; CredentialRotated Retry rotation up to 3×; on verify failure, keep prior version active Rotation token per credentialId; no-op if current version is target
ExternalApiHealthWorker Schedule (probe interval) Probe vendor endpoints, update connection health, gate routing IntegrationConnection, IntegrationProvider Health status; IntegrationFailed on outage; metrics Probe retried within window; circuit opens on sustained failure Probe keyed by connectionId + window; latest-wins health write
IntegrationRetryWorker IntegrationFailed event + schedule sweep Retry transient integration-run failures per policy; escalate poison IntegrationFailure, IntegrationRun Re-executed run; IntegrationRunCompleted or escalation Policy-driven backoff; max attempts from RetryPolicy Dedup on failureId + attempt counter; poison after cap

Event Flow

flowchart TB
    subgraph Outbound["Outbound Webhook Delivery"]
        Evt["Factory event<br/>(e.g. DeploymentPromoted)"] --> Match["Match subscriptions"]
        Match --> Queue["Enqueue WebhookDelivery"]
        Queue --> WDW["WebhookDeliveryWorker"]
        WDW -->|2xx| Done["WebhookDelivered"]
        WDW -->|"non-2xx / timeout"| Backoff["Backoff + retry"]
        Backoff --> WDW
        Backoff -->|"cap reached"| DLQ["Dead-letter +<br/>IntegrationFailure"]
    end

    subgraph Resilience["Failure & Retry"]
        Fail["IntegrationFailed"] --> IRW["IntegrationRetryWorker"]
        IRW -->|transient| Retry["Re-execute run"]
        Retry --> Ok["IntegrationRunCompleted"]
        IRW -->|"poison / cap"| Escalate["Escalate to Observability"]
    end

    subgraph Scheduled["Scheduled Maintenance"]
        Clock["Scheduler"] --> Sync["IntegrationSyncWorker"]
        Clock --> Rot["CredentialRotationWorker"]
        Clock --> Health["ExternalApiHealthWorker"]
        Rot --> Rotated["CredentialRotated"]
        Health -->|outage| Fail
    end
Hold "Alt" / "Option" to enable pan & zoom

Worker Concerns

  • At-least-once + idempotent. All workers assume redelivery. Each derives an idempotency key from eventId (or deliveryId/failureId) plus the handler name, per the event envelope consumer rules.
  • Tenant guard. Every handler asserts tenantId before touching a store or vendor endpoint.
  • Poison handling. Unprocessable messages move to a dead-letter subqueue with the full envelope preserved for replay; an IntegrationFailure aggregate records the cause. See Workflows.
  • Backpressure & rate limits. IntegrationSyncWorker and WebhookDeliveryWorker respect per-provider rate-limit buckets surfaced by the Vendor Registry, backing off rather than overwhelming a vendor.
  • Circuit breaking. ExternalApiHealthWorker opens a circuit on sustained failure so dependent services fail fast and stop sending traffic to a degraded vendor.
  • Trace propagation. traceId and correlationId flow into OpenTelemetry spans and Serilog context for end-to-end correlation in Observability.