Skip to content

Workflows

Target Architecture — Final-State Design

These workflows describe the runtime control loops in their final state. Every step emits an enveloped event correlated by traceId, and every workload health check uses ConnectSoft.Extensions.Diagnostics.HealthChecks.

Deployment Workflow

A deployment begins with a delivery manifest from the DevOps / GitOps Platform, provisions infrastructure with Pulumi if needed, rolls workloads onto Azure compute, gates on health, and promotes.

sequenceDiagram
    participant DevOps as DevOps / GitOps
    participant Dep as DeploymentService
    participant Env as RuntimeEnvironmentService
    participant Pulumi as Pulumi
    participant Compute as AKS / Container Apps
    participant Health as RuntimeHealthService
    participant Cat as ServiceCatalogRuntimeService

    DevOps->>Dep: Deploy (release manifest + images)
    Dep->>Env: Ensure environment provisioned
    Env->>Pulumi: Plan + apply infrastructure
    Pulumi-->>Env: Stack outputs (resource ids)
    Env-->>Dep: RuntimeEnvironmentProvisioned
    Dep->>Dep: Bind RuntimeConfiguration + SecretBinding
    Dep->>Compute: Roll out workloads
    Compute-->>Dep: Rollout progressing
    Dep->>Health: Run health gate
    Health-->>Dep: HealthCheckCompleted (Healthy)
    alt Health gate passed
        Dep->>Compute: Promote (shift traffic)
        Dep->>Cat: Update inventory
        Dep-->>DevOps: RuntimeDeploymentCompleted
    else Health gate failed
        Dep->>Compute: Roll back to previous deployment
        Dep-->>DevOps: RuntimeDeploymentRolledBack
    end
Hold "Alt" / "Option" to enable pan & zoom

RuntimeDeployment State Machine

stateDiagram-v2
    [*] --> Requested
    Requested --> Provisioning : ensure environment
    Provisioning --> Configuring : infra ready
    Configuring --> RollingOut : config + secrets bound
    RollingOut --> HealthGating : workloads started
    HealthGating --> Promoting : gates passed
    HealthGating --> RollingBack : gates failed
    Promoting --> Completed : traffic shifted
    RollingBack --> RolledBack : previous restored
    Completed --> [*]
    RolledBack --> [*]
    RollingOut --> RollingBack : rollout error
    Provisioning --> Failed : provisioning error
    Failed --> [*]
Hold "Alt" / "Option" to enable pan & zoom

Drift Detection & Remediation

The DriftDetectionWorker continuously compares the live inventory against Git/Pulumi desired state and triggers corrective deployments.

sequenceDiagram
    participant Drift as RuntimeDriftDetectionService
    participant Cat as ServiceCatalogRuntimeService
    participant DevOps as DevOps / GitOps (desired state)
    participant Dep as DeploymentService
    participant Gov as Governance

    Drift->>Cat: Read actual inventory snapshot
    Drift->>DevOps: Read desired state (Git/Pulumi)
    Drift->>Drift: Diff actual vs desired
    alt Divergence found
        Drift-->>Gov: RuntimeDriftDetected (audit)
        Drift->>Dep: RemediateDrift (corrective deployment)
        Dep->>Dep: Re-apply / redeploy / roll back
        Dep-->>Drift: RuntimeDeploymentCompleted
        Drift->>Drift: Mark RuntimeDriftRemediated
    else In sync
        Drift->>Drift: No action
    end
Hold "Alt" / "Option" to enable pan & zoom

Remediation policy distinguishes auto-remediable drift (image/replica/config mismatch — corrected automatically in non-prod, or with policy approval in prod) from review-required drift (unmanaged or missing resources — surfaced to the Runtime Center for operator decision).

Scaling Workflow

sequenceDiagram
    participant Health as RuntimeHealthService
    participant Scale as RuntimeScalingService
    participant AppI as Application Insights
    participant Compute as AKS / Container Apps

    Health-->>Scale: HealthCheckCompleted
    Scale->>AppI: Query live metrics (cpu / queue / rps)
    Scale->>Scale: Evaluate ScalingPolicy rules
    alt Threshold breached and cooldown elapsed
        Scale->>Compute: Apply replica / throughput change
        Compute-->>Scale: Applied
        Scale-->>Health: ScalingPolicyApplied
    else Within bounds or in cooldown
        Scale->>Scale: No action
    end
Hold "Alt" / "Option" to enable pan & zoom

Scaling honours minReplicas/maxReplicas, scaleToZero (event-driven targets only), and per-policy cooldowns to prevent flapping. Sustained inability to meet targets emits ScalingPolicyViolated for incident creation.

Failure & Rollback

stateDiagram-v2
    [*] --> Healthy
    Healthy --> Degraded : health checks fail
    Degraded --> Healthy : self-recovers
    Degraded --> Rollback : threshold exceeded
    Rollback --> Restoring : redeploy previous good
    Restoring --> Healthy : health gate passes
    Restoring --> Incident : restore fails
    Incident --> [*]
Hold "Alt" / "Option" to enable pan & zoom
  • Automatic rollback — a failed health gate during deployment, or a sustained Degraded state post-deployment, triggers rollback to the last Completed RuntimeDeployment.
  • Blast-radius containment — rollouts use RollingHealthGated/Canary/BlueGreen strategies so failures affect a bounded fraction of traffic before promotion.
  • Incident escalation — when automatic restore fails, the platform emits a signal consumed by Observability & Feedback to open an incident, preserving the full envelope and trace for replay.