Workflows¶
Target Architecture — Final-State Design
These workflows describe the runtime control loops in their final state. Every step emits an enveloped event correlated by traceId, and every workload health check uses ConnectSoft.Extensions.Diagnostics.HealthChecks.
Deployment Workflow¶
A deployment begins with a delivery manifest from the DevOps / GitOps Platform, provisions infrastructure with Pulumi if needed, rolls workloads onto Azure compute, gates on health, and promotes.
sequenceDiagram
participant DevOps as DevOps / GitOps
participant Dep as DeploymentService
participant Env as RuntimeEnvironmentService
participant Pulumi as Pulumi
participant Compute as AKS / Container Apps
participant Health as RuntimeHealthService
participant Cat as ServiceCatalogRuntimeService
DevOps->>Dep: Deploy (release manifest + images)
Dep->>Env: Ensure environment provisioned
Env->>Pulumi: Plan + apply infrastructure
Pulumi-->>Env: Stack outputs (resource ids)
Env-->>Dep: RuntimeEnvironmentProvisioned
Dep->>Dep: Bind RuntimeConfiguration + SecretBinding
Dep->>Compute: Roll out workloads
Compute-->>Dep: Rollout progressing
Dep->>Health: Run health gate
Health-->>Dep: HealthCheckCompleted (Healthy)
alt Health gate passed
Dep->>Compute: Promote (shift traffic)
Dep->>Cat: Update inventory
Dep-->>DevOps: RuntimeDeploymentCompleted
else Health gate failed
Dep->>Compute: Roll back to previous deployment
Dep-->>DevOps: RuntimeDeploymentRolledBack
end
RuntimeDeployment State Machine¶
stateDiagram-v2
[*] --> Requested
Requested --> Provisioning : ensure environment
Provisioning --> Configuring : infra ready
Configuring --> RollingOut : config + secrets bound
RollingOut --> HealthGating : workloads started
HealthGating --> Promoting : gates passed
HealthGating --> RollingBack : gates failed
Promoting --> Completed : traffic shifted
RollingBack --> RolledBack : previous restored
Completed --> [*]
RolledBack --> [*]
RollingOut --> RollingBack : rollout error
Provisioning --> Failed : provisioning error
Failed --> [*]
Drift Detection & Remediation¶
The DriftDetectionWorker continuously compares the live inventory against Git/Pulumi desired state and triggers corrective deployments.
sequenceDiagram
participant Drift as RuntimeDriftDetectionService
participant Cat as ServiceCatalogRuntimeService
participant DevOps as DevOps / GitOps (desired state)
participant Dep as DeploymentService
participant Gov as Governance
Drift->>Cat: Read actual inventory snapshot
Drift->>DevOps: Read desired state (Git/Pulumi)
Drift->>Drift: Diff actual vs desired
alt Divergence found
Drift-->>Gov: RuntimeDriftDetected (audit)
Drift->>Dep: RemediateDrift (corrective deployment)
Dep->>Dep: Re-apply / redeploy / roll back
Dep-->>Drift: RuntimeDeploymentCompleted
Drift->>Drift: Mark RuntimeDriftRemediated
else In sync
Drift->>Drift: No action
end
Remediation policy distinguishes auto-remediable drift (image/replica/config mismatch — corrected automatically in non-prod, or with policy approval in prod) from review-required drift (unmanaged or missing resources — surfaced to the Runtime Center for operator decision).
Scaling Workflow¶
sequenceDiagram
participant Health as RuntimeHealthService
participant Scale as RuntimeScalingService
participant AppI as Application Insights
participant Compute as AKS / Container Apps
Health-->>Scale: HealthCheckCompleted
Scale->>AppI: Query live metrics (cpu / queue / rps)
Scale->>Scale: Evaluate ScalingPolicy rules
alt Threshold breached and cooldown elapsed
Scale->>Compute: Apply replica / throughput change
Compute-->>Scale: Applied
Scale-->>Health: ScalingPolicyApplied
else Within bounds or in cooldown
Scale->>Scale: No action
end
Scaling honours minReplicas/maxReplicas, scaleToZero (event-driven targets only), and per-policy cooldowns to prevent flapping. Sustained inability to meet targets emits ScalingPolicyViolated for incident creation.
Failure & Rollback¶
stateDiagram-v2
[*] --> Healthy
Healthy --> Degraded : health checks fail
Degraded --> Healthy : self-recovers
Degraded --> Rollback : threshold exceeded
Rollback --> Restoring : redeploy previous good
Restoring --> Healthy : health gate passes
Restoring --> Incident : restore fails
Incident --> [*]
- Automatic rollback — a failed health gate during deployment, or a sustained
Degradedstate post-deployment, triggers rollback to the lastCompletedRuntimeDeployment. - Blast-radius containment — rollouts use
RollingHealthGated/Canary/BlueGreenstrategies so failures affect a bounded fraction of traffic before promotion. - Incident escalation — when automatic restore fails, the platform emits a signal consumed by Observability & Feedback to open an incident, preserving the full envelope and trace for replay.