Control Plane — Workflows¶
Workflow orchestration is the core domain of the Control Plane. A WorkflowInstance is a durable, event-sourced state machine that turns business intent into a governed, traceable sequence of agent tasks, validations, approvals, artifacts, and a release. Orchestration is driven by WorkflowOrchestrator (MassTransit sagas) from versioned WorkflowDefinition templates and is grounded in the existing coordinators (Project Bootstrap, Sprint Execution, Milestone Lifecycle, Microservice Assembly, Release) and orchestration domain.
Target Architecture — Final-State Design
Every transition emits a canonical event, so the full lifecycle is observable and replayable. Workflows advance autonomously by default and pause only at defined approval gates or on failure (human escalation).
Main Lifecycle¶
The end-to-end factory lifecycle for producing a module/service:
flowchart LR
Intent["Project intent<br/>(Factory Studio)"] --> Bootstrap["Project Bootstrap"]
Bootstrap --> Blueprint["Blueprint design & validation"]
Blueprint --> Workflow["Workflow instance per module"]
Workflow --> Tasks["Agent tasks<br/>(Agent Mesh)"]
Tasks --> Artifacts["Artifacts produced & registered"]
Artifacts --> Assembly["Microservice assembly"]
Assembly --> Gate["Release approval gate"]
Gate --> Release["Release / promotion<br/>(DevOps & GitOps)"]
Release --> Running["Running SaaS"]
Running --> Feedback["Runtime feedback"]
Feedback --> Intent
Each stage is a workflow (or step) instantiated from a definition: Project Bootstrap creates project/environments/modules; Blueprint validates the specification; per-module Workflow instances assign agent tasks; Microservice Assembly integrates outputs; Release promotes through environments behind an approval gate.
WorkflowInstance State Machine¶
stateDiagram-v2
[*] --> Created
Created --> Running: WorkflowInstanceStarted
Running --> AssigningTask: step ready
AssigningTask --> AwaitingAgent: AgentTaskAssigned
AwaitingAgent --> Running: AgentTaskCompleted
AwaitingAgent --> Correcting: AgentTaskFailed (retryable)
Correcting --> AwaitingAgent: reassigned (attempt <= max)
Correcting --> Failed: max attempts exceeded
Running --> AwaitingApproval: approval gate reached
AwaitingApproval --> Running: ApprovalGranted
AwaitingApproval --> Cancelled: ApprovalRejected
AwaitingApproval --> Escalated: ApprovalExpired
Running --> Compensating: step failed (compensation required)
Compensating --> Failed: compensation complete
Escalated --> Running: human resumes
Escalated --> Cancelled: human cancels
Running --> Completed: WorkflowInstanceCompleted
Failed --> [*]
Cancelled --> [*]
Completed --> [*]
| State | Meaning | Emits |
|---|---|---|
Created |
Instance materialized from a definition. | WorkflowInstanceStarted (on start) |
Running |
Advancing through steps. | WorkflowStepCompleted |
AssigningTask |
Translating a step into an agent task. | — |
AwaitingAgent |
Agent task placed; awaiting execution result. | AgentTaskAssigned |
Correcting |
Retryable failure; reassigning with feedback. | AgentTaskReassigned |
AwaitingApproval |
Paused at a human gate. | ApprovalRequested |
Compensating |
Running compensating actions for a failed step. | CompensationStarted |
Escalated |
Handed to a human (timeout/expiry). | WorkflowEscalated |
Completed |
All steps done, no open gates. | WorkflowInstanceCompleted |
Failed / Cancelled |
Terminal failure / human cancellation. | WorkflowInstanceFailed / WorkflowInstanceCancelled |
Task Assignment Sequence¶
How a ready workflow step becomes an executed agent task:
sequenceDiagram
participant Orchestrator as WorkflowOrchestrator
participant TaskSvc as TaskAssignmentService
participant Pool as AgentPoolManager
participant Policy as ModelPolicyService
participant Mesh as Agent Mesh
participant Cost as CostUsageService
Orchestrator->>TaskSvc: AssignAgentTask(step, role, skill)
TaskSvc->>Policy: resolve model policy
Policy-->>TaskSvc: modelPolicyId
TaskSvc->>Pool: acquire lease(role, tenant)
Pool-->>TaskSvc: lease granted
TaskSvc-->>Orchestrator: AgentTaskAssigned
TaskSvc->>Mesh: dispatch task (Agent Task Contract)
Mesh->>Mesh: load context, execute skill, validate
Mesh-->>TaskSvc: execution result (artifacts, tokens)
TaskSvc->>Cost: RecordUsage(tokens, task)
TaskSvc-->>Orchestrator: AgentTaskCompleted
Orchestrator->>Pool: release lease
If no capacity is available, AgentPoolManager defers the lease and the TaskAssignmentWorker requeues with back-off; assignment remains idempotent on (workflowInstanceId, stepId).
Approval Gate Sequence¶
How a sensitive transition (e.g. production release) passes a policy and human gate:
sequenceDiagram
participant Orchestrator as WorkflowOrchestrator
participant Policy as PolicyEngineService
participant Approval as ApprovalService
participant Studio as Factory Studio
participant Reviewer as Human Reviewer
participant Audit as AuditService
Orchestrator->>Policy: EvaluatePolicy(action=release:promote, env=prod)
Policy->>Audit: PolicyDecisionRecorded(effect=RequireApproval)
Policy-->>Orchestrator: RequireApproval(role=ReleaseManager)
Orchestrator->>Approval: RequestApproval(role=ReleaseManager)
Approval-->>Studio: ApprovalRequested
Studio->>Reviewer: surface gate in Human Review Center
Reviewer->>Studio: Grant (comment)
Studio->>Approval: GrantApproval(decidedBy)
Approval->>Audit: AuditEntryRecorded(Granted)
Approval-->>Orchestrator: ApprovalGranted
Orchestrator->>Orchestrator: resume workflow (promote)
A Deny decision fails the step immediately; an expired approval moves the instance to Escalated.
Failure Handling¶
- Classification: failures are transient (retryable — network, throttling), validation (correctable — agent output failed checks), or terminal (unrecoverable — bad input, policy deny).
- Transient failures use MassTransit exponential back-off at the message level.
- Validation failures route to
Correcting: the task is reassigned with validator feedback, bounded bymaxCorrectionAttempts(per the Agent Task Contract). - Terminal failures trigger compensation and/or human escalation.
Retry & Compensation¶
flowchart TB
StepFailed["Step failed"] --> Classify{Failure type}
Classify -->|Transient| Retry["Retry with back-off"]
Classify -->|Validation| Correct["Reassign with feedback<br/>(attempt <= max)"]
Classify -->|Terminal| Compensate["Run compensating actions"]
Retry --> Resume["Resume step"]
Correct --> Resume
Correct -->|max exceeded| Compensate
Compensate --> Escalate["Escalate to human"]
Escalate --> Decision{Human decision}
Decision -->|Resume| Resume
Decision -->|Cancel| Cancelled["WorkflowInstanceCancelled"]
Compensation is forward-recovery via compensating actions, not distributed rollback. Each WorkflowStepDefinition may declare a CompensationDefinition (e.g. retract a provisioned environment, mark an artifact superseded). Compensation actions are themselves idempotent and emit events.
Human Escalation¶
When a step exceeds its deadline (detected by WorkflowTimeoutWorker) or an approval expires, the instance enters Escalated. The gate surfaces in Factory Studio's Human Review Center with full context (trace, failing step, prior attempts, policy decision). A human can resume, reassign, or cancel; the decision is audited.
Replay¶
Because the WorkflowInstance event store is append-only and immutable, the WorkflowReplayService can deterministically reconstruct any instance:
flowchart LR
History["Append-only event store"] --> Replay["WorkflowReplayService"]
Replay --> Shadow["Shadow instance / projection"]
Shadow --> Inspect["Inspect state at any point"]
Shadow --> Rederive["Re-derive outcome with new definition"]
Replay is used to debug a failure (reconstruct exact state at a step), re-derive an outcome under a newer workflow/agent definition, or rebuild the ProcessStateService projection. Replays write to a shadow stream and never mutate the source history, preserving the audit trail. This depends on envelopes being immutable once published (see Event Envelope).
Related¶
- Events · Workers · Microservices · Aggregate Roots
- Implementation grounding: Coordinators · Orchestration Domain · Orchestration Layer · Projects Management
- Reference: Agent Task Contract · Event Envelope