Control Plane vs Data Plane¶
Overview¶
The ConnectSoft AI Software Factory runtime uses a control plane / data plane separation to achieve scalability, reliability, and operational isolation. This architectural pattern separates orchestration and coordination (control plane) from work execution (data plane), enabling independent scaling, deployment, and failure isolation.
Control Plane Responsibilities¶
The Control Plane is the central nervous system of the Factory, responsible for:
Orchestration¶
- Workflow Coordination — Manages multi-step agent workflows (vision → architecture → engineering → QA → DevOps)
- Dependency Management — Ensures steps execute in the correct order and handles dependencies between jobs
- Agent Assignment — Routes tasks to appropriate agents based on capabilities and availability
Scheduling¶
- Job Scheduling — Determines when jobs should execute based on priority, resource availability, and constraints
- Queue Management — Manages job queues, prioritization, and rate limiting
- Resource Allocation — Allocates compute resources to data plane workers based on workload
Run Lifecycle Management¶
- Run Creation — Validates run requests and creates run records
- State Tracking — Tracks run progress through states (Requested → Validated → Queued → Running → Succeeded/Failed/Cancelled)
- Lifecycle Transitions — Manages state transitions and enforces lifecycle rules
Validation and Policy Enforcement¶
- Input Validation — Validates run requests, templates, and configurations before execution
- Policy Enforcement — Enforces security policies, resource quotas, and compliance rules
- Pre-flight Checks — Verifies prerequisites (e.g., Azure DevOps access, template availability)
State Store¶
- Run Metadata — Stores run configuration, status, timestamps, and metadata
- Step Status — Tracks individual job/step status within runs
- Artifact References — Maintains references to generated artifacts (repo URLs, pipeline IDs, etc.)
- History — Retains run history for auditing, debugging, and analytics
Audit Logging¶
- Action Logging — Records all control plane actions (run creation, state changes, policy decisions)
- Compliance — Provides audit trails for compliance and governance
- Traceability — Links all actions to users, projects, and trace IDs
API Surface¶
- REST API — Exposes REST endpoints for run management, status queries, and configuration
- GraphQL API — Provides flexible querying for complex data requirements
- gRPC API — High-performance API for internal service-to-service communication
- WebSocket — Real-time updates for run status and progress
Data Plane Responsibilities¶
The Data Plane is responsible for executing work units that produce artifacts:
Repository Generation¶
- Git Repository Creation — Creates repositories in Azure DevOps or GitHub
- Code Generation — Generates code files, tests, and documentation
- Commit Management — Creates commits, branches, and pull requests
- Template Application — Applies templates and overlays to generate codebases
Pipeline Generation¶
- CI/CD Pipeline Creation — Generates Azure DevOps pipelines or GitHub Actions workflows
- Infrastructure-as-Code — Generates Bicep, Terraform, or Pulumi scripts
- Deployment Configurations — Creates Kubernetes manifests, Helm charts, and deployment configs
SaaS Scaffolding¶
- Microservice Generation — Scaffolds complete microservices with Clean Architecture structure
- Library Generation — Creates reusable libraries following ConnectSoft patterns
- Frontend Generation — Generates frontend applications and components
- Documentation Generation — Creates ADRs, runbooks, API documentation, and architecture diagrams
External System Integration¶
- Azure DevOps Integration — Creates repos, work items, pipelines, and artifacts
- Git Provider Integration — Interacts with GitHub, GitLab, or other Git providers
- Cloud Service Integration — Provisions and configures Azure resources (Key Vault, Service Bus, etc.)
- Notification Systems — Sends notifications via email, SMS, or webhooks
Deployment & Isolation¶
Control Plane Deployment¶
The Control Plane is deployed as long-running, stateful services:
- Orchestrator Service — Main orchestration service (typically 2-3 instances for HA)
- Scheduler Services — Job scheduling and queue management (can scale horizontally)
- API Gateway — Routes external requests to internal services
- State Database — Relational or document database for run state (with replication)
- Message Queue/Bus — Job queue and event bus (e.g., Azure Service Bus, RabbitMQ)
Characteristics:
- High Availability — Deployed with redundancy and failover
- Stateful — Maintains run state and coordination state
- Persistent Storage — Requires durable storage for state and audit logs
- Network Isolation — Typically deployed in a dedicated namespace or cluster
Data Plane Deployment¶
The Data Plane is deployed as stateless, autoscaled worker pools:
- Worker Pools — Separate pools for different job types (Repo Gen, Pipeline Gen, SaaS Gen)
- Horizontal Scaling — Workers scale up/down based on queue depth and workload
- Stateless — Workers don't maintain local state; all state is in Control Plane
- Isolation — Workers can be deployed in separate namespaces, node pools, or clusters
Characteristics: - Stateless — No local state; workers fetch job context from Control Plane - Autoscaled — Automatically scales based on queue depth and metrics - Isolated — Can be isolated by job type, tenant, or security requirements - Ephemeral — Workers can be terminated and replaced without affecting runs
Isolation Strategies¶
The Factory supports multiple isolation strategies for the Data Plane:
- Namespace Isolation — Different worker pools in separate Kubernetes namespaces
- Node Pool Isolation — Dedicated node pools for different job types or tenants
- Cluster Isolation — Separate clusters for high-security or multi-tenant scenarios
- Network Isolation — Network policies and service meshes for traffic isolation
Architecture Diagram¶
graph LR
subgraph ControlPlane["Control Plane"]
API[Factory API]
Orchestrator[Orchestrator]
Scheduler[Schedulers]
RunStore[Run State Store]
AuditSink[Audit Sink]
end
subgraph DataPlane["Data Plane"]
WorkerA[Worker Pool A<br/>Repo Gen]
WorkerB[Worker Pool B<br/>Pipeline Gen]
WorkerC[Worker Pool C<br/>SaaS Gen]
end
API --> Orchestrator
Orchestrator --> RunStore
Orchestrator --> Scheduler
Orchestrator --> AuditSink
Orchestrator --> Queue[(Job Queue)]
Queue --> WorkerA
Queue --> WorkerB
Queue --> WorkerC
WorkerA --> RunStore
WorkerB --> RunStore
WorkerC --> RunStore
Key Interactions:
- Client → API → Orchestrator — Run requests flow through API to Orchestrator
- Orchestrator → RunStore — Orchestrator persists run state
- Orchestrator → Scheduler — Orchestrator delegates scheduling decisions
- Orchestrator → Queue — Orchestrator enqueues jobs for execution
- Queue → Workers — Workers consume jobs from queue
- Workers → RunStore — Workers update run state with progress and results
Benefits of Separation¶
Scalability¶
- Independent Scaling — Control plane and data plane scale independently based on different metrics
- Horizontal Scaling — Data plane workers can scale to thousands of instances
- Resource Optimization — Control plane (CPU-light) and data plane (CPU-heavy) can use different instance types
Reliability¶
- Failure Isolation — Worker failures don't affect control plane; control plane failures can pause work without losing state
- Graceful Degradation — Control plane can continue operating even if some worker pools are unavailable
- Recovery — Failed workers can be replaced without affecting in-flight runs (state is in control plane)
Security¶
- Network Isolation — Data plane workers can be isolated from control plane and external networks
- Access Control — Different security policies for control plane (admin access) vs data plane (limited access)
- Compliance — Control plane can enforce compliance policies before work execution
Operational Excellence¶
- Deployment Independence — Control plane and data plane can be deployed and updated independently
- Monitoring Separation — Different monitoring and alerting for control plane (availability) vs data plane (throughput)
- Cost Optimization — Control plane (always-on) vs data plane (scale-to-zero) enables cost optimization
Related Documentation¶
- Execution Engine — How runs and jobs are executed through the queueing model
- State & Memory — How run state is stored and integrated with AI memory
- Observability — How control plane and data plane are monitored and observed
- Overall Platform Architecture — High-level Factory architecture
- Orchestration Layer — Orchestration design and patterns