ποΈ Infrastructure Architect Agent Specification¶
π― Purpose¶
The Infrastructure Architect Agent is responsible for designing and generating the foundational cloud infrastructure required to operate the ConnectSoft AI Software Factory at scale β using multi-cloud, tenant-aware, IaC-first principles.
Its mission is to translate high-level architectural intent (compute needs, data isolation, security boundaries, etc.) into declarative, reproducible infrastructure modules, using tools like Bicep, Terraform, and Pulumi (C#).
π‘ Why This Agent Matters¶
Without this agent:
- Infrastructure setup would be manual, inconsistent, and error-prone
- Multi-tenant environments would suffer from cross-tenant leakage or drift
- Developers would waste time managing infrastructure instead of building logic
- Observability and identity boundaries wouldnβt be enforced from the ground up
- Services couldnβt scale or operate securely across cloud regions/providers
With the Infrastructure Architect Agent:
β
All infrastructure is code-generated, version-controlled, and composable
β
Teams get ready-to-deploy, secure environments with no human provisioning
β
Each tenant and microservice has proper isolation, cost tagging, and access boundaries
β
Infrastructure definitions are portable across Azure, AWS, GCP, or hybrid setups
β
Provisioning supports Bicep, Terraform, and now Pulumi (C#) β enabling .NET-native IaC
π§± What This Agent Enables¶
| Capability | Impact |
|---|---|
| π Cloud-native networking | VNet, subnets, NAT, firewalls, private DNS |
| ποΈ Compute orchestration | AKS/EKS clusters, node pools, autoscaling |
| π Identity & access | Managed Identity, IAM roles, service principals |
| πΎ Storage provisioning | Azure Blob, S3, databases, backup vaults |
| π Environment scaffolding | Per-tenant or per-stage logical groupings |
| π¬ Policy-as-code | RBAC, NSGs, audit logging, telemetry agents |
| β»οΈ Reusability & portability | Bicep + Terraform + Pulumi (C#) bundles |
π§ Position in the Factory Pipeline¶
flowchart TD
SolutionArchitect --> InfrastructureArchitect
InfrastructureArchitect --> DevOpsArchitect
InfrastructureArchitect --> CloudArchitectureAgent
InfrastructureArchitect --> SecurityArchitect
InfrastructureArchitect --> ObservabilityAgent
π§ Reusable Infrastructure Scenarios¶
| Scenario | Provisioned By Agent |
|---|---|
| New microservice with DB + Key Vault | β VNet, AKS node pool, Azure SQL, KV |
| Staging environment per tenant | β Subnet, node pool, traffic filtering |
| Multi-cloud deployment (Azure + AWS) | β Terraform + Pulumi dual-mode templates |
| Cluster auto-scaling with cost budget tags | β Autoscaler + tagging in IaC |
π Vision Alignment¶
This agent upholds ConnectSoftβs values:
- Cloud-Native
- Security-First
- Event-Driven & Observable
- Multi-Tenant Ready
- Automated, Reproducible, and Composable
π‘ Scope of Influence¶
The Infrastructure Architect Agent governs the infrastructure control plane across all environments and tenants.
Its outputs shape the physical, logical, and virtual boundaries for every microservice, data flow, identity relationship, and cloud resource used by the ConnectSoft AI Software Factory.
This scope includes multi-region, multi-cloud, and multi-tenant infrastructure definitions.
π§ System Layers Controlled by the Agent¶
| Layer | Components Controlled |
|---|---|
| Networking | VNet, Subnet, Private DNS, Peering, NAT, NSGs, Public IP, Private Endpoints |
| Compute | AKS/EKS/GKE clusters, node pools, autoscaling groups, container runtimes |
| Storage | Azure Blob, S3 Buckets, PostgreSQL/MySQL, File Shares, Disks, Backups |
| Identity & Access | Managed Identity, IAM Roles, RBAC bindings, service principal outputs |
| Secrets & Key Management | Azure Key Vault, AWS KMS, GCP Secrets, integration paths |
| Environment Isolation | Per-tenant staging spaces, dedicated compute/storage profiles |
| Provisioning Mode | Bicep, Terraform, and Pulumi (C#) IaC backends for every artifact |
| Observability Layer | Log routing (Log Analytics, CloudWatch), OpenTelemetry endpoints |
| DNS & Traffic Control | Zones, Ingress Controllers, Route Tables, Load Balancers |
π Target Environments Provisioned¶
| Environment | Resources Created |
|---|---|
| Development | Shared AKS pool, dev-only DNS, reduced autoscaling |
| Staging | Cluster with cost tags, tracing on, mTLS across microservices |
| Production | Hardened AKS/EKS, regional failover, Key Vault with HSM |
| Per-Tenant | Optional isolated subnet, DNS, key ring, secret namespace |
| Hybrid (Local + Cloud) | Self-hosted runner + cloud ingress, VPN/peering |
π¦ Agent Output Mode Coverage¶
| Output Type | Description |
|---|---|
| Bicep | Azure-native IaC with parameterized modules |
| Terraform | Multi-cloud abstraction supporting Azure, AWS, GCP |
| Pulumi (C#) | .NET-native, developer-friendly infrastructure declarations |
| Mermaid Topology Diagram | Visual VNet/cluster/service layout |
| Provisioning Events | Emits InfraProvisioningPlanPublished and optional diff events |
π§ Scope Summary Diagram¶
graph TD
A[VNet/Subnet/DNS]
B[AKS Node Pools]
C[Key Vault + IAM]
D[Storage Accounts/DBs]
E[Pulumi + Terraform + Bicep]
F[OpenTelemetry Collector]
G[Ingress + Load Balancer]
A --> E
B --> E
C --> E
D --> E
F --> E
G --> E
β Key Impact Zones¶
| Impacted Layer | Agent Enforcement |
|---|---|
| β Cost-aware autoscaling | Node pool budget tags and usage quotas |
| π Zero Trust networking | NSGs, private links, enforced mTLS |
| π Least privilege IAM | MSI roles, scoped secrets, tenant RBAC |
| π¦ Modular IaC | Every component pluggable and reusable |
| β»οΈ Reusable patterns | Shared modules across tenants, clouds, services |
| π Observability-on-provision | OpenTelemetry spans built into infra creation |
π Core Responsibilities¶
The Infrastructure Architect Agent is accountable for designing, provisioning, securing, and documenting the infrastructure foundation that powers every microservice, gateway, and subsystem in the ConnectSoft AI Software Factory.
Its responsibilities ensure that infrastructure is:
- Declaratively defined
- Multi-cloud and multi-tenant ready
- Secure by default
- Environment-aware
- Observability-aligned
- Compatible with DevOps and Cloud Architecture agents
π§± 1. Provisioning Blueprint Generation¶
| Task | Output |
|---|---|
| Generate per-environment infrastructure definitions | infra.bicep, main.tf, PulumiInfra.cs |
| Structure modules for reusability (e.g., VNet, AKS, KV) | Modularized IaC blocks |
| Define naming conventions, tags, location strategies | Global infra naming policy module |
| Emit variant manifests (e.g., Dev, Staging, Prod, TenantX) | infra.{env}.bicep, infra.{tenant}.tf |
π 2. Identity and Access Layer Definition¶
| Task | Output |
|---|---|
| Define system-assigned or user-assigned Managed Identities | identity-map.yaml |
| Bind IAM roles to service scopes (e.g., read, contributor, custom roles) | IAM policy definitions per cloud |
| Map identity outputs to agents for later use (DevOps, Security) | Shared output map with resource_id, client_id, secret_ref |
π 3. Network Topology Planning¶
| Task | Output |
|---|---|
| Define VNet + Subnet per environment or tenant | network-topology.yaml |
| Configure NSGs, DNS, peering, and egress rules | Enforced through PulumiInfra.cs or network.tf |
| Control ingress exposure and service mesh compatibility | Mesh annotation support for Istio/Linkerd/Kuma |
πΎ 4. Storage and Data Service Definition¶
| Task | Output |
|---|---|
| Define Blob/S3 buckets, access tiers, replication, encryption | storage-definition.yaml |
| Provision PostgreSQL, SQL, or CosmosDB | Configurable DB-as-a-service modules |
| Define backup vaults, data retention, TTL settings | Storage compliance blueprint injected from policy |
π 5. Observability and Monitoring Setup¶
| Task | Output |
|---|---|
| Deploy OpenTelemetry Collector as sidecar or managed infra | Optional toggle via policy |
| Route logs to Azure Monitor, CloudWatch, or custom endpoints | otel-agent.yaml output |
| Enable service health metrics via K8s | Auto-wired probes into deployment specs |
Emit InfraProvisioningSpans for all provisioning actions |
All agent actions are observable |
β»οΈ 6. Reusability and Policy Propagation¶
| Task | Output |
|---|---|
| Reuse core templates across projects | Parameterized Bicep, Pulumi classes, Terraform modules |
| Integrate organizational naming/tagging/security standards | Tag injection logic with owner, env, billing_code |
Emit global infra-policy-map.yaml to downstream agents |
Shared policy alignment for Security and DevOps Architect Agents |
π’ 7. Lifecycle & Compliance Hooks¶
| Task | Output |
|---|---|
Emit InfraProvisioningPlanPublished with trace_id and artifact list |
Triggers downstream agent coordination |
| Emit drift detection events if previous state differs | Optional in audit mode |
Output infra-metadata.json per run |
Includes all provisioned resource IDs and traceability fields |
Provide rollback plan (infra-rollback.yaml) when enabled |
Describes deletions, safe resource tear-down steps |
β Summary¶
| Responsibility Area | Delivered By Agent |
|---|---|
| IaC output (Bicep, Terraform, Pulumi) | β |
| Networking & DNS topology | β |
| Identity, RBAC, and IAM config | β |
| Storage definition and protection | β |
| Observability layer pre-wired | β |
| Reusability, modularity, compliance | β |
| Traceable provisioning lifecycle | β |
π₯ Core Inputs¶
The Infrastructure Architect Agent consumes architectural intent and operational metadata from upstream agents and policy sources to generate IaC-compatible infrastructure blueprints.
These inputs drive the agentβs ability to:
- Choose the right cloud services
- Structure the network and security model
- Enforce cost and compliance constraints
- Output per-environment or per-tenant variations
- Align with platform observability and deployment expectations
π Required Input Artifacts¶
| Artifact | Source Agent | Purpose |
|---|---|---|
solution-architecture.md |
Solution Architect Agent | Defines global cloud targets, service composition, zones |
application-architecture.md |
Application Architect Agent | Describes logical services, access groups, service mesh topology |
deployment-strategy.yaml |
DevOps Architect Agent | Determines target environments and staging separation |
identity-policy.yaml |
Security Architect Agent | Defines IAM/RBAC/MSI rules for cloud resources |
resource-configuration.yaml |
Cloud Architecture Agent | Specifies node pool sizes, DNS rules, ingress exposure |
field-retention-map.yaml |
Data Architect Agent | Drives storage encryption and replication flags |
observability-policy.yaml |
Observability Agent | Ensures metrics, logs, and tracing are pre-integrated |
previous-infra-state.json (optional) |
From repo | Enables drift detection and versioned rollback |
π Sample Input: resource-configuration.yaml¶
resources:
compute:
provider: Azure
cluster_type: AKS
node_pool:
size: Standard_D4s_v3
autoscale: true
max_nodes: 10
dns:
zone: "internal.connectsoft.dev"
private_dns_enabled: true
storage:
account_tier: Standard_LRS
replication: GRS
π Sample Input: identity-policy.yaml¶
iam:
enable_managed_identity: true
role_bindings:
- role: Contributor
principal: microservice@connectsoft
- role: Key Vault Reader
principal: devops-agent@connectsoft
tenant_isolation:
enabled: true
mode: soft
π Sample Input: observability-policy.yaml¶
otel:
collector:
enabled: true
type: sidecar
spans:
provision_infra: true
storage_attached: true
exporters:
- type: otlp
endpoint: https://otel.connectsoft.dev
π§© Input Validation Rules¶
| Rule | Description |
|---|---|
| Must specify at least one compute cluster (AKS, EKS, GKE) | β |
| IAM roles must reference valid service principals or MSIs | β |
| DNS zone must follow platform naming convention | β |
| Missing observability policy β fallback to basic collector + OTLP spans | β |
| Storage configuration must match retention + compliance map | β |
| Identity policies must align with tenant model (shared/isolated) | β |
| If no resource config provided β suggest default AKS dev config | β |
π Input Prompt Snippet Example¶
assignment: provision-infrastructure
project:
trace_id: trace-infra-44188
service_name: NotificationService
environment: Staging
tenant_id: tenant-42
inputs:
solution_architecture_url: https://.../solution-architecture.md
resource_configuration_url: https://.../notification/resource-configuration.yaml
identity_policy_url: https://.../notification/identity-policy.yaml
observability_policy_url: https://.../notification/observability.yaml
previous_infra_state_url: https://.../notification/infra-v1.1.0.json
settings:
output_format: [bicep, terraform, pulumi]
multi_cloud_enabled: false
rollback_safe: true
emit_diagram: true
π§ Metadata Propagation¶
Every input maps to:
trace_idβ present in every output file and eventtenant_id,environmentβ used in naming, resource grouping, taggingagent_versionβ helps downstream validationcloud_providerβ decides Bicep (Azure), Terraform (multi-cloud), or Pulumi (C#/.NET)
π€ Core Outputs¶
The Infrastructure Architect Agent emits a complete infrastructure-as-code (IaC) blueprint for each service, tenant, and environment β using multi-mode outputs including:
- Bicep for Azure-native deployments
- Terraform for multi-cloud compatibility
- Pulumi (C#) for .NET-native teams
Each output is environment-specific, versioned, and traceable, and includes metadata for rollback, observability, and security review.
π¦ Artifact Output Summary¶
| Artifact | Format | Purpose |
|---|---|---|
infra.bicep |
Bicep | Azure-native definition (VNets, AKS, KV, etc.) |
main.tf |
Terraform | Multi-cloud support (AWS, Azure, GCP) |
PulumiInfra.cs |
C# (Pulumi) | .NET-based IaC, same topology as Bicep/TF |
network-topology.yaml |
YAML | Declares subnets, IP ranges, NSGs, DNS |
identity-map.yaml |
YAML | Maps service identities, roles, scopes |
storage-definition.yaml |
YAML | Blob buckets, database plans, backup policies |
infra-metadata.json |
JSON | Traceable outputs: IDs, trace_id, environment, version |
infra-rollback.yaml |
YAML | Describes how to safely undo/replace resources |
infra-policy-map.yaml |
YAML | Injected tags, constraints, quotas, compliance requirements |
infra-topology.mmd |
Mermaid | Visual network and compute diagram |
InfraProvisioningPlanPublished |
Event | Lifecycle event for CI/CD and Cloud Architecture Agent |
π§Ύ Output Example: infra.bicep¶
resource aks 'Microsoft.ContainerService/managedClusters@2022-03-01' = {
name: '${serviceName}-aks-${environment}'
location: location
identity: {
type: 'SystemAssigned'
}
properties: {
dnsPrefix: '${serviceName}-${environment}'
agentPoolProfiles: [
{
name: 'default'
count: 2
vmSize: 'Standard_DS2_v2'
osType: 'Linux'
mode: 'System'
}
]
}
}
π§Ύ Output Example: PulumiInfra.cs¶
var resourceGroup = new ResourceGroup($"{project}-rg");
var vnet = new VirtualNetwork("vnet", new VirtualNetworkArgs {
AddressSpaces = { "10.0.0.0/16" },
ResourceGroupName = resourceGroup.Name,
Location = resourceGroup.Location
});
var aks = new KubernetesCluster("aks", new KubernetesClusterArgs {
ResourceGroupName = resourceGroup.Name,
AgentPoolProfiles = {
new KubernetesClusterAgentPoolProfileArgs {
Name = "agentpool",
Count = 2,
VmSize = "Standard_D2_v2",
Mode = "System"
}
},
DnsPrefix = $"{project}-k8s",
Identity = new KubernetesClusterIdentityArgs {
Type = "SystemAssigned"
}
});
π§Ύ Output Example: network-topology.yaml¶
vnet:
name: vnet-notification-staging
address_space: 10.1.0.0/16
subnets:
- name: public
address_prefix: 10.1.1.0/24
- name: private
address_prefix: 10.1.2.0/24
dns_zone: internal.connectsoft.dev
peering:
enabled: true
targets:
- core-services
- telemetry-network
π§Ύ Output Example: infra-metadata.json¶
{
"trace_id": "trace-infra-44411",
"service": "NotificationService",
"environment": "Staging",
"provisioned_at": "2025-05-02T01:34:00Z",
"agent_version": "1.3.0",
"artifacts": {
"bicep": "infra.bicep",
"terraform": "main.tf",
"pulumi": "PulumiInfra.cs",
"metadata": "infra-metadata.json"
}
}
β Output Completeness Rules¶
| Requirement | Status |
|---|---|
All outputs must include trace_id, agent_version, and environment |
β |
| Must emit at least one IaC format (Bicep, TF, Pulumi) | β |
| All compute and network resources must be grouped per environment | β |
| Storage, identity, and DNS outputs must be traceable via IDs | β |
| Mermaid diagram must match declared topology | β |
Lifecycle event InfraProvisioningPlanPublished must be emitted |
β |
π Event Emission Example¶
{
"event": "InfraProvisioningPlanPublished",
"trace_id": "trace-infra-44411",
"service": "NotificationService",
"environment": "Staging",
"outputs": {
"pulumi": "PulumiInfra.cs",
"terraform": "main.tf",
"bicep": "infra.bicep",
"metadata": "infra-metadata.json"
},
"timestamp": "2025-05-02T01:34:00Z"
}
π Agent Knowledge Base Overview¶
The Infrastructure Architect Agent leverages a curated infrastructure knowledge base that includes:
- Reusable IaC templates and modules
- Multi-cloud best practices
- Resource naming conventions and tagging strategies
- Known patterns for tenant isolation, zone redundancy, and cost optimization
- Policy-driven configurations for storage, identity, telemetry, and security
This knowledge base is applied uniformly across all IaC output modes: Bicep, Terraform, and Pulumi (C#).
π§± 1. Infrastructure Modules Library¶
| Module | Description |
|---|---|
aks_cluster |
Parametric node pool with MSI, DNS, autoscaling |
vnet_base |
VNet + subnets + NSG with peer routing |
keyvault_module |
Vault + access policy bindings + purge protection |
blob_storage |
Secure container with private endpoint, encryption |
dns_zone |
Azure Private DNS or Route53 zone per tenant/environment |
otel_collector |
Injected as deployment or managed agent with routing |
role_assignments |
RBAC/IAM bindings for services and managed identities |
π 2. Cloud-Specific Knowledge¶
| Cloud | Agent Behavior |
|---|---|
| Azure | Uses Bicep templates + ARM native API ID mappings |
| AWS | Terraform modules for VPC, IAM, EKS, S3, CloudWatch |
| GCP | Terraform modules for VPC, GKE, IAM, KMS, Stackdriver |
| Hybrid/Local | Defaults to self-hosted Kubernetes + DNS stub zones |
| Pulumi (C#) | Mirrors Bicep/TF topologies using Pulumi SDKs + .NET |
π 3. Identity and Access Policies¶
| Pattern | Rule |
|---|---|
Default identity type = SystemAssigned MSI or Service Principal |
|
| Least privilege: services get only minimal roles (Reader, KV Reader) | |
| Tenant isolation: per-tenant resource group or role segregation | |
| Access to secrets: always via vault reference, not hardcoded | |
identity-map.yaml reused across DevOps and Security Agents |
π¦ 4. Naming and Tagging Conventions¶
| Convention | Example |
|---|---|
env-service-region |
prod-notification-eus |
Tags: project, owner, env, trace_id |
Used across all provisioned resources |
Vault secrets: connectsoft/service/env/key |
Always prefixed with connectsoft |
DNS zones: internal.connectsoft.dev + tenant subdomains |
Used for service resolution and peering |
β»οΈ 5. Reuse & Template Inheritance Rules¶
| Component | Reuse Scope |
|---|---|
| Subnet definitions | Shared across services in same VNet |
| AKS node pool sizes | Reused per environment class (dev/staging/prod) |
| OpenTelemetry agent config | Injected if observability-policy.yaml is missing |
| Bicep, Terraform, Pulumi modules | Always versioned and imported from platform IaC registry |
π§ Semantic Memory Lookup Example¶
query: "provision AKS for production service with secure vault"
match:
aks_template: aks_cluster_v3
storage_encryption: enabled
key_vault_binding: connectsoft-vault/prod/notification
observability: otel_collector + log analytics
β Benefits of the Knowledge Base¶
| Benefit | Outcome |
|---|---|
| π Accelerated infra generation | Plug-and-play templates in Bicep/TF/Pulumi |
| π Policy consistency | Same IAM, DNS, security posture across environments |
| π Cloud abstraction | Terraform & Pulumi ensure portability |
| πΎ Reusable and testable modules | All IaC artifacts inherit tested structure |
| π Infrastructure observability by default | Pre-integrated OTEL, logs, metrics |
| π§ Semantic alignment | All infra outputs traceable and driven by shared intent |
π End-to-End Process Flow¶
The Infrastructure Architect Agent executes a deterministic, traceable, and reusable process to generate secure, environment-ready infrastructure blueprints across Azure, AWS, GCP, and Pulumi (.NET) ecosystems.
It coordinates input parsing, module resolution, IaC generation, observability injection, and lifecycle event emission β ensuring every microservice or environment is cloud-ready.
π Step-by-Step Flow¶
| Step | Description |
|---|---|
| 1οΈβ£ Input Collection & Validation | Load the required input files for the infrastructure setup and validate the essential parameters. The files include solution-architecture.md, resource-configuration.yaml, identity-policy.yaml, field-retention-map.yaml, observability-policy.yaml, and optionally previous-infra-state.json for drift detection. Validation ensures correct cloud provider selection, environment tags, and mandatory elements such as compute, DNS, and vault configurations. |
| 2οΈβ£ Module Resolution | Select the appropriate base templates from the module library based on the input files (e.g., aks_cluster, keyvault_module). Determine the cloud-specific Infrastructure as Code (IaC) format to generate (e.g., Bicep, Terraform, Pulumi C#). Additionally, check for any custom overrides or tenant-specific constraints that may affect the generated infrastructure. |
| 3οΈβ£ IaC Generation | Emit Infrastructure as Code (IaC) artifacts for various components, including: - Network: VNet, subnets, NSGs, peering - Compute: AKS, node pools, autoscaling groups - Storage: Blob/S3, databases, vaults - Identity: MSI, roles, bindings - DNS: zones, resolution paths, private endpoints Output formats include infra.bicep, main.tf, and PulumiInfra.cs. |
| 4οΈβ£ Policy Injection & Observability Wiring | Apply policy injection and observability wiring by integrating the following: - Naming/tagging strategy for resource organization. - Trace ID embedding ( trace_id, service_name, environment) for distributed tracing. - RBAC/IAM bindings based on identity-policy.yaml for secure access control. - OpenTelemetry collector & span wiring for monitoring. Validate that all resources are tagged with billing and governance labels, and ensure tracing spans and log sinks are configured. |
| 5οΈβ£ Topology & Rollback Generation | Generate topology and rollback artifacts: - network-topology.yaml: Defines the network architecture. - storage-definition.yaml: Outlines storage configurations. - identity-map.yaml: Maps identity configurations. - infra-topology.mmd: Mermaid diagram representing infrastructure. - infra-rollback.yaml: If rollback_safe: true, generate rollback configurations. Optionally, diff the previous state (previous-infra-state.json) and emit infra-drift-detected.yaml if changes or missing resources are detected. |
| 6οΈβ£ Lifecycle Metadata + Event Publishing | Generate lifecycle metadata and publish events: - Compose infra-metadata.json containing: - Agent version - Build ID - Resource count - Regions used - Trace ID - Output map (links to Bicep/Terraform/Pulumi). - Emit the InfraProvisioningPlanPublished lifecycle event. Optionally, emit InfraDriftDetected, InfraRollbackReady, or InfraProvisioningFailed events based on the process outcome. |
1οΈβ£ Input Collection & Validation¶
-
Load:
solution-architecture.mdresource-configuration.yamlidentity-policy.yamlfield-retention-map.yamlobservability-policy.yamlprevious-infra-state.json(optional for drift detection)
-
Validate:
- Cloud provider selection
- Environment tags and tenant mappings
- Mandatory elements (e.g., compute + DNS + vault)
2οΈβ£ Module Resolution¶
- Based on inputs:
- Select base templates from module library (
aks_cluster,keyvault_module, etc.) - Determine cloud-specific IaC format(s) to generate (Bicep, Terraform, Pulumi C#)
- Check for custom overrides or tenant-specific constraints
- Select base templates from module library (
3οΈβ£ IaC Generation¶
-
Emit IaC artifacts for:
- Network: VNet, subnets, NSGs, peering
- Compute: AKS, node pools, autoscaling groups
- Storage: Blob/S3, DBs, vaults
- Identity: MSI, roles, bindings
- DNS: zones, resolution paths, private endpoints
-
Formats:
infra.bicepmain.tfPulumiInfra.cs
4οΈβ£ Policy Injection & Observability Wiring¶
-
Apply:
- Naming/tagging strategy
- Trace ID embedding (
trace_id,service_name,environment) - RBAC/IAM bindings based on
identity-policy.yaml - OpenTelemetry collector & span wiring
-
Validate:
- All resources tagged with billing and governance labels
- At least one tracing span and log sink configured
5οΈβ£ Topology & Rollback Generation¶
-
Emit:
network-topology.yamlstorage-definition.yamlidentity-map.yamlinfra-topology.mmd(Mermaid)infra-rollback.yamlifrollback_safe: true
-
Optionally:
- Diff from
previous-infra-state.json - Emit
infra-drift-detected.yamlif resources changed or missing
- Diff from
6οΈβ£ Lifecycle Metadata + Event Publishing¶
-
Compose
infra-metadata.jsonwith:- Agent version
- Build ID
- Resource count
- Regions used
- Trace ID
- Output map (bicep/terraform/pulumi links)
-
Emit:
InfraProvisioningPlanPublishedlifecycle event- (Optional)
InfraDriftDetected,InfraRollbackReady,InfraProvisioningFailed
π§ Mermaid Process Diagram¶
flowchart TD
A[Input Parsing] --> B[Template Resolution]
B --> C[IaC Generation - Bicep, TF, Pulumi]
C --> D[Policy + Observability Injection]
D --> E[Topology + Rollback Output]
E --> F[Emit Metadata + Events]
π Auto-Correction & Retry¶
| Condition | Action |
|---|---|
| Missing cloud type | Default to Azure |
| DNS zone undefined | Inject fallback internal.connectsoft.dev |
| No OTEL config | Apply default OTLP + collector template |
| Storage class mismatch | Use default Standard_LRS or gp2 |
| Secret naming violation | Normalize and log fix in output trace |
π Observability Trace Spans¶
| Span Name | Description |
|---|---|
infra_inputs_validated |
Initial agent activation success |
iac_generated |
When IaC is written to disk/output directory |
identity_mapped |
When MSI + RBAC bindings are computed |
topology_rendered |
Network + DNS plan finalized |
infra_provisioning_event_published |
Agent lifecycle marker sent |
π οΈ Semantic Kernel Skills Overview¶
The Infrastructure Architect Agent is powered by a modular set of Semantic Kernel (.NET) skills.
Each skill handles a focused infrastructure concern β from input parsing and IaC generation to security, observability, and rollback management.
All skills are reusable, composable, and support trace-based observability and drift correction.
π§ 1. Core Infra Composition Skills¶
| Skill | Purpose |
|---|---|
IaCScaffolderSkill |
Generates Bicep, Terraform, and Pulumi (C#) files from resolved templates |
CloudTopologyBuilderSkill |
Produces network-topology.yaml, infra-topology.mmd |
NodePoolPlannerSkill |
Determines AKS/EKS node sizes, autoscale rules, zones |
TaggingPolicyEnforcerSkill |
Injects owner, project, env, trace_id into all resources |
π 2. Identity and Access Control Skills¶
| Skill | Purpose |
|---|---|
IdentityMapGeneratorSkill |
Produces identity-map.yaml with MSI, SP, roles |
IAMRoleBinderSkill |
Creates IAM bindings in Bicep, TF, or Pulumi |
TenantIsolationPlannerSkill |
Applies per-tenant scoping rules for RBAC, resource groups |
VaultIntegrationSkill |
Validates and links vault access with identity bindings |
π 3. Multi-Cloud IaC Generator Skills¶
| Skill | Purpose |
|---|---|
BicepEmitterSkill |
Generates infra.bicep for Azure deployments |
TerraformEmitterSkill |
Outputs main.tf with providers, modules, locals |
PulumiCSharpEmitterSkill |
Converts resolved infra map into PulumiInfra.cs using .NET SDKs |
IaCDiffCheckerSkill |
Compares new IaC with previous-infra-state.json and emits drift report |
π 4. Observability and Topology Skills¶
| Skill | Purpose |
|---|---|
OpenTelemetryWiringSkill |
Injects OTLP collector config, exporter URLs, and trace IDs |
TopologyDiagramWriterSkill |
Renders Mermaid diagram for VNet, Subnets, AKS, Vaults |
SpanEmitterSkill |
Emits infra lifecycle spans to OpenTelemetry and DevOps trace |
StorageDefinitionWriterSkill |
Builds storage-definition.yaml from retention, schema, and access tier policies |
π 5. Rollback, Metadata, and Event Publication Skills¶
| Skill | Purpose |
|---|---|
RollbackPlanBuilderSkill |
Produces infra-rollback.yaml based on diff and rollback flags |
MetadataManifestWriterSkill |
Outputs infra-metadata.json with trace and artifact map |
LifecycleEventPublisherSkill |
Emits InfraProvisioningPlanPublished and other lifecycle signals |
DriftAlertEmitterSkill |
Triggers InfraDriftDetected event and changelog if topology changed unexpectedly |
π Skill Observability Example¶
{
"trace_id": "trace-infra-44219",
"skill": "PulumiCSharpEmitterSkill",
"output": "PulumiInfra.cs",
"cloud_provider": "Azure",
"status": "success",
"duration_ms": 134
}
π Retry & Auto-Correction Behaviors¶
| Skill | Condition | Auto-Correction |
|---|---|---|
VaultIntegrationSkill |
Secret missing reference | Injects fallback vault path per service/environment |
TenantIsolationPlannerSkill |
No tenant ID provided | Assumes shared infra mode |
OpenTelemetryWiringSkill |
Missing exporter config | Defaults to http://otel.default:4317 OTLP |
PulumiCSharpEmitterSkill |
Unsupported resource | Downgrades to Terraform if fallback module available |
π Metrics Emitted per Skill¶
| Metric | Purpose |
|---|---|
iac_files_generated_total |
Total IaC templates produced |
drift_detected_total |
Number of diffs vs prior infra state |
span_injection_success |
Whether OTEL spans were injected |
rollback_plan_ready |
Boolean indicating rollback plan completeness |
event_publish_success |
Confirmed lifecycle notification sent |
π§° Core Technologies and Platforms¶
The Infrastructure Architect Agent uses a modular, cloud-native, and automation-first tech stack to generate, validate, and export infrastructure blueprints across cloud environments.
It supports multi-IaC, multi-cloud, and developer-native (Pulumi C#) workflows, ensuring seamless integration with ConnectSoftβs broader platform and DevOps agents.
βοΈ Supported Cloud Platforms¶
| Cloud Provider | Role in Agent |
|---|---|
| Azure | Primary cloud for production environments β AKS, Key Vault, Storage, Monitor |
| AWS | Optional deployments via Terraform β EKS, S3, IAM, CloudWatch |
| GCP | Supported via Terraform β GKE, Cloud Storage, IAM, Stackdriver |
| Hybrid (Self-hosted) | DNS stub zones, internal IP ranges, K3s, OpenTelemetry only |
π¦ Infrastructure-as-Code Engines¶
| Tool | Purpose |
|---|---|
| Bicep | Azure-native IaC (AKS, VNets, Key Vault, RBAC) |
| Terraform | Cloud-agnostic IaC across Azure, AWS, GCP |
| Pulumi (C#) | .NET-native IaC for developer-centric infrastructure teams |
| ARM Templates (fallback only) | Legacy support for Azure services where Bicep isnβt available |
| Cloud SDKs | Used via Pulumi C# for resource orchestration |
π οΈ Semantic Kernel (.NET)¶
| Use | Details |
|---|---|
| Agent orchestration | Composes IaC generation flow from skills |
| Prompt interpretation | Parses and applies intent from YAML-based prompt definitions |
| Skill injection | Loads IaC emitter skills based on output target (Bicep, TF, Pulumi) |
| Trace integration | Embeds trace_id, agent_version, service_name into every generated file and span |
π Identity and Access¶
| Technology | Purpose |
|---|---|
| Azure Managed Identity (MSI) | Used by AKS, Key Vault, Storage access |
| Terraform IAM modules | AWS/GCP support for minimal role policies |
| Pulumi.AzureNative.Authorization | C#-based identity provisioning |
| Key Vault & KMS | Securely manage tokens, connection strings, keys |
| RBAC Scopes | Automatically mapped via identity-map.yaml |
π Networking and DNS¶
| Stack | Role |
|---|---|
| Azure Private DNS | Used for internal .connectsoft.dev subdomains |
| Route53 / Cloud DNS | Used for tenant-bound or public zones |
| VNet / VPC | Created per environment or tenant |
| NSGs / Security Groups | Applied per subnet or cluster node pool |
| Peering & Transit Gateway | Supported where cross-region communication is required |
π Observability & Logging Stack¶
| Component | Purpose |
|---|---|
| OpenTelemetry Collector | Injected into infra stack to emit spans and metrics |
| Azure Monitor / Log Analytics | Default observability sink for Azure-based infra |
| Prometheus + Grafana (optional) | Can be configured via Pulumi or Helm |
| CloudWatch / Stackdriver | Used on AWS/GCP via Terraform log routing modules |
| otel-agent-config.yaml | Emitted to standardize span schema per resource type |
π Template Registry and Reuse¶
| Format | Reuse Mechanism |
|---|---|
| .bicep | Imported from connectsoft/iac/modules/*.bicep |
| .tf | Uses Terraform module blocks with source ref |
| .cs (Pulumi) | Generated from TemplateLibrary/Modules/*.cs |
| Shared Tags/Locals | Used across all formats: env, project, trace_id, region |
π CI/CD and DevOps Integration¶
| Tool | Integration Point |
|---|---|
| Azure DevOps Pipelines | Pulls IaC artifacts from infra-out folder |
| GitOps | Optional: sync from generated output into Flux/ArgoCD |
InfraProvisioningPlanPublished |
Triggers downstream deploy/test flows |
| Rollback Triggers | Based on infra-rollback.yaml, linked via trace and git ref |
π System Prompt (Bootstrapping Instruction)¶
This system prompt governs how the Infrastructure Architect Agent initializes, interprets architectural intent, and generates IaC artifacts in Bicep, Terraform, and Pulumi (C#) formats.
It enforces ConnectSoftβs standards for security, observability, naming, versioning, and tenant-aware resource allocation.
β Full System Prompt (Plain Text)¶
You are the **Infrastructure Architect Agent** in the ConnectSoft AI Software Factory.
Your purpose is to take architectural definitions and environment metadata, and generate a complete, secure, and reusable infrastructure blueprint that can be deployed using Bicep, Terraform, or Pulumi (C#).
---
## Your Responsibilities:
1. Load inputs:
- solution-architecture.md
- application-architecture.md
- identity-policy.yaml
- resource-configuration.yaml
- field-retention-map.yaml
- observability-policy.yaml
- previous-infra-state.json (optional)
2. Generate infrastructure outputs:
- Bicep (`infra.bicep`)
- Terraform (`main.tf`)
- Pulumi C# (`PulumiInfra.cs`)
- YAML metadata files: `network-topology.yaml`, `storage-definition.yaml`, `identity-map.yaml`
- Mermaid diagram: `infra-topology.mmd`
- Rollback and changelog files: `infra-rollback.yaml`, `infra-drift-detected.yaml`
3. Apply policies:
- Use ConnectSoftβs naming, tagging, and RBAC conventions
- Inject traceability metadata (`trace_id`, `agent_version`, `service_name`, `environment`)
- Wire in OpenTelemetry collector and span templates
4. Emit lifecycle events:
- `InfraProvisioningPlanPublished`
- `InfraDriftDetected` (if previous infra state differs)
- `InfraRollbackReady` (if rollback plan successfully created)
---
## Output Requirements:
- All artifacts must be consistent, versioned, and environment-specific
- At least one IaC format (Bicep, TF, or Pulumi) must be generated
- All infrastructure must include trace metadata and environment tags
- Resources must be deployable via automation pipelines (Azure DevOps, GitOps)
- No hardcoded secrets or identity strings β use vaults and RBAC
---
## Observability:
- Inject OTEL collector or sidecar per cluster or host
- Emit spans for: input parsing, template selection, IaC generation, rollback readiness
- Use `otel-agent-config.yaml` to standardize export targets and trace structure
---
## Fallbacks and Safety:
- If cloud provider not specified β default to Azure
- If observability config missing β inject default OTLP + stdout
- If tenant_id missing β assume shared multi-tenant mode
- If rollback disabled but detected drift β emit warning and skip promotion
π Policy-Driven Agent Behavior¶
- All DNS zones must resolve within
*.connectsoft.dev - All secrets and config must be sourced via Azure Key Vault or equivalent
- All generated outputs must conform to ConnectSoft's environment-tagging, traceability, and modular reuse conventions
π₯ Input Prompt Template¶
The Infrastructure Architect Agent is activated by a structured YAML input prompt provided by the ConnectSoft orchestrator or Solution Architect Agent.
This prompt defines the service context, environment, tenant scope, and configuration sources needed to generate infrastructure artifacts.
β Sample Input Prompt (YAML)¶
assignment: provision-infrastructure
project:
trace_id: trace-infra-99881
service_name: NotificationService
environment: Staging
tenant_id: tenant-42
agent_version: 1.3.0
inputs:
solution_architecture_url: https://.../solution-architecture.md
application_architecture_url: https://.../application-architecture.md
resource_configuration_url: https://.../notification/resource-config.yaml
identity_policy_url: https://.../notification/identity-policy.yaml
field_retention_map_url: https://.../notification/field-retention.yaml
observability_policy_url: https://.../notification/observability.yaml
previous_infra_state_url: https://.../notification/infra-v1.1.0.json
settings:
output_format: [bicep, terraform, pulumi]
inject_otel: true
rollback_safe: true
emit_diagram: true
cloud_provider: Azure
enable_tenant_isolation: true
π§© Required Input Fields¶
π· project¶
| Field | Description |
|---|---|
trace_id |
Unique ID for traceability, reused across all outputs |
service_name |
The logical microservice name this infra supports |
environment |
Deployment stage (Dev, Staging, Production) |
tenant_id |
Tenant scope (optional β if omitted, shared mode assumed) |
agent_version |
Infrastructure Architect Agent version used to generate artifacts |
π inputs¶
| Key | Description |
|---|---|
solution_architecture_url |
High-level system blueprint |
application_architecture_url |
Logical service and zone breakdown |
resource_configuration_url |
Resource size, DNS, storage class, compute settings |
identity_policy_url |
RBAC, MSI, tenant roles, access constraints |
field_retention_map_url |
Storage requirements (encryption, backup) |
observability_policy_url |
Span injection and OTEL configuration |
previous_infra_state_url |
Optional β used for drift detection and rollback planning |
βοΈ settings¶
| Field | Description |
|---|---|
output_format |
Which IaC targets to generate (bicep, terraform, pulumi) |
inject_otel |
Whether to inject OTEL collector and trace wiring |
rollback_safe |
If true, infra-rollback.yaml will be generated |
emit_diagram |
If true, outputs infra-topology.mmd (Mermaid) |
cloud_provider |
Forces IaC resolution to Azure, AWS, GCP |
enable_tenant_isolation |
Enforces subnet, DNS, or RG separation for tenant-specific infra |
β Validation Rules¶
| Rule | Description |
|---|---|
service_name, trace_id, and environment must be present |
β |
At least one of output_format must be valid |
β |
If rollback_safe: true, previous state must be provided |
β |
| Identity policies must define at least one principal binding | β |
DNS zone and subnet must resolve to valid topology unless emit_diagram: false |
β |
π¦ Minimal Prompt Example¶
assignment: provision-infrastructure
project:
trace_id: trace-infra-44419
service_name: AuditService
environment: Dev
inputs:
resource_configuration_url: https://.../audit/resource-config.yaml
identity_policy_url: https://.../audit/identity-policy.yaml
settings:
output_format: [pulumi]
rollback_safe: false
π€ Output Expectations Overview¶
The Infrastructure Architect Agent emits a complete, environment-aware, and IaC-compatible infrastructure bundle for every service, tenant, and environment combination.
These outputs are used by DevOps, Observability, Security, and Cloud Architecture Agents to provision, validate, and monitor real-world infrastructure across Azure, AWS, GCP, or hybrid environments.
π¦ Artifact Summary Table¶
| Artifact | Format | Purpose |
|---|---|---|
infra.bicep |
Bicep | Azure-native infrastructure script |
main.tf |
Terraform | Multi-cloud IaC abstraction |
PulumiInfra.cs |
C# (.NET Pulumi) | Developer-native infrastructure declaration |
network-topology.yaml |
YAML | Defines VNet, subnets, DNS zones, NSGs |
identity-map.yaml |
YAML | Describes service-managed identities, role bindings |
storage-definition.yaml |
YAML | Blob buckets, DB definitions, backup policies |
infra-metadata.json |
JSON | Metadata: trace ID, version, region, cloud provider |
infra-rollback.yaml |
YAML | Reversion and resource safe-destruction map |
infra-policy-map.yaml |
YAML | Resource tags, region policies, identity zones |
infra-topology.mmd |
Mermaid | Visual architecture map of network + services |
otel-agent-config.yaml |
YAML | Telemetry collectors, exporters, OTLP endpoints |
InfraProvisioningPlanPublished |
JSON (Event) | Published lifecycle metadata for downstream use |
π§Ύ Example: infra-policy-map.yaml¶
tags:
owner: infra-team
project: connectsoft
trace_id: trace-infra-88288
billing_code: cc123
resource_group_strategy: per-environment
tenant_isolation: subnet
observability:
otel_enabled: true
collector_url: http://otel.default:4317
π Example: infra-metadata.json¶
{
"trace_id": "trace-infra-99122",
"service": "NotificationService",
"environment": "Production",
"cloud_provider": "Azure",
"agent_version": "1.3.0",
"generated_at": "2025-05-02T01:55:00Z",
"output_artifacts": {
"bicep": "infra.bicep",
"terraform": "main.tf",
"pulumi": "PulumiInfra.cs"
},
"region": "eastus2",
"rollback_ready": true
}
π Example: InfraProvisioningPlanPublished (Event)¶
{
"event": "InfraProvisioningPlanPublished",
"trace_id": "trace-infra-99122",
"service": "NotificationService",
"environment": "Production",
"outputs": {
"Pulumi": "PulumiInfra.cs",
"metadata": "infra-metadata.json",
"topology": "infra-topology.mmd"
},
"timestamp": "2025-05-02T01:55:00Z"
}
π Output Constraints & Validation Rules¶
| Requirement | Enforced |
|---|---|
All outputs include trace_id, environment, agent_version |
β |
At least one IaC format (bicep, tf, or pulumi) must be present |
β |
DNS + network must be represented in network-topology.yaml |
β |
| Observability outputs must be injected or defaulted | β |
If rollback_safe: true, infra-rollback.yaml must be emitted |
β |
Mermaid diagram emitted if emit_diagram: true |
β |
π Optional: deployment-docs.md¶
For downstream use in Developer Portal or Audit Trail.
# Infrastructure Deployment: NotificationService
**Environment:** Production
**Cloud:** Azure
**Version:** v1.3.0
**Trace ID:** trace-infra-99122
## Components
- AKS Cluster with autoscaling
- Private VNet with DNS
- Azure Blob Storage with GRS
- Key Vault with managed identity
- OTEL Collector with OLTP export
## Rollback
Rollback enabled β `infra-rollback.yaml`
π§ Memory Strategy Overview¶
The Infrastructure Architect Agent maintains a short-term session memory and long-term semantic memory to ensure:
- Infrastructure consistency across environments and agents
- Reuse of known patterns and resource identifiers
- Rollback awareness and safe reversion
- Drift detection and proactive remediation
- Cross-agent alignment (DevOps, Security, Observability, Cloud)
π Short-Term (Session) Memory¶
| Key | Purpose |
|---|---|
trace_id |
Tracks all outputs, spans, and events for current run |
service_name |
Used in naming conventions, tags, identities |
environment |
Affects DNS, tags, scaling strategy, vaults |
output_format[] |
Controls which IaC formats to emit (Bicep, TF, Pulumi) |
generated_artifacts[] |
List of created files for summary + event emission |
drift_detected[] |
Captures differences from previous-infra-state.json |
tenant_mode |
Shared vs. isolated infrastructure toggles |
π§ Long-Term Semantic Memory¶
π 1. Provisioned Resources History¶
| Data | Used For |
|---|---|
| Resource group names and locations | Prevent collisions, support idempotency |
| Vault references and secret names | Enforce secure naming + reuse |
| AKS/EKS cluster names and configurations | Auto-resolve for service scaleouts |
| DNS zones, subnets, and peering policies | Maintain cross-region resolution and tenancy compliance |
π 2. Identity Mapping History¶
| Tracked | Use Case |
|---|---|
| Previously bound MSIs, SPs, and RBAC roles | Re-apply correct scopes to new services |
| Tenant-based role templates | Enforce least privilege per tenant pattern |
| Vault identity linkage | Validate that all resources have secure vault access only |
π 3. Observability and OTEL Tracing Memory¶
| Retained | Purpose |
|---|---|
Previously emitted otel-agent-config.yaml |
Reuse collector endpoint + exporter rules |
| Known span naming conventions | Ensure trace compatibility across deployments |
| Log forwarding sinks | Automatically bind to Azure Monitor, CloudWatch, or Stackdriver per platform |
β»οΈ 4. Template and Module Reuse¶
| Element | Behavior |
|---|---|
| AKS/EKS node pool configs | Match by environment and region |
| VNet modules | Reuse across environments if shared_vnet: true |
| Vault templates | Loaded per team/project/tenant class |
infra-policy-map.yaml |
Maintains cost, tag, and scaling standards |
π Memory Snapshot Output Example¶
{
"trace_id": "trace-infra-55488",
"version": "1.3.0",
"rollback_ready": true,
"generated_artifacts": [
"PulumiInfra.cs",
"network-topology.yaml",
"identity-map.yaml",
"infra-rollback.yaml"
],
"reuse_policies": {
"vnet": "shared",
"dns": "per-tenant",
"vault": "env-isolated"
},
"drift_detected": false
}
β Memory Benefits¶
| Value | Result |
|---|---|
| π Prevent redundant redeployment | Avoid resource re-creation when no change is needed |
| π Secure identity reuse | Roles and bindings consistently applied |
| π Span and telemetry continuity | Standard OTEL tracing maintained over versions |
| π¦ Consistent IaC structure | Modules and naming reused across hundreds of microservices |
| π§ Rollback safety | Links each version to previous working state with traceability |
β Validation and Correction Overview¶
The Infrastructure Architect Agent includes a comprehensive, multi-layered validation pipeline to ensure that all emitted IaC artifacts:
- Are syntactically correct
- Follow security, naming, and tagging conventions
- Are traceable
- Support safe provisioning and rollback
- Conform to ConnectSoftβs platform policies across clouds and tenants
When validation fails, the agent applies auto-corrections, issues warnings, or blocks output publication.
π Validation Phases¶
1οΈβ£ Schema & Structure Validation¶
| Artifact | Rule |
|---|---|
infra.bicep / main.tf / PulumiInfra.cs |
Valid IaC syntax, no unresolved references |
network-topology.yaml |
Must include VNet, at least one subnet, DNS zone |
identity-map.yaml |
Must include at least one MSI or principal |
storage-definition.yaml |
Must define replication, encryption for each store |
infra-policy-map.yaml |
Required tags: owner, project, env, trace_id |
infra-topology.mmd |
Valid Mermaid syntax and referenced resource IDs exist |
2οΈβ£ Security & Naming Enforcement¶
| Rule | Correction |
|---|---|
DNS zone must end in .connectsoft.dev |
Append suffix if missing |
| Vault secrets must be reference-based (not inline) | Convert inline to key_ref: pattern |
| Service name must be kebab-case | Auto-format and update in IaC files |
Tenant resources must include tenant_id in name/tag |
Append suffix or tag if absent |
3οΈβ£ Observability & Traceability¶
| Rule | Correction |
|---|---|
OTEL config must exist if inject_otel: true |
Inject default otel-agent-config.yaml |
Each IaC output must include trace_id, agent_version, and environment |
Inject automatically if missing |
| Must emit OpenTelemetry spans for generation lifecycle | Auto-emit if span config is present or fallback allowed |
4οΈβ£ Drift and Rollback Safety Checks¶
| Rule | Behavior |
|---|---|
If rollback_safe: true, but no previous-infra-state.json provided |
Emit warning, fallback to snapshot-only rollback |
| If significant drift from previous state detected | Emit infra-drift-detected.yaml, block auto-promotion |
| If no drift and version unchanged | Block emission of duplicate infra.bicep or PulumiInfra.cs |
π Auto-Correction Behavior Matrix¶
| Input Error | Correction Applied |
|---|---|
DNS missing .connectsoft.dev |
Append suffix, log to changelog |
| Vault secret defined inline | Convert to key_ref and generate binding |
| AKS region undefined | Default to eastus2 |
Missing identity-policy.yaml |
Inject system-assigned MSI + Reader role |
| Missing tags | Inject default tags: owner, project, trace_id |
π’ Blocking Conditions¶
| Condition | Outcome |
|---|---|
| Invalid IaC syntax | Block artifact, emit InfraProvisioningFailed |
| Inline secrets with no fallback | Fail validation |
| Identity map missing required bindings | Fail validation |
| Drift detected with rollback disabled | Block publish, emit alert |
| Trace ID missing | Block all outputs until traceability is ensured |
π Observability Spans Emitted¶
| Span Name | Trigger |
|---|---|
iac_validation_passed |
All IaC outputs validated |
corrections_applied |
One or more auto-corrections were made |
rollback_plan_generated |
infra-rollback.yaml emitted successfully |
infra_drift_detected |
Infrastructure change from previous version |
publish_blocked_due_to_drift |
Promotion blocked, manual review required |
β Validation Outcome Summary¶
| Outcome | Action |
|---|---|
| β Pass | Emit InfraProvisioningPlanPublished |
| β οΈ Partial Pass (with auto-corrections) | Emit warning, mark trace |
| β Fail | Emit InfraProvisioningFailed or block downstream publication |
| π Drift Detected | Emit changelog + infra-drift-detected.yaml, mark rollback required |
π€ Agent Collaboration Overview¶
The Infrastructure Architect Agent plays a central role in the ConnectSoft AI Software Factory by producing foundational infrastructure on which all other services and agents depend.
It acts as both a consumer of architectural intent and a provider of deployable environments, identity scaffolding, and cloud boundaries to downstream agents.
πΌ Upstream Providers (Input Sources)¶
| Agent | Input Artifact |
|---|---|
| Solution Architect Agent | solution-architecture.md β defines zones, environments, regions |
| Application Architect Agent | application-architecture.md β lists logical services and zone groupings |
| Cloud Architecture Agent | resource-configuration.yaml β compute type, DNS, region, zones |
| Security Architect Agent | identity-policy.yaml β MSI, role bindings, secret access policies |
| Observability Agent | observability-policy.yaml β OTEL injection and span propagation |
| Data Architect Agent | field-retention-map.yaml β drives encryption, backup vault generation |
π½ Downstream Consumers¶
| Agent | Consumes |
|---|---|
| DevOps Architect Agent | Pulls identity-map.yaml, infra.bicep, PulumiInfra.cs to enable secure pipelines and environments |
| Cloud Architecture Agent | Uses network-topology.yaml, infra-policy-map.yaml for region scaling, CDN planning |
| Security Architect Agent | Reads infra-policy-map.yaml and identity definitions for role enforcement |
| Observability Agent | Reuses otel-agent-config.yaml, emits spans from infra provisioning lifecycle |
| Rollback Executor Agent | Consumes infra-rollback.yaml when a downgrade or revert is required |
| Developer Portal Generator Agent | Displays topology (infra-topology.mmd) and status of infra per service/environment |
π¦ Published Events¶
| Event | Trigger |
|---|---|
InfraProvisioningPlanPublished |
Emitted when generation completes with success |
InfraDriftDetected |
Published when current state differs from previous-infra-state.json |
InfraRollbackReady |
Emitted when rollback file is prepared and safe to apply |
InfraProvisioningFailed |
Published if IaC generation or validation fails |
π Artifact Linkage to Other Agents¶
| Artifact | Consumed By |
|---|---|
infra.bicep, main.tf, PulumiInfra.cs |
DevOps Agent, CI/CD Orchestrator |
network-topology.yaml |
API Gateway Configurator, Cloud Scaling Agent |
identity-map.yaml |
Security, DevOps, Vault Injector Agents |
storage-definition.yaml |
Backup, Data Migration, and Cost Monitor Agents |
infra-topology.mmd |
Developer Portal, Audit Dashboard |
π§ Traceability and Coordination¶
All outputs contain:
metadata:
trace_id: trace-infra-11889
agent_version: 1.3.0
service_name: NotificationService
environment: Staging
generated_on: 2025-05-02T02:10:00Z
These values are used to cross-reference deployments, rollback chains, and audit trails across DevOps, Security, and Monitoring agents.
π‘ Integration Flow¶
flowchart TD
SolutionArchitect --> InfrastructureArchitect
ApplicationArchitect --> InfrastructureArchitect
CloudArchitect --> InfrastructureArchitect
SecurityArchitect --> InfrastructureArchitect
ObservabilityAgent --> InfrastructureArchitect
InfrastructureArchitect --> DevOpsArchitect
InfrastructureArchitect --> CloudArchitect
InfrastructureArchitect --> RollbackExecutor
InfrastructureArchitect --> DeveloperPortal
InfrastructureArchitect --> ObservabilityAgent
β Collaboration Summary¶
| Agent | Relationship |
|---|---|
| πΌ Architects & Planners | Provide policies, inputs, and intent |
| π½ Executors | Consume infra, deploy, monitor, rollback, or visualize |
| π Cyclical Feedback | Drift β security/governance validation β rollback or regenerate |
| π Spans + Events | Provide observability hooks to DevOps + platform telemetry |
π‘ Observability & Oversight Strategy¶
The Infrastructure Architect Agent is fully instrumented to support:
- Real-time span emission
- Lifecycle event publishing
- Provisioning rollback awareness
- Topology visualization
- Audit traceability across clouds, tenants, and environments
These mechanisms ensure that every infrastructure generation cycle is observable, versioned, and governance-ready.
π OpenTelemetry Span Emission¶
| Span Name | Trigger |
|---|---|
infra_inputs_parsed |
After validating incoming prompt and configs |
iac_generation_completed |
When all IaC formats are successfully written |
topology_rendered |
When infra-topology.mmd is generated |
rollback_plan_ready |
When infra-rollback.yaml is complete |
infra_lifecycle_event_emitted |
When InfraProvisioningPlanPublished is dispatched |
Each span includes:
- trace_id
- agent_version
- service_name
- environment
- duration_ms
- status
π’ Lifecycle Event Streams¶
| Event | Purpose |
|---|---|
InfraProvisioningPlanPublished |
Main lifecycle signal with full output metadata |
InfraRollbackReady |
Indicates safe rollback plan was generated |
InfraDriftDetected |
Signals divergence from previously known infra state |
InfraProvisioningFailed |
Indicates generation or validation failure (used by DevOps + Observability) |
π Dashboards & Audit Integration¶
| Dashboard Tile | Description |
|---|---|
| IaC Generation Timeline | Visual trace of infra generation per service |
| Drift Detection Status | Last time drift was seen and what changed |
| Resource Count & Cost Tags | Breakdown of provisioned units and billing labels |
| RBAC & MSI Audit | Which identities were generated or reused |
| Topology Viewer | Mermaid-rendered overlay of DNS, VNet, zones, vaults, and compute |
π§ Rollback Awareness & Safety¶
If rollback_safe: true:
infra-rollback.yamlis always generated- Last successful output snapshot is stored
- Changes are diffed against
previous-infra-state.json - Downstream rollback agents and DevOps are notified
Rollback artifacts include:
rollback:
to_version: 1.2.0
generated_on: 2025-05-02T02:13:00Z
trace_id: trace-infra-99321
files: [infra.bicep, PulumiInfra.cs, identity-map.yaml]
approved_by: system
π¦ Final Output Bundle (Recap)¶
| Artifact | Format |
|---|---|
infra.bicep / main.tf / PulumiInfra.cs |
IaC files |
identity-map.yaml / network-topology.yaml / storage-definition.yaml |
Infra blueprints |
infra-topology.mmd |
Mermaid diagram |
infra-metadata.json |
Versioned metadata |
infra-rollback.yaml |
Rollback spec |
infra-policy-map.yaml |
Tag + policy map |
otel-agent-config.yaml |
Telemetry routing |
InfraProvisioningPlanPublished |
Lifecycle event |
β Agent Outcome Summary¶
| Capability | Delivered |
|---|---|
| π Multi-format IaC generation | β Bicep, Terraform, Pulumi (C#) |
| π§± Modular infra blueprints | β Per environment, region, tenant |
| π Secure RBAC and secrets | β MSI, Vault, scoped identity maps |
| π DNS, VNet, AKS, Storage | β All core infra defined |
| π Observability spans | β OTEL and span-ready outputs |
| β»οΈ Drift + rollback coverage | β With lifecycle events + changelogs |