Skip to content

πŸ—οΈ Infrastructure Architect Agent Specification

🎯 Purpose

The Infrastructure Architect Agent is responsible for designing and generating the foundational cloud infrastructure required to operate the ConnectSoft AI Software Factory at scale β€” using multi-cloud, tenant-aware, IaC-first principles.

Its mission is to translate high-level architectural intent (compute needs, data isolation, security boundaries, etc.) into declarative, reproducible infrastructure modules, using tools like Bicep, Terraform, and Pulumi (C#).


πŸ’‘ Why This Agent Matters

Without this agent:

  • Infrastructure setup would be manual, inconsistent, and error-prone
  • Multi-tenant environments would suffer from cross-tenant leakage or drift
  • Developers would waste time managing infrastructure instead of building logic
  • Observability and identity boundaries wouldn’t be enforced from the ground up
  • Services couldn’t scale or operate securely across cloud regions/providers

With the Infrastructure Architect Agent:

βœ… All infrastructure is code-generated, version-controlled, and composable
βœ… Teams get ready-to-deploy, secure environments with no human provisioning
βœ… Each tenant and microservice has proper isolation, cost tagging, and access boundaries
βœ… Infrastructure definitions are portable across Azure, AWS, GCP, or hybrid setups
βœ… Provisioning supports Bicep, Terraform, and now Pulumi (C#) β€” enabling .NET-native IaC


🧱 What This Agent Enables

Capability Impact
πŸ”Œ Cloud-native networking VNet, subnets, NAT, firewalls, private DNS
πŸ—οΈ Compute orchestration AKS/EKS clusters, node pools, autoscaling
πŸ” Identity & access Managed Identity, IAM roles, service principals
πŸ’Ύ Storage provisioning Azure Blob, S3, databases, backup vaults
🌍 Environment scaffolding Per-tenant or per-stage logical groupings
πŸ’¬ Policy-as-code RBAC, NSGs, audit logging, telemetry agents
♻️ Reusability & portability Bicep + Terraform + Pulumi (C#) bundles

🧭 Position in the Factory Pipeline

flowchart TD
    SolutionArchitect --> InfrastructureArchitect
    InfrastructureArchitect --> DevOpsArchitect
    InfrastructureArchitect --> CloudArchitectureAgent
    InfrastructureArchitect --> SecurityArchitect
    InfrastructureArchitect --> ObservabilityAgent
Hold "Alt" / "Option" to enable pan & zoom

🧠 Reusable Infrastructure Scenarios

Scenario Provisioned By Agent
New microservice with DB + Key Vault βœ… VNet, AKS node pool, Azure SQL, KV
Staging environment per tenant βœ… Subnet, node pool, traffic filtering
Multi-cloud deployment (Azure + AWS) βœ… Terraform + Pulumi dual-mode templates
Cluster auto-scaling with cost budget tags βœ… Autoscaler + tagging in IaC

πŸ“ˆ Vision Alignment

This agent upholds ConnectSoft’s values:

  • Cloud-Native
  • Security-First
  • Event-Driven & Observable
  • Multi-Tenant Ready
  • Automated, Reproducible, and Composable

πŸ“‘ Scope of Influence

The Infrastructure Architect Agent governs the infrastructure control plane across all environments and tenants.
Its outputs shape the physical, logical, and virtual boundaries for every microservice, data flow, identity relationship, and cloud resource used by the ConnectSoft AI Software Factory.

This scope includes multi-region, multi-cloud, and multi-tenant infrastructure definitions.


πŸ”§ System Layers Controlled by the Agent

Layer Components Controlled
Networking VNet, Subnet, Private DNS, Peering, NAT, NSGs, Public IP, Private Endpoints
Compute AKS/EKS/GKE clusters, node pools, autoscaling groups, container runtimes
Storage Azure Blob, S3 Buckets, PostgreSQL/MySQL, File Shares, Disks, Backups
Identity & Access Managed Identity, IAM Roles, RBAC bindings, service principal outputs
Secrets & Key Management Azure Key Vault, AWS KMS, GCP Secrets, integration paths
Environment Isolation Per-tenant staging spaces, dedicated compute/storage profiles
Provisioning Mode Bicep, Terraform, and Pulumi (C#) IaC backends for every artifact
Observability Layer Log routing (Log Analytics, CloudWatch), OpenTelemetry endpoints
DNS & Traffic Control Zones, Ingress Controllers, Route Tables, Load Balancers

🌍 Target Environments Provisioned

Environment Resources Created
Development Shared AKS pool, dev-only DNS, reduced autoscaling
Staging Cluster with cost tags, tracing on, mTLS across microservices
Production Hardened AKS/EKS, regional failover, Key Vault with HSM
Per-Tenant Optional isolated subnet, DNS, key ring, secret namespace
Hybrid (Local + Cloud) Self-hosted runner + cloud ingress, VPN/peering

πŸ“¦ Agent Output Mode Coverage

Output Type Description
Bicep Azure-native IaC with parameterized modules
Terraform Multi-cloud abstraction supporting Azure, AWS, GCP
Pulumi (C#) .NET-native, developer-friendly infrastructure declarations
Mermaid Topology Diagram Visual VNet/cluster/service layout
Provisioning Events Emits InfraProvisioningPlanPublished and optional diff events

🧠 Scope Summary Diagram

graph TD
    A[VNet/Subnet/DNS]
    B[AKS Node Pools]
    C[Key Vault + IAM]
    D[Storage Accounts/DBs]
    E[Pulumi + Terraform + Bicep]
    F[OpenTelemetry Collector]
    G[Ingress + Load Balancer]

    A --> E
    B --> E
    C --> E
    D --> E
    F --> E
    G --> E
Hold "Alt" / "Option" to enable pan & zoom

βœ… Key Impact Zones

Impacted Layer Agent Enforcement
❗ Cost-aware autoscaling Node pool budget tags and usage quotas
πŸ” Zero Trust networking NSGs, private links, enforced mTLS
πŸ”‘ Least privilege IAM MSI roles, scoped secrets, tenant RBAC
πŸ“¦ Modular IaC Every component pluggable and reusable
♻️ Reusable patterns Shared modules across tenants, clouds, services
πŸ“ˆ Observability-on-provision OpenTelemetry spans built into infra creation

πŸ“‹ Core Responsibilities

The Infrastructure Architect Agent is accountable for designing, provisioning, securing, and documenting the infrastructure foundation that powers every microservice, gateway, and subsystem in the ConnectSoft AI Software Factory.

Its responsibilities ensure that infrastructure is:

  • Declaratively defined
  • Multi-cloud and multi-tenant ready
  • Secure by default
  • Environment-aware
  • Observability-aligned
  • Compatible with DevOps and Cloud Architecture agents

🧱 1. Provisioning Blueprint Generation

Task Output
Generate per-environment infrastructure definitions infra.bicep, main.tf, PulumiInfra.cs
Structure modules for reusability (e.g., VNet, AKS, KV) Modularized IaC blocks
Define naming conventions, tags, location strategies Global infra naming policy module
Emit variant manifests (e.g., Dev, Staging, Prod, TenantX) infra.{env}.bicep, infra.{tenant}.tf

πŸ” 2. Identity and Access Layer Definition

Task Output
Define system-assigned or user-assigned Managed Identities identity-map.yaml
Bind IAM roles to service scopes (e.g., read, contributor, custom roles) IAM policy definitions per cloud
Map identity outputs to agents for later use (DevOps, Security) Shared output map with resource_id, client_id, secret_ref

🌐 3. Network Topology Planning

Task Output
Define VNet + Subnet per environment or tenant network-topology.yaml
Configure NSGs, DNS, peering, and egress rules Enforced through PulumiInfra.cs or network.tf
Control ingress exposure and service mesh compatibility Mesh annotation support for Istio/Linkerd/Kuma

πŸ’Ύ 4. Storage and Data Service Definition

Task Output
Define Blob/S3 buckets, access tiers, replication, encryption storage-definition.yaml
Provision PostgreSQL, SQL, or CosmosDB Configurable DB-as-a-service modules
Define backup vaults, data retention, TTL settings Storage compliance blueprint injected from policy

πŸ“ˆ 5. Observability and Monitoring Setup

Task Output
Deploy OpenTelemetry Collector as sidecar or managed infra Optional toggle via policy
Route logs to Azure Monitor, CloudWatch, or custom endpoints otel-agent.yaml output
Enable service health metrics via K8s Auto-wired probes into deployment specs
Emit InfraProvisioningSpans for all provisioning actions All agent actions are observable

♻️ 6. Reusability and Policy Propagation

Task Output
Reuse core templates across projects Parameterized Bicep, Pulumi classes, Terraform modules
Integrate organizational naming/tagging/security standards Tag injection logic with owner, env, billing_code
Emit global infra-policy-map.yaml to downstream agents Shared policy alignment for Security and DevOps Architect Agents

πŸ“’ 7. Lifecycle & Compliance Hooks

Task Output
Emit InfraProvisioningPlanPublished with trace_id and artifact list Triggers downstream agent coordination
Emit drift detection events if previous state differs Optional in audit mode
Output infra-metadata.json per run Includes all provisioned resource IDs and traceability fields
Provide rollback plan (infra-rollback.yaml) when enabled Describes deletions, safe resource tear-down steps

βœ… Summary

Responsibility Area Delivered By Agent
IaC output (Bicep, Terraform, Pulumi) βœ…
Networking & DNS topology βœ…
Identity, RBAC, and IAM config βœ…
Storage definition and protection βœ…
Observability layer pre-wired βœ…
Reusability, modularity, compliance βœ…
Traceable provisioning lifecycle βœ…

πŸ“₯ Core Inputs

The Infrastructure Architect Agent consumes architectural intent and operational metadata from upstream agents and policy sources to generate IaC-compatible infrastructure blueprints.

These inputs drive the agent’s ability to:

  • Choose the right cloud services
  • Structure the network and security model
  • Enforce cost and compliance constraints
  • Output per-environment or per-tenant variations
  • Align with platform observability and deployment expectations

πŸ“‚ Required Input Artifacts

Artifact Source Agent Purpose
solution-architecture.md Solution Architect Agent Defines global cloud targets, service composition, zones
application-architecture.md Application Architect Agent Describes logical services, access groups, service mesh topology
deployment-strategy.yaml DevOps Architect Agent Determines target environments and staging separation
identity-policy.yaml Security Architect Agent Defines IAM/RBAC/MSI rules for cloud resources
resource-configuration.yaml Cloud Architecture Agent Specifies node pool sizes, DNS rules, ingress exposure
field-retention-map.yaml Data Architect Agent Drives storage encryption and replication flags
observability-policy.yaml Observability Agent Ensures metrics, logs, and tracing are pre-integrated
previous-infra-state.json (optional) From repo Enables drift detection and versioned rollback

πŸ“˜ Sample Input: resource-configuration.yaml

resources:
  compute:
    provider: Azure
    cluster_type: AKS
    node_pool:
      size: Standard_D4s_v3
      autoscale: true
      max_nodes: 10
  dns:
    zone: "internal.connectsoft.dev"
    private_dns_enabled: true
  storage:
    account_tier: Standard_LRS
    replication: GRS

πŸ“˜ Sample Input: identity-policy.yaml

iam:
  enable_managed_identity: true
  role_bindings:
    - role: Contributor
      principal: microservice@connectsoft
    - role: Key Vault Reader
      principal: devops-agent@connectsoft
  tenant_isolation:
    enabled: true
    mode: soft

πŸ“˜ Sample Input: observability-policy.yaml

otel:
  collector:
    enabled: true
    type: sidecar
  spans:
    provision_infra: true
    storage_attached: true
  exporters:
    - type: otlp
      endpoint: https://otel.connectsoft.dev

🧩 Input Validation Rules

Rule Description
Must specify at least one compute cluster (AKS, EKS, GKE) βœ…
IAM roles must reference valid service principals or MSIs βœ…
DNS zone must follow platform naming convention βœ…
Missing observability policy β†’ fallback to basic collector + OTLP spans βœ…
Storage configuration must match retention + compliance map βœ…
Identity policies must align with tenant model (shared/isolated) βœ…
If no resource config provided β†’ suggest default AKS dev config βœ…

πŸ” Input Prompt Snippet Example

assignment: provision-infrastructure

project:
  trace_id: trace-infra-44188
  service_name: NotificationService
  environment: Staging
  tenant_id: tenant-42

inputs:
  solution_architecture_url: https://.../solution-architecture.md
  resource_configuration_url: https://.../notification/resource-configuration.yaml
  identity_policy_url: https://.../notification/identity-policy.yaml
  observability_policy_url: https://.../notification/observability.yaml
  previous_infra_state_url: https://.../notification/infra-v1.1.0.json

settings:
  output_format: [bicep, terraform, pulumi]
  multi_cloud_enabled: false
  rollback_safe: true
  emit_diagram: true

🧠 Metadata Propagation

Every input maps to:

  • trace_id β†’ present in every output file and event
  • tenant_id, environment β†’ used in naming, resource grouping, tagging
  • agent_version β†’ helps downstream validation
  • cloud_provider β†’ decides Bicep (Azure), Terraform (multi-cloud), or Pulumi (C#/.NET)

πŸ“€ Core Outputs

The Infrastructure Architect Agent emits a complete infrastructure-as-code (IaC) blueprint for each service, tenant, and environment β€” using multi-mode outputs including:

  • Bicep for Azure-native deployments
  • Terraform for multi-cloud compatibility
  • Pulumi (C#) for .NET-native teams

Each output is environment-specific, versioned, and traceable, and includes metadata for rollback, observability, and security review.


πŸ“¦ Artifact Output Summary

Artifact Format Purpose
infra.bicep Bicep Azure-native definition (VNets, AKS, KV, etc.)
main.tf Terraform Multi-cloud support (AWS, Azure, GCP)
PulumiInfra.cs C# (Pulumi) .NET-based IaC, same topology as Bicep/TF
network-topology.yaml YAML Declares subnets, IP ranges, NSGs, DNS
identity-map.yaml YAML Maps service identities, roles, scopes
storage-definition.yaml YAML Blob buckets, database plans, backup policies
infra-metadata.json JSON Traceable outputs: IDs, trace_id, environment, version
infra-rollback.yaml YAML Describes how to safely undo/replace resources
infra-policy-map.yaml YAML Injected tags, constraints, quotas, compliance requirements
infra-topology.mmd Mermaid Visual network and compute diagram
InfraProvisioningPlanPublished Event Lifecycle event for CI/CD and Cloud Architecture Agent

🧾 Output Example: infra.bicep

resource aks 'Microsoft.ContainerService/managedClusters@2022-03-01' = {
  name: '${serviceName}-aks-${environment}'
  location: location
  identity: {
    type: 'SystemAssigned'
  }
  properties: {
    dnsPrefix: '${serviceName}-${environment}'
    agentPoolProfiles: [
      {
        name: 'default'
        count: 2
        vmSize: 'Standard_DS2_v2'
        osType: 'Linux'
        mode: 'System'
      }
    ]
  }
}

🧾 Output Example: PulumiInfra.cs

var resourceGroup = new ResourceGroup($"{project}-rg");

var vnet = new VirtualNetwork("vnet", new VirtualNetworkArgs {
    AddressSpaces = { "10.0.0.0/16" },
    ResourceGroupName = resourceGroup.Name,
    Location = resourceGroup.Location
});

var aks = new KubernetesCluster("aks", new KubernetesClusterArgs {
    ResourceGroupName = resourceGroup.Name,
    AgentPoolProfiles = {
        new KubernetesClusterAgentPoolProfileArgs {
            Name = "agentpool",
            Count = 2,
            VmSize = "Standard_D2_v2",
            Mode = "System"
        }
    },
    DnsPrefix = $"{project}-k8s",
    Identity = new KubernetesClusterIdentityArgs {
        Type = "SystemAssigned"
    }
});

🧾 Output Example: network-topology.yaml

vnet:
  name: vnet-notification-staging
  address_space: 10.1.0.0/16
  subnets:
    - name: public
      address_prefix: 10.1.1.0/24
    - name: private
      address_prefix: 10.1.2.0/24
  dns_zone: internal.connectsoft.dev
  peering:
    enabled: true
    targets:
      - core-services
      - telemetry-network

🧾 Output Example: infra-metadata.json

{
  "trace_id": "trace-infra-44411",
  "service": "NotificationService",
  "environment": "Staging",
  "provisioned_at": "2025-05-02T01:34:00Z",
  "agent_version": "1.3.0",
  "artifacts": {
    "bicep": "infra.bicep",
    "terraform": "main.tf",
    "pulumi": "PulumiInfra.cs",
    "metadata": "infra-metadata.json"
  }
}

βœ… Output Completeness Rules

Requirement Status
All outputs must include trace_id, agent_version, and environment βœ…
Must emit at least one IaC format (Bicep, TF, Pulumi) βœ…
All compute and network resources must be grouped per environment βœ…
Storage, identity, and DNS outputs must be traceable via IDs βœ…
Mermaid diagram must match declared topology βœ…
Lifecycle event InfraProvisioningPlanPublished must be emitted βœ…

πŸ“ˆ Event Emission Example

{
  "event": "InfraProvisioningPlanPublished",
  "trace_id": "trace-infra-44411",
  "service": "NotificationService",
  "environment": "Staging",
  "outputs": {
    "pulumi": "PulumiInfra.cs",
    "terraform": "main.tf",
    "bicep": "infra.bicep",
    "metadata": "infra-metadata.json"
  },
  "timestamp": "2025-05-02T01:34:00Z"
}

πŸ“š Agent Knowledge Base Overview

The Infrastructure Architect Agent leverages a curated infrastructure knowledge base that includes:

  • Reusable IaC templates and modules
  • Multi-cloud best practices
  • Resource naming conventions and tagging strategies
  • Known patterns for tenant isolation, zone redundancy, and cost optimization
  • Policy-driven configurations for storage, identity, telemetry, and security

This knowledge base is applied uniformly across all IaC output modes: Bicep, Terraform, and Pulumi (C#).


🧱 1. Infrastructure Modules Library

Module Description
aks_cluster Parametric node pool with MSI, DNS, autoscaling
vnet_base VNet + subnets + NSG with peer routing
keyvault_module Vault + access policy bindings + purge protection
blob_storage Secure container with private endpoint, encryption
dns_zone Azure Private DNS or Route53 zone per tenant/environment
otel_collector Injected as deployment or managed agent with routing
role_assignments RBAC/IAM bindings for services and managed identities

🌍 2. Cloud-Specific Knowledge

Cloud Agent Behavior
Azure Uses Bicep templates + ARM native API ID mappings
AWS Terraform modules for VPC, IAM, EKS, S3, CloudWatch
GCP Terraform modules for VPC, GKE, IAM, KMS, Stackdriver
Hybrid/Local Defaults to self-hosted Kubernetes + DNS stub zones
Pulumi (C#) Mirrors Bicep/TF topologies using Pulumi SDKs + .NET

πŸ” 3. Identity and Access Policies

Pattern Rule
Default identity type = SystemAssigned MSI or Service Principal
Least privilege: services get only minimal roles (Reader, KV Reader)
Tenant isolation: per-tenant resource group or role segregation
Access to secrets: always via vault reference, not hardcoded
identity-map.yaml reused across DevOps and Security Agents

πŸ“¦ 4. Naming and Tagging Conventions

Convention Example
env-service-region prod-notification-eus
Tags: project, owner, env, trace_id Used across all provisioned resources
Vault secrets: connectsoft/service/env/key Always prefixed with connectsoft
DNS zones: internal.connectsoft.dev + tenant subdomains Used for service resolution and peering

♻️ 5. Reuse & Template Inheritance Rules

Component Reuse Scope
Subnet definitions Shared across services in same VNet
AKS node pool sizes Reused per environment class (dev/staging/prod)
OpenTelemetry agent config Injected if observability-policy.yaml is missing
Bicep, Terraform, Pulumi modules Always versioned and imported from platform IaC registry

🧠 Semantic Memory Lookup Example

query: "provision AKS for production service with secure vault"
match:
  aks_template: aks_cluster_v3
  storage_encryption: enabled
  key_vault_binding: connectsoft-vault/prod/notification
  observability: otel_collector + log analytics

βœ… Benefits of the Knowledge Base

Benefit Outcome
πŸš€ Accelerated infra generation Plug-and-play templates in Bicep/TF/Pulumi
πŸ” Policy consistency Same IAM, DNS, security posture across environments
🌍 Cloud abstraction Terraform & Pulumi ensure portability
πŸ’Ύ Reusable and testable modules All IaC artifacts inherit tested structure
πŸ“ˆ Infrastructure observability by default Pre-integrated OTEL, logs, metrics
🧠 Semantic alignment All infra outputs traceable and driven by shared intent

πŸ”„ End-to-End Process Flow

The Infrastructure Architect Agent executes a deterministic, traceable, and reusable process to generate secure, environment-ready infrastructure blueprints across Azure, AWS, GCP, and Pulumi (.NET) ecosystems.

It coordinates input parsing, module resolution, IaC generation, observability injection, and lifecycle event emission β€” ensuring every microservice or environment is cloud-ready.


πŸ“‹ Step-by-Step Flow

Step Description
1️⃣ Input Collection & Validation Load the required input files for the infrastructure setup and validate the essential parameters. The files include solution-architecture.md, resource-configuration.yaml, identity-policy.yaml, field-retention-map.yaml, observability-policy.yaml, and optionally previous-infra-state.json for drift detection. Validation ensures correct cloud provider selection, environment tags, and mandatory elements such as compute, DNS, and vault configurations.
2️⃣ Module Resolution Select the appropriate base templates from the module library based on the input files (e.g., aks_cluster, keyvault_module). Determine the cloud-specific Infrastructure as Code (IaC) format to generate (e.g., Bicep, Terraform, Pulumi C#). Additionally, check for any custom overrides or tenant-specific constraints that may affect the generated infrastructure.
3️⃣ IaC Generation Emit Infrastructure as Code (IaC) artifacts for various components, including:
- Network: VNet, subnets, NSGs, peering
- Compute: AKS, node pools, autoscaling groups
- Storage: Blob/S3, databases, vaults
- Identity: MSI, roles, bindings
- DNS: zones, resolution paths, private endpoints
Output formats include infra.bicep, main.tf, and PulumiInfra.cs.
4️⃣ Policy Injection & Observability Wiring Apply policy injection and observability wiring by integrating the following:
- Naming/tagging strategy for resource organization.
- Trace ID embedding (trace_id, service_name, environment) for distributed tracing.
- RBAC/IAM bindings based on identity-policy.yaml for secure access control.
- OpenTelemetry collector & span wiring for monitoring.
Validate that all resources are tagged with billing and governance labels, and ensure tracing spans and log sinks are configured.
5️⃣ Topology & Rollback Generation Generate topology and rollback artifacts:
- network-topology.yaml: Defines the network architecture.
- storage-definition.yaml: Outlines storage configurations.
- identity-map.yaml: Maps identity configurations.
- infra-topology.mmd: Mermaid diagram representing infrastructure.
- infra-rollback.yaml: If rollback_safe: true, generate rollback configurations. Optionally, diff the previous state (previous-infra-state.json) and emit infra-drift-detected.yaml if changes or missing resources are detected.
6️⃣ Lifecycle Metadata + Event Publishing Generate lifecycle metadata and publish events:
- Compose infra-metadata.json containing:
- Agent version
- Build ID
- Resource count
- Regions used
- Trace ID
- Output map (links to Bicep/Terraform/Pulumi).
- Emit the InfraProvisioningPlanPublished lifecycle event. Optionally, emit InfraDriftDetected, InfraRollbackReady, or InfraProvisioningFailed events based on the process outcome.

1️⃣ Input Collection & Validation

  • Load:

    • solution-architecture.md
    • resource-configuration.yaml
    • identity-policy.yaml
    • field-retention-map.yaml
    • observability-policy.yaml
    • previous-infra-state.json (optional for drift detection)
  • Validate:

    • Cloud provider selection
    • Environment tags and tenant mappings
    • Mandatory elements (e.g., compute + DNS + vault)

2️⃣ Module Resolution

  • Based on inputs:
    • Select base templates from module library (aks_cluster, keyvault_module, etc.)
    • Determine cloud-specific IaC format(s) to generate (Bicep, Terraform, Pulumi C#)
    • Check for custom overrides or tenant-specific constraints

3️⃣ IaC Generation

  • Emit IaC artifacts for:

    • Network: VNet, subnets, NSGs, peering
    • Compute: AKS, node pools, autoscaling groups
    • Storage: Blob/S3, DBs, vaults
    • Identity: MSI, roles, bindings
    • DNS: zones, resolution paths, private endpoints
  • Formats:

    • infra.bicep
    • main.tf
    • PulumiInfra.cs

4️⃣ Policy Injection & Observability Wiring

  • Apply:

    • Naming/tagging strategy
    • Trace ID embedding (trace_id, service_name, environment)
    • RBAC/IAM bindings based on identity-policy.yaml
    • OpenTelemetry collector & span wiring
  • Validate:

    • All resources tagged with billing and governance labels
    • At least one tracing span and log sink configured

5️⃣ Topology & Rollback Generation

  • Emit:

    • network-topology.yaml
    • storage-definition.yaml
    • identity-map.yaml
    • infra-topology.mmd (Mermaid)
    • infra-rollback.yaml if rollback_safe: true
  • Optionally:

    • Diff from previous-infra-state.json
    • Emit infra-drift-detected.yaml if resources changed or missing

6️⃣ Lifecycle Metadata + Event Publishing

  • Compose infra-metadata.json with:

    • Agent version
    • Build ID
    • Resource count
    • Regions used
    • Trace ID
    • Output map (bicep/terraform/pulumi links)
  • Emit:

    • InfraProvisioningPlanPublished lifecycle event
    • (Optional) InfraDriftDetected, InfraRollbackReady, InfraProvisioningFailed

🧠 Mermaid Process Diagram

flowchart TD
    A[Input Parsing] --> B[Template Resolution]
    B --> C[IaC Generation - Bicep, TF, Pulumi]
    C --> D[Policy + Observability Injection]
    D --> E[Topology + Rollback Output]
    E --> F[Emit Metadata + Events]
Hold "Alt" / "Option" to enable pan & zoom

πŸ” Auto-Correction & Retry

Condition Action
Missing cloud type Default to Azure
DNS zone undefined Inject fallback internal.connectsoft.dev
No OTEL config Apply default OTLP + collector template
Storage class mismatch Use default Standard_LRS or gp2
Secret naming violation Normalize and log fix in output trace

πŸ“ˆ Observability Trace Spans

Span Name Description
infra_inputs_validated Initial agent activation success
iac_generated When IaC is written to disk/output directory
identity_mapped When MSI + RBAC bindings are computed
topology_rendered Network + DNS plan finalized
infra_provisioning_event_published Agent lifecycle marker sent

πŸ› οΈ Semantic Kernel Skills Overview

The Infrastructure Architect Agent is powered by a modular set of Semantic Kernel (.NET) skills.
Each skill handles a focused infrastructure concern β€” from input parsing and IaC generation to security, observability, and rollback management.

All skills are reusable, composable, and support trace-based observability and drift correction.


πŸ”§ 1. Core Infra Composition Skills

Skill Purpose
IaCScaffolderSkill Generates Bicep, Terraform, and Pulumi (C#) files from resolved templates
CloudTopologyBuilderSkill Produces network-topology.yaml, infra-topology.mmd
NodePoolPlannerSkill Determines AKS/EKS node sizes, autoscale rules, zones
TaggingPolicyEnforcerSkill Injects owner, project, env, trace_id into all resources

πŸ” 2. Identity and Access Control Skills

Skill Purpose
IdentityMapGeneratorSkill Produces identity-map.yaml with MSI, SP, roles
IAMRoleBinderSkill Creates IAM bindings in Bicep, TF, or Pulumi
TenantIsolationPlannerSkill Applies per-tenant scoping rules for RBAC, resource groups
VaultIntegrationSkill Validates and links vault access with identity bindings

🌍 3. Multi-Cloud IaC Generator Skills

Skill Purpose
BicepEmitterSkill Generates infra.bicep for Azure deployments
TerraformEmitterSkill Outputs main.tf with providers, modules, locals
PulumiCSharpEmitterSkill Converts resolved infra map into PulumiInfra.cs using .NET SDKs
IaCDiffCheckerSkill Compares new IaC with previous-infra-state.json and emits drift report

πŸ“ˆ 4. Observability and Topology Skills

Skill Purpose
OpenTelemetryWiringSkill Injects OTLP collector config, exporter URLs, and trace IDs
TopologyDiagramWriterSkill Renders Mermaid diagram for VNet, Subnets, AKS, Vaults
SpanEmitterSkill Emits infra lifecycle spans to OpenTelemetry and DevOps trace
StorageDefinitionWriterSkill Builds storage-definition.yaml from retention, schema, and access tier policies

πŸ” 5. Rollback, Metadata, and Event Publication Skills

Skill Purpose
RollbackPlanBuilderSkill Produces infra-rollback.yaml based on diff and rollback flags
MetadataManifestWriterSkill Outputs infra-metadata.json with trace and artifact map
LifecycleEventPublisherSkill Emits InfraProvisioningPlanPublished and other lifecycle signals
DriftAlertEmitterSkill Triggers InfraDriftDetected event and changelog if topology changed unexpectedly

πŸ“ˆ Skill Observability Example

{
  "trace_id": "trace-infra-44219",
  "skill": "PulumiCSharpEmitterSkill",
  "output": "PulumiInfra.cs",
  "cloud_provider": "Azure",
  "status": "success",
  "duration_ms": 134
}

πŸ” Retry & Auto-Correction Behaviors

Skill Condition Auto-Correction
VaultIntegrationSkill Secret missing reference Injects fallback vault path per service/environment
TenantIsolationPlannerSkill No tenant ID provided Assumes shared infra mode
OpenTelemetryWiringSkill Missing exporter config Defaults to http://otel.default:4317 OTLP
PulumiCSharpEmitterSkill Unsupported resource Downgrades to Terraform if fallback module available

πŸ“Š Metrics Emitted per Skill

Metric Purpose
iac_files_generated_total Total IaC templates produced
drift_detected_total Number of diffs vs prior infra state
span_injection_success Whether OTEL spans were injected
rollback_plan_ready Boolean indicating rollback plan completeness
event_publish_success Confirmed lifecycle notification sent

🧰 Core Technologies and Platforms

The Infrastructure Architect Agent uses a modular, cloud-native, and automation-first tech stack to generate, validate, and export infrastructure blueprints across cloud environments.
It supports multi-IaC, multi-cloud, and developer-native (Pulumi C#) workflows, ensuring seamless integration with ConnectSoft’s broader platform and DevOps agents.


☁️ Supported Cloud Platforms

Cloud Provider Role in Agent
Azure Primary cloud for production environments β€” AKS, Key Vault, Storage, Monitor
AWS Optional deployments via Terraform β€” EKS, S3, IAM, CloudWatch
GCP Supported via Terraform β€” GKE, Cloud Storage, IAM, Stackdriver
Hybrid (Self-hosted) DNS stub zones, internal IP ranges, K3s, OpenTelemetry only

πŸ“¦ Infrastructure-as-Code Engines

Tool Purpose
Bicep Azure-native IaC (AKS, VNets, Key Vault, RBAC)
Terraform Cloud-agnostic IaC across Azure, AWS, GCP
Pulumi (C#) .NET-native IaC for developer-centric infrastructure teams
ARM Templates (fallback only) Legacy support for Azure services where Bicep isn’t available
Cloud SDKs Used via Pulumi C# for resource orchestration

πŸ› οΈ Semantic Kernel (.NET)

Use Details
Agent orchestration Composes IaC generation flow from skills
Prompt interpretation Parses and applies intent from YAML-based prompt definitions
Skill injection Loads IaC emitter skills based on output target (Bicep, TF, Pulumi)
Trace integration Embeds trace_id, agent_version, service_name into every generated file and span

πŸ” Identity and Access

Technology Purpose
Azure Managed Identity (MSI) Used by AKS, Key Vault, Storage access
Terraform IAM modules AWS/GCP support for minimal role policies
Pulumi.AzureNative.Authorization C#-based identity provisioning
Key Vault & KMS Securely manage tokens, connection strings, keys
RBAC Scopes Automatically mapped via identity-map.yaml

🌍 Networking and DNS

Stack Role
Azure Private DNS Used for internal .connectsoft.dev subdomains
Route53 / Cloud DNS Used for tenant-bound or public zones
VNet / VPC Created per environment or tenant
NSGs / Security Groups Applied per subnet or cluster node pool
Peering & Transit Gateway Supported where cross-region communication is required

πŸ“Š Observability & Logging Stack

Component Purpose
OpenTelemetry Collector Injected into infra stack to emit spans and metrics
Azure Monitor / Log Analytics Default observability sink for Azure-based infra
Prometheus + Grafana (optional) Can be configured via Pulumi or Helm
CloudWatch / Stackdriver Used on AWS/GCP via Terraform log routing modules
otel-agent-config.yaml Emitted to standardize span schema per resource type

πŸ“ Template Registry and Reuse

Format Reuse Mechanism
.bicep Imported from connectsoft/iac/modules/*.bicep
.tf Uses Terraform module blocks with source ref
.cs (Pulumi) Generated from TemplateLibrary/Modules/*.cs
Shared Tags/Locals Used across all formats: env, project, trace_id, region

πŸ” CI/CD and DevOps Integration

Tool Integration Point
Azure DevOps Pipelines Pulls IaC artifacts from infra-out folder
GitOps Optional: sync from generated output into Flux/ArgoCD
InfraProvisioningPlanPublished Triggers downstream deploy/test flows
Rollback Triggers Based on infra-rollback.yaml, linked via trace and git ref

πŸ“œ System Prompt (Bootstrapping Instruction)

This system prompt governs how the Infrastructure Architect Agent initializes, interprets architectural intent, and generates IaC artifacts in Bicep, Terraform, and Pulumi (C#) formats.

It enforces ConnectSoft’s standards for security, observability, naming, versioning, and tenant-aware resource allocation.


βœ… Full System Prompt (Plain Text)

You are the **Infrastructure Architect Agent** in the ConnectSoft AI Software Factory.

Your purpose is to take architectural definitions and environment metadata, and generate a complete, secure, and reusable infrastructure blueprint that can be deployed using Bicep, Terraform, or Pulumi (C#).

---

## Your Responsibilities:

1. Load inputs:
   - solution-architecture.md
   - application-architecture.md
   - identity-policy.yaml
   - resource-configuration.yaml
   - field-retention-map.yaml
   - observability-policy.yaml
   - previous-infra-state.json (optional)

2. Generate infrastructure outputs:
   - Bicep (`infra.bicep`)
   - Terraform (`main.tf`)
   - Pulumi C# (`PulumiInfra.cs`)
   - YAML metadata files: `network-topology.yaml`, `storage-definition.yaml`, `identity-map.yaml`
   - Mermaid diagram: `infra-topology.mmd`
   - Rollback and changelog files: `infra-rollback.yaml`, `infra-drift-detected.yaml`

3. Apply policies:
   - Use ConnectSoft’s naming, tagging, and RBAC conventions
   - Inject traceability metadata (`trace_id`, `agent_version`, `service_name`, `environment`)
   - Wire in OpenTelemetry collector and span templates

4. Emit lifecycle events:
   - `InfraProvisioningPlanPublished`
   - `InfraDriftDetected` (if previous infra state differs)
   - `InfraRollbackReady` (if rollback plan successfully created)

---

## Output Requirements:

- All artifacts must be consistent, versioned, and environment-specific
- At least one IaC format (Bicep, TF, or Pulumi) must be generated
- All infrastructure must include trace metadata and environment tags
- Resources must be deployable via automation pipelines (Azure DevOps, GitOps)
- No hardcoded secrets or identity strings β€” use vaults and RBAC

---

## Observability:

- Inject OTEL collector or sidecar per cluster or host
- Emit spans for: input parsing, template selection, IaC generation, rollback readiness
- Use `otel-agent-config.yaml` to standardize export targets and trace structure

---

## Fallbacks and Safety:

- If cloud provider not specified β†’ default to Azure
- If observability config missing β†’ inject default OTLP + stdout
- If tenant_id missing β†’ assume shared multi-tenant mode
- If rollback disabled but detected drift β†’ emit warning and skip promotion

πŸ” Policy-Driven Agent Behavior

  • All DNS zones must resolve within *.connectsoft.dev
  • All secrets and config must be sourced via Azure Key Vault or equivalent
  • All generated outputs must conform to ConnectSoft's environment-tagging, traceability, and modular reuse conventions

πŸ“₯ Input Prompt Template

The Infrastructure Architect Agent is activated by a structured YAML input prompt provided by the ConnectSoft orchestrator or Solution Architect Agent.
This prompt defines the service context, environment, tenant scope, and configuration sources needed to generate infrastructure artifacts.


βœ… Sample Input Prompt (YAML)

assignment: provision-infrastructure

project:
  trace_id: trace-infra-99881
  service_name: NotificationService
  environment: Staging
  tenant_id: tenant-42
  agent_version: 1.3.0

inputs:
  solution_architecture_url: https://.../solution-architecture.md
  application_architecture_url: https://.../application-architecture.md
  resource_configuration_url: https://.../notification/resource-config.yaml
  identity_policy_url: https://.../notification/identity-policy.yaml
  field_retention_map_url: https://.../notification/field-retention.yaml
  observability_policy_url: https://.../notification/observability.yaml
  previous_infra_state_url: https://.../notification/infra-v1.1.0.json

settings:
  output_format: [bicep, terraform, pulumi]
  inject_otel: true
  rollback_safe: true
  emit_diagram: true
  cloud_provider: Azure
  enable_tenant_isolation: true

🧩 Required Input Fields

πŸ”· project

Field Description
trace_id Unique ID for traceability, reused across all outputs
service_name The logical microservice name this infra supports
environment Deployment stage (Dev, Staging, Production)
tenant_id Tenant scope (optional β€” if omitted, shared mode assumed)
agent_version Infrastructure Architect Agent version used to generate artifacts

πŸ“ inputs

Key Description
solution_architecture_url High-level system blueprint
application_architecture_url Logical service and zone breakdown
resource_configuration_url Resource size, DNS, storage class, compute settings
identity_policy_url RBAC, MSI, tenant roles, access constraints
field_retention_map_url Storage requirements (encryption, backup)
observability_policy_url Span injection and OTEL configuration
previous_infra_state_url Optional β€” used for drift detection and rollback planning

βš™οΈ settings

Field Description
output_format Which IaC targets to generate (bicep, terraform, pulumi)
inject_otel Whether to inject OTEL collector and trace wiring
rollback_safe If true, infra-rollback.yaml will be generated
emit_diagram If true, outputs infra-topology.mmd (Mermaid)
cloud_provider Forces IaC resolution to Azure, AWS, GCP
enable_tenant_isolation Enforces subnet, DNS, or RG separation for tenant-specific infra

βœ… Validation Rules

Rule Description
service_name, trace_id, and environment must be present βœ…
At least one of output_format must be valid βœ…
If rollback_safe: true, previous state must be provided βœ…
Identity policies must define at least one principal binding βœ…
DNS zone and subnet must resolve to valid topology unless emit_diagram: false βœ…

πŸ“¦ Minimal Prompt Example

assignment: provision-infrastructure

project:
  trace_id: trace-infra-44419
  service_name: AuditService
  environment: Dev

inputs:
  resource_configuration_url: https://.../audit/resource-config.yaml
  identity_policy_url: https://.../audit/identity-policy.yaml

settings:
  output_format: [pulumi]
  rollback_safe: false

πŸ“€ Output Expectations Overview

The Infrastructure Architect Agent emits a complete, environment-aware, and IaC-compatible infrastructure bundle for every service, tenant, and environment combination.

These outputs are used by DevOps, Observability, Security, and Cloud Architecture Agents to provision, validate, and monitor real-world infrastructure across Azure, AWS, GCP, or hybrid environments.


πŸ“¦ Artifact Summary Table

Artifact Format Purpose
infra.bicep Bicep Azure-native infrastructure script
main.tf Terraform Multi-cloud IaC abstraction
PulumiInfra.cs C# (.NET Pulumi) Developer-native infrastructure declaration
network-topology.yaml YAML Defines VNet, subnets, DNS zones, NSGs
identity-map.yaml YAML Describes service-managed identities, role bindings
storage-definition.yaml YAML Blob buckets, DB definitions, backup policies
infra-metadata.json JSON Metadata: trace ID, version, region, cloud provider
infra-rollback.yaml YAML Reversion and resource safe-destruction map
infra-policy-map.yaml YAML Resource tags, region policies, identity zones
infra-topology.mmd Mermaid Visual architecture map of network + services
otel-agent-config.yaml YAML Telemetry collectors, exporters, OTLP endpoints
InfraProvisioningPlanPublished JSON (Event) Published lifecycle metadata for downstream use

🧾 Example: infra-policy-map.yaml

tags:
  owner: infra-team
  project: connectsoft
  trace_id: trace-infra-88288
  billing_code: cc123
resource_group_strategy: per-environment
tenant_isolation: subnet
observability:
  otel_enabled: true
  collector_url: http://otel.default:4317

πŸ“Š Example: infra-metadata.json

{
  "trace_id": "trace-infra-99122",
  "service": "NotificationService",
  "environment": "Production",
  "cloud_provider": "Azure",
  "agent_version": "1.3.0",
  "generated_at": "2025-05-02T01:55:00Z",
  "output_artifacts": {
    "bicep": "infra.bicep",
    "terraform": "main.tf",
    "pulumi": "PulumiInfra.cs"
  },
  "region": "eastus2",
  "rollback_ready": true
}

πŸ“ˆ Example: InfraProvisioningPlanPublished (Event)

{
  "event": "InfraProvisioningPlanPublished",
  "trace_id": "trace-infra-99122",
  "service": "NotificationService",
  "environment": "Production",
  "outputs": {
    "Pulumi": "PulumiInfra.cs",
    "metadata": "infra-metadata.json",
    "topology": "infra-topology.mmd"
  },
  "timestamp": "2025-05-02T01:55:00Z"
}

πŸ“Œ Output Constraints & Validation Rules

Requirement Enforced
All outputs include trace_id, environment, agent_version βœ…
At least one IaC format (bicep, tf, or pulumi) must be present βœ…
DNS + network must be represented in network-topology.yaml βœ…
Observability outputs must be injected or defaulted βœ…
If rollback_safe: true, infra-rollback.yaml must be emitted βœ…
Mermaid diagram emitted if emit_diagram: true βœ…

πŸ“˜ Optional: deployment-docs.md

For downstream use in Developer Portal or Audit Trail.

# Infrastructure Deployment: NotificationService

**Environment:** Production  
**Cloud:** Azure  
**Version:** v1.3.0  
**Trace ID:** trace-infra-99122

## Components
- AKS Cluster with autoscaling
- Private VNet with DNS
- Azure Blob Storage with GRS
- Key Vault with managed identity
- OTEL Collector with OLTP export

## Rollback
Rollback enabled β†’ `infra-rollback.yaml`

🧠 Memory Strategy Overview

The Infrastructure Architect Agent maintains a short-term session memory and long-term semantic memory to ensure:

  • Infrastructure consistency across environments and agents
  • Reuse of known patterns and resource identifiers
  • Rollback awareness and safe reversion
  • Drift detection and proactive remediation
  • Cross-agent alignment (DevOps, Security, Observability, Cloud)

πŸ• Short-Term (Session) Memory

Key Purpose
trace_id Tracks all outputs, spans, and events for current run
service_name Used in naming conventions, tags, identities
environment Affects DNS, tags, scaling strategy, vaults
output_format[] Controls which IaC formats to emit (Bicep, TF, Pulumi)
generated_artifacts[] List of created files for summary + event emission
drift_detected[] Captures differences from previous-infra-state.json
tenant_mode Shared vs. isolated infrastructure toggles

🧠 Long-Term Semantic Memory

πŸ” 1. Provisioned Resources History

Data Used For
Resource group names and locations Prevent collisions, support idempotency
Vault references and secret names Enforce secure naming + reuse
AKS/EKS cluster names and configurations Auto-resolve for service scaleouts
DNS zones, subnets, and peering policies Maintain cross-region resolution and tenancy compliance

πŸ” 2. Identity Mapping History

Tracked Use Case
Previously bound MSIs, SPs, and RBAC roles Re-apply correct scopes to new services
Tenant-based role templates Enforce least privilege per tenant pattern
Vault identity linkage Validate that all resources have secure vault access only

🌍 3. Observability and OTEL Tracing Memory

Retained Purpose
Previously emitted otel-agent-config.yaml Reuse collector endpoint + exporter rules
Known span naming conventions Ensure trace compatibility across deployments
Log forwarding sinks Automatically bind to Azure Monitor, CloudWatch, or Stackdriver per platform

♻️ 4. Template and Module Reuse

Element Behavior
AKS/EKS node pool configs Match by environment and region
VNet modules Reuse across environments if shared_vnet: true
Vault templates Loaded per team/project/tenant class
infra-policy-map.yaml Maintains cost, tag, and scaling standards

πŸ“‘ Memory Snapshot Output Example

{
  "trace_id": "trace-infra-55488",
  "version": "1.3.0",
  "rollback_ready": true,
  "generated_artifacts": [
    "PulumiInfra.cs",
    "network-topology.yaml",
    "identity-map.yaml",
    "infra-rollback.yaml"
  ],
  "reuse_policies": {
    "vnet": "shared",
    "dns": "per-tenant",
    "vault": "env-isolated"
  },
  "drift_detected": false
}

βœ… Memory Benefits

Value Result
πŸ” Prevent redundant redeployment Avoid resource re-creation when no change is needed
πŸ”’ Secure identity reuse Roles and bindings consistently applied
πŸ“ˆ Span and telemetry continuity Standard OTEL tracing maintained over versions
πŸ“¦ Consistent IaC structure Modules and naming reused across hundreds of microservices
🧭 Rollback safety Links each version to previous working state with traceability

βœ… Validation and Correction Overview

The Infrastructure Architect Agent includes a comprehensive, multi-layered validation pipeline to ensure that all emitted IaC artifacts:

  • Are syntactically correct
  • Follow security, naming, and tagging conventions
  • Are traceable
  • Support safe provisioning and rollback
  • Conform to ConnectSoft’s platform policies across clouds and tenants

When validation fails, the agent applies auto-corrections, issues warnings, or blocks output publication.


πŸ” Validation Phases

1️⃣ Schema & Structure Validation

Artifact Rule
infra.bicep / main.tf / PulumiInfra.cs Valid IaC syntax, no unresolved references
network-topology.yaml Must include VNet, at least one subnet, DNS zone
identity-map.yaml Must include at least one MSI or principal
storage-definition.yaml Must define replication, encryption for each store
infra-policy-map.yaml Required tags: owner, project, env, trace_id
infra-topology.mmd Valid Mermaid syntax and referenced resource IDs exist

2️⃣ Security & Naming Enforcement

Rule Correction
DNS zone must end in .connectsoft.dev Append suffix if missing
Vault secrets must be reference-based (not inline) Convert inline to key_ref: pattern
Service name must be kebab-case Auto-format and update in IaC files
Tenant resources must include tenant_id in name/tag Append suffix or tag if absent

3️⃣ Observability & Traceability

Rule Correction
OTEL config must exist if inject_otel: true Inject default otel-agent-config.yaml
Each IaC output must include trace_id, agent_version, and environment Inject automatically if missing
Must emit OpenTelemetry spans for generation lifecycle Auto-emit if span config is present or fallback allowed

4️⃣ Drift and Rollback Safety Checks

Rule Behavior
If rollback_safe: true, but no previous-infra-state.json provided Emit warning, fallback to snapshot-only rollback
If significant drift from previous state detected Emit infra-drift-detected.yaml, block auto-promotion
If no drift and version unchanged Block emission of duplicate infra.bicep or PulumiInfra.cs

πŸ” Auto-Correction Behavior Matrix

Input Error Correction Applied
DNS missing .connectsoft.dev Append suffix, log to changelog
Vault secret defined inline Convert to key_ref and generate binding
AKS region undefined Default to eastus2
Missing identity-policy.yaml Inject system-assigned MSI + Reader role
Missing tags Inject default tags: owner, project, trace_id

πŸ“’ Blocking Conditions

Condition Outcome
Invalid IaC syntax Block artifact, emit InfraProvisioningFailed
Inline secrets with no fallback Fail validation
Identity map missing required bindings Fail validation
Drift detected with rollback disabled Block publish, emit alert
Trace ID missing Block all outputs until traceability is ensured

πŸ“ˆ Observability Spans Emitted

Span Name Trigger
iac_validation_passed All IaC outputs validated
corrections_applied One or more auto-corrections were made
rollback_plan_generated infra-rollback.yaml emitted successfully
infra_drift_detected Infrastructure change from previous version
publish_blocked_due_to_drift Promotion blocked, manual review required

βœ… Validation Outcome Summary

Outcome Action
βœ… Pass Emit InfraProvisioningPlanPublished
⚠️ Partial Pass (with auto-corrections) Emit warning, mark trace
❌ Fail Emit InfraProvisioningFailed or block downstream publication
πŸ” Drift Detected Emit changelog + infra-drift-detected.yaml, mark rollback required

🀝 Agent Collaboration Overview

The Infrastructure Architect Agent plays a central role in the ConnectSoft AI Software Factory by producing foundational infrastructure on which all other services and agents depend.
It acts as both a consumer of architectural intent and a provider of deployable environments, identity scaffolding, and cloud boundaries to downstream agents.


πŸ”Ό Upstream Providers (Input Sources)

Agent Input Artifact
Solution Architect Agent solution-architecture.md β€” defines zones, environments, regions
Application Architect Agent application-architecture.md β€” lists logical services and zone groupings
Cloud Architecture Agent resource-configuration.yaml β€” compute type, DNS, region, zones
Security Architect Agent identity-policy.yaml β€” MSI, role bindings, secret access policies
Observability Agent observability-policy.yaml β€” OTEL injection and span propagation
Data Architect Agent field-retention-map.yaml β€” drives encryption, backup vault generation

πŸ”½ Downstream Consumers

Agent Consumes
DevOps Architect Agent Pulls identity-map.yaml, infra.bicep, PulumiInfra.cs to enable secure pipelines and environments
Cloud Architecture Agent Uses network-topology.yaml, infra-policy-map.yaml for region scaling, CDN planning
Security Architect Agent Reads infra-policy-map.yaml and identity definitions for role enforcement
Observability Agent Reuses otel-agent-config.yaml, emits spans from infra provisioning lifecycle
Rollback Executor Agent Consumes infra-rollback.yaml when a downgrade or revert is required
Developer Portal Generator Agent Displays topology (infra-topology.mmd) and status of infra per service/environment

πŸ“¦ Published Events

Event Trigger
InfraProvisioningPlanPublished Emitted when generation completes with success
InfraDriftDetected Published when current state differs from previous-infra-state.json
InfraRollbackReady Emitted when rollback file is prepared and safe to apply
InfraProvisioningFailed Published if IaC generation or validation fails

πŸ”— Artifact Linkage to Other Agents

Artifact Consumed By
infra.bicep, main.tf, PulumiInfra.cs DevOps Agent, CI/CD Orchestrator
network-topology.yaml API Gateway Configurator, Cloud Scaling Agent
identity-map.yaml Security, DevOps, Vault Injector Agents
storage-definition.yaml Backup, Data Migration, and Cost Monitor Agents
infra-topology.mmd Developer Portal, Audit Dashboard

🧠 Traceability and Coordination

All outputs contain:

metadata:
  trace_id: trace-infra-11889
  agent_version: 1.3.0
  service_name: NotificationService
  environment: Staging
  generated_on: 2025-05-02T02:10:00Z

These values are used to cross-reference deployments, rollback chains, and audit trails across DevOps, Security, and Monitoring agents.


πŸ“‘ Integration Flow

flowchart TD
    SolutionArchitect --> InfrastructureArchitect
    ApplicationArchitect --> InfrastructureArchitect
    CloudArchitect --> InfrastructureArchitect
    SecurityArchitect --> InfrastructureArchitect
    ObservabilityAgent --> InfrastructureArchitect

    InfrastructureArchitect --> DevOpsArchitect
    InfrastructureArchitect --> CloudArchitect
    InfrastructureArchitect --> RollbackExecutor
    InfrastructureArchitect --> DeveloperPortal
    InfrastructureArchitect --> ObservabilityAgent
Hold "Alt" / "Option" to enable pan & zoom

βœ… Collaboration Summary

Agent Relationship
πŸ”Ό Architects & Planners Provide policies, inputs, and intent
πŸ”½ Executors Consume infra, deploy, monitor, rollback, or visualize
πŸ” Cyclical Feedback Drift β†’ security/governance validation β†’ rollback or regenerate
πŸ“ˆ Spans + Events Provide observability hooks to DevOps + platform telemetry

πŸ“‘ Observability & Oversight Strategy

The Infrastructure Architect Agent is fully instrumented to support:

  • Real-time span emission
  • Lifecycle event publishing
  • Provisioning rollback awareness
  • Topology visualization
  • Audit traceability across clouds, tenants, and environments

These mechanisms ensure that every infrastructure generation cycle is observable, versioned, and governance-ready.


πŸ“ˆ OpenTelemetry Span Emission

Span Name Trigger
infra_inputs_parsed After validating incoming prompt and configs
iac_generation_completed When all IaC formats are successfully written
topology_rendered When infra-topology.mmd is generated
rollback_plan_ready When infra-rollback.yaml is complete
infra_lifecycle_event_emitted When InfraProvisioningPlanPublished is dispatched

Each span includes: - trace_id
- agent_version
- service_name
- environment
- duration_ms
- status


πŸ“’ Lifecycle Event Streams

Event Purpose
InfraProvisioningPlanPublished Main lifecycle signal with full output metadata
InfraRollbackReady Indicates safe rollback plan was generated
InfraDriftDetected Signals divergence from previously known infra state
InfraProvisioningFailed Indicates generation or validation failure (used by DevOps + Observability)

πŸ“Š Dashboards & Audit Integration

Dashboard Tile Description
IaC Generation Timeline Visual trace of infra generation per service
Drift Detection Status Last time drift was seen and what changed
Resource Count & Cost Tags Breakdown of provisioned units and billing labels
RBAC & MSI Audit Which identities were generated or reused
Topology Viewer Mermaid-rendered overlay of DNS, VNet, zones, vaults, and compute

🧭 Rollback Awareness & Safety

If rollback_safe: true:

  • infra-rollback.yaml is always generated
  • Last successful output snapshot is stored
  • Changes are diffed against previous-infra-state.json
  • Downstream rollback agents and DevOps are notified

Rollback artifacts include:

rollback:
  to_version: 1.2.0
  generated_on: 2025-05-02T02:13:00Z
  trace_id: trace-infra-99321
  files: [infra.bicep, PulumiInfra.cs, identity-map.yaml]
  approved_by: system

πŸ“¦ Final Output Bundle (Recap)

Artifact Format
infra.bicep / main.tf / PulumiInfra.cs IaC files
identity-map.yaml / network-topology.yaml / storage-definition.yaml Infra blueprints
infra-topology.mmd Mermaid diagram
infra-metadata.json Versioned metadata
infra-rollback.yaml Rollback spec
infra-policy-map.yaml Tag + policy map
otel-agent-config.yaml Telemetry routing
InfraProvisioningPlanPublished Lifecycle event

βœ… Agent Outcome Summary

Capability Delivered
πŸ” Multi-format IaC generation βœ… Bicep, Terraform, Pulumi (C#)
🧱 Modular infra blueprints βœ… Per environment, region, tenant
πŸ” Secure RBAC and secrets βœ… MSI, Vault, scoped identity maps
🌐 DNS, VNet, AKS, Storage βœ… All core infra defined
πŸ“ˆ Observability spans βœ… OTEL and span-ready outputs
♻️ Drift + rollback coverage βœ… With lifecycle events + changelogs