Skip to content

๐ŸŒ Environment Manager Agent Specification

๐ŸŽฏ Purpose

The Environment Manager Agent owns the full environment provisioning lifecycle within the ConnectSoft AI Software Factory. It ensures that every target runtime environment โ€” dev, staging, UAT, and production โ€” is correctly provisioned, configured, validated, and healthy before any deployment proceeds.

It operates as the authoritative source of truth for environment state, bridging the gap between infrastructure definitions and live deployment targets.

It guarantees that no service is deployed into an unprepared, drifted, or unhealthy environment.


๐Ÿงญ Role in the Platform

The Environment Manager Agent sits between infrastructure provisioning and deployment orchestration, ensuring that environments are ready, consistent, and policy-compliant before workloads arrive.

Factory Layer Agent Role
Infrastructure Consumes provisioning outputs from Cloud Provisioner Agent
DevOps & Delivery Validates environment readiness before deployment proceeds
Configuration Coordinates with Configuration Manager Agent for env-specific vars
Observability Emits health reports and drift detection events
Security Enforces environment isolation policies and access boundaries

๐Ÿ“Š Position Diagram

flowchart TD

  subgraph Infrastructure
    A[Cloud Provisioner Agent]
    B[Infrastructure Architect Agent]
  end

  subgraph DevOps & Delivery
    C[Environment Manager Agent]
    D[Deployment Orchestrator Agent]
    E[DevOps Engineer Agent]
  end

  subgraph Configuration
    F[Configuration Manager Agent]
  end

  subgraph Observability
    G[Observability Engineer Agent]
  end

  B --> A
  A --> C
  F --> C
  C --> D
  C --> E
  C --> G
  D --> G
Hold "Alt" / "Option" to enable pan & zoom

The Environment Manager Agent ensures environments are provisioned, validated, and ready before the Deployment Orchestrator begins rollout.


๐Ÿ“‹ Triggering Events

The Environment Manager Agent is activated by the following events:

Event Source Description
project_initiated Orchestration Agent New project requires environment bootstrapping across all tiers
deployment_requested DevOps Engineer Agent Pre-deployment environment readiness check
environment_drift_detected Observability / Monitoring Environment configuration has diverged from desired state
environment_health_degraded Health Check Monitor Probes indicate unhealthy environment components
environment_promotion_needed Release Manager Agent Promotion from staging to production requires target validation

๐Ÿ“Œ Responsibilities

๐Ÿ”ง Core Responsibilities

โœ… 1. Environment Provisioning Lifecycle

  • Bootstrap new environments on project initiation
  • Coordinate with Cloud Provisioner Agent for Azure resource creation
  • Track environment state across dev, staging, UAT, production
  • Maintain environment registry with metadata and health status

โœ… 2. Parity Validation

  • Ensure configuration, resource topology, and policies are consistent across tiers
  • Detect and report parity drift between staging and production
  • Validate that environment overlays match expected baselines
parity_check:
  reference: staging
  target: production
  compare:
    - resource_topology
    - networking_rules
    - secret_mounts
    - scaling_policies
    - rbac_bindings

โœ… 3. Environment-Specific Configuration

  • Apply namespace, resource quota, and network policy overlays per environment
  • Coordinate with Configuration Manager Agent for runtime variables
  • Inject environment labels and metadata into Kubernetes namespaces

โœ… 4. Health Checks and Readiness

  • Run pre-deployment health probes on environment infrastructure
  • Validate DNS, ingress controllers, service mesh, and storage availability
  • Emit EnvironmentReady or EnvironmentUnhealthy events

โœ… 5. Drift Detection and Remediation

  • Continuously compare desired state against actual infrastructure state
  • Alert on unauthorized changes or configuration tampering
  • Trigger auto-remediation or escalation to HumanOpsAgent

โœ… 6. Environment Teardown and Recycling

  • Deprovision ephemeral environments (e.g., PR-based preview environments)
  • Archive environment metadata before teardown
  • Emit EnvironmentDeprovisioned event with trace linkage

๐Ÿ“Š Responsibilities and Deliverables

Responsibility Deliverable
Environment bootstrapping Provisioned namespace, quotas, network policies
Parity validation parity-report.json comparing environment tiers
Health monitoring environment-health-report.json with probe results
Configuration coordination environment-config.yaml with resolved variables
Drift detection drift-report.json with delta and remediation steps
Teardown EnvironmentDeprovisioned event and archived state

๐Ÿ“ค Output Types

Output Type Format Description
environment-config YAML Resolved environment configuration including namespaces, quotas
environment-health-report JSON Health probe results for all environment components
parity-report JSON Cross-environment comparison results
drift-report JSON Detected deviations from desired state

๐Ÿงพ Example environment-config Output

environment:
  name: staging
  namespace: cs-staging-orderservice
  cluster: aks-connectsoft-west
  resource_quotas:
    cpu: "8"
    memory: "16Gi"
    pods: "50"
  network_policy: restricted
  ingress_class: nginx
  tls_enabled: true
  secret_store: azure-keyvault-staging
  labels:
    trace_id: trace-env-4421
    managed_by: environment-manager-agent
    tier: staging

๐Ÿงพ Example environment-health-report Output

{
  "environment": "staging",
  "cluster": "aks-connectsoft-west",
  "trace_id": "trace-env-4421",
  "timestamp": "2025-06-10T14:22:00Z",
  "probes": [
    { "component": "ingress-controller", "status": "healthy", "latency_ms": 12 },
    { "component": "dns-resolution", "status": "healthy", "latency_ms": 3 },
    { "component": "keyvault-access", "status": "healthy", "latency_ms": 45 },
    { "component": "storage-class", "status": "healthy", "latency_ms": 8 }
  ],
  "overall_status": "healthy",
  "agent": "environment-manager-agent"
}

๐Ÿ”„ Process Flow

flowchart TD
    A[Trigger Received] --> B[Resolve Target Environment]
    B --> C[Load Environment Blueprint]
    C --> D[Provision or Validate Infrastructure]
    D --> E[Apply Configuration Overlays]
    E --> F[Run Health Probes]
    F --> G{All Probes Healthy?}
    G -- Yes --> H[Emit EnvironmentReady + Config]
    G -- No --> I{Retryable?}
    I -- Yes --> D
    I -- No --> J[Emit EnvironmentUnhealthy + Notify Ops]
Hold "Alt" / "Option" to enable pan & zoom

๐Ÿชœ Step-by-Step Breakdown

Step Action
1 Receive trigger event (project_initiated, deployment_requested, or drift_detected)
2 Resolve target environment from event context and trace metadata
3 Load environment blueprint defining expected topology, quotas, and policies
4 Provision new resources or validate existing infrastructure against blueprint
5 Apply environment-specific configuration overlays (namespace labels, network policies, secrets)
6 Execute health probes against DNS, ingress, storage, vault, and mesh components
7 If healthy: emit EnvironmentReady event with config artifacts
8 If unhealthy: retry provisioning or escalate to HumanOpsAgent

๐Ÿค Collaboration Patterns

๐Ÿ“ฅ Upstream Inputs From

Agent Input
Cloud Provisioner Agent Base infrastructure (AKS cluster, networking, storage)
Infrastructure Architect Agent Environment blueprints, topology definitions
Configuration Manager Agent Runtime configuration, secrets references, feature flags
Release Manager Agent Promotion requests requiring target environment validation

๐Ÿ“ค Downstream Consumers

Agent Output Consumed
Deployment Orchestrator Agent EnvironmentReady event, environment-config.yaml
DevOps Engineer Agent Environment metadata for pipeline generation
Observability Engineer Agent Health reports and drift alerts for dashboards
HumanOpsAgent Escalation on EnvironmentUnhealthy or unrecoverable drift

๐Ÿ” Event-Based Communication

Event Trigger Consumed By
EnvironmentReady Successful provisioning and health validation Deployment Orchestrator, DevOps Engineer
EnvironmentUnhealthy Failed health probes after retries HumanOpsAgent, Observability Agent
EnvironmentDriftDetected Desired vs actual state mismatch HumanOpsAgent, Configuration Manager
EnvironmentDeprovisioned Teardown of ephemeral environment Audit Agent, Release Manager

๐Ÿงฉ Collaboration Sequence

sequenceDiagram
    participant CloudProv as Cloud Provisioner Agent
    participant EnvMgr as Environment Manager Agent
    participant ConfigMgr as Configuration Manager Agent
    participant DeployOrch as Deployment Orchestrator Agent
    participant HumanOps as HumanOpsAgent

    CloudProv->>EnvMgr: Infrastructure Provisioned
    ConfigMgr->>EnvMgr: Environment Config Resolved
    EnvMgr->>EnvMgr: Validate + Health Check
    EnvMgr->>DeployOrch: Emit EnvironmentReady
    EnvMgr->>HumanOps: (On Failure) Emit EnvironmentUnhealthy
Hold "Alt" / "Option" to enable pan & zoom

๐Ÿง  Memory and Knowledge

๐Ÿ“Œ Short-Term Memory (Execution Scope)

Field Purpose
trace_id Links environment operations to originating blueprint
environment_name Target environment being managed
health_probe_results Current probe status during validation
provisioning_state Tracks whether provisioning is in-progress or complete

๐Ÿ’พ Long-Term Memory (Persistent)

Memory Type Purpose
Environment Registry Tracks all environments, their status, and last-known config
Parity Baseline Cache Stores reference configurations for cross-tier comparison
Drift History Log Records all detected drift events with timestamps and deltas
Health Probe History Trends health status over time for SLA tracking
Teardown Archive Preserves metadata of deprovisioned environments for audit

๐Ÿ“š Knowledge Base

Knowledge Area Description
Environment Blueprints Topology definitions per tier (dev/staging/UAT/prod)
Provisioning Templates Namespace, quota, network policy, and RBAC templates
Health Probe Definitions Standard probes for DNS, ingress, vault, storage, mesh
Parity Rules Which attributes must match across tiers
Drift Remediation Playbooks Auto-fix strategies for common drift scenarios

โœ… Validation

The Environment Manager Agent validates every managed environment against:

๐Ÿงช Validation Categories

Category Checks Performed
Infrastructure Readiness Cluster accessible, namespace exists, quotas applied
Network Connectivity DNS resolves, ingress controller responds, egress rules in place
Secret Store Access Azure Key Vault or Kubernetes secrets accessible from pods
Configuration Completeness All required env vars, config maps, and feature flags present
Parity Compliance Target environment matches reference tier within tolerance
Security Boundaries RBAC bindings, network policies, and pod security standards enforced

โŒ Failure Actions

Failure Type Action
Cluster unreachable Abort and emit EnvironmentUnhealthy
Missing namespace or quotas Auto-create if policy allows, else escalate
Secret store inaccessible Retry with backoff, then escalate to HumanOpsAgent
Parity drift beyond threshold Emit EnvironmentDriftDetected and block deployment
Health probe timeout Retry up to 3 times, then emit failure event

๐Ÿ“Š Validation Output Format

{
  "trace_id": "trace-env-4421",
  "environment": "staging",
  "validation_status": "passed",
  "checks": [
    { "category": "infrastructure", "status": "passed" },
    { "category": "networking", "status": "passed" },
    { "category": "secrets", "status": "passed" },
    { "category": "parity", "status": "passed", "reference": "production" }
  ],
  "agent": "environment-manager-agent",
  "timestamp": "2025-06-10T14:25:00Z"
}

๐Ÿงฉ Skills and Kernel Functions

Skill Purpose
EnvironmentProvisionerSkill Bootstrap namespaces, quotas, and network policies
ParityValidatorSkill Compare environment tiers and report deviations
HealthProbeRunnerSkill Execute health checks against infrastructure components
DriftDetectorSkill Compare desired vs actual state and emit drift reports
ConfigOverlayApplierSkill Merge environment-specific configuration overlays
EnvironmentTeardownSkill Safely deprovision ephemeral environments with archival
EventEmitterSkill Emit lifecycle events (EnvironmentReady, EnvironmentUnhealthy)
TraceMetadataInjectorSkill Attach trace_id and blueprint references to all outputs

๐Ÿ“ˆ Observability Hooks

Span Name Description
envmgr.provision.start Start of environment provisioning
envmgr.healthcheck.run Execution of health probes
envmgr.parity.validate Cross-tier parity comparison
envmgr.drift.detect Drift detection scan
envmgr.complete Successful environment readiness
envmgr.failed Environment provisioning or validation failure

Span Tags

  • trace_id, environment, cluster, agent: environment-manager-agent
  • status: ready | unhealthy | drifted
  • probe_count, drift_delta_count

๐Ÿง  Summary

The Environment Manager Agent is the gatekeeper of deployment readiness in the ConnectSoft AI Software Factory. It ensures that:

  • ๐ŸŒ Every environment is provisioned, validated, and healthy before workloads arrive
  • ๐Ÿ”„ Parity is enforced across tiers to prevent promotion surprises
  • ๐Ÿ” Drift is detected and remediated proactively
  • ๐Ÿ“Š Health status is continuously monitored and reported
  • ๐Ÿงฉ Configuration is coordinated with the Configuration Manager Agent

It transforms environment management from a manual, error-prone task into an autonomous, trace-linked, policy-driven operation โ€” ensuring that the platform's deployment foundation is always trustworthy and ready.