Skip to content

Storage

Target Architecture — Final-State Design

The Agent Mesh follows a polyglot persistence strategy aligned to access pattern: Azure SQL / PostgreSQL (via NHibernate) for durable definitions and execution records, Redis for hot execution and pool state, Blob for large prompt/response and artifact payloads, and Application Insights / OTEL for telemetry. Every store enforces the cross-cutting metadata schema, with tenantId as a first-class index for multi-tenant isolation.

Storage Map

flowchart TB
    subgraph Durable["Azure SQL / PostgreSQL (NHibernate)"]
        Defs["Agent & Skill definitions/versions"]
        Execs["Tasks, executions, validations, corrections"]
    end
    subgraph Hot["Redis"]
        State["Execution state + context cache"]
        Health["Pool health"]
    end
    subgraph Large["Azure Blob Storage"]
        Payloads["Prompts, responses, large artifacts"]
    end
    subgraph Telemetry["App Insights / OTEL"]
        Metrics["Metrics, traces, spans"]
    end
    Defs --> Execs
    Execs --> State
    Execs --> Payloads
    State --> Health
    Execs --> Metrics
Hold "Alt" / "Option" to enable pan & zoom

Data Placement

Data Store Owner Service Access Pattern Retention Notes
AgentDefinition / AgentVersion Azure SQL / PostgreSQL AgentRegistryService Read-heavy lookups by agentId/version Indefinite (versioned, immutable versions) Tenant-scoped; seeded from Platform/registry and Agents.
SkillDefinition / SkillVersion Azure SQL / PostgreSQL SkillRegistryService Read-heavy lookups by skillId/version Indefinite Immutable versions; contracts queried at execution.
AgentTask Azure SQL / PostgreSQL AgentTaskService Write on assign/transition; read by taskId/traceId 18 months hot, then archive Indexed by taskId, traceId, tenantId.
AgentExecution / SkillExecution Azure SQL / PostgreSQL AgentExecutionService Append + status update 18 months hot, then archive Hot status mirrored to Redis.
ModelInvocation / ToolInvocation Azure SQL / PostgreSQL (metadata) + Blob (payloads) AgentTelemetryService High-volume append 90 days detailed, aggregates retained Token/cost/latency in relational; prompt/response bodies in Blob.
ValidationResult / CorrectionAttempt Azure SQL / PostgreSQL AgentValidationService / AgentCorrectionService Append; read on correction 18 months Feedback messages retained for improvement attribution.
Hot execution state Redis AgentRuntimeService Low-latency read/write during a run TTL (run + short grace) Working state, step cursors, claimed-task locks.
Context package cache Redis AgentRuntimeService Cache read by contextPackageId ttlSeconds from package Mirrors Knowledge context package freshness window.
AgentHealthStatus Redis (hot) + relational (history) AgentPoolManager Frequent probe writes; read for scheduling Hot: live; history: 90 days Powers GET /agents/{agentId}/health.
Large prompt/response & artifact payloads Azure Blob Storage AgentTelemetryService / AgentExecutionService Write-once, read on demand Per classification policy Referenced by ref from relational records; classified & optionally redacted.
Telemetry (metrics, traces, spans) Application Insights / OTEL AgentTelemetryService Stream export Per observability retention policy Stitched on traceId; feeds Observability & Feedback.

Persistence Principles

  • Access-pattern fit — durable system-of-record in relational stores; ephemeral, latency-sensitive state in Redis; bulky bodies in Blob; signals in the telemetry backend.
  • Tenant isolationtenantId participates in every index used for tenant-scoped queries; no cross-tenant reads.
  • TraceabilitytraceId, correlationId, projectId, moduleId are indexed columns, never buried in blobs, so any artifact is correlatable end to end.
  • Classification — large payloads are classified; sensitive content may be redacted before storage per Knowledge governance and Security.
  • Reference, don't duplicate — relational records hold ref pointers into Blob rather than inlining large content.
  • Infrastructure as code — all stores are provisioned with Pulumi.