Skip to content

Security

Target Architecture — Final-State Design

Because the Agent Mesh runs autonomous agents that read knowledge, call models, invoke tools, and produce code, it is a high-value security surface. Its model is least-privilege, tenant-isolated, fully audited, and policy-enforced — every agent acts only within an explicit permission scope, every model call obeys a tenant model policy, and every action is an auditable event. The mesh defers cross-cutting policy to the Governance, Security & Compliance Platform.

Authentication

  • Workload identity — every mesh service and every caller authenticates with an Azure AD / Entra managed identity; there are no shared static credentials between services.
  • Service-to-service — internal calls (runtime → registry, runtime → router) use mutual workload identity over the internal network; internal endpoints are not exposed through the public gateway.
  • External callers — Control Plane, Factory Studio, and operators reach public endpoints through the factory API gateway with token-based authentication.
  • Provider auth — model provider credentials (Azure OpenAI / OpenAI / Ollama) are held by the Integration Platform, never by individual agents.

Authorization

Authorization is layered and re-evaluated on every operation — registration never grants standing access.

Layer Enforced where Checks
Tenant scope All services tenantId claim matches resource tenant.
Operation scope API + handlers Caller role (registrar, orchestrator, runtime, operator) permits the action.
Agent permission scope ToolAdapterService, ModelRouterService, Knowledge The acting AgentVersion's scope permits the tool, model policy, and classification.
Model policy ModelRouterService Provider/model selection complies with the task modelPolicyId and tenant routing.

Agent permission scoping

Every AgentVersion carries a least-privilege PermissionScope (tools, model policy, knowledge classification ceiling, resource actions). The mesh re-checks scope at each tool and model invocation, so a compromised or misbehaving agent cannot exceed its grant. Scopes are authorized by Governance at registration and audited on use. See Agent Registry.

Model policy enforcement

A modelPolicyId binds a task to allowed providers, models, regions, and cost ceilings. The ModelRouterService rejects any selection outside the policy, supports per-tenant model routing (e.g. a tenant restricted to in-region Azure OpenAI or to self-hosted Ollama for data-residency), and records the policy decision on every ModelInvocation.

Tenant Isolation

  • tenantId is part of every primary/secondary index used for tenant-scoped queries; cross-tenant reads are rejected at the handler.
  • Agent pools, context caches (Redis), and Blob payload containers are partitioned per tenant.
  • Events carry tenantId and broker subscriptions are tenant-filtered via the cs-tenant-id application property.
  • Context packages are access-checked per tenant by Knowledge governance before delivery.

Secret Handling

  • Secrets and provider credentials live in a managed secret store (Azure Key Vault), referenced by identity — never embedded in agent definitions, skills, prompts, or logs.
  • Configuration wiring is handled by the configuration/secret pipeline; the mesh resolves secret references at runtime via managed identity.
  • Prompt and response payloads are classified before storage; secrets detected in payloads are redacted (see threat considerations).

Audit

  • Every state transition and every model/tool invocation is an event in the canonical envelope, giving a complete, replayable audit trail keyed on traceId.
  • ModelInvocation and ToolInvocation records (with policy decisions) provide a per-call audit of what an agent did, with which model/tool, under which policy.
  • Audit events flow to the Observability & Feedback Platform and Governance for compliance reporting.

Threat Considerations

Threat Mitigation
Prompt injection via retrieved context or tool output Context is governed and classification-filtered by Knowledge; tool outputs are treated as untrusted and validated; skills bound prompt construction.
Prompt / output safety Inputs and outputs pass safety screening; validation includes policy rules; unsafe content fails validation and escalates.
Excessive agency (agent over-reaching) Least-privilege PermissionScope re-checked per call; bounded correction attempts; non-idempotent tools run once.
Data exfiltration Classification ceilings per agent; per-tenant model routing and data residency; redaction of sensitive payloads before storage.
Cost / token abuse Token budgets in context packages; cost ceilings in model policy; telemetry-driven anomaly detection.
Cross-tenant leakage tenantId guards on every store and subscription; partitioned caches and payload containers.
Poison / replay Idempotency on eventId; dead-letter quarantine via the PoisonTaskWorker with full envelope preservation.
Model provider compromise Provider credentials isolated in the Integration Platform; failover and policy-constrained provider selection.