Skip to content

Bounded Contexts

Target Architecture — Final-State Design

This page describes the final-state bounded-context decomposition of the Observability & Feedback Platform. Each context owns its aggregate roots, services, and stores, and communicates with the others only through the canonical event envelope.

The Observability & Feedback Platform is decomposed into seven bounded contexts. Each owns a clear slice of the observability and feedback domain, persists its own aggregates, and integrates with the rest of the factory through events rather than shared state. This keeps the platform independently scalable — high-volume ingestion contexts (Tracing, Logs, Metrics) scale separately from the transactional contexts (Incidents, Feedback & Quality, Cost).

Context Map

flowchart TB
    subgraph Ingestion["High-volume ingestion"]
        Tracing["Tracing<br/>TraceRecord, TelemetryCorrelation"]
        Logs["Logs<br/>LogRecordReference"]
        Metrics["Metrics &amp; SLO<br/>MetricSeries, SloDefinition"]
    end

    subgraph Reaction["Detection &amp; reaction"]
        DashAlerts["Dashboards &amp; Alerts<br/>DashboardDefinition, AlertRule"]
        Incidents["Incidents<br/>Incident"]
    end

    subgraph Learning["Learning &amp; economics"]
        FeedbackQuality["Feedback &amp; Quality<br/>FeedbackItem, QualityScore"]
        Cost["Cost<br/>CostSignal"]
    end

    Tracing -->|"correlated signals"| Metrics
    Logs -->|"log-derived metrics"| Metrics
    Metrics -->|"metric series"| DashAlerts
    Metrics -->|"SloBreached"| Incidents
    DashAlerts -->|"AlertTriggered"| Incidents
    Incidents -->|"IncidentResolved"| FeedbackQuality
    Tracing -->|"runtime signals"| FeedbackQuality
    Metrics -->|"usage"| Cost
    Cost -->|"CostAnomalyDetected"| FeedbackQuality
    FeedbackQuality -->|"FeedbackItemCreated"| KP["Knowledge Platform"]
Hold "Alt" / "Option" to enable pan & zoom

Contexts

Tracing

Owns end-to-end distributed traces. Ingests OpenTelemetry spans from factory services, agents, and generated SaaS products and reconstructs the full causal path of a request by traceId. The TelemetryCorrelation aggregate stitches traces, logs, metrics, and feedback into one correlated view, making it the join point for the whole platform.

Logs

Owns references to structured Serilog logs. The platform does not duplicate log bodies — Log Analytics is the system of record — but it maintains LogRecordReference aggregates that index and govern access to logs by traceId, tenant, project, and module so queries stay tenant-isolated and traceable.

Metrics & SLO

Owns aggregated time-series (MetricSeries) and service-level objectives (SloDefinition). Rolls raw counters and histograms into queryable series, tracks error budgets, and detects SLO breaches that feed alerting and incidents.

Dashboards & Alerts

Owns reusable, multi-tenant DashboardDefinition views and AlertRule definitions. Dashboards visualise metrics, traces, cost, and quality; alert rules evaluate metrics and signals and raise triggers that can open incidents.

Incidents

Owns the Incident lifecycle — open, analyse, escalate, mitigate, resolve — with full trace lineage. Incidents are the platform's unit of operational reaction and the bridge between detection (alerts, SLO breaches) and learning (feedback).

Feedback & Quality

Owns FeedbackItem and QualityScore aggregates. This is the learning context: it distils runtime signals, incidents, and human/agent feedback into durable feedback items, computes quality scores per project and artifact, and publishes them to the Knowledge Platform to close the improvement loop.

Cost

Owns CostSignal aggregates: per-tenant, per-project attribution of model, compute, and infrastructure cost, with anomaly detection. Provides the economic feedback the factory needs to optimise generation cost-effectively.

Context-to-Aggregate Map

Bounded Context Aggregate Roots Primary Store Key Events Scaling Profile
Tracing TraceRecord, TelemetryCorrelation Application Insights TraceRecorded High-volume ingestion
Logs LogRecordReference Log Analytics — (query-only) High-volume ingestion
Metrics & SLO MetricSeries, SloDefinition Application Insights + Azure SQL MetricAggregated, SloBreached High-volume aggregation
Dashboards & Alerts DashboardDefinition, AlertRule Azure SQL / PostgreSQL AlertTriggered Transactional
Incidents Incident Azure SQL / PostgreSQL IncidentOpened, IncidentResolved Transactional
Feedback & Quality FeedbackItem, QualityScore Azure SQL / PostgreSQL + Blob FeedbackItemCreated, QualityScoreComputed Transactional
Cost CostSignal Azure SQL / PostgreSQL + Blob CostAnomalyDetected Transactional + batch

Design Principles

  • Ingestion separated from reaction. Tracing, Logs, and Metrics absorb factory-wide telemetry volume; Incidents, Feedback & Quality, and Cost run as transactional services that react to distilled signals. This separation is what lets the platform scale ingestion independently of the learning loop.
  • Correlation is a first-class context. TelemetryCorrelation exists specifically to make traceId the universal join key across stores that are otherwise heterogeneous (App Insights, Log Analytics, SQL).
  • Events are the only cross-context contract. Contexts never read each other's stores. All cross-context flow uses the canonical envelope (see Events).
  • Multi-tenant by construction. Every aggregate carries tenantId; it is an isolation boundary in every store, query, dashboard, and alert.