Skip to content

Aggregate Roots

Target Architecture — Final-State Design

This page describes the final-state aggregate roots of the Observability & Feedback Platform. Each aggregate enforces its own invariants, emits domain events through the canonical event envelope, and is persisted by a single owning service. Every aggregate carries tenantId as an isolation boundary.

The platform owns eleven aggregate roots across its seven bounded contexts. Aggregates are the consistency boundaries of the domain: each is loaded, mutated, and saved as a whole by exactly one service through a single repository.

TraceRecord

Purpose — A correlated, queryable projection of an end-to-end distributed trace, anchored by traceId, spanning factory services, agents, and generated SaaS.

FieldstraceId (identity), tenantId, projectId, moduleId, rootSpanName, startedAt, durationMs, status, dimensions (required telemetry dimensions), spanCount.

EntitiesSpan (spanId, parentSpanId, name, moduleId, durationMs, attributes), SpanEvent (timestamp, name, attributes).

Value ObjectsTraceStatus (ok | error | unset), TelemetryDimensions (the eleven required dimensions), TimeRange.

Invariants — exactly one root span (null parentSpanId); every span shares the aggregate's traceId and tenantId; durationMs ≥ 0; spans form a valid parent/child tree.

Domain EventsTraceRecorded.

RepositoryITraceRecordRepository (query by traceId, tenant, time range). Read-optimised over Application Insights.

Persistence — Application Insights (span store); correlated projection cached for query. No relational mutation path.

LogRecordReference

Purpose — A governed, tenant-scoped index reference to structured Serilog log entries stored in Log Analytics — the platform does not duplicate log bodies.

FieldsreferenceId, tenantId, projectId, moduleId, traceId, executionId, level, timestamp, logAnalyticsRef (workspace + query handle).

Entities — none (reference aggregate).

Value ObjectsLogLevel (Verbose | Debug | Information | Warning | Error | Fatal), LogAnalyticsRef.

Invariants — every reference is tenant-scoped; traceId is required for correlation; references are immutable once created.

Domain Events — none (query-only context; emits no integration events).

RepositoryILogRecordReferenceRepository (resolve and authorize Log Analytics queries by tenant/trace).

Persistence — Log Analytics is the system of record; reference metadata in Azure SQL / PostgreSQL for governance and access control.

MetricSeries

Purpose — An aggregated time-series of a single metric, grouped by required dimensions, used by dashboards, alerts, SLOs, and anomaly detection.

FieldsseriesId, tenantId, metricName, unit, groupKey (dimension key), window, points (timestamp/value), aggregation (sum | avg | p50 | p95 | p99 | count).

EntitiesMetricPoint (timestamp, value).

Value ObjectsMetricKey (metric name + dimension key), AggregationKind, Window.

Invariants — points are ordered and non-overlapping within a window; aggregation is deterministic per (metricName, window, groupKey); values consistent with unit.

Domain EventsMetricAggregated.

RepositoryIMetricSeriesRepository (upsert by deterministic key; query by metric/time range/group).

Persistence — Application Insights / metric store for points; series metadata in Azure SQL.

DashboardDefinition

Purpose — A reusable, multi-tenant dashboard composed of panels over metrics, traces, cost, and quality.

FieldsdashboardId, tenantId, name, scope (project/module), panels, version, createdAt, updatedAt.

EntitiesPanel (type, metric/sloId, groupBy, layout).

Value ObjectsDashboardScope, PanelType (timeseries | stat | table | trace | heatmap).

Invariants — unique name per tenant+scope; at least one panel; panel metric/SLO references must resolve within the tenant.

Domain EventsDashboardDefined, DashboardUpdated (internal).

RepositoryIDashboardDefinitionRepository.

Persistence — Azure SQL / PostgreSQL (NHibernate); optionally projected to App Insights workbooks / Grafana.

AlertRule

Purpose — A condition over metric series (or signals) that, when met, raises an alert trigger and optionally opens an incident.

FieldsalertRuleId, tenantId, name, scope, condition (metric, operator, threshold, forMinutes), severity, actions, enabled, version.

EntitiesAlertAction (type, target).

Value ObjectsAlertCondition, Severity (info | warning | high | critical), AlertScope.

Invariants — condition references a resolvable metric; forMinutes ≥ 0; a disabled rule never triggers; unique name per tenant+scope.

Domain EventsAlertTriggered.

RepositoryIAlertRuleRepository.

Persistence — Azure SQL / PostgreSQL (NHibernate).

Incident

Purpose — The unit of operational reaction: a tracked problem with lifecycle, severity, trace lineage, and resolution, bridging detection and learning.

FieldsincidentId, tenantId, projectId, moduleId, title, severity, status, source (alert/SLO/trace/manual), traceId, openedAt, acknowledgedAt, resolvedAt, rootCause, dimensions.

EntitiesIncidentEvent (timestamp, type, actor, note), EscalationStep (level, target, at).

Value ObjectsIncidentStatus (open | acknowledged | investigating | mitigated | resolved), Severity, IncidentSource.

Invariants — status transitions follow the allowed lifecycle (see Workflows); resolvedAt requires rootCause; a resolved incident is immutable except for post-mortem notes; tenantId matches source.

Domain EventsIncidentOpened, IncidentResolved (plus internal IncidentAcknowledged, IncidentEscalated).

RepositoryIIncidentRepository.

Persistence — Azure SQL / PostgreSQL (NHibernate); event log appended per transition.

FeedbackItem

Purpose — A durable feedback record distilled from a runtime signal, incident, human, or agent — the platform's primary contribution to the improvement loop.

FieldsfeedbackItemId, tenantId, projectId, artifactId, source, sourceId, category, sentiment, summary, detail, traceId, status, createdAt.

EntitiesFeedbackAttachment (blob ref, kind).

Value ObjectsFeedbackSource (incident | runtime-signal | human | agent), FeedbackCategory (reliability | performance | cost | maintainability | correctness), Sentiment (positive | neutral | negative), FeedbackStatus (captured | routed | applied).

Invariantssummary is required; sourceId required when source is incident/runtime-signal; artifactId or projectId present for attribution; immutable summary once routed.

Domain EventsFeedbackItemCreated.

RepositoryIFeedbackItemRepository.

Persistence — Azure SQL / PostgreSQL (metadata) + Azure Blob (attachments, exports).

QualityScore

Purpose — A computed, multi-dimensional quality score for a project (and its artifacts), derived from feedback, incidents, and SLO adherence.

FieldsqualityScoreId, tenantId, projectId, computedAt, overall, dimensions (reliability, performance, cost_efficiency, maintainability, correctness), artifactScores, window.

EntitiesArtifactScore (artifactId, score, openFeedbackCount).

Value ObjectsScoreVector (dimension → 0..1), Window.

Invariants — all scores in [0,1]; overall is a deterministic function of dimensions; recomputation for the same (project, window) is idempotent.

Domain EventsQualityScoreComputed.

RepositoryIQualityScoreRepository.

Persistence — Azure SQL / PostgreSQL (NHibernate).

CostSignal

Purpose — Attribution of model, compute, and infrastructure cost to a tenant/project, with anomaly detection feeding the economic side of the loop.

FieldscostSignalId, tenantId, projectId, period, currency, total, breakdown (category → amount), anomalies, computedAt.

EntitiesCostBreakdownLine (category, amount), CostAnomaly (category, detectedAt, deltaPct, baseline).

Value ObjectsMoney (amount + currency), CostCategory (model_inference | compute | storage | network), Period.

Invariantstotal equals the sum of breakdown lines; amounts ≥ 0; anomalies reference an existing category; deterministic per (project, period).

Domain EventsCostAnomalyDetected.

RepositoryICostSignalRepository.

Persistence — Azure SQL / PostgreSQL (metadata) + Azure Blob (detailed cost exports).

SloDefinition

Purpose — A service-level objective with target, window, and error budget, evaluated continuously to detect breaches.

FieldssloId, tenantId, projectId, name, indicator (SLI metric), target (e.g. 99.9), window (rolling), errorBudget, budgetRemaining, status.

EntitiesBudgetBurnEvent (timestamp, burnedPct).

Value ObjectsSli (metric + good/total definition), SloTarget, SloStatus (healthy | at-risk | breached).

Invariantstarget in (0,100]; budgetRemaining in [0, errorBudget]; status breached requires budgetRemaining = 0; window is positive.

Domain EventsSloBreached.

RepositoryISloDefinitionRepository.

Persistence — Azure SQL / PostgreSQL (NHibernate); budget burn computed from MetricSeries.

TelemetryCorrelation

Purpose — The join aggregate that stitches traces, logs, metrics, incidents, and feedback into a single correlated view per traceId — the universal correlation key of the platform.

FieldstraceId (identity), tenantId, projectId, dimensions, traceRef, logRefs, metricRefs, incidentRefs, feedbackRefs, updatedAt.

EntitiesCorrelationLink (kind, targetId, store).

Value ObjectsTelemetryDimensions, CorrelationKind (trace | log | metric | incident | feedback).

Invariants — exactly one correlation per traceId per tenant (upsert); all linked refs share the same tenantId; links reference resolvable records.

Domain Events — emits internal correlation snapshots; no public integration event.

RepositoryITelemetryCorrelationRepository (upsert by traceId).

Persistence — Application Insights (trace anchor) + Azure SQL (link index).

Aggregate Summary

Aggregate Root Bounded Context Owning Service Key Events Primary Store
TraceRecord Tracing TraceService TraceRecorded Application Insights
LogRecordReference Logs LogQueryService Log Analytics
MetricSeries Metrics & SLO MetricAggregationService MetricAggregated App Insights + SQL
DashboardDefinition Dashboards & Alerts DashboardService DashboardDefined Azure SQL / PostgreSQL
AlertRule Dashboards & Alerts AlertRuleService AlertTriggered Azure SQL / PostgreSQL
Incident Incidents IncidentService IncidentOpened, IncidentResolved Azure SQL / PostgreSQL
FeedbackItem Feedback & Quality FeedbackService FeedbackItemCreated Azure SQL + Blob
QualityScore Feedback & Quality QualityScoreService QualityScoreComputed Azure SQL / PostgreSQL
CostSignal Cost CostTelemetryService CostAnomalyDetected Azure SQL + Blob
SloDefinition Metrics & SLO SloService SloBreached Azure SQL / PostgreSQL
TelemetryCorrelation Tracing TelemetryCorrelationService (internal) App Insights + SQL