Generated SaaS Observability¶

Target Architecture — Final-State Design

This page describes the observability architecture generated into every SaaS Product. Instrumentation is generated from the standard stack — Serilog for structured logs, OpenTelemetry for metrics and traces, and Azure Application Insights as the backend — and dashboards/alerts are generated as artifacts. The product's signals feed the factory's Observability & Feedback platform.

Every generated product is observable by construction. Each service and worker emits structured logs, metrics, and distributed traces correlated by the canonical event envelope traceId/correlationId. Generated dashboards and alerts make the running product legible to operators, and the same signals flow back to the factory to close the feedback loop.

Logs¶

Structured logging via Serilog with a consistent JSON schema; no unstructured string logs.
Log context carries the required dimensions on every entry (tenantId, traceId, correlationId, service, operation).
Sinks — Application Insights / Azure Monitor in production; console in development. PII is redacted at the sink per data-classification rules.
Correlation — log scopes enrich entries with the ambient traceId so logs join traces and metrics.

Metrics¶

Metric category	Examples
Request	request rate, latency (p50/p95/p99), error rate per route + tenant
Messaging	events published/consumed, consumer lag, retry count, dead-letter count
Worker	messages processed, processing duration, failures per worker
Domain	tenants provisioned, subscriptions activated, notifications sent, reports generated
Resource	CPU, memory, connection-pool usage, cache hit ratio
Business	active tenants, active users, edition distribution, metered usage

Metrics are emitted via OpenTelemetry instruments and tagged with the required dimensions below.

Traces¶

Distributed tracing via OpenTelemetry spans across the gateway, services, message consumers, and external calls.
Trace propagation — traceId/correlationId flow from the inbound request through synchronous hops and into published events, so an async side effect (e.g. a notification) shares the originating trace.
Span attributes include tenantId, eventType, operation, and outcome.

Required dimensions¶

Every signal (log, metric, trace) carries these dimensions so the factory can correlate and slice consistently:

Dimension	Source	Purpose
`tenantId`	token / envelope	Per-tenant slicing and isolation analysis
`traceId`	envelope / OTEL	End-to-end correlation
`correlationId`	envelope	Workflow/saga correlation
`service` / `moduleId`	runtime	Service attribution
`eventType`	envelope	Event-level analysis
`environment`	runtime	Env separation (dev/staging/prod)
`version`	build	Release correlation

Dashboards¶

Generated dashboards (from observability dashboard templates) give operators a ready-made view per product:

Product Health — request rate, error rate, latency, dependency health, by service and tenant.
Messaging & Workers — topic throughput, consumer lag, retries, dead-letter trends.
Tenant Activity — active tenants/users, onboarding funnel, edition distribution.
SaaS Spine — subscriptions, billing health, feature-flag rollout, notification delivery, report generation.
Security — auth failures, authorization denials, rate-limit hits, audit volume.

These dashboards are the product-side counterpart to the factory's Observability & Feedback dashboards; product signals are exported upstream.

Alerts¶

Alert	Condition	Action
High error rate	error rate > threshold over window, per service/tenant	Page on-call; link to trace
Latency SLO breach	p99 latency > SLO	Notify; autoscale evaluation
Consumer lag	Service Bus lag > threshold	Scale workers; investigate poison
Dead-letter spike	DLQ growth > threshold	Operator replay workflow
Provisioning failure	`TenantProvisioningTimedOut` / failure rate	Notify onboarding owner
Auth failure spike	auth failures > baseline	Security review
Resource saturation	CPU/memory/connection pool > threshold	Autoscale; capacity review

How observability contributes to the pillars¶

Traceability — traceId/correlationId on every signal make any request reconstructable end to end and back to factory intent.
Reusability — instrumentation, dimensions, dashboards, and alerts are generated identically across products.
Autonomy — agents generate the observability artifacts; SRE-style agents can act on alerts.
Governance — security metrics and audit volume are observable; PII redaction enforces classification.
Observability — this is the observability pillar realized in the generated product and fed back to the factory.
Multi-tenant scale — tenantId on every signal enables per-tenant SLOs, cost attribution, and noisy-neighbor detection.