Observability Architecture¶

Observability is designed in, not bolted on. Every action across every platform emits structured logs, metrics, and traces correlated by traceId, feeding the Observability & Feedback Platform and ultimately the knowledge improvement loop.

Telemetry stack¶

Signal	Technology
Logs	Serilog (`ConnectSoft.Extensions.Logging.Serilog`) → Application Insights / Log Analytics
Traces	OpenTelemetry (`ConnectSoft.Extensions.Observability`, `Extensions.Telemetry`)
Metrics	OpenTelemetry / `ConnectSoft.Extensions.Diagnostics.Metrics` → App Insights; optional Prometheus/Grafana
Correlation	`traceId` propagated through logs, spans, and the event envelope

Telemetry pipeline¶

flowchart LR
    services["Platform & Generated Services"] -->|OTEL SDK| collector["OTEL Collector"]
    collector --> appinsights["Application Insights / Log Analytics"]
    collector --> prom["Prometheus (optional)"]
    appinsights --> traceSvc["TraceService / MetricAggregationService"]
    traceSvc --> dashboards["DashboardService"]
    traceSvc --> alerts["AlertRuleService"]
    alerts --> incidents["IncidentService"]
    incidents --> feedback["FeedbackService"]
    feedback --> knowledge["Knowledge Platform<br/>Runtime Memory"]
    knowledge --> generation["Better Future Generation"]

Hold "Alt" / "Option" to enable pan & zoom

Required dimensions¶

Every telemetry record carries the standard dimensions so any signal can be sliced by lifecycle entity:

traceId, executionId, tenantId, projectId, moduleId, agentId, skillId, artifactId, workflowId, environment, version.

These are the same fields defined in the Metadata Schema, ensuring telemetry joins cleanly to artifacts, tasks, and deployments.

The feedback loop¶

Runtime telemetry is not just for operators — it is an input to generation. The Observability & Feedback Platform converts signals and incidents into FeedbackItems and quality scores, which the Knowledge Platform stores as runtime memory and turns into improvement candidates for future runs. This closes the loop described in the Final Reference Flow.

Observing the factory itself¶

The factory observes both the generated runtimes and its own platforms and agents: agent executions emit spans (AgentTelemetryService), workflows emit correlation events, and cost telemetry tracks LLM and infrastructure spend per project and tenant.