π§ Data Architect Agent Specification
π― Purpose
The Data Architect Agent is responsible for designing, evolving, and governing the data storage layer of every service in the ConnectSoft AI Software Factory β while ensuring:
Scalability (partitioned & cloud-native storage)
Integrity (strong constraints, referential safety)
Compliance (PII classification, retention rules, auditability)
Extensibility (schema versioning, event rehydration, projections)
Zero-downtime evolution (migration automation, backward compatibility)
It produces platform-ready, validated, and observable database blueprints and migration artifacts , aligned with Clean Architecture, DDD, and ConnectSoft's security-first and observability-first principles.
π‘ Why This Agent Exists
Without a dedicated Data Architect Agent:
Tables and schemas drift from the domain model
Indexes, constraints, and nullability rules are inconsistently applied
Data becomes fragmented across incompatible formats
Migrations are unsafe, untraceable, and break CI/CD flows
No unified view of multitenancy, soft-deletion, or retention timelines
Event-sourced services lack structured projections for querying
With this agent:
β
Schemas are derived directly from the domain model
β
Migrations are tracked and environment-specific
β
Data is partitioned and scoped per tenant
β
PII fields and retention rules are codified
β
All artifacts are versioned, diffable, and CI/CD-integrated
π§± Design Scope
β
DDL generation (SQL, YAML, FluentMigrator)
β
Migrations (create, alter, drop, index, rename, seed)
β
NoSQL collections & document design (MongoDB, Cosmos)
β
Projection modeling for event-driven views
β
Multitenant partitioning rules
β
TTL-based storage (for logs, sessions, events)
β
Data retention & redaction policy tagging
β
Observability : schema span emissions, table access tracing
β
Migration versioning via FluentMigrator or SQL tags
π Multi-Migration Strategy Support
The agent supports three primary types of migrations per service:
Migration Type
Format
Purpose
Declarative
schema.sql, schema.yaml
For schema-first services; output = full source of truth
Imperative
*.cs using FluentMigrator
For .NET-based microservices using versioned migrations
Composable
migration-*.sql scripts
For raw SQL runners, cross-dialect compatibility, rollback-friendly
β
The agent can emit all three types depending on settings.migration_strategy in the input prompt.
ποΈ Example Output Bundle
/schema
schema.yaml
schema.sql
migration-20250422-initialize-db.sql
migration-20250423-add-index-on-email.sql
fluent/20250423_AddAuditColumns.cs
/metadata
field-retention-map.yaml
er-diagram.mmd
schema-observability.yaml
DatabaseSchemaPublished.json
π§ Role in the Agentic Lifecycle
flowchart TD
DomainModeler --> DataArchitect
ApplicationArchitect --> DataArchitect
DataArchitect --> MigrationEngine
DataArchitect --> TestGenerator
DataArchitect --> ObservabilityAgent
DataArchitect --> DevOpsAgent
Hold "Alt" / "Option" to enable pan & zoom
π§ Summary
Feature
Value
Domain-aligned schemas
β
DDD mapping to storage
Migration strategy
β
FluentMigrator, SQL, or YAML
Audit & retention awareness
β
PII tagging, TTL injection
Multitenant design
β
Partition, filter, and access rules per tenant
CI/CD readiness
β
Outputs versioned, published, and validated
Observability
β
Tables and fields mapped to telemetry templates
ποΈ Scope of Influence
The Data Architect Agent governs all aspects of a serviceβs persistent storage architecture , ensuring every decision is:
Domain-aligned
Cloud-native
Multi-tenant ready
Compliant
Migration-safe
It acts as the authoritative designer of relational, NoSQL, event-sourced, and cache-backed storage layers for microservices, modules, and shared infrastructure components.
π Logical Scope
Area
Included
β
Table/collection design
CREATE TABLE / collection.definition
β
Field types, nullability, defaults
e.g., amount NUMERIC(10,2) NOT NULL DEFAULT 0.0
β
Indexes, keys, constraints
Primary, unique, foreign keys, composite indexes
β
Value object flattening
Embed VOs inside parent entities (e.g., Address, Money)
β
Enum mapping
Stored as integer or string (configurable per enum)
β
Tenant scoping
Partition key, row filters, access policies
β
History/versioning fields
CreatedAt, UpdatedAt, DeletedAt, Version, etc.
β
Observability tagging
Span templates for table operations, audit log hooks
β
TTL & retention
Policies for deletion, redaction, expiration
π§± Physical Scope
Storage Type
Responsibilities
Relational DB (Postgres, SQL Server)
Normalized schema, constraints, indexes, foreign keys
NoSQL DB (MongoDB, Cosmos DB)
Document design, nested field constraints, TTL indexing
Event Store / Projection DB
Denormalized tables/views from event replay
Blob Storage
Data classification (e.g., x-sensitive, file naming, metadata tagging)
Cache/Read Replicas
Declare what fields can be projected or cached
Embedded DB (LiteDB, SQLite for mobile/edge)
Strip down schema for constrained environments
π Temporal Scope
Aspect
Controlled
Schema versioning
schema.yaml.version, migrationId, fluentMigrator version
Historical data rules
Audit logs, soft-deletion, row snapshots
Retention policies
Time-based (e.g., 30d logs), legal holds
TTL logic
Auto-purge of sessions, messages, etc.
π Compliance and Security Scope
Feature
Role
PII classification
Tags fields in field-retention-map.yaml
Redactable fields
Emits policy flags (x-redactable: true)
Encrypted at rest
Identifies fields requiring AES/GCM or external KMS
Row-level security (Postgres)
Maps tenant access conditions to RLS rules
Data masking
Emits masking metadata for tools like SQL Server Dynamic Data Masking or Cosmos DB Confidential LSM
π« Out of Scope
Not Owned
Delegated To
Migration execution (runtime)
β
DevOps Architect Agent
ORM entity generation (e.g., NHibernate)
β
Adapter Generator Agent
Data analytics or BI schema
β (handled by future Analytics Architect Agent)
Application-layer queries (LINQ, etc.)
β
Developer responsibility
Full-text search design
β
Search Architect Agent (future role)
π¦ Output Influence Summary
Artifact
Owned
schema.sql
β
schema.yaml
β
migration-*.sql
β
migration/*.cs (FluentMigrator)
β
er-diagram.mmd
β
field-retention-map.yaml
β
DatabaseSchemaPublished.json
β
Runtime index management
β (influences DevOps but not runtime schema sync)
π§ ConnectSoft Scope Matrix
Layer
Data Architect Agent
Domain
Reads (VOs, Entities, Enums)
Application
Reads use case flows for read/write frequency
Infrastructure
Owns physical schema + migrations
Observability
Emits table-span templates
Security
Flags PII, tenant, RLS, retention fields
Governance
Outputs versioning + changelogs
π Core Responsibilities
The Data Architect Agent owns the definition and evolution of persistent storage blueprints
for ConnectSoft microservices, ensuring they are:
Domain-driven
Observable
Secure
Versioned
Composable
It produces schemas, migrations, diagrams, and compliance maps ready for CI/CD, deployment, and governance pipelines.
π 1. Schema Design Responsibilities
Task
Output
Translate domain aggregates and value objects into tables/collections
schema.yaml, schema.sql, or schema.mongo.json
Flatten embedded value objects intelligently
E.g., Address β address_line1, city, zip
Handle enums as int, string, or lookup table (based on config)
enum-strategy.yaml
Define primary keys , foreign keys , unique indexes , nullable/required flags
schema.sql, migration-*.sql
Generate observability fields
created_at, updated_at, trace_id, version, tenant_id
π 2. Migration and Evolution Responsibilities
Task
Output
Generate forward-only SQL migrations
migration-<timestamp>-*.sql
Generate FluentMigrator C# classes
Migrations/20250501_AddFieldToClient.cs
Track schema version and emit diffs
schema.version, migration.log
Suggest rollback plan or up/down separation
Optional down.sql or Revert... migration
Handle safe field changes: nullable β non-null, type widening, enum expansion
Auto-patch and document
π§ͺ 3. Validation and Consistency Enforcement
Task
Validated
Naming conventions
snake_case for SQL, camelCase for NoSQL
Type consistency with domain model
Match DTOs and VOs (or explain mismatch)
Field constraints
From validation-map.yaml: minLength, max, enum, etc.
Nullability rules
Optionality mapped to nullable column/type
Index redundancy
Avoid overlapping composite/unique/index sets
RLS & tenant fields
Ensure tenant_id included where required
π 4. Retention, Compliance, and Security
Task
Output
PII/retention annotation
field-retention-map.yaml
Masking/redaction strategy
x-redactable, x-sensitive tags
TTL declaration for short-lived entities
ttl-policy.yaml or inline schema flags
Data encryption flagging
Fields marked x-encrypted: true
Compliance scope linking
Maps to GDPR/SOC2 classification tags
π§ 5. Observability and Metrics Integration
Task
Output
Emit schema-span templates
schema-observability.yaml
Mark fields as observable (e.g., invoice.total, client.status)
Used by Observability Agent
Inject audit trail structure
created_by, modified_by, trace_id, correlation_id
Emit DatabaseSchemaPublished event with trace_id, schema version, and changelog URL
π 6. Documentation and Diagrams
Task
Output
Generate ER diagram from schema
er-diagram.mmd
Describe relationships in Markdown
data-relationships.md
Auto-link schema objects to glossary terms
Enables rich developer portal linking
β
Summary of Responsibilities
Area
Description
π§± Physical schema design
β
Tables, fields, keys, indexes
π§© Domain-aligned translation
β
Aggregates, VOs, enums β storage
π¦ Migration authoring
β
FluentMigrator, SQL, DDL diffs
π Retention and compliance tagging
β
PII, TTL, masking, audit trail
π Observability tagging
β
Span templates, access fields
π Documentation and traceability
β
ERD, Markdown, lifecycle event
The Data Architect Agent consumes structured, traceable, and semantically meaningful artifacts
from upstream agents to model and evolve the persistent storage layer.
These inputs define:
The entities , value objects , and enums to persist
The field-level constraints and validations
The retention , PII , and security policies
The observability and multitenancy boundaries
Any previously generated schema snapshots for diffing and migration generation
Artifact
Source Agent
Purpose
domain-model.yaml
Domain Modeler Agent
Defines aggregates, value objects, identifiers
validation-map.yaml
Domain Modeler Agent
Field-level constraints (e.g., not null, length, enum)
use-cases.yaml
Application Architect Agent
Determines access patterns (write-heavy, read projections)
field-retention-policy.yaml
Compliance Architect Agent
Flags fields as pii, ttl, audit_required, redactable
application-architecture.md
Application Architect Agent
Maps services to tenancy rules and cross-entity flows
previous-schema.yaml (optional)
CI/CD or prior agent run
Used for diffing and migration detection
trace_id, bounded_context, service_name
Orchestrator
Enables traceability and governance audit linkage
πΆ domain-model.yaml
aggregates :
- name : Client
fields :
- name : clientId
type : Guid
required : true
- name : email
type : Email
- name : address
type : Address
- name : status
type : ClientStatus
value_objects :
- Address
- Email
enums :
- ClientStatus : [ Active , Inactive , Suspended ]
πΆ validation-map.yaml
Client :
email :
format : email
maxLength : 255
address.zip :
minLength : 5
maxLength : 10
status :
enum : [ Active , Inactive , Suspended ]
πΆ field-retention-policy.yaml
fields :
Client.email :
pii : true
redactable : true
encrypted : true
Client.status :
retention : 5y
Client.createdAt :
audit_required : true
πΆ application-architecture.md (Excerpt)
This service is multi-tenant. All persistent objects must include a tenant_id column.
The Client aggregate supports soft deletion (i.e., deleted_at should be modeled).
The Address VO is reused across four services β prefer embedded rather than normalized form.
Rule
Enforced
domain-model.yaml and validation-map.yaml must be present
β
Each field must resolve to a scalar or embeddable VO
β
Enum values must be matched to definition (for type safety)
β
If multi-tenancy is enabled, tenant_id must be generated and included
β
Previous schema required if migration_strategy = diff
β
Traceability fields must be resolved
β
(trace_id, service_name, bounded_context)
π§ Semantic Memory Integration (Optional)
Query
Result
"Invoice.CreatedAt"
Suggest timestamp with time zone, index on created_at, add to audit trail
"Status" enum reuse
Suggest shared enum or foreign key to status_types table
"Address" VO reuse
Suggest embedded pattern with naming template address_city, address_zip, etc.
assignment : "generate-schema"
project :
service_name : ClientService
trace_id : trace-data-112938
bounded_context : CRM
inputs :
domain_model_url : https://.../domain-model.yaml
validation_rules_url : https://.../validation-map.yaml
retention_policy_url : https://.../field-retention-policy.yaml
app_architecture_url : https://.../application-architecture.md
previous_schema_url : https://.../schema-v1.yaml
settings :
migration_strategy : fluentmigrator
output_format : [ sql , yaml , erd ]
π€ Core Outputs
The Data Architect Agent produces a complete, versioned, and traceable data schema bundle ,
ready for:
CI/CD deployment
FluentMigrator-based evolution
Event-sourced projection building
Observability, compliance, and retention auditing
Developer portal documentation
π¦ Primary Output Artifacts
Artifact
Format
Purpose
schema.sql
SQL DDL
Declarative schema for relational DBs
schema.yaml
YAML DSL
Platform-native representation of entity/table structure
migration-*.sql
SQL
Forward-only or reversible migrations
Migrations/*.cs
C# FluentMigrator classes
Versioned imperative migrations
field-retention-map.yaml
YAML
Field-level metadata: PII, TTL, encryption
er-diagram.mmd
Mermaid
Entity-relationship diagram
data-schema-observability.yaml
YAML
Span templates for DB-level tracing
DatabaseSchemaPublished
JSON
Event that notifies schema is ready for use
π 1. schema.sql
| Format | PostgreSQL or SQL Server |
| Example |
CREATE TABLE client (
client_id UUID PRIMARY KEY ,
email VARCHAR ( 255 ) NOT NULL ,
status VARCHAR ( 20 ) NOT NULL ,
created_at TIMESTAMPTZ DEFAULT now (),
tenant_id UUID NOT NULL ,
trace_id UUID ,
deleted_at TIMESTAMPTZ
);
β
Used in schema-first environments or as a reference baseline.
π 2. schema.yaml
| Format | ConnectSoft DSL |
| Purpose | Declarative schema for agents, dev portals, and governance |
tables :
- name : client
fields :
- name : client_id
type : uuid
primary_key : true
- name : email
type : varchar(255)
required : true
pii : true
encrypted : true
- name : status
type : varchar(20)
enum : ClientStatus
- name : tenant_id
type : uuid
partition_key : true
- name : deleted_at
type : timestamp
soft_delete : true
observability :
traceable : true
span_name : db.crm.client.insert
π¦ 3. Migrations
πΉ migration-<timestamp>-add-column.sql
ALTER TABLE client ADD COLUMN marketing_opt_in BOOLEAN DEFAULT false ;
πΉ Migrations/20250428_AddMarketingOptIn.cs
[Migration(2025042801)]
public class AddMarketingOptIn : Migration
{
public override void Up () =>
Alter . Table ( "client" ). AddColumn ( "marketing_opt_in" ). AsBoolean (). WithDefaultValue ( false );
public override void Down () =>
Delete . Column ( "marketing_opt_in" ). FromTable ( "client" );
}
β
Used by FluentMigrator runners or SQL CI scripts
β
Tags each migration with trace_id, service_name, version
π 4. field-retention-map.yaml
| Format | YAML |
| Purpose | Compliance annotation for field-level audit, redaction, encryption
client :
email :
pii : true
redactable : true
encrypted : true
status :
retention : 5y
created_at :
audit_required : true
marketing_opt_in :
retention : 1y
πΊοΈ 5. er-diagram.mmd
erDiagram
client {
UUID client_id PK
STRING email
STRING status
UUID tenant_id
TIMESTAMP created_at
TIMESTAMP deleted_at
}
invoice {
UUID invoice_id PK
UUID client_id FK
NUMERIC amount
}
client ||--o{ invoice : has
Hold "Alt" / "Option" to enable pan & zoom
β
Used for developer portal embedding and visual CI diffs
π 6. data-schema-observability.yaml
client :
operations :
- insert :
span_name : db.crm.client.insert
trace_id_field : trace_id
- delete :
soft_delete : true
span_name : db.crm.client.delete
retention_policy : 7y
β
Enables Observability Agent to link spans to table operations
π’ 7. DatabaseSchemaPublished (Event)
{
"event" : "DatabaseSchemaPublished" ,
"service" : "ClientService" ,
"boundedContext" : "CRM" ,
"schemaVersion" : "v2.0" ,
"traceId" : "trace-data-112938" ,
"artifactUrls" : {
"ddl" : "https://schemas.connectsoft.dev/crm/schema-v2.sql" ,
"migrations" : "https://schemas.connectsoft.dev/crm/migrations/" ,
"observability" : "https://schemas.connectsoft.dev/crm/data-schema-observability.yaml"
},
"timestamp" : "2025-05-01T23:00:00Z"
}
β
Consumed by CI, Test Generator, DevOps Agent, Dev Portal
β
Output Checklist
Output
Required
Declarative schema (schema.sql, schema.yaml)
β
Migration strategy (SQL or FluentMigrator)
β
Retention/compliance metadata
β
Observability fields + spans
β
Event emission
β
with links and trace_id
Diagram output (optional but default)
β
Mermaid preferred
π Knowledge Base Overview
The Data Architect Agent relies on a curated knowledge base of:
Schema design best practices
Domain β storage translation patterns
Column naming conventions
Migration strategies
Retention policies
Polyglot persistence heuristics
This enables the agent to make domain-informed, platform-aligned decisions while producing predictable, reusable data designs.
π§± 1. Naming Conventions & Translation Rules
Pattern
Output
Address VO β flatten
address_line1, address_city, address_zip
Enums stored as string
Column type β varchar(20), constraint on values
Timestamps
Always use TIMESTAMPTZ with DEFAULT now()
Boolean fields
Use is_* or _enabled suffix: is_active, email_verified
UUIDs
Use uuid type, primary keys are snake_case_id (e.g., client_id)
Soft deletion
Field = deleted_at TIMESTAMPTZ NULL, queried via RLS or adapter policy
Multi-tenancy
Field = tenant_id UUID NOT NULL, partitioning enabled
π¦ 2. Schema Normalization & Denormalization Rules
Pattern
Applied When
Flatten VOs
If reused only once or shallow (e.g., Address)
Reference VOs
If reused across >3 aggregates or contains sub-VOs
Lookup tables for enums
If enum is user-extensible or over 10 values
Denormalized projections
For CQRS read models and Materialized Views
Array-of-VOs
Use JSONB or link-table (depending on access/read pattern)
π 3. Migration Strategy Heuristics
Migration Type
Rule
SQL Script
Preferred when schema is stable and cross-language
FluentMigrator
Required for .NET-first services using NHibernate or Dapper
Diff-based generation
Triggered when previous-schema.yaml is provided
Rollback support
Enabled only if down.sql or Revert*.cs migration is explicitly requested
Data seeding
Separate class or SQL script if requires_seed_data: true in schema.yaml
π 4. Retention & PII Enforcement Patterns
Field Type
Tag
email, phone, ssn, dob
x-sensitive: true, encrypted: true
user_id, device_id, ip_address
audit_required: true
session, token, log tables
ttl: 30d or custom expiration index
tenant_id
Always partition key and indexed
deleted_at
Required for soft-deletable entities
π§ 5. Semantic Memory β Schema Reuse Suggestions
Input
Suggested Output
"Client aggregate with Address VO"
Flatten Address inline unless reused by 3+ other aggregates
"Status = [Active, Suspended, Inactive]"
Enum β status VARCHAR(20) NOT NULL + constraint check
"Invoice.Amount"
Suggest NUMERIC(10,2) with NOT NULL DEFAULT 0.00
"CreateAt"
Suggest TIMESTAMPTZ DEFAULT now() and include in audit metadata
"EventStore"
Suggest table with stream_id, sequence_number, event_type, payload JSONB
π 6. Optimization Patterns
Pattern
Used For
Composite Index (email, status)
Email-based filtering or dashboard search
Partial Index (deleted_at IS NULL)
Faster active-only queries
Covering Index (client_id, created_at, amount)
Optimized dashboards
FTS (Full-text search)
Only for designated searchable_fields in schema.yaml
GIN Index on JSONB
Used for polymorphic settings or tags in event_metadata column
π§ͺ 7. Validation-Aware Templates
Validation
Injected Constraint
minLength: 5
VARCHAR(255) CHECK (LENGTH(field) >= 5)
enum
CHECK (status IN ('Active', 'Suspended'))
range: 0-100
CHECK (score BETWEEN 0 AND 100)
pattern: email
Optional CHECK or rely on app-level + observability tags
π Changelog Awareness
Change Detected
Action
Field added β not required
Minor migration with NULL default
Field added β required
Migration with backfill or default
Field removed
Flag as breaking; require schema bump
Type changed
Flag as breaking; recommend DROP ADD or shadow copy
Constraint tightened
Suggest ALTER ... CHECK + validation audit
π Process Flow Overview
The Data Architect Agent executes a structured, deterministic flow to translate domain models, validation rules, and compliance metadata into:
Schema files (.sql, .yaml)
Migrations (.sql, FluentMigrator .cs)
Diagrams (.mmd)
Metadata maps for observability and retention
The agent supports multiple backends (PostgreSQL, Cosmos, MongoDB, etc.) and migration styles based on input prompt strategy .
π Step-by-Step Lifecycle
Load and validate:
domain-model.yaml
validation-map.yaml
retention-policy.yaml
application-architecture.md
(Optional) previous-schema.yaml for diff-based migrations
Identify:
Aggregates, VOs, Enums
Table/entity boundaries
Multitenant scoping rules
Compliance and observability annotations
2οΈβ£ Table & Field Mapping
Map each aggregate β table
Flatten value objects into fields (unless reused widely)
Enum handling strategy (varchar, int, lookup table)
Resolve field types and nullability
Add tenant_id, trace_id, and created_at if required
3οΈβ£ Index, Constraint, and Retention Policy Generation
4οΈβ£ Migration Strategy Resolution
π Retry if type mismatch or unsafe drop detected (e.g., backfill or copy strategy required)
5οΈβ£ Schema File and DSL Output
Emit:
schema.sql β full DDL snapshot
schema.yaml β normalized data blueprint
schema.version (for governance and trace)
field-retention-map.yaml β used by compliance and governance agents
6οΈβ£ Observability Injection
Auto-inject:
trace_id, correlation_id fields
created_by, updated_by, deleted_by
Span templates per operation: insert, update, delete
Output: data-schema-observability.yaml
7οΈβ£ Documentation and Diagram Generation
Generate:
er-diagram.mmd β Mermaid-based ERD
data-relationships.md β Plaintext description of keys and joins
Optional: export to PlantUML or SVG if output_format requires
8οΈβ£ Schema Publication and Event Emission
Emit:
DatabaseSchemaPublished event
β contains: service, version, trace_id, artifact URLs
Archive:
Generated artifacts under /schema/v{X.Y.Z}/
π Telemetry & Span Injection
Span
Description
schema_generation.duration_ms
Total schema file generation time
migration_generation.retry_count
Retry count on invalid field evolution
observability_injection.status
Pass/fail on trace field mapping
compliance_annotation.coverage_percent
Percent of fields tagged as PII or retention-bound
schema_diff.breaking_change_detected
Boolean flag for version escalator
π§ High-Level Flow
flowchart TD
A[Parse Inputs] --> B[Map Aggregates to Tables]
B --> C[Inject Validation + Constraints]
C --> D[Generate Migrations]
D --> E[Emit schema.sql / schema.yaml]
E --> F[Inject Observability + PII]
F --> G[Generate Diagram + Metadata]
G --> H[Publish Event + Archive]
Hold "Alt" / "Option" to enable pan & zoom
π οΈ Skills and Kernel Functions
The Data Architect Agent is implemented as a modular, skill-driven pipeline using Semantic Kernel (.NET) .
Each skill is independently observable, retry-safe, and designed to operate over domain-aligned inputs with structured traceability .
π Core Skills Used
Skill
Purpose
DomainModelParserSkill
Parses domain-model.yaml, identifies aggregates, VOs, enums
ValidationMapNormalizerSkill
Loads and restructures field-level constraints
RetentionPolicyMapperSkill
Maps compliance metadata (e.g., pii, ttl) to fields
TenantScoperSkill
Ensures tenant_id is injected where needed
π§± 2. Schema and Field Modeling
Skill
Purpose
TableBuilderSkill
Converts aggregates β tables, maps primary keys, VOs
FieldTypeInfererSkill
Resolves types (uuid, varchar, jsonb, timestampz) from domain model
VOFlattenerSkill
Embeds VOs inline unless explicitly configured as reference
EnumStrategySelectorSkill
Determines how enums are stored (int, varchar, or lookup FK)
ConstraintInjectorSkill
Applies CHECK, NOT NULL, DEFAULT, and other column constraints
π 3. Migration Strategy and Evolution
Skill
Purpose
SchemaDiffDetectorSkill
Diffs current schema vs. previous-schema.yaml to detect changes
SqlMigrationGeneratorSkill
Emits ALTER TABLE, ADD COLUMN, etc. as .sql files
FluentMigratorGeneratorSkill
Emits C# migration class templates (Up, Down) with version tags
MigrationSafetyCheckerSkill
Flags unsafe ops (e.g., dropping columns, type narrowing) and injects fallback
π 4. Compliance and Retention Enforcement
Skill
Purpose
PIITaggerSkill
Tags fields with x-sensitive, encrypted, redactable
RetentionPolicyPlannerSkill
Generates field TTLs, table-level expirations
RLSPolicyInjectorSkill
Emits CREATE POLICY for Postgres-based RLS enforcement (if enabled)
MaskingStrategySkill
Suggests dynamic masking for SQL Server / Cosmos DB use cases
π 5. Observability Injection
Skill
Purpose
TraceFieldEnforcerSkill
Ensures trace_id, correlation_id, created_by fields exist
AuditColumnInjectorSkill
Adds created_at, updated_at, deleted_at, version fields
SpanTemplateEmitterSkill
Outputs data-schema-observability.yaml with span templates
TelemetryTaggerSkill
Annotates key business fields (e.g., invoice.amount) for tracing agents
πΌοΈ 6. Documentation and Visualization
Skill
Purpose
ERDiagramGeneratorSkill
Generates Mermaid ER diagram (er-diagram.mmd)
RelationshipMarkdownWriterSkill
Produces data-relationships.md for DevPortal
GlossaryLinkerSkill
Cross-links fields to known terms in glossary.yaml (optional)
π€ 7. Publishing and Events
Skill
Purpose
SchemaFileWriterSkill
Outputs schema.sql, schema.yaml, field-retention-map.yaml
MigrationFileWriterSkill
Emits .sql or .cs files based on strategy
DatabaseSchemaPublishedEmitterSkill
Emits contract event with trace_id and artifact URLs
SchemaChangelogEmitterSkill (optional)
Outputs version diffs if enabled via settings
π§ Skill Composition Flow (Simplified)
flowchart TD
A[DomainModelParserSkill] --> B[TableBuilderSkill]
B --> C[ConstraintInjectorSkill]
C --> D[MigrationStrategySkill]
D --> E[Observability & Retention Injection]
E --> F[SchemaFileWriterSkill]
F --> G[DatabaseSchemaPublishedEmitterSkill]
Hold "Alt" / "Option" to enable pan & zoom
π Retry-Safe Behaviors
Skill
Retry Trigger
Resolution
FieldTypeInfererSkill
Unknown or ambiguous type
Fallback to text, flag warning
MigrationSafetyCheckerSkill
Unsafe change detected
Suggest DROP/ADD pair with backfill
PIITaggerSkill
Unmapped sensitive field
Suggest tag from memory or glossary
FluentMigratorGeneratorSkill
Name conflict in class
Auto-rename or suffix with timestamp
π Metrics Emitted (Per Skill)
Metric
Example
skill_execution_duration_ms{name}
ConstraintInjectorSkill: 24ms
retry_count{name}
FieldTypeInfererSkill: 1 retry
sensitive_fields_detected_total
email, phone, ssn
migration_artifact_emitted_total
3 .sql, 1 .cs
er_diagram_generated: true
Mermaid flow verified
π οΈ Technologies Used
The Data Architect Agent leverages a robust, cloud-native, and polyglot-compatible toolchain to generate, validate, and publish data schemas and migrations across multiple database engines and migration systems β fully aligned with ConnectSoftβs automation-first strategy.
Technology
Purpose
Semantic Kernel (.NET)
Skill orchestration, input coordination, retry logic
FluentMigrator (.NET)
C#-based declarative migration generation (Up()/Down() methods)
PostgreSQL
Default SQL target for multi-tenant SaaS environments
SQL Server
Optional support for legacy and enterprise integrations
Mermaid.js
Visual generation of ER diagrams from schema structure
OpenTelemetry SDK
Trace/span emission per skill and contract generation step
Format
Purpose
schema.sql
Declarative PostgreSQL/SQL Server DDL
schema.yaml
Platform-native schema description used across agents
migration-*.sql
Forward-only or versioned migrations
*.cs (FluentMigrator)
.NET-first imperative migrations
er-diagram.mmd
Graphical entity relationships (for DevPortal + docs)
field-retention-map.yaml
Compliance metadata for security/governance agents
data-schema-observability.yaml
Spans and trace fields for runtime agents
π 3. Migration Strategies Supported
Strategy
Tools
Description
Declarative DDL
schema.sql, schema.yaml
Single-file representation; used as baseline
FluentMigrator
.cs C# classes
Used in CI/CD pipeline for .NET microservices
Diff-based SQL
migration-*.sql
Generated from previous schema snapshots (ALTER TABLE, etc.)
Rollback support
FluentMigrator or optional down.sql
Optional if configured in prompt or CI policy
βοΈ 4. Cloud-Native Support Targets
Provider
Component
Azure SQL Database
Target for production relational workloads
Azure Cosmos DB (Mongo API)
NoSQL collections with TTL/index rules
Azure Storage Blobs
Used for uploading schema bundles + changelogs
Azure Event Grid
Publishes DatabaseSchemaPublished event
Azure DevOps
CI pipeline triggers + artifact deployment
π§ 5. Semantic Memory & Validation Helpers
Tool
Purpose
ConnectSoft Schema Registry
Stores historical schema snapshots and changelogs
SQLFluff / pgFormatter
Optional formatting and linting of generated .sql files
DiffEngine
Computes field-level diffs and detects breaking schema changes
Governance Rule Engine
Ensures compliance with ConnectSoft standards (naming, retention, versioning)
π 6. Security & PII Tagging Compatibility
System
Integration
Microsoft Purview
Tag fields with pii, sensitive for catalog sync
Dynamic Data Masking (SQL Server)
Emit mask metadata (MASKED WITH FUNCTION)
PostgreSQL RLS
Inject CREATE POLICY and ALTER TABLE ENABLE RLS
Field-level encryption hints
Optional x-encrypted: true metadata for KMS-aware adapters
π 7. Observability + Telemetry Infrastructure
Tool
Output
OpenTelemetry
Span for each step of schema generation and migration writing
Prometheus (via Observability Agent)
Emits metrics: number of fields, retry count, PII coverage
Application Insights
Track errors, latency, and field-level validation failures
β
Output Delivery Channels
Artifact
Destination
schema.sql, migration.sql
Artifact Storage API
.cs (FluentMigrator)
.NET Service Repositories
DatabaseSchemaPublished
Event Grid + CI/CD trigger
Diagrams
Developer Portal UI
Observability Map
Observability Agent and tracing systems
π System Prompt (Bootstrapping Instruction)
The System Prompt defines the agentβs core instructions, design rules, and execution logic when activated by the ConnectSoft AI Factory.
It ensures consistency, governance compliance, and repeatable schema generation across all services and domains.
β
Full System Prompt (Plain Text)
You are the **Data Architect Agent** in the ConnectSoft AI Software Factory.
Your responsibility is to translate domain models, value objects, enums, validation rules,
and retention/compliance policies into versioned, cloud-ready **data schemas and migrations**.
---
## Your Core Goals:
1. Produce:
- `schema.sql` (relational DDL)
- `schema.yaml` (ConnectSoft platform DSL)
- `migration-*.sql` or `*.cs` (FluentMigrator classes)
- `er-diagram.mmd` (visual documentation)
- `field-retention-map.yaml` (compliance tags)
- `data-schema-observability.yaml` (telemetry)
- `DatabaseSchemaPublished` (contract event)
2. Map:
- Aggregates β Tables or Collections
- Value Objects β Embedded or Normalized fields
- Enums β Varchar, int, or lookup tables (based on reuse)
- Field constraints (`required`, `minLength`, etc.) β DB constraints
- Retention/PII fields β masking/encryption/TTL metadata
3. Ensure:
- Multitenancy (`tenant_id` field is included and indexed)
- Traceability (`trace_id`, `correlation_id`, `created_at`)
- Referential integrity (primary, foreign, unique constraints)
- Observability span mapping per table operation
- Compatibility with FluentMigrator or SQL-first CI/CD pipelines
---
## Validation Rules:
- Every table must have a primary key
- Indexes must not overlap or duplicate constraints
- If RLS is enabled, inject `CREATE POLICY`
- If migration strategy = diff, compare `previous-schema.yaml`
- Soft-deletable entities must include `deleted_at`
- Sensitive fields (email, ssn, dob) must be tagged with `x-sensitive: true`
- Enum fields must have value validation (`CHECK (status IN (...))`)
- All schema artifacts must include `trace_id`, `service_name`, `version`
---
## Migration Strategy Modes:
- `fluentmigrator` β emit `.cs` files using `Migration(YyyyMMdd_HHmm)` format
- `sql` β emit timestamped forward-only `.sql` files
- `diff` β compare against previous schema and generate additive diffs
- All strategies must support rollback or changelog tagging (if configured)
---
## Publication:
When schema generation is complete:
- Emit a `DatabaseSchemaPublished` event with artifact links, version, and `trace_id`
- Archive schema bundle for CI/CD and downstream test/migration runners
---
## Output Format and Structure:
- Use snake_case for table and column names
- Prefer `uuid`, `timestampz`, `varchar`, `jsonb` types
- Enforce DDD boundaries β no DTOs, only domain-derived structures
- Avoid circular foreign keys unless explicitly declared
- Each field in `schema.yaml` must contain `type`, `required`, and `observability` attributes
---
## Versioning:
- Changes to field type, constraints, or deletion = **breaking** β bump major version
- New nullable fields or optional indexes = **non-breaking** β bump minor version
- Deprecations must be marked and included in changelog
π§ Summary of Behavior
The system prompt ensures the Data Architect Agent:
Builds clean, domain-aligned relational schemas
Emits machine- and human-readable contracts
Complies with observability, retention, and security standards
Publishes results for immediate downstream use in DevOps and governance
The Data Architect Agent receives a structured YAML prompt from the ConnectSoft orchestration layer.
This defines:
The service context
Source artifacts
Desired migration strategy
Compliance and observability enforcement options
Output formatting rules
assignment : "generate-schema"
project :
trace_id : "trace-data-55623"
project_id : "connectsoft-saas"
bounded_context : "Billing"
service_name : "InvoiceService"
agent_version : "1.3.0"
inputs :
domain_model_url : "https://artifacts.connectsoft.dev/billing/domain-model.yaml"
validation_rules_url : "https://artifacts.connectsoft.dev/billing/validation-map.yaml"
retention_policy_url : "https://artifacts.connectsoft.dev/billing/field-retention-policy.yaml"
app_architecture_url : "https://artifacts.connectsoft.dev/billing/application-architecture.md"
glossary_url : "https://artifacts.connectsoft.dev/glossary/billing.yaml"
previous_schema_url : "https://artifacts.connectsoft.dev/billing/schema-v1.yaml" # optional
settings :
migration_strategy : "fluentmigrator" # options: fluentmigrator, sql, diff
output_format : [ "sql" , "yaml" , "erd" ]
include_observability_map : true
publish_contract : true
generate_down_migrations : false
π§Ύ Field Descriptions
π§ project
Field
Description
trace_id
Correlation trace for full execution lineage
project_id
Root-level SaaS/project group
bounded_context
Domain boundary this service belongs to
service_name
Name of the microservice
agent_version
Which version of the Data Architect Agent is executing this spec
Artifact
Description
domain_model_url
Aggregates, value objects, and enums
validation_rules_url
Field-level rules: min/max, format, nullable
retention_policy_url
PII, TTL, encryption, audit settings
app_architecture_url
Multitenancy, soft-deletion, RLS enforcement
glossary_url
Optional β aligns naming and reused concepts
previous_schema_url
Optional β enables diffing, changelog, and migration strategy derivation
βοΈ settings
Field
Purpose
migration_strategy
Determines output: .sql, .cs, or auto-diff
output_format
Declares desired formats (can include diagrams, schema DSL)
include_observability_map
Emits data-schema-observability.yaml
publish_contract
Emits DatabaseSchemaPublished event
generate_down_migrations
Whether to emit Down() in FluentMigrator (default false)
π§ Minimal Example Prompt
assignment : generate-schema
project :
service_name : ClientService
trace_id : trace-client-774
inputs :
domain_model_url : https://.../client/domain-model.yaml
validation_rules_url : https://.../client/validation.yaml
settings :
migration_strategy : sql
output_format : [ yaml ]
β
Validation Rules
Rule
Description
Must include trace_id, service_name, and at least domain_model_url
β
validation_rules_url and retention_policy_url recommended for compliance coverage
β
If migration_strategy = diff, previous_schema_url is required
β
If publish_contract: true, all outputs must pass validation before event emission
β
π Compatibility Notes
All inputs are trace-linked and version-aware
Fully compatible with:
Vision Architect Agent
Application Architect Agent
Domain Modeler Agent
Compliance Agent
Governance Dashboards
DevOps Pipelines
π€ Output Expectations
The Data Architect Agent produces a complete, traceable, and deployable data definition package.
These outputs are consumed by:
π§± Adapter Generator Agent (for ORM/entity generation)
π DevOps Architect Agent (for pipeline execution)
π§ͺ Test Generator Agent (for fixture bootstrapping)
π§ Governance Dashboards (for audit and compliance validation)
All artifacts include traceability , versioning , retention mapping , and observability hooks .
π¦ Output Artifact Summary
Artifact
Format
Description
schema.sql
SQL
Declarative schema DDL for relational DBs
schema.yaml
YAML
Agent-readable DSL version of schema
migration-*.sql
SQL
Forward-only or reversible migrations
Migrations/*.cs
C#
FluentMigrator migrations (imperative)
field-retention-map.yaml
YAML
Field-level retention/PII classification
data-schema-observability.yaml
YAML
Field-span mapping, trace injection, audit coverage
er-diagram.mmd
Mermaid
ER diagram for dev portal visualization
DatabaseSchemaPublished
JSON
CI/CD-triggered contract event
π§Ύ Detailed Artifact Descriptions
π¦ 1. schema.sql
A canonical DDL export aligned with domain structure and constraints
Includes:
Table definitions
Primary/foreign keys
Indexes, CHECK constraints
Nullable/required rules
Default values
Compatible with PostgreSQL and SQL Server dialects
π¨ 2. schema.yaml
Abstract schema model for platform-wide reuse
tables :
- name : invoice
fields :
- name : invoice_id
type : uuid
primary_key : true
required : true
- name : status
type : varchar(20)
enum : InvoiceStatus
pii : false
- name : tenant_id
type : uuid
required : true
indexed : true
retention :
status : 5y
observability :
trace_id : true
created_at : true
πͺ 3. Migrations
Type
Output
SQL
migration-20250429-add-table-client.sql
FluentMigrator
Migrations/20250429_AddTableClient.cs
Includes Up() (always), Down() (if enabled), with trace annotations
π₯ 4. field-retention-map.yaml
Compliance-oriented metadata extracted from policy inputs
invoice.status :
pii : false
retention : 7y
encrypted : false
client.email :
pii : true
encrypted : true
redactable : true
π© 5. data-schema-observability.yaml
Tells the Observability Agent what to trace at the DB level
tables :
client :
spans :
- insert :
span_name : db.crm.client.insert
trace_fields : [ trace_id , correlation_id ]
- update :
span_name : db.crm.client.update
π« 6. er-diagram.mmd
erDiagram
invoice {
UUID invoice_id PK
UUID client_id FK
NUMERIC amount
TIMESTAMP created_at
}
client {
UUID client_id PK
STRING email
TIMESTAMP deleted_at
}
client ||--o{ invoice : has
Hold "Alt" / "Option" to enable pan & zoom
π© 7. DatabaseSchemaPublished Event
Field
Description
event
DatabaseSchemaPublished
service
e.g., InvoiceService
schemaVersion
e.g., v2.0.1
traceId
Correlation ID
artifactUrls
DDL, DSL, diagrams, retention, observability
timestamp
UTC time of publication
Enables downstream jobs and dashboards to sync and validate schema artifacts
β
Output Compliance Criteria
Rule
Enforced
All tables have primary_key, created_at, tenant_id
β
Sensitive fields include x-sensitive, pii, encrypted tags
β
Span templates exist for insert, update, delete ops
β
Version bump triggered on any breaking change
β
Traceability fields present in all outputs
β
trace_id, project_id, service_name
π§ Memory Strategy Overview
The Data Architect Agent uses structured and semantic memory to:
Reuse known field structures, enum mappings, and constraints
Apply naming, compliance, and index patterns consistently
Detect historical schema versions and generate intelligent diffs
Prevent unsafe or duplicated migrations
Assist in automated versioning and changelog generation
This memory system operates in two scopes :
π Short-term (session memory) and π Long-term (knowledge base) .
π Short-Term Memory
Memory Area
Purpose
session.trace_id
Tracks lineage of current generation run
field_definitions[]
Captures all resolved fields + inferred types/constraints
aggregates[]
List of processed aggregates and their mappings
value_object_usage[]
Tracks flattened vs referenced VOs
generated_migrations[]
Avoids duplication and helps deduplicate FluentMigrator class names
schema_diff_cache
Temporarily holds change list between old and new schema for rollback/scoping logic
π Long-Term Semantic Memory
1. π¦ Reusable Schema Fragments
Pattern
Description
Money
{ amount NUMERIC(10,2), currency VARCHAR(3) }
Email
{ email VARCHAR(255) NOT NULL CHECK (email LIKE '%@%') }
AuditTrailFields
created_at, updated_at, deleted_at, created_by, trace_id
Multitenancy
tenant_id UUID NOT NULL, indexed, included in all queries
2. π Compliance & PII Tagging
Trigger
Output
Field name matches "email", "ssn", "dob"
Auto-tag with x-sensitive, encrypted, audit_required
Field flagged as TTL
Inject x-ttl-days and mark for storage pruning or indexing
Service is in a regulated domain (e.g., Insurance, Healthcare)
Enforce retention mapping and trace coverage
3. π Version History Ledger
Tracked
Use
schema@v1, schema@v2
Enables diff-based migration generation
field_changes[]
Auto-generates changelogs: added, removed, modified
breaking_changes[]
Triggers version bump logic and rollback plan (if allowed)
last_seen_field_states[]
Prevents downgrades or regressions in schema type evolution
4. π§± Migration Pattern Matching
Detection
Memory Action
Field type widening (e.g., varchar(50) β varchar(255))
Mark as safe
Enum extension
Log, suggest adding CHECK with extended IN clause
Field drop attempt
Flag as unsafe unless wrapped in reversible strategy
New VO used
Suggest reuse pattern based on prior flatten/reference ratio
Concept
Used For
"Address"
Suggest common VO flattening structure (address_line1, city, zip)
"Status"
Enum field with shared value set (Active, Suspended, Pending)
"AuditTrail"
Standard field group injection + x-audit-required: true
π§ͺ Memory-Driven Changelog Generation
When previous_schema.yaml is present, memory diff engine emits:
changelog :
version : v2.0.0
breaking_changes :
- field_removed : client.middle_name
- constraint_changed : invoice.amount now NOT NULL
additions :
- table : payment_method
- field_added : invoice.due_date
π§ Semantic Memory Match Example
query : "field: invoice.amount"
match :
reuse_pattern : Money
type : NUMERIC(10,2)
constraint : NOT NULL DEFAULT 0.00
traceability : true
audit_required : true
last_used_by : InvoiceService
β
Benefits of Memory Strategy
Capability
Value
Schema reuse
Faster, consistent generation across services
Compliance tagging
Reduced errors, governance integration
Safe migration evolution
Intelligent diffs and rollback logic
Version diffing
Enables changelog and CI/CD auditability
Observability propagation
Preserves trace span coverage across evolutions
β
Validation & Correction Framework
The Data Architect Agent executes multi-phase schema validation , correction, and recovery logic across all artifacts it generates.
It ensures data designs are:
Structurally valid
Compliant with domain and validation rules
Backward-compatible (when possible)
Safe for CI/CD deployment and migration
π Validation Phases
1οΈβ£ Schema Structure & Syntax Validation
Target
Check
schema.sql
Valid syntax per SQL dialect (PostgreSQL, SQL Server)
schema.yaml
Valid against ConnectSoft schema DSL schema
migration.sql
Executes in dry-run parser or mock DB
fluentmigrator.cs
Compiles with no semantic errors
enum definitions
Ensure values match usage and constraints
2οΈβ£ Domain β Storage Consistency Validation
Rule
Enforced
Every aggregate maps to one table or collection
β
Every required field must be NOT NULL
β
Field constraints match validation-map.yaml
β
(e.g., length, min/max, format)
Enum fields have correct constraint (CHECK or FK)
β
Field types are semantically consistent (Email β varchar)
β
3οΈβ£ Migration Safety Validation
Rule
Check
Dropping a column triggers warning + audit event
β οΈ
Changing nullable β non-null requires default or backfill
β
Field renames must include mapping
β
Duplicated migrations by name β flagged
β
New table already exists β abort or suggest suffixing
4οΈβ£ Compliance & Observability Validation
Rule
Action
Missing trace_id, created_at, tenant_id β inject
π
PII field detected but untagged β tag as x-sensitive
π
No observability span for insert/update/delete β warn and generate defaults
π
Retention policy undefined for ttl fields β assign fallback or escalate
β οΈ
π Retry & Auto-Correction Flow
Skill
Error
Retry Action
FieldTypeInfererSkill
Unknown domain type
Fallback to text or jsonb with comment
MigrationSafetyCheckerSkill
Unsafe drop
Remove drop, inject deprecation comment
PIITaggerSkill
Detected untagged field
Search glossary or pattern match
EnumStrategySelectorSkill
Enum mismatch
Convert to varchar(20) or FK lookup table
TraceFieldEnforcerSkill
trace_id missing
Add field + observability span
π Correction Example
{
"trace_id" : "data-trace-44192" ,
"skill" : "MigrationSafetyCheckerSkill" ,
"retry" : 1 ,
"issue" : "Unsafe type change: amount NUMERIC β INTEGER" ,
"resolution" : "Suggested add new column 'amount_int' and backfill from 'amount'"
}
π Metrics Collected
Metric
Description
schema_validation_passed_total
Count of successful full schema validations
migration_validation_failed_total
Count of rejected or unsafe migration attempts
pii_fields_tagged_total
Number of x-sensitive fields detected
observability_span_injected_total
Operations auto-covered
retry_count_by_skill
Diagnostic metric for agent behavior tuning
π¨ Escalation Conditions
Condition
Action
3+ retries in same skill
Emit SchemaGenerationWarning
Unsafe column deletion without --SAFE_DROP tag
Emit blocking error, abort pipeline
Trace fields missing from multiple tables
Block observability publication
Version mismatch with schema.version
Warn CI and halt DatabaseSchemaPublished event
β
Recovery Guarantee
Most failures are resolved via:
Pattern-based correction
Semantic fallback injection
Memory-guided type resolution
Retry with normalized constraint variants
Final output is only published if:
β
All artifacts pass validation
β
Changelog is emitted (if breaking)
β
Observability and compliance coverage β₯ 90%
π€ Collaboration Interfaces
The Data Architect Agent integrates seamlessly with both upstream semantic agents and downstream operational agents to maintain data-design integrity across the ConnectSoft ecosystem.
It consumes domain and application structure from upstream agents and produces schema, migration, and metadata artifacts for others to consume in deployment, validation, and visualization flows.
Agent
Artifact
Domain Modeler Agent
domain-model.yaml, validation-map.yaml, value-objects.yaml
Application Architect Agent
use-cases.yaml, application-architecture.md (multi-tenancy, soft delete, access patterns)
Compliance Architect Agent
field-retention-policy.yaml, PII rules
Solution Architect Agent
Overall platform schema strategy (shared DB, RLS, etc.)
Glossary Agent (optional)
Term-to-field mappings and reusable VO field sets
π½ Downstream Consumers
Agent
Artifacts Consumed
Adapter Generator Agent
schema.yaml, schema.sql, field-retention-map.yaml
β Generates ORM models, query layer, value mappers
| DevOps Architect Agent | migration-*.sql, Migrations/*.cs, schema.version, DatabaseSchemaPublished
β Integrates schema deployment into CI/CD pipelines |
| Test Generator Agent | schema.yaml, schema.sql
β Creates fixtures, seed data, integration DB tests |
| Observability Agent | data-schema-observability.yaml
β Generates span mapping and access telemetry rules |
| Compliance Dashboard Agent | field-retention-map.yaml, schema lineage
β Ensures traceable PII/redaction coverage |
| Developer Portal Generator | er-diagram.mmd, schema.yaml, data-relationships.md
β Builds visual API-to-DB relationship diagrams and schema browser |
π’ Event-Driven Interface
Event
Description
DatabaseSchemaPublished
Published once all validation and artifact generation passes
β Contains artifact links, trace_id, schemaVersion, service, timestamp
SchemaGenerationWarning (optional)
Emitted when unsafe migrations, missing PII tags, or trace fields are detected but auto-corrected
π§ Inter-Agent Traceability
All generated artifacts include trace metadata:
trace_id : data-trace-000112
agent_version : 1.3.0
service_name : InvoiceService
bounded_context : Billing
schema_version : v2.1.0
generated_on : 2025-05-01T23:15:00Z
β
Enables end-to-end tracing across modeling, schema, tests, and gateway configuration
π Collaboration Flow Diagram
flowchart TD
DomainModeler -->|domain-model.yaml| DataArchitect
ApplicationArchitect -->|use-cases.yaml| DataArchitect
ComplianceArchitect -->|retention-policies.yaml| DataArchitect
DataArchitect --> AdapterGenerator
DataArchitect --> DevOpsArchitect
DataArchitect --> TestGenerator
DataArchitect --> ObservabilityAgent
DataArchitect --> DeveloperPortal
Hold "Alt" / "Option" to enable pan & zoom
π Contract Compliance for Consumers
Constraint
Applied
Migration consumers must use provided version tag (schema.version)
β
ORM codegen must respect nullable and encrypted flags
β
PII fields must be consumed only via compliant field mappers
β
Tests must use fixture-safe schema versions (schema.yaml@vN)
β
Observability spans must align with table/action trace templates
β
β
Outcomes of Agent Collaboration
Capability
Benefit
Unified source of truth for DB schema
β
Every agent reads from schema.yaml
Safe migration rollout via CI/CD
β
DevOps Agent applies only validated .sql / .cs
Traceable PII and retention policies
β
Compliance & Security Dashboards stay aligned
Full-stack visibility
β
API calls trace down to fields and rows in DB
π‘ Observability Hooks
The Data Architect Agent is fully observable and auditable via:
OpenTelemetry spans emitted during each skill's execution
Lifecycle events for schema generation and migration publication
Semantic trace metadata embedded in all output files
These ensure complete transparency into what was generated , why , and how it evolved .
π Key Telemetry Metrics
Metric
Description
schema_generation_duration_ms
Total time to parse, generate, and validate all artifacts
pii_fields_tagged_total
Number of fields identified as sensitive/PII
migration_retry_count_total
Count of fallback/correction attempts during migration creation
observability_field_injection_rate
% of tables with required span fields (trace_id, created_at, etc.)
breaking_change_detection_count
How often the agent identifies and manages schema-breaking diffs
π§ Span Emission Model
Each skill emits a span:
{
"span_name" : "SchemaBuilderSkill" ,
"trace_id" : "data-trace-11722" ,
"duration_ms" : 42 ,
"agent_version" : "1.3.0" ,
"service_name" : "InvoiceService" ,
"status" : "success"
}
These spans are collected by the Observability Agent and visible in ConnectSoftβs trace dashboards.
π§ Human Oversight Triggers
Although autonomous, the agent flags and escalates edge cases such as:
Trigger
Escalation
PII fields without tagging β 3+ retries
SchemaGenerationWarning emitted
Unsafe DROP COLUMN without rollback support
Migration halted, escalated to Governance Officer
Traceability fields (trace_id, tenant_id) missing on multiple entities
Event suppressed until corrected
Schema version mismatch or illegal downgrade
Deployment blocked, audit note added
π Compliance and Governance Integration
All output files are tagged with:
agent_version, trace_id, schema_version, bounded_context, service_name
field-retention-map.yaml and data-schema-observability.yaml are synced with:
Compliance Dashboards
Data Loss Prevention Auditors
Governance Agents
π’ Lifecycle Event: DatabaseSchemaPublished
Field
Description
event
DatabaseSchemaPublished
schemaVersion
Auto-computed via semver diff engine
artifactUrls
Links to schema, migrations, diagram, retention map
generated_by
Data Architect Agent v1.3.0
trace_id
Traceable execution lineage
timestamp
UTC timestamp
β
Enables downstream CI/CD pipelines, test agents, and artifact promotion workflows
π Audit Trail Support
Each generated migration or schema file includes:
metadata :
trace_id : "data-trace-11233"
schema_version : "v2.1.0"
service : "InvoiceService"
generated_on : "2025-05-01T23:42:00Z"
generated_by : "Data Architect Agent v1.3.0"
β
Enables long-term schema evolution lineage and rollback validation
β
Final Outcomes
The Data Architect Agent enables ConnectSoft to:
Operate with governed, reusable, traceable schema designs
Automate migration generation with confidence and auditability
Align field-level data design with domain, compliance, and observability
Support multi-agent cooperation across platform, infrastructure, and business domains
π¦ Output Summary
Artifact
Purpose
schema.sql, schema.yaml
Declarative schema
migration.sql, .cs
Forward-compatible evolutions
field-retention-map.yaml
PII, TTL, compliance tracking
observability.yaml
Span coverage and trace field injection
er-diagram.mmd
Visual schema overview
DatabaseSchemaPublished
Event trigger for all downstream agents