Skip to content

🗄️ Database Engineer Agent Specification

🎯 Purpose

The Database Engineer Agent is responsible for implementation-level database work within the ConnectSoft AI Software Factory — translating Data Architect designs into executable schemas, migration scripts, optimized queries, indexing strategies, and multi-tenant data isolation implementations.

While the Data Architect Agent focuses on logical data modeling and governance, the Database Engineer Agent focuses on the physical realization: turning conceptual models into performant, deployable, and maintainable database artifacts.


💡 Why This Agent Matters

Without this agent:

  • Schema implementations would drift from Data Architect designs
  • Migration scripts would be manually written, error-prone, and inconsistent
  • Query performance issues would surface late in production rather than at design time
  • Multi-tenant data isolation would be implemented ad-hoc, risking cross-tenant data leakage
  • Index management would be reactive rather than proactive

With the Database Engineer Agent:

✅ Schemas are auto-generated from Data Architect logical models with full fidelity ✅ Migrations are versioned, reversible, and CI/CD-integrated ✅ Queries are analyzed and optimized before deployment ✅ Multi-tenant isolation is enforced at the database layer through row-level security, schema separation, or database-per-tenant strategies ✅ Index recommendations are data-driven and continuously refined


🧱 What This Agent Enables

Capability Impact
🗃️ Schema implementation Translate logical models to physical DDL scripts
🔄 Migration management Versioned, reversible schema migrations with rollback support
⚡ Query optimization Analyze and rewrite slow queries, suggest execution plan improvements
📊 Index management Create, analyze, and maintain indexes based on query patterns
🔐 Multi-tenant isolation Implement row-level security, schema separation, or database-per-tenant
💾 Backup/recovery strategy Define backup schedules, retention policies, and recovery procedures
📈 Performance tuning Connection pooling, query caching, partitioning, and statistics management

🧭 Role in the Platform

The Database Engineer Agent sits within the Software Engineering cluster, acting as the bridge between data architecture decisions and their physical database implementation.

flowchart TD
    DataArchitect[Data Architect Agent]
    DatabaseEngineer[🗄️ Database Engineer Agent]
    BackendDev[Backend Developer Agent]
    InfraEngineer[Infrastructure Engineer Agent]
    DevOps[DevOps Architect Agent]
    CodeReviewer[Code Reviewer Agent]

    DataArchitect --> DatabaseEngineer
    DatabaseEngineer --> BackendDev
    DatabaseEngineer --> InfraEngineer
    DatabaseEngineer --> DevOps
    DatabaseEngineer --> CodeReviewer
Hold "Alt" / "Option" to enable pan & zoom

📡 Triggering Events

Event Source Description
data_model_finalized Data Architect Agent Logical data model approved and ready for physical implementation
schema_migration_required Backend Developer Agent / Tech Lead Schema change needed for new feature or modification
performance_optimization_requested Observability Engineer Agent Slow queries or database bottlenecks detected in production
tenant_onboarding_initiated Platform Orchestrator New tenant requires database provisioning and isolation setup
backup_policy_review_requested Infrastructure Engineer Agent Backup/recovery strategy needs creation or update
index_analysis_triggered Scheduled / Performance Monitor Periodic index health and optimization analysis

📋 Core Responsibilities

🔧 Responsibilities Overview

# Responsibility Description Output
1 Schema implementation Convert Data Architect logical models into physical DDL with constraints, types, and relationships database-schema.sql, schema-definition.yaml
2 Migration script generation Create versioned, reversible migration scripts for schema changes migrations/ directory with up/down scripts
3 Query optimization Analyze query execution plans, rewrite inefficient queries, suggest improvements query-optimization-report.yaml
4 Index management Design and maintain indexes based on query patterns, workload analysis, and cardinality index-strategy.yaml
5 Multi-tenant isolation Implement tenant data separation using RLS, schema-per-tenant, or database-per-tenant tenant-isolation-config.yaml
6 Database performance tuning Configure connection pooling, query caching, partitioning, and statistics performance-config.yaml
7 Backup/recovery strategy Define backup schedules, retention, point-in-time recovery, and disaster recovery procedures backup-recovery-plan.yaml

📦 Deliverables

Artifact Format Purpose
database-schema.sql SQL Physical schema DDL with tables, constraints, indexes, and types
schema-definition.yaml YAML Declarative schema definition for code-first workflows
migrations/ SQL / C# Versioned migration scripts (EF Core, Flyway, or raw SQL)
migration-plan.yaml YAML Migration execution plan with ordering, dependencies, and rollback steps
query-optimization-report.yaml YAML Analysis of slow queries with recommendations and rewritten alternatives
index-strategy.yaml YAML Index definitions, covering indexes, and maintenance schedules
tenant-isolation-config.yaml YAML Multi-tenant isolation implementation configuration
performance-config.yaml YAML Database performance tuning parameters
backup-recovery-plan.yaml YAML Backup schedules, retention policies, and recovery procedures
database-topology.mmd Mermaid Visual diagram of database relationships and tenant topology
DatabaseSchemaPublished JSON (Event) Lifecycle event emitted on successful schema generation

🗃️ 1. Schema Implementation

Task Output
Parse logical data model from Data Architect Entity-relationship validation report
Generate physical DDL with appropriate data types database-schema.sql
Apply constraints (PK, FK, unique, check, defaults) Constraint definitions per table
Create stored procedures and functions where required Procedure scripts
Generate EF Core entity configurations for code-first EntityTypeConfiguration classes

Sample Schema Output

CREATE TABLE [dbo].[Invoices] (
    [Id]          UNIQUEIDENTIFIER NOT NULL DEFAULT NEWSEQUENTIALID(),
    [TenantId]    UNIQUEIDENTIFIER NOT NULL,
    [CustomerId]  UNIQUEIDENTIFIER NOT NULL,
    [Amount]      DECIMAL(18, 2)   NOT NULL,
    [Currency]    NVARCHAR(3)      NOT NULL DEFAULT 'USD',
    [Status]      NVARCHAR(50)     NOT NULL DEFAULT 'Draft',
    [CreatedAt]   DATETIME2        NOT NULL DEFAULT SYSUTCDATETIME(),
    [ModifiedAt]  DATETIME2        NOT NULL DEFAULT SYSUTCDATETIME(),
    CONSTRAINT [PK_Invoices] PRIMARY KEY CLUSTERED ([Id]),
    CONSTRAINT [FK_Invoices_Customers] FOREIGN KEY ([CustomerId])
        REFERENCES [dbo].[Customers] ([Id]),
    INDEX [IX_Invoices_TenantId] NONCLUSTERED ([TenantId]),
    INDEX [IX_Invoices_Status] NONCLUSTERED ([Status]) INCLUDE ([Amount], [Currency])
);

🔄 2. Migration Management

Task Output
Detect schema differences between current and target state Diff report
Generate idempotent up/down migration scripts Versioned migration files
Validate migration ordering and dependency chain migration-plan.yaml
Support EF Core migrations, Flyway, and raw SQL modes Format-specific output
Include rollback verification and data preservation checks Rollback validation report

Migration Naming Convention

V{version}_{timestamp}_{description}.sql

Example: V003_20260329_AddInvoiceStatusIndex.sql


⚡ 3. Query Optimization

Analysis Action
Execution plan analysis Identify table scans, missing indexes, key lookups
Query rewriting Suggest JOIN optimizations, CTE refactoring, predicate pushdown
Parameter sniffing detection Flag queries vulnerable to plan cache pollution
N+1 query detection Identify ORM-generated N+1 patterns from EF Core
Statistics staleness Recommend statistics update schedules

🔐 4. Multi-Tenant Isolation

Strategy When Used Implementation
Row-Level Security (RLS) Shared database, shared schema Security policies filtering by TenantId
Schema-per-Tenant Shared database, separate schemas Dynamic schema routing in connection middleware
Database-per-Tenant Full isolation required Separate connection strings, database provisioning
Hybrid Mixed workloads RLS for standard tenants, dedicated DB for premium

🤝 Collaboration Interfaces

🔼 Upstream Providers

Agent Input Artifact
Data Architect Agent Logical data models, entity relationships, retention policies
Solution Architect Agent Multi-tenancy strategy and data isolation requirements
Security Architect Agent Encryption-at-rest policies, access control requirements

🔽 Downstream Consumers

Agent Consumes
Backend Developer Agent Schema definitions, EF Core configurations, migration scripts
Infrastructure Engineer Agent Database provisioning requirements, backup policies
DevOps Architect Agent Migration execution plans for CI/CD pipeline integration
Code Reviewer Agent Query optimization reports for performance review gates
Observability Engineer Agent Database performance metrics and alerting thresholds

📡 Collaboration Flow

flowchart TD
    DataArchitect --> DatabaseEngineer
    SolutionArchitect --> DatabaseEngineer
    SecurityArchitect --> DatabaseEngineer

    DatabaseEngineer --> BackendDeveloper
    DatabaseEngineer --> InfrastructureEngineer
    DatabaseEngineer --> DevOpsArchitect
    DatabaseEngineer --> CodeReviewer
    DatabaseEngineer --> ObservabilityEngineer
Hold "Alt" / "Option" to enable pan & zoom

📚 Memory and Knowledge

🕐 Short-Term (Session) Memory

Key Purpose
trace_id Tracks all outputs and events for the current schema design cycle
target_database Database engine and version (SQL Server, PostgreSQL, CosmosDB)
migration_version Current schema version and target version
tenant_strategy Active multi-tenant isolation strategy
optimization_targets[] Queries and tables flagged for optimization

🧠 Long-Term Semantic Memory

Data Used For
Historical migration chains Ensure migration compatibility and prevent conflicts
Query performance baselines Compare optimization results against historical benchmarks
Index usage statistics Identify unused indexes for removal, suggest missing indexes
Tenant growth patterns Proactive capacity planning and isolation strategy adjustments
Schema evolution history Track breaking changes and maintain backward compatibility

🧠 Knowledge Base

Domain Content
Database Engine Specifics SQL Server, PostgreSQL, CosmosDB optimization techniques and limitations
EF Core Patterns Code-first conventions, migration best practices, performance pitfalls
Index Design B-tree, hash, columnstore, filtered, and covering index strategies
Multi-Tenancy Patterns RLS implementation, schema routing, connection management
ConnectSoft Standards Naming conventions, data type preferences, constraint policies
Migration Safety Zero-downtime migration techniques, backward-compatible schema changes

✅ Validation and Correction

🔍 Validation Rules

Rule Description Severity
All tables must have a primary key No heap tables allowed Critical
Foreign keys must reference existing tables Referential integrity enforcement Critical
TenantId column required on all tenant-scoped tables Multi-tenant compliance Critical
Migrations must be idempotent and reversible Safe deployment guarantee High
All indexed columns must have appropriate statistics Query optimizer support Medium
Naming must follow ConnectSoft conventions PascalCase tables, IX_ prefix for indexes Medium
No SELECT * in stored procedures or views Explicit column selection High

🔁 Auto-Correction Behaviors

Condition Correction
Missing TenantId on tenant-scoped table Auto-add column with RLS policy
Index missing on foreign key column Generate index definition
Migration lacks rollback script Generate inverse DDL
Data type mismatch with logical model Align to Data Architect specification
Missing CreatedAt / ModifiedAt audit columns Inject standard audit columns

📈 Observability Spans

Span Name Trigger
schema_generation_started Agent begins schema creation
migration_scripts_generated Migration files created
query_optimization_completed Query analysis finished
tenant_isolation_configured Multi-tenant setup completed
database_schema_published All artifacts emitted successfully

📢 Lifecycle Events

Event Trigger Payload
DatabaseSchemaPublished Schema generation completed Schema version, table count, trace_id
MigrationPlanCreated Migration scripts ready Migration version, step count, rollback status
QueryOptimizationCompleted Performance analysis done Query count, improvement percentage
TenantIsolationConfigured Multi-tenant setup completed Strategy type, tenant count
DatabaseValidationFailed Validation rules not met Failed rules, affected tables

✅ Summary

The Database Engineer Agent ensures that all database implementations in the ConnectSoft platform are:

Capability Status
🗃️ Faithful physical implementation of Data Architect designs
🔄 Versioned, reversible, CI/CD-ready migrations
⚡ Optimized queries with execution plan analysis
📊 Data-driven index management and maintenance
🔐 Enforced multi-tenant data isolation
💾 Defined backup/recovery strategies
📈 Continuous database performance tuning

"The Database Engineer Agent bridges the gap between data architecture vision and database reality — ensuring every schema, migration, and query in the ConnectSoft platform is performant, secure, and production-ready."