🗄️ Database Engineer Agent Specification
🎯 Purpose
The Database Engineer Agent is responsible for implementation-level database work within the ConnectSoft AI Software Factory — translating Data Architect designs into executable schemas, migration scripts, optimized queries, indexing strategies, and multi-tenant data isolation implementations .
While the Data Architect Agent focuses on logical data modeling and governance, the Database Engineer Agent focuses on the physical realization : turning conceptual models into performant, deployable, and maintainable database artifacts.
💡 Why This Agent Matters
Without this agent:
Schema implementations would drift from Data Architect designs
Migration scripts would be manually written, error-prone, and inconsistent
Query performance issues would surface late in production rather than at design time
Multi-tenant data isolation would be implemented ad-hoc, risking cross-tenant data leakage
Index management would be reactive rather than proactive
With the Database Engineer Agent:
✅ Schemas are auto-generated from Data Architect logical models with full fidelity
✅ Migrations are versioned, reversible, and CI/CD-integrated
✅ Queries are analyzed and optimized before deployment
✅ Multi-tenant isolation is enforced at the database layer through row-level security, schema separation, or database-per-tenant strategies
✅ Index recommendations are data-driven and continuously refined
🧱 What This Agent Enables
Capability
Impact
🗃️ Schema implementation
Translate logical models to physical DDL scripts
🔄 Migration management
Versioned, reversible schema migrations with rollback support
⚡ Query optimization
Analyze and rewrite slow queries, suggest execution plan improvements
📊 Index management
Create, analyze, and maintain indexes based on query patterns
🔐 Multi-tenant isolation
Implement row-level security, schema separation, or database-per-tenant
💾 Backup/recovery strategy
Define backup schedules, retention policies, and recovery procedures
📈 Performance tuning
Connection pooling, query caching, partitioning, and statistics management
The Database Engineer Agent sits within the Software Engineering cluster, acting as the bridge between data architecture decisions and their physical database implementation.
flowchart TD
DataArchitect[Data Architect Agent]
DatabaseEngineer[🗄️ Database Engineer Agent]
BackendDev[Backend Developer Agent]
InfraEngineer[Infrastructure Engineer Agent]
DevOps[DevOps Architect Agent]
CodeReviewer[Code Reviewer Agent]
DataArchitect --> DatabaseEngineer
DatabaseEngineer --> BackendDev
DatabaseEngineer --> InfraEngineer
DatabaseEngineer --> DevOps
DatabaseEngineer --> CodeReviewer
Hold "Alt" / "Option" to enable pan & zoom
📡 Triggering Events
Event
Source
Description
data_model_finalized
Data Architect Agent
Logical data model approved and ready for physical implementation
schema_migration_required
Backend Developer Agent / Tech Lead
Schema change needed for new feature or modification
performance_optimization_requested
Observability Engineer Agent
Slow queries or database bottlenecks detected in production
tenant_onboarding_initiated
Platform Orchestrator
New tenant requires database provisioning and isolation setup
backup_policy_review_requested
Infrastructure Engineer Agent
Backup/recovery strategy needs creation or update
index_analysis_triggered
Scheduled / Performance Monitor
Periodic index health and optimization analysis
📋 Core Responsibilities
🔧 Responsibilities Overview
#
Responsibility
Description
Output
1
Schema implementation
Convert Data Architect logical models into physical DDL with constraints, types, and relationships
database-schema.sql, schema-definition.yaml
2
Migration script generation
Create versioned, reversible migration scripts for schema changes
migrations/ directory with up/down scripts
3
Query optimization
Analyze query execution plans, rewrite inefficient queries, suggest improvements
query-optimization-report.yaml
4
Index management
Design and maintain indexes based on query patterns, workload analysis, and cardinality
index-strategy.yaml
5
Multi-tenant isolation
Implement tenant data separation using RLS, schema-per-tenant, or database-per-tenant
tenant-isolation-config.yaml
6
Database performance tuning
Configure connection pooling, query caching, partitioning, and statistics
performance-config.yaml
7
Backup/recovery strategy
Define backup schedules, retention, point-in-time recovery, and disaster recovery procedures
backup-recovery-plan.yaml
📦 Deliverables
Artifact
Format
Purpose
database-schema.sql
SQL
Physical schema DDL with tables, constraints, indexes, and types
schema-definition.yaml
YAML
Declarative schema definition for code-first workflows
migrations/
SQL / C#
Versioned migration scripts (EF Core, Flyway, or raw SQL)
migration-plan.yaml
YAML
Migration execution plan with ordering, dependencies, and rollback steps
query-optimization-report.yaml
YAML
Analysis of slow queries with recommendations and rewritten alternatives
index-strategy.yaml
YAML
Index definitions, covering indexes, and maintenance schedules
tenant-isolation-config.yaml
YAML
Multi-tenant isolation implementation configuration
performance-config.yaml
YAML
Database performance tuning parameters
backup-recovery-plan.yaml
YAML
Backup schedules, retention policies, and recovery procedures
database-topology.mmd
Mermaid
Visual diagram of database relationships and tenant topology
DatabaseSchemaPublished
JSON (Event)
Lifecycle event emitted on successful schema generation
🗃️ 1. Schema Implementation
Task
Output
Parse logical data model from Data Architect
Entity-relationship validation report
Generate physical DDL with appropriate data types
database-schema.sql
Apply constraints (PK, FK, unique, check, defaults)
Constraint definitions per table
Create stored procedures and functions where required
Procedure scripts
Generate EF Core entity configurations for code-first
EntityTypeConfiguration classes
Sample Schema Output
CREATE TABLE [ dbo ].[ Invoices ] (
[ Id ] UNIQUEIDENTIFIER NOT NULL DEFAULT NEWSEQUENTIALID (),
[ TenantId ] UNIQUEIDENTIFIER NOT NULL ,
[ CustomerId ] UNIQUEIDENTIFIER NOT NULL ,
[ Amount ] DECIMAL ( 18 , 2 ) NOT NULL ,
[ Currency ] NVARCHAR ( 3 ) NOT NULL DEFAULT 'USD' ,
[ Status ] NVARCHAR ( 50 ) NOT NULL DEFAULT 'Draft' ,
[ CreatedAt ] DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME (),
[ ModifiedAt ] DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME (),
CONSTRAINT [ PK_Invoices ] PRIMARY KEY CLUSTERED ([ Id ]),
CONSTRAINT [ FK_Invoices_Customers ] FOREIGN KEY ([ CustomerId ])
REFERENCES [ dbo ].[ Customers ] ([ Id ]),
INDEX [ IX_Invoices_TenantId ] NONCLUSTERED ([ TenantId ]),
INDEX [ IX_Invoices_Status ] NONCLUSTERED ([ Status ]) INCLUDE ([ Amount ], [ Currency ])
);
🔄 2. Migration Management
Task
Output
Detect schema differences between current and target state
Diff report
Generate idempotent up/down migration scripts
Versioned migration files
Validate migration ordering and dependency chain
migration-plan.yaml
Support EF Core migrations, Flyway, and raw SQL modes
Format-specific output
Include rollback verification and data preservation checks
Rollback validation report
Migration Naming Convention
V{version}_{timestamp}_{description}.sql
Example: V003_20260329_AddInvoiceStatusIndex.sql
⚡ 3. Query Optimization
Analysis
Action
Execution plan analysis
Identify table scans, missing indexes, key lookups
Query rewriting
Suggest JOIN optimizations, CTE refactoring, predicate pushdown
Parameter sniffing detection
Flag queries vulnerable to plan cache pollution
N+1 query detection
Identify ORM-generated N+1 patterns from EF Core
Statistics staleness
Recommend statistics update schedules
🔐 4. Multi-Tenant Isolation
Strategy
When Used
Implementation
Row-Level Security (RLS)
Shared database, shared schema
Security policies filtering by TenantId
Schema-per-Tenant
Shared database, separate schemas
Dynamic schema routing in connection middleware
Database-per-Tenant
Full isolation required
Separate connection strings, database provisioning
Hybrid
Mixed workloads
RLS for standard tenants, dedicated DB for premium
🤝 Collaboration Interfaces
🔼 Upstream Providers
Agent
Input Artifact
Data Architect Agent
Logical data models, entity relationships, retention policies
Solution Architect Agent
Multi-tenancy strategy and data isolation requirements
Security Architect Agent
Encryption-at-rest policies, access control requirements
🔽 Downstream Consumers
Agent
Consumes
Backend Developer Agent
Schema definitions, EF Core configurations, migration scripts
Infrastructure Engineer Agent
Database provisioning requirements, backup policies
DevOps Architect Agent
Migration execution plans for CI/CD pipeline integration
Code Reviewer Agent
Query optimization reports for performance review gates
Observability Engineer Agent
Database performance metrics and alerting thresholds
📡 Collaboration Flow
flowchart TD
DataArchitect --> DatabaseEngineer
SolutionArchitect --> DatabaseEngineer
SecurityArchitect --> DatabaseEngineer
DatabaseEngineer --> BackendDeveloper
DatabaseEngineer --> InfrastructureEngineer
DatabaseEngineer --> DevOpsArchitect
DatabaseEngineer --> CodeReviewer
DatabaseEngineer --> ObservabilityEngineer
Hold "Alt" / "Option" to enable pan & zoom
📚 Memory and Knowledge
🕐 Short-Term (Session) Memory
Key
Purpose
trace_id
Tracks all outputs and events for the current schema design cycle
target_database
Database engine and version (SQL Server, PostgreSQL, CosmosDB)
migration_version
Current schema version and target version
tenant_strategy
Active multi-tenant isolation strategy
optimization_targets[]
Queries and tables flagged for optimization
🧠 Long-Term Semantic Memory
Data
Used For
Historical migration chains
Ensure migration compatibility and prevent conflicts
Query performance baselines
Compare optimization results against historical benchmarks
Index usage statistics
Identify unused indexes for removal, suggest missing indexes
Tenant growth patterns
Proactive capacity planning and isolation strategy adjustments
Schema evolution history
Track breaking changes and maintain backward compatibility
🧠 Knowledge Base
Domain
Content
Database Engine Specifics
SQL Server, PostgreSQL, CosmosDB optimization techniques and limitations
EF Core Patterns
Code-first conventions, migration best practices, performance pitfalls
Index Design
B-tree, hash, columnstore, filtered, and covering index strategies
Multi-Tenancy Patterns
RLS implementation, schema routing, connection management
ConnectSoft Standards
Naming conventions, data type preferences, constraint policies
Migration Safety
Zero-downtime migration techniques, backward-compatible schema changes
✅ Validation and Correction
🔍 Validation Rules
Rule
Description
Severity
All tables must have a primary key
No heap tables allowed
Critical
Foreign keys must reference existing tables
Referential integrity enforcement
Critical
TenantId column required on all tenant-scoped tables
Multi-tenant compliance
Critical
Migrations must be idempotent and reversible
Safe deployment guarantee
High
All indexed columns must have appropriate statistics
Query optimizer support
Medium
Naming must follow ConnectSoft conventions
PascalCase tables, IX_ prefix for indexes
Medium
No SELECT * in stored procedures or views
Explicit column selection
High
🔁 Auto-Correction Behaviors
Condition
Correction
Missing TenantId on tenant-scoped table
Auto-add column with RLS policy
Index missing on foreign key column
Generate index definition
Migration lacks rollback script
Generate inverse DDL
Data type mismatch with logical model
Align to Data Architect specification
Missing CreatedAt / ModifiedAt audit columns
Inject standard audit columns
📈 Observability Spans
Span Name
Trigger
schema_generation_started
Agent begins schema creation
migration_scripts_generated
Migration files created
query_optimization_completed
Query analysis finished
tenant_isolation_configured
Multi-tenant setup completed
database_schema_published
All artifacts emitted successfully
📢 Lifecycle Events
Event
Trigger
Payload
DatabaseSchemaPublished
Schema generation completed
Schema version, table count, trace_id
MigrationPlanCreated
Migration scripts ready
Migration version, step count, rollback status
QueryOptimizationCompleted
Performance analysis done
Query count, improvement percentage
TenantIsolationConfigured
Multi-tenant setup completed
Strategy type, tenant count
DatabaseValidationFailed
Validation rules not met
Failed rules, affected tables
✅ Summary
The Database Engineer Agent ensures that all database implementations in the ConnectSoft platform are:
Capability
Status
🗃️ Faithful physical implementation of Data Architect designs
✅
🔄 Versioned, reversible, CI/CD-ready migrations
✅
⚡ Optimized queries with execution plan analysis
✅
📊 Data-driven index management and maintenance
✅
🔐 Enforced multi-tenant data isolation
✅
💾 Defined backup/recovery strategies
✅
📈 Continuous database performance tuning
✅
"The Database Engineer Agent bridges the gap between data architecture vision and database reality — ensuring every schema, migration, and query in the ConnectSoft platform is performant, secure, and production-ready."