State & Memory¶
Overview¶
The Factory runtime manages two distinct but related concepts:
- Run State — Operational state for active and recent runs (ephemeral, per-run)
- AI Memory — Long-term knowledge stored in the Knowledge & Memory System (persistent, cross-project)
This separation enables efficient operational execution while building a persistent knowledge base that improves over time.
Run State Store¶
What Is Stored¶
The Run State Store maintains operational state for Factory runs:
Run Metadata¶
- Run Identifiers — runId, tenantId, projectId, templateRecipeId
- Run Status — Current state (Requested, Validated, Queued, Running, Succeeded, Failed, Cancelled)
- Timestamps — requestedAt, startedAt, completedAt, updatedAt
- Request Context — User who requested, request parameters, configuration
Step/Job Status¶
- Job Identifiers — jobId, stepName, attempt number
- Job Status — Current state (Pending, Running, Succeeded, Failed, Cancelled)
- Job Results — Success/failure status, error messages, execution duration
- Artifact References — Links to generated artifacts (repo URLs, pipeline IDs, etc.)
Execution Context¶
- Correlation IDs — traceId, spanId for distributed tracing
- External System IDs — Azure DevOps buildId, repoId, pipelineId, workItemId
- Execution Metadata — Worker instance, execution environment, resource usage
Storage Choice¶
The Run State Store is typically implemented as:
- Relational Database (SQL) — For structured queries, joins, and transactional consistency
- Examples: PostgreSQL, SQL Server, Azure SQL Database
- Benefits: ACID transactions, complex queries, referential integrity
- Document Database — For flexible schema and horizontal scaling
- Examples: Cosmos DB, MongoDB
- Benefits: Schema flexibility, horizontal scaling, JSON-native storage
Considerations: - Query Patterns — Relational DB for complex queries (e.g., "all runs for project X in last 30 days") - Scale Requirements — Document DB for high-scale, multi-tenant scenarios - Consistency Needs — Relational DB for strong consistency requirements
Job State & Idempotency Keys¶
Idempotency Key Structure¶
Jobs use structured idempotency keys to ensure safe retries:
- Format:
{runId}:{stepName}:{attempt} - Example:
run-abc123:generate-repo:1 - Purpose: Uniquely identifies a job execution attempt
State Fields for Safe Retries¶
Job state includes fields that enable safe retries:
- idempotencyKey — Unique key for deduplication
- status — Current job status (Pending, Running, Succeeded, Failed)
- attemptNumber — Current retry attempt (1, 2, 3, ...)
- lastAttemptAt — Timestamp of last execution attempt
- result — Execution result (success/failure, error details)
- checkpoint — Progress checkpoint for resumable jobs
Atomic State Updates¶
State updates are atomic to prevent race conditions:
- Optimistic Locking — Use version numbers or timestamps to detect concurrent updates
- Transactional Updates — Use database transactions for multi-field updates
- Idempotent Operations — State updates are idempotent (applying same update twice has no effect)
Artifacts & Metadata¶
Artifact Storage¶
Generated artifacts are stored in external systems, not in the Run State Store:
- Git Repositories — Code, tests, documentation stored in Azure DevOps or GitHub
- Azure DevOps — Pipelines, work items, artifacts stored in Azure DevOps
- Blob Storage — Large artifacts (diagrams, models) stored in Azure Blob Storage
- Container Registries — Docker images stored in Azure Container Registry
Artifact References¶
The Run State Store maintains references to artifacts:
- Repository URLs — Links to generated Git repositories
- Pipeline IDs — References to generated CI/CD pipelines
- Work Item IDs — References to created Azure DevOps work items
- Blob URLs — Links to stored artifacts in blob storage
- Artifact Metadata — Size, type, creation timestamp, checksums
Artifact Lifecycle¶
Artifacts follow a lifecycle managed by the Factory:
- Generation — Artifacts are generated by workers
- Storage — Artifacts are stored in external systems (Git, Azure DevOps, etc.)
- Reference — Artifact references are stored in Run State Store
- Indexing — Selected artifacts are indexed in Knowledge & Memory System
- Retention — Artifacts are retained according to retention policies
AI Memory & Knowledge System Integration¶
Operational State vs. Long-Term Memory¶
The Factory maintains a clear separation:
| Aspect | Operational State (Run State Store) | Long-Term Memory (Knowledge System) |
|---|---|---|
| Purpose | Track active runs, enable execution | Learn patterns, enable reuse |
| Lifetime | Ephemeral (weeks/months) | Persistent (years) |
| Scope | Per-run, per-project | Cross-project, cross-tenant |
| Query Pattern | Structured queries (SQL) | Semantic search (vector) |
| Update Frequency | High (real-time) | Low (batch/indexing) |
How Factory Interacts with Knowledge System¶
During Execution¶
- Pattern Retrieval — Agents query Knowledge System for similar past solutions
- Template Lookup — Look up templates and patterns from Knowledge System
- Context Enrichment — Retrieve relevant historical context for agents
After Execution¶
- Run Summaries — Store run summaries, outcomes, and learnings
- Pattern Extraction — Extract reusable patterns from generated artifacts
- Failure Analysis — Store failure patterns and resolutions for future reference
- Success Patterns — Index successful solutions for reuse
Vector Indexes¶
The Knowledge System uses vector indexes for semantic search:
- Template Knowledge — Vector embeddings of templates, blueprints, and patterns
- Past Runs — Vector embeddings of run summaries, decisions, and outcomes
- Code Patterns — Vector embeddings of code snippets and architectural patterns
- Domain Knowledge — Vector embeddings of domain-specific solutions
Example Query:
Query: "multi-tenant user management with role-based access"
→ Vector search finds:
- Past runs that generated similar solutions
- Templates for multi-tenant patterns
- Code patterns for RBAC implementation
State & Memory Architecture¶
graph TD
RunStore[(Run State DB)]
Queue[(Job Queue)]
Artifacts[(Repos, Pipelines, Docs)]
Memory[Knowledge & Memory System<br/>Vector DB, search]
Worker --> RunStore
Worker --> Queue
Worker --> Artifacts
Orchestrator --> RunStore
Orchestrator --> Queue
RunStore --> Memory
Artifacts --> Memory
Worker --> Memory
Orchestrator --> Memory
Data Flows:
- Operational Flow — Workers update RunStore with execution state
- Artifact Flow — Workers create artifacts in external systems, store references in RunStore
- Indexing Flow — Selected artifacts and run summaries are indexed in Memory System
- Query Flow — Agents query Memory System for patterns and context
State Retention and Cleanup¶
Run State Retention¶
Run state is retained for operational and audit purposes:
- Active Runs — Retained indefinitely while run is active
- Completed Runs — Retained for configurable period (e.g., 90 days, 1 year)
- Failed Runs — Retained longer for debugging and analysis (e.g., 1 year)
- Archived Runs — Old runs can be archived to cold storage
Memory System Retention¶
Knowledge & Memory System retains data indefinitely:
- Patterns — Retained permanently for pattern reuse
- Run Summaries — Retained for historical context and learning
- Artifact Indexes — Retained for semantic search and retrieval
- Failure Patterns — Retained for failure analysis and prevention
Related Documentation¶
- Knowledge and Memory System — Comprehensive guide to the Knowledge & Memory System
- Knowledge Indices — Vector search and semantic retrieval
- Knowledge Graph — Graph-based knowledge representation
- Execution Engine — How runs and jobs use state during execution
- Control Plane — How control plane manages state