Skip to content

State & Memory

Overview

The Factory runtime manages two distinct but related concepts:

  1. Run State — Operational state for active and recent runs (ephemeral, per-run)
  2. AI Memory — Long-term knowledge stored in the Knowledge & Memory System (persistent, cross-project)

This separation enables efficient operational execution while building a persistent knowledge base that improves over time.


Run State Store

What Is Stored

The Run State Store maintains operational state for Factory runs:

Run Metadata

  • Run Identifiers — runId, tenantId, projectId, templateRecipeId
  • Run Status — Current state (Requested, Validated, Queued, Running, Succeeded, Failed, Cancelled)
  • Timestamps — requestedAt, startedAt, completedAt, updatedAt
  • Request Context — User who requested, request parameters, configuration

Step/Job Status

  • Job Identifiers — jobId, stepName, attempt number
  • Job Status — Current state (Pending, Running, Succeeded, Failed, Cancelled)
  • Job Results — Success/failure status, error messages, execution duration
  • Artifact References — Links to generated artifacts (repo URLs, pipeline IDs, etc.)

Execution Context

  • Correlation IDs — traceId, spanId for distributed tracing
  • External System IDs — Azure DevOps buildId, repoId, pipelineId, workItemId
  • Execution Metadata — Worker instance, execution environment, resource usage

Storage Choice

The Run State Store is typically implemented as:

  • Relational Database (SQL) — For structured queries, joins, and transactional consistency
  • Examples: PostgreSQL, SQL Server, Azure SQL Database
  • Benefits: ACID transactions, complex queries, referential integrity
  • Document Database — For flexible schema and horizontal scaling
  • Examples: Cosmos DB, MongoDB
  • Benefits: Schema flexibility, horizontal scaling, JSON-native storage

Considerations: - Query Patterns — Relational DB for complex queries (e.g., "all runs for project X in last 30 days") - Scale Requirements — Document DB for high-scale, multi-tenant scenarios - Consistency Needs — Relational DB for strong consistency requirements


Job State & Idempotency Keys

Idempotency Key Structure

Jobs use structured idempotency keys to ensure safe retries:

  • Format: {runId}:{stepName}:{attempt}
  • Example: run-abc123:generate-repo:1
  • Purpose: Uniquely identifies a job execution attempt

State Fields for Safe Retries

Job state includes fields that enable safe retries:

  • idempotencyKey — Unique key for deduplication
  • status — Current job status (Pending, Running, Succeeded, Failed)
  • attemptNumber — Current retry attempt (1, 2, 3, ...)
  • lastAttemptAt — Timestamp of last execution attempt
  • result — Execution result (success/failure, error details)
  • checkpoint — Progress checkpoint for resumable jobs

Atomic State Updates

State updates are atomic to prevent race conditions:

  • Optimistic Locking — Use version numbers or timestamps to detect concurrent updates
  • Transactional Updates — Use database transactions for multi-field updates
  • Idempotent Operations — State updates are idempotent (applying same update twice has no effect)

Artifacts & Metadata

Artifact Storage

Generated artifacts are stored in external systems, not in the Run State Store:

  • Git Repositories — Code, tests, documentation stored in Azure DevOps or GitHub
  • Azure DevOps — Pipelines, work items, artifacts stored in Azure DevOps
  • Blob Storage — Large artifacts (diagrams, models) stored in Azure Blob Storage
  • Container Registries — Docker images stored in Azure Container Registry

Artifact References

The Run State Store maintains references to artifacts:

  • Repository URLs — Links to generated Git repositories
  • Pipeline IDs — References to generated CI/CD pipelines
  • Work Item IDs — References to created Azure DevOps work items
  • Blob URLs — Links to stored artifacts in blob storage
  • Artifact Metadata — Size, type, creation timestamp, checksums

Artifact Lifecycle

Artifacts follow a lifecycle managed by the Factory:

  1. Generation — Artifacts are generated by workers
  2. Storage — Artifacts are stored in external systems (Git, Azure DevOps, etc.)
  3. Reference — Artifact references are stored in Run State Store
  4. Indexing — Selected artifacts are indexed in Knowledge & Memory System
  5. Retention — Artifacts are retained according to retention policies

AI Memory & Knowledge System Integration

Operational State vs. Long-Term Memory

The Factory maintains a clear separation:

Aspect Operational State (Run State Store) Long-Term Memory (Knowledge System)
Purpose Track active runs, enable execution Learn patterns, enable reuse
Lifetime Ephemeral (weeks/months) Persistent (years)
Scope Per-run, per-project Cross-project, cross-tenant
Query Pattern Structured queries (SQL) Semantic search (vector)
Update Frequency High (real-time) Low (batch/indexing)

How Factory Interacts with Knowledge System

During Execution

  • Pattern Retrieval — Agents query Knowledge System for similar past solutions
  • Template Lookup — Look up templates and patterns from Knowledge System
  • Context Enrichment — Retrieve relevant historical context for agents

After Execution

  • Run Summaries — Store run summaries, outcomes, and learnings
  • Pattern Extraction — Extract reusable patterns from generated artifacts
  • Failure Analysis — Store failure patterns and resolutions for future reference
  • Success Patterns — Index successful solutions for reuse

Vector Indexes

The Knowledge System uses vector indexes for semantic search:

  • Template Knowledge — Vector embeddings of templates, blueprints, and patterns
  • Past Runs — Vector embeddings of run summaries, decisions, and outcomes
  • Code Patterns — Vector embeddings of code snippets and architectural patterns
  • Domain Knowledge — Vector embeddings of domain-specific solutions

Example Query:

Query: "multi-tenant user management with role-based access"
→ Vector search finds:
  - Past runs that generated similar solutions
  - Templates for multi-tenant patterns
  - Code patterns for RBAC implementation


State & Memory Architecture

graph TD
    RunStore[(Run State DB)]
    Queue[(Job Queue)]
    Artifacts[(Repos, Pipelines, Docs)]
    Memory[Knowledge & Memory System<br/>Vector DB, search]

    Worker --> RunStore
    Worker --> Queue
    Worker --> Artifacts
    Orchestrator --> RunStore
    Orchestrator --> Queue

    RunStore --> Memory
    Artifacts --> Memory
    Worker --> Memory
    Orchestrator --> Memory
Hold "Alt" / "Option" to enable pan & zoom

Data Flows:

  1. Operational Flow — Workers update RunStore with execution state
  2. Artifact Flow — Workers create artifacts in external systems, store references in RunStore
  3. Indexing Flow — Selected artifacts and run summaries are indexed in Memory System
  4. Query Flow — Agents query Memory System for patterns and context

State Retention and Cleanup

Run State Retention

Run state is retained for operational and audit purposes:

  • Active Runs — Retained indefinitely while run is active
  • Completed Runs — Retained for configurable period (e.g., 90 days, 1 year)
  • Failed Runs — Retained longer for debugging and analysis (e.g., 1 year)
  • Archived Runs — Old runs can be archived to cold storage

Memory System Retention

Knowledge & Memory System retains data indefinitely:

  • Patterns — Retained permanently for pattern reuse
  • Run Summaries — Retained for historical context and learning
  • Artifact Indexes — Retained for semantic search and retrieval
  • Failure Patterns — Retained for failure analysis and prevention