Skip to content

Context Builder

Target Architecture — Final-State Design

This page is a deep dive on the ContextBuilderService, the heart of how the Knowledge Platform grounds autonomous agents. It assembles a governed, ranked, token-budgeted Context Package for every agent task.

The ContextBuilderService (ConnectSoft.Factory.Knowledge.ContextBuilderService) is the platform's synthesis engine. When the Agent Mesh is about to execute an agent task, it does not hand the model a raw prompt — it requests a Context Package. The builder fuses graph, vector, and metadata retrieval, applies governance, fits the result to a token budget, and returns a single ranked, auditable bundle. This is the mechanism that turns the factory's accumulated memory into grounded agent reasoning rather than parametric guesswork.

Responsibilities

  • Accept a ContextBuildRequest and produce a ContextPackage within a hard token budget.
  • Retrieve candidate knowledge from three complementary sources: the knowledge graph (structural neighbourhood), vector memory (semantic similarity), and the metadata index (structured filters).
  • Fuse and re-rank candidates, then select the highest-value subset that fits the budget.
  • Enforce governance: evaluate access policy, redact or exclude disallowed sources, and record the decision and audit.
  • Cache the package in Redis for fast retrieval and emit ContextPackageBuilt for traceability and improvement attribution.

Context Builder Sequence

sequenceDiagram
    participant AM as Agent Mesh
    participant CB as ContextBuilderService
    participant KG as KnowledgeGraphService
    participant VM as VectorMemoryService
    participant MI as MetadataIndexService
    participant POL as MemoryPolicyService
    participant RED as MemoryRedactionService
    participant Cache as Redis

    AM->>CB: POST /knowledge/context/build (request)
    CB->>CB: persist ContextBuildRequest; emit ContextBuildRequested
    par Parallel retrieval
        CB->>KG: graph neighbourhood (project, module, decisions)
        KG-->>CB: KnowledgeNodes + edges (GraphProjection)
        CB->>VM: semantic / hybrid search (intent)
        VM-->>CB: ranked VectorDocuments
        CB->>MI: structured filter (artifacts, tasks)
        MI-->>CB: metadata records
    end
    CB->>CB: fuse + re-rank candidates
    CB->>POL: evaluate access (candidates)
    POL-->>CB: decisions (allow / redact / deny)
    opt redaction required
        CB->>RED: redact restricted sources
        RED-->>CB: redacted projections
    end
    CB->>CB: budget-fit selection (token budget)
    CB->>Cache: store ContextPackage (ttlSeconds)
    CB-->>AM: 200 ContextPackage (sources, budget, ttl)
    CB->>CB: emit ContextPackageBuilt
Hold "Alt" / "Option" to enable pan & zoom

Build Pipeline Stages

  1. Request capture — the ContextBuildRequest (with intent, scope, retrieval spec, optional pinned sources) is persisted and ContextBuildRequested is emitted.
  2. Parallel retrieval — graph, vector, and metadata retrieval run concurrently over internal gRPC for low latency:
  3. Graph — neighbourhood projection around the project/module: prior decisions, derived artifacts, contracts, dependencies, related runtime signals.
  4. Semantic/hybrid — dense-vector (and hybrid) search over the intent across blueprints, docs, code, and patterns.
  5. Metadata — structured filters (artifact type, status, module) to pull exact, high-precision matches.
  6. Fusion & re-ranking — candidates from all three origins are merged, de-duplicated by ref, and re-ranked using a blended score (semantic similarity, graph proximity, recency, quality score, and reuse signal). pinned sources are always included.
  7. GovernanceMemoryPolicyService evaluates a MemoryAccessPolicy over the candidate set; disallowed sources are dropped and Confidential sources are redacted via MemoryRedactionService. A MemoryAccessDecision (policyDecisionId) and MemoryAccessAudit are recorded.
  8. Budget fitting — the builder greedily selects the highest-value allowed sources until the tokenBudget is reached, tracking tokensUsed. Lower-value or oversized sources are summarised or excluded.
  9. Assembly & cache — the ContextPackage (with ContextSource[], graphProjectionId, policyDecisionId) is cached in Redis under contextPackageId for ttlSeconds, and ContextPackageBuilt is emitted.

Ranking Signals

Signal Source Effect
Semantic similarity Vector memory Core relevance to the intent
Graph proximity Knowledge graph Boost structurally related artifacts/decisions
Recency Metadata Prefer current versions; penalise stale/superseded
Quality score KnowledgeQualityAssessment Down-rank low-quality knowledge
Reuse signal Pattern Catalog Boost proven, reusable patterns/blueprints
Pinned Request Force inclusion of explicitly required sources

Token Budgeting

The builder honours the tokenBudget as a hard constraint so prompts never exceed model limits:

  • Each ContextSource carries a tokens estimate; the running total never exceeds tokenBudget.
  • When a high-value source is too large, the builder includes a summarised projection (and links the full artifact) rather than dropping it.
  • The package records tokensUsed so the Agent Mesh and Observability & Feedback Platform can correlate context size with outcome quality.

Caching & Freshness

  • Built packages live in Redis keyed by contextPackageId with ttlSeconds (default 1800s).
  • GET /knowledge/context/{contextPackageId} serves from cache; an expired package is transparently rebuilt from the original contextBuildRequestId.
  • A durable record of every package is also kept in SQL for 90 days for audit and improvement attribution.

Governance & Traceability

Every package is fully auditable: it links its taskId, traceId, policyDecisionId, and graphProjectionId, so the factory can later attribute an artifact's quality or a runtime incident back to the exact context that produced it — the foundation of the self-improvement loop. See Context Package Schema and Governance.

Failure Handling

  • If a retrieval source is unavailable, the builder degrades gracefully (e.g. graph-only) and flags reduced coverage on the package.
  • If governance denies all candidates, an empty-but-valid package is returned with the denial recorded, so the agent fails safely rather than reasoning ungrounded.
  • All errors propagate traceId into OTEL for diagnosis.