Skip to content

🧠 Test Generator Agent Specification

🎯 Purpose

The Test Generator Agent is an AI-first, behavior-aware, exploratory test generation agent that:

Detects gaps, anomalies, or incomplete paths in the system and generates hypothetical, behavioral, or runtime-inspired test cases β€” even when code or handlers do not directly define them.

Unlike the Test Case Generator Agent, which strictly scaffolds test files from static inputs (handlers, DTOs, validators), this agent acts as an intelligent tester, simulating what a skilled QA engineer or SDET would do when:

  • Reviewing event logs or traces
  • Exploring undocumented edge cases
  • Validating role-based behavior
  • Challenging the system’s assumptions
  • Predicting future failure modes

🧠 What This Agent Focuses On

Focus Area Description
Gap Discovery Finds missing test cases not covered by scaffolds
Behavior Inference Infers test scenarios from logs, telemetry, and event streams
AI-Prompted Exploration Uses OpenAI prompts to ask β€œWhat could go wrong?”
Runtime Context Awareness Uses production logs, synthetic data, or simulations
Augmented .feature Generation Extends test libraries with scenario-driven Gherkin cases
Security & Role Testing Generates privilege escalation and unauthorized access tests
Studio Scenario Builder Suggests missing cases or test chains in Studio interface
Negative Path Hypotheses β€œWhat if X fails?” β†’ emits test for simulation and validation

🧠 Test Generator Agent in Action

Example Blueprint:

  • Use Case: CapturePayment
  • Role: Cashier
  • Coverage: 92%, but missing fraud validation

Trigger:

  • Studio detects a lack of fraud simulation test
  • QA engineer submits exploratory prompt:

β€œWhat happens if currency is changed mid-transaction?”

Output:

  • Adds Scenario: Unexpected currency switch to capture_payment.feature
  • Suggests assertion: Should reject mixed currency payments
  • Links the test with trace payments-2025-0429 and gap ID fraud-path-003

πŸ“Œ How It Extends the Factory

Impact Area Value
πŸ” Post-hoc validation Adds depth beyond code structure
🧠 Promptable reasoning Explores untested flows via AI
πŸ” Security coverage Challenges role definitions and abuse paths
πŸ“Š Studio augmentation Powers β€œwhat’s missing?” UX for QA engineers
πŸ” Regression reinforcement Adds tests based on recent bugs or telemetry warnings
🌍 Multi-edition awareness Extends coverage by injecting region, locale, or tenant-specific edge tests

βœ… Summary

The Test Generator Agent is:

  • An adaptive testing thinker, not just a test scaffolder
  • A prompt-based scenario designer capable of inferring unknowns
  • A complement to Test Case Generator Agent β€” together they form a closed loop of test coverage

πŸ“£ It is ConnectSoft’s QA assistant for AI-powered exploratory testing, extending trust, security, and resilience across every service.


🧭 Strategic Position in the QA Cluster and Platform Flow

The Test Generator Agent resides in the QA Engineering Cluster, working as a behavioral test expansion engine alongside:

  • πŸ§ͺ Test Case Generator Agent
  • πŸ“‹ QA Engineer Agent
  • πŸ“ˆ Test Coverage Validator Agent
  • πŸ› Bug Investigator Agent
  • πŸ› οΈ Bug Resolver Agent

It complements the Test Case Generator Agent by offering runtime-aware, user-behavior-simulated, and exploratory test generation.


🧬 Full Factory Flow Positioning

flowchart TD
    A[Blueprint Finalized] --> B[Handlers/Controllers Scaffolded]
    B --> C[TestCaseGeneratorAgent]
    C --> D[TestArtifactsGenerated]
    D --> E[TestCoverageValidatorAgent]
    E -->|Gaps Found| F[TestGeneratorAgent]

    F --> G[Augmented Tests (.feature, hypothesis)]
    G --> H[QAEngineerAgent]
    H --> I[Studio Coverage Preview]

    subgraph QA Cluster
        C
        E
        F
        H
    end
Hold "Alt" / "Option" to enable pan & zoom

βœ… Trigger: Detected gap, Studio prompt, telemetry event, or QA action βœ… Output: Augmented test cases, exploratory .feature files, Markdown hypothesis reports


🀝 Strategic Collaborators

Agent Relationship
QA Engineer Agent Accepts test augmentation requests via Studio prompts
Test Coverage Validator Agent Triggers this agent on low coverage or missing role/scenario
Bug Investigator Agent Submits failed event traces or logs for hypothesis testing
Studio Embeds agent as β€œSuggest test” helper and β€œWhat-if analyzer”
Test Memory Service Provides embeddings, historical test knowledge, or case similarity retrievals

πŸ“¦ Factory Roles

Phase Test Generator Agent’s Role
πŸ”§ Generation (Not involved)
πŸ§ͺ Validation Triggered post-initial test generation
πŸ“Š Augmentation Adds BDD scenarios, unexpected flow coverage
πŸ“˜ Documentation Outputs hypothesis-driven summaries
πŸ› οΈ Bug Handling Suggests reproduction or defense tests
πŸ‘€ Studio UX Embedded as a QA-side assistant for test brainstorming

🧠 Example Activation Scenarios

Trigger Result
β€œGuest user caused null ref in production” Agent generates auth bypass test
β€œQA asks β€˜What if amount is changed while approved?’” Agent emits scenario: Scenario: Changing amount after approval
β€œEdition β€˜lite’ lacks currency test” Agent adds Scenario: Lite edition currency mismatch
β€œTelemetry shows high 400 errors for invalid format” Agent adds edge case tests for known malformed inputs

πŸ“Œ Platform Cluster Inclusion

cluster: qa-engineering
agent: test-generator-agent
position:
  - post-test-validation
  - pre-pr-check
  - pre-release-scenario-expansion
  - on-demand via Studio
activation_modes:
  - trace-aware
  - event-based
  - prompt-triggered
  - gap-fill

βœ… Summary

The Test Generator Agent holds a flexible, reactive position in the QA flow:

  • Triggered after standard test generation
  • Supports human-in-the-loop or auto-augmentation
  • Integrates across Studio, QA, bug handling, and telemetry analysis
  • Strengthens ConnectSoft's commitment to observability-first and validation-enforced automation

πŸ“‹ Responsibilities

The Test Generator Agent is responsible for augmenting, extending, or inventing tests in response to:

  • Coverage gaps
  • Hypothetical failure modes
  • Observability triggers
  • QA prompts or Studio workflows
  • Bug patterns or edge behaviors

It does not replace the Test Case Generator Agent, but complements it with AI-powered reasoning and behavioral expansion.


βœ… Key Responsibilities Breakdown

Responsibility Description
1. Scenario-Based Test Synthesis Generate .feature files or assertions from human prompts, blueprint intent, or gaps
2. Telemetry-Informed Test Creation Use runtime metrics or logs to generate edge-case tests
3. Security Flow Simulation Generate tests for role misuse, unauthorized access, injection paths
4. Hypothesis-Based Coverage Propose test cases based on inferred missing logic (e.g., β€œwhat if input is corrupted?”)
5. Prompt-Based QA Test Expansion Turn freeform QA questions into test logic (What happens if...)
6. Role-Conditional Test Variants Add missing test scenarios per role β†’ action pairing across editions
7. Studio Integration for Suggestions Populate β€œmissing scenario” panels in Studio or markdown summaries
8. Negative Path Discovery Proactively generate failure and edge-case sequences
9. Trace Replay or Test Recovery Rebuild test cases based on logs, bug reports, or failed executions
10. Coverage Rebalancing Fill gaps where no unit, validator, or bdd tests were previously generated

πŸ§ͺ What It Does Not Do

Out of Scope Reason
Standard handler/unit test scaffolding Covered by Test Case Generator Agent
Basic DTO validator test creation Already handled from RuleFor(...) patterns
Static test folder scaffolding Not its role β€” works on runtime, prompts, feedback
Snapshotting full service flows Done by QA Agent in workflow validation stage

πŸ” Responsibility Examples

Example 1 β€” Observability-Based Trigger:

Telemetry shows 500 errors during partial refund flow.

  • βœ… Agent simulates conditions
  • βœ… Emits test: Scenario: Partial refund over max limit
  • βœ… Suggests BDD step definition + validator injection

Example 2 β€” Prompt-Based Expansion:

QA enters in Studio:

β€œWhat happens if payment is submitted twice?”

  • βœ… Agent proposes test:

  • Handle_ShouldReject_WhenPaymentIsDuplicate()

  • Scenario: Duplicate payment request is rejected
  • βœ… Suggests extending integration test and .feature

Example 3 β€” Role-Missing Test:

Coverage shows CFO not tested in enterprise edition

  • βœ… Agent adds:
Scenario: CFO approves high-value invoice
  Given a CFO user
  When they approve an invoice of 10,000
  Then the approval succeeds

πŸ“˜ Artifact-Level Responsibilities

Artifact Action
*.feature Create, enrich, or patch new BDD scenarios
*.cs test files Append hypothesis-based test methods
test-metadata.yaml Emit augmented_by: test-generator-agent
observability-events.jsonl Add logs tagged with source=telemetry, trigger=prompt
qa-feedback.md Summarize new suggestions in Studio format

βœ… Summary

The Test Generator Agent’s responsibilities are centered around:

  • πŸ€– Intelligence (inferred scenarios, gaps, behaviors)
  • πŸ“˜ Flexibility (prompt- or telemetry-based activation)
  • πŸ§ͺ Completeness (role paths, failure modes, missed conditions)
  • πŸ“¦ Usability (artifacts feed directly into CI/CD, Studio, PRs)

It serves as the creative and critical thinking arm of ConnectSoft’s QA cluster β€” inventing the tests that others miss.


πŸ“₯ Inputs

Unlike the Test Case Generator Agent (which consumes static artifacts like handlers and DTOs), the Test Generator Agent takes in a rich, dynamic, multi-modal input set that allows it to:

  • Simulate real-world test gaps
  • Respond to behavior deviations or failures
  • Enrich test coverage based on runtime observations and prompts

πŸ“¦ Primary Input Categories

Input Type Description Example
Blueprint & Trace Metadata Contextual reference for feature, roles, edition trace_id: payment-2025-0147, blueprint_id: usecase-9342
Execution Gaps from Test Coverage Validator Agent Identified missing tests by type, role, or scenario "No role test found for CFO in CancelInvoiceHandler"
Telemetry, Logs, and Spans Runtime events, logs, traces, or error spikes 500 errors during refund, event: UnexpectedCurrencyFormat
QA Prompts from Studio Free-form or structured queries initiated by QA "What if payment is cancelled after capture?"
Edition-Specific Overrides Per-edition roles, behaviors, toggles, or constraints enterprise β†’ emit InvoiceAuditLogged, lite β†’ skip tax validation
Failed Test Executions or Bug Reports History of test failures or uncovered bugs "Duplicate invoice bug, no scenario covers it"
Memory / Similar Test Lookups Historical tests for similar handlers or scenarios From MicroserviceMemoryIndex or vector embeddings
DTO / Domain Object Snapshots Structural references for test input generation CreateInvoiceInput, RefundRequest
Existing .feature Files (for extension) Used to enrich or patch scenarios capture_payment.feature
Authorization Role Matrix Map of role-to-action coverage by edition Used to trigger 403, escalation, or bypass tests

🧠 Sample Enriched Input

trace_id: invoice-2025-0143
blueprint_id: usecase-9241
existing_coverage:
  unit_tests: 3
  bdd_scenarios: 2
  roles_tested: [FinanceManager]
  roles_missing: [CFO, Guest]
telemetry:
  recent_errors:
    - event: "NullReference in RefundProcessor"
    - code: 500
    - payload: "RefundAmount: null"
qa_prompt: "What if refund is issued twice?"
dto_structure:
  RefundRequest:
    - RefundAmount: decimal
    - Reason: string

β†’ Result: Generate

  • Scenario: Duplicate refund prevention
  • Handle_ShouldThrow_WhenRefundIsDuplicate
  • .feature augmentation with CFO role

πŸ’‘ Human-Centered Inputs

Source Format
QA Studio Prompt "What if currency is changed post approval?"
Bug Resolver Comment Bug #4812: missing test for customer type = corporate
Manual Edition Override "Add auth scenario for Guest user in lite edition"

These are structured by the agent’s skill planner into actionable test-generation sequences.


πŸ” Semantic Inputs from Logs

{
  "event": "InvoiceApprovalFailed",
  "role": "Analyst",
  "message": "Unauthorized access attempt",
  "trace_id": "invoice-2025-0311"
}

β†’ Agent infers:

  • Missing test case for unauthorized Analyst attempting approval
  • Proposes .feature scenario + integration test

πŸ“˜ Prompt Template Input

{
  "prompt": "What happens if refund amount is negative?",
  "context": {
    "handler": "IssueRefundHandler",
    "validator": "RefundRequestValidator",
    "roles": ["SupportAgent"],
    "blueprint_id": "usecase-8014"
  }
}

β†’ Skill invoked: ProposeEdgeCaseFromPrompt


βœ… Summary

The Test Generator Agent consumes:

  • 🧠 Blueprint context
  • πŸ“‰ Test gap metadata
  • πŸ“Š Runtime telemetry
  • πŸ§ͺ Human prompts
  • πŸ” Memory and prior test embeddings

This multi-source fusion of design-time, runtime, and interactive inputs makes the agent capable of intelligent, behavioral test expansion unmatched by static test generation tools.


πŸ“€ Outputs

The Test Generator Agent produces intelligent, augmented, and behavior-focused test artifacts that extend the standard test suite. These outputs are:

  • πŸ“˜ Expressive and traceable
  • πŸ§ͺ Hypothetical and exploratory
  • 🧩 Aligned to observed gaps or QA prompts
  • πŸ“„ Structured for CI/CD, QA agents, and Studio consumption

πŸ“¦ Primary Output Artifacts

Output Type Description Format/Example
Augmented .feature Scenarios New or patched BDD scenarios derived from prompts, logs, or test gaps refund_flow.feature
Hypothesis Test Cases Test files or methods appended to existing unit/integration tests Handle_ShouldReject_WhenRefundExceedsLimit()
Studio Markdown Summary Human-readable insights for QA, trace viewers, and prompt responses qa-augmented-tests.md
Test Augmentation Metadata JSON or YAML structured trace β†’ gap β†’ test mapping test-augmentation-metadata.yaml
Observability Event Logs Trace-linked reasoning, AI inferences, scenario triggers testgen-observability.jsonl
Retrospective Gap Patch Commits Optional Git-based diff patches for test extension patch-test-trace-0427.diff
Prompt-to-Test Response Bundles Structured records of QA query β†’ generated test flow Used by Studio feedback panel

πŸ“˜ Output Example: Augmented .feature File

@trace_id: refund-2025-0182
@augmented_by: test-generator-agent
@source: prompt
Feature: Refund flow edge cases

Scenario: Refund is issued twice
  Given a support agent issued a refund
  When they try to issue it again
  Then the system should reject it with status "DuplicateRefund"

Scenario: Negative refund amount
  Given a refund request with amount = -100
  Then the request is rejected

πŸ§ͺ Output Example: Hypothetical Test Method

Appended to IssueRefundHandlerTests.cs:

[TestMethod]
[TraceId("refund-2025-0182")]
[AugmentedBy("test-generator-agent")]
public async Task Handle_ShouldThrow_WhenRefundAmountIsNegative()
{
    var input = new RefundRequest { RefundAmount = -100 };
    var result = await handler.Handle(input);
    Assert.IsFalse(result.IsSuccess);
    Assert.AreEqual("Refund amount must be positive", result.Error.Message);
}

πŸ“„ QA Summary Output (Markdown)

### πŸ“Œ QA Scenario Augmentation – Trace: refund-2025-0182

βœ… Added 2 BDD scenarios to `refund_flow.feature`
βœ… Appended 1 unit test to `IssueRefundHandlerTests.cs`

- Scenario: Duplicate refund attempt β†’ status: handled
- Scenario: Negative refund β†’ validator failed as expected

🧠 Reasoning: Triggered by QA prompt β€œWhat if refund is issued twice?”

πŸ“Ž test-augmentation-metadata.yaml

trace_id: refund-2025-0182
augmented_by: test-generator-agent
source: studio-prompt
test_type: bdd + unit
new_scenarios:
  - Duplicate refund attempt
  - Negative refund
linked_artifacts:
  - refund_flow.feature
  - IssueRefundHandlerTests.cs
roles_covered: [SupportAgent]

πŸ“Š Observability Events

Emitted to testgen-observability.jsonl:

{
  "event": "TestAugmented",
  "trace_id": "refund-2025-0182",
  "source": "qa_prompt",
  "scenario": "Refund is issued twice",
  "roles_covered": ["SupportAgent"],
  "edition": "lite"
}

πŸ“€ Git Patch (Optional)

git apply patch-test-trace-0427.diff
# Adds missing test for RefundAmount < 0

βœ… Summary

The Test Generator Agent emits:

  • πŸ“˜ Test files and .feature enhancements
  • 🧠 Reasoned Markdown outputs for QA and Studio
  • πŸ“¦ Metadata linking test β†’ trace β†’ prompt
  • πŸ“Š Observability events for audit and trace replay
  • πŸ” Optional patch artifacts for review pipelines

These outputs empower QA, Studio, CI/CD, and trace analytics with dynamic, intelligent, explainable test expansions.


🚦 Reactive Triggers – When and Why the Agent Is Invoked

Unlike statically triggered agents (e.g., Test Case Generator), the Test Generator Agent is activated reactively or on-demand, when existing tests are insufficient or complex behaviors demand exploration.

It responds to intelligent signals, not just code artifacts.


πŸ” Triggering Modes

Mode Description Example
πŸ§ͺ Coverage Gap Trigger Fired when Test Coverage Validator Agent detects untested paths β€œNo BDD scenario for CFO role in ApproveInvoice”
🧠 QA Prompt Trigger Triggered via Studio input or prompt request β€œWhat happens if a refund is re-issued after cancellation?”
πŸ“‰ Observability Trigger Based on runtime telemetry (logs, errors, anomalies) β€œSpike in 400 errors on POST /api/refund”
πŸ› Bug Pattern Trigger Initiated when Bug Resolver Agent detects untested fix area β€œBug #4201 lacks test reproduction for invalid tax exemption”
🎭 Edition/Role Test Gap Detected when edition-specific paths aren’t tested β€œEnterprise edition lacks Guest user rejection case”
🧱 New Business Rule Trigger Agent re-scans blueprint after major rule/validation update β€œAdded late fee logic to InvoiceDue β†’ generate overdue scenario”
πŸ’¬ Prompt Simulation Mode Interactive QA request: simulate variants, override parameters β€œSimulate invalid dates across timezones”

🧠 Reactive Lifecycle Example

sequenceDiagram
    participant CoverageValidator
    participant TestGenerator
    participant QAEngineer
    participant Studio

    CoverageValidator->>TestGenerator: Trigger on missing scenario (CFO approval)
    TestGenerator->>QAEngineer: Propose new `.feature` scenario
    QAEngineer->>Studio: Approve test addition
    TestGenerator->>Studio: Emit markdown summary + test metadata
Hold "Alt" / "Option" to enable pan & zoom

πŸ“˜ Sample Trigger: QA Prompt

Input:

{
  "prompt": "What if refund is denied twice?",
  "context": {
    "handler": "IssueRefundHandler",
    "blueprint_id": "usecase-8041"
  }
}

Agent Action:

  • Generates Scenario: Repeat refund denial handling
  • Emits markdown summary for Studio
  • Appends new test to IssueRefundHandlerTests.cs

πŸ“Š Studio Trigger UX

Agent embedded in Studio under:

  • βœ… β€œSuggest Missing Test”
  • βœ… β€œSimulate Alternate Path”
  • βœ… β€œTest Role or Edition Variant”
  • βœ… β€œCover Observability Gap”

Each action produces previewable suggestions.


πŸ” Trigger Types Recap

Type Triggered By Result
πŸ“‰ Observability Telemetry alert, error log Edge case test inferred from log
πŸ§ͺ Coverage Validator agent reports missing scenario Add .feature or unit test
🧠 QA Prompt in Studio Prompt-based scenario simulation
πŸ› Bug Resolver agent signals reproduction required Add test to capture defect case
🧾 Edition Gap in edition-specific role or config Emit conditional .feature or test branch
🧱 Business Rule Blueprint update Add test reflecting new rule path

βœ… Summary

The Test Generator Agent is never passive β€” it is responsive, contextual, and AI-augmented:

  • Activated when something is missing, ambiguous, or questioned
  • Bridges the gap between human QA, production observations, and test logic
  • Enables continuous, adaptive quality assurance beyond static coverage

πŸ” Process Flow (High-Level)

The Test Generator Agent executes in a reactive, prompt-augmented flow, designed to transform incomplete test coverage, ambiguous behaviors, or human questions into concrete, testable outputs.

Unlike deterministic generators, it operates like a QA researcher with memory and observability tools β€” combining design, runtime, and human input into a creative testing loop.


🧬 High-Level Execution Diagram

flowchart TD
    Trigger[πŸ“₯ Trigger Received<br>(Prompt, Gap, Log, Edition)] --> Analyze[🧠 Analyze Context]
    Analyze --> Plan[πŸ“‹ Plan Test Generation Path]
    Plan --> Generate[βš™οΈ Execute Scenario Simulation & Prompt Skills]
    Generate --> Emit[πŸ“€ Emit Test Artifacts]
    Emit --> Validate[πŸ§ͺ Run Lint + Structure Validators]
    Validate --> Report[πŸ“Š Emit Metadata + Studio Summary]
    Report --> Done[βœ… Done]
Hold "Alt" / "Option" to enable pan & zoom

🧱 Phase Descriptions

Phase Description
πŸ“₯ Trigger Received Activated by Studio prompt, Test Coverage Validator, bug trace, or telemetry
🧠 Analyze Context Loads handler, DTO, existing tests, blueprint, roles, and test history
πŸ“‹ Plan Test Path Determines which types of tests to generate: BDD, unit, edge, edition-specific
βš™οΈ Execute Scenario Simulation Uses prompt skills and embeddings to create test content (steps, assertions, inputs)
πŸ“€ Emit Artifacts Outputs .feature, .cs, Markdown, metadata, and logs
πŸ§ͺ Validate Structure Ensures format, structure, trace metadata, naming, and consistency
πŸ“Š Emit Metadata Saves test-augmentation-metadata.yaml, emits spans, and notifies Studio/QA agents

πŸ” Execution Characteristics

Trait Detail
Reentrant Can be retriggered for the same trace to add more cases
Traceable Every output is trace_id and source tagged
Edition-Aware Scenarios vary based on edition_id and roles_allowed
Prompt-Centric User questions or QA notes become execution contexts
Memory-Augmented Reuses previous patterns, tests, and coverage snapshots for alignment

πŸ“˜ Example Flow (Prompt-Driven)

  1. Trigger: QA in Studio asks:

β€œWhat if refund is attempted twice for the same transaction?”

  1. Plan: Agent checks:

  2. Handler: IssueRefundHandler

  3. Existing test coverage
  4. DTO: RefundRequest
  5. Prior bug trace: #REFD-2203

  6. Generate:

  7. .feature scenario:

    Scenario: Duplicate refund is rejected
    
    * Appends unit test method to handler test * Adds markdown explanation

  8. Validate:

  9. Passes lint, snapshot, and coverage inclusion

  10. Emit:

  11. Saves to repo/memory

  12. Updates Studio dashboard
  13. Notifies QA agent

βœ… Summary

The Test Generator Agent runs a reactive, intelligent, trace-backed pipeline that enables:

  • 🧠 Context-driven test generation
  • πŸ“˜ Scenario expansion based on real user questions and observability
  • πŸ“Ž Outputs tied directly to trace, edition, and role
  • πŸ’¬ QA collaboration and Markdown reporting
  • πŸ§ͺ Validation-integrated, repeatable, and audit-safe processes

πŸ”¬ Process Flow (Detailed Skill-Orchestrated Flow)

The Test Generator Agent operates as a skill-orchestrated AI agent, executing a series of modular and reactive skills β€” each one handling a stage in the prompt-to-test translation pipeline.

This architecture supports:

  • πŸ” Reusability across test types
  • πŸ“Ž Trace alignment
  • 🧠 Prompt + telemetry understanding
  • πŸ” Memory + embedding retrieval
  • πŸ“„ Markdown storytelling + Studio visibility

🧬 Detailed Skill Flow

flowchart LR
    A[Trigger Event] --> B[LoadTestContext]
    B --> C[IdentifyTestGap]
    C --> D[PlanScenarioTemplates]
    D --> E[GenerateTestScenarios]
    E --> F[EmitTestArtifacts]
    F --> G[ValidateGeneratedTests]
    G --> H[EmitObservability]
    H --> I[StudioSummaryMarkdown]
Hold "Alt" / "Option" to enable pan & zoom

🧠 Core Skills Used

Skill Purpose
LoadTestContext Loads trace ID, blueprint, handler, DTO, edition, and QA prompt
IdentifyTestGap Queries test coverage memory and recent observability logs
PlanScenarioTemplates Decides what kinds of tests to generate (unit, BDD, edition-specific)
GenerateTestScenarios Uses OpenAI-backed planners to simulate steps, inputs, outcomes
EmitTestArtifacts Writes .feature, .cs, YAML metadata, and Markdown
ValidateGeneratedTests Runs naming, formatting, trace-tag, and structure linting
EmitObservability Emits OpenTelemetry spans + JSONL trace logs
StudioSummaryMarkdown Generates human-readable report for QA and Studio dashboard

πŸ§ͺ Internal Skill Handlers and Sub-Skills

πŸ” Test Discovery + Planning

  • ScanCoverageByTrace(trace_id)
  • FetchMissingScenariosByHandler()
  • DetectUncoveredRoles(trace_id, edition)

✍️ Prompt-Based Scenario Generation

  • GenerateFeatureScenario(prompt)
  • SuggestTestMethod(prompt, handler, dto)
  • InferAssertionsFromDTO(field_rules)

πŸ“˜ Artifact Generation

  • CreateFeatureFile(handler, scenarios)
  • AppendTestMethod(test_class, method_code)
  • EmitTestMetadata(trace_id, test_type, role, edition)

πŸ“ Output Coordination

All skills write into structured paths like:

Tests/
β”œβ”€β”€ PaymentsService.Specs/
β”‚   └── Features/
β”‚       └── refund_flow.feature
β”œβ”€β”€ PaymentsService.UnitTests/
β”‚   └── IssueRefundHandlerTests.cs
β”œβ”€β”€ test-augmentation-metadata.yaml
└── qa-report.md

πŸ“Š Observability Skill Output

Span Example:

{
  "span": "testgen.GenerateFeatureScenario",
  "trace_id": "refund-2025-0142",
  "handler": "IssueRefundHandler",
  "edition": "lite",
  "source": "qa_prompt",
  "scenario_title": "Refund denied twice"
}

Validation Result:

{
  "status": "success",
  "issues_found": 0,
  "methods_added": 1,
  "feature_scenarios_added": 2
}

🧠 Retriable and Self-Healing

All skills:

  • Support retry on failure or invalid output
  • Are idempotent for deterministic prompts
  • Auto-tag retry_count, source, last_modified_by in metadata

βœ… Summary

The Test Generator Agent's core intelligence is delivered through modular Semantic Kernel skills, enabling it to:

  • React to signals and prompts
  • Simulate human testing logic
  • Produce trace-aligned, validated artifacts
  • Emit reasoning and coverage metadata

This skill-based execution allows for fine-grained control, modular upgrades, and full AI-driven collaboration with QA teams and Studio.


🧩 Core Skills

The Test Generator Agent’s core skills power its ability to:

  • Think like a QA engineer
  • Generate complete test flows from partial prompts
  • Simulate business logic without static code structure
  • Convert system knowledge into BDD and executable tests
  • Fill testing blind spots using pattern inference, prompt expansion, and flow simulation

🧠 Core AI-Driven Skills

Skill Name Description
GenerateFeatureScenario(prompt) Converts a prompt like β€œWhat if a refund is issued twice?” into a Gherkin-compliant scenario
SuggestTestMethod(prompt, handler, dto) Creates MSTest method names, signatures, and assertions
ExpandPromptIntoPaths(prompt) Breaks vague QA prompts into multiple edge cases
InferAssertionsFromDTO(dto_rules) Suggests validation rules and expected outcomes
SimulateRoleActions(trace, edition) Generates role-based test variants (e.g. Guest, CFO, SupportAgent)
PlanFailureFlowScenarios() Proposes behavioral fallbacks or constraint violation tests
MapTelemetryToTestInput(log_entry) Turns runtime logs into structured inputs for test simulation
GenerateScenarioMatrix(handler, roles, inputs) Maps permutations of scenario candidates for dynamic .feature generation
PredictRegressionTestImpact(bug_metadata) Suggests new test scenarios based on historical bug pattern embeddings

πŸ’¬ Prompt β†’ Scenario Skill Chain

Input Prompt:

β€œWhat if refund is issued while invoice is locked?”

Skill Execution Chain:

  1. GenerateFeatureScenario β†’

Scenario: Refund rejected while invoice is locked
2. SuggestTestMethod β†’

Handle_ShouldReject_WhenInvoiceIsLocked()
3. InferAssertionsFromDTO β†’ Expected error: "InvoiceLockedException" 4. EmitTestArtifacts β†’ Outputs .feature, .cs, and trace metadata


πŸ§ͺ Sample Skill: ExpandPromptIntoPaths

Input Prompt:

β€œWhat if the amount is too high?”

Output:

- Scenario: Amount equals max allowed
- Scenario: Amount exceeds max allowed
- Scenario: Amount is null
- Scenario: Amount is string
- Scenario: Amount submitted twice

β†’ Used by downstream skills to generate .feature and test method templates.


🧠 DTO-Aware Assertion Inference

Given:

public class RefundRequest {
    [Required]
    public decimal Amount { get; set; }

    [MaxLength(200)]
    public string Reason { get; set; }
}

β†’ InferAssertionsFromDTO() generates:

  • Amount = 0 β†’ IsFailure("Amount must be greater than 0")
  • Reason = 300 chars β†’ IsFailure("Reason too long")

🎯 Use of Embeddings

Skills like PredictRegressionTestImpact() and SuggestTestMethod() use:

  • Vector similarity with past test descriptions
  • Stored embeddings from bug traces, feature summaries, .feature titles
  • Memory of past DTOs and known domain actions

β†’ Improves precision and recall of suggested test cases.


πŸ“˜ Markdown Summary Skill

Skill Output
StudioSummaryMarkdown() Human-readable test reasoning summary
ExplainWhyScenarioMatters() QA-facing commentary attached to .feature preview

βœ… Summary

The Test Generator Agent’s core skill system allows it to:

  • Understand ambiguous prompts
  • Simulate diverse paths with confidence
  • Write tests from language and reasoning, not code alone
  • Collaborate with QA agents and Studio via explainable, testable artifacts

These skills make it an intelligent QA engineer in software form.


πŸ“‘ Observability-Aware Test Inference

The Test Generator Agent integrates with the observability fabric of the platform β€” leveraging telemetry, spans, logs, and runtime event data to:

  • πŸ“‰ Detect real-world behavior gaps
  • πŸ§ͺ Infer scenarios that have not been exercised in tests
  • ⚠️ Proactively propose tests to prevent repeat issues
  • πŸ” Link tests directly to production symptoms

This makes the agent not just QA-aligned β€” but production-informed.


πŸ“Š Key Observability Inputs Used

Source Example
OpenTelemetry Spans High failure rate in RefundService.Handle()
Error Logs Repeated NullReferenceException on CustomerId
HTTP Metrics Surge in 400 BadRequest on POST /invoice
Trace Snapshots Slow response time with specific input patterns
AppInsights / Logs User retries on specific flow = behavioral anti-pattern
Service Events event: PaymentMismatchDetected (never tested in .feature)

πŸ” Skill: MapTelemetryToTestInput

This core skill takes in runtime telemetry (e.g., from logs or spans) and:

  • Identifies which handler or controller was involved
  • Parses any payload (request/response) structures
  • Extracts failure conditions, error messages, or paths
  • Reconstructs an inferred test case

πŸ§ͺ Example Input: Observability-Driven Trigger

{
  "trace_id": "refund-2025-0143",
  "handler": "IssueRefundHandler",
  "error_message": "Amount cannot be null",
  "event": "NullReferenceException",
  "log_payload": {
    "RefundAmount": null,
    "CustomerId": "8d2..."
  }
}

β†’ Agent generates:

  • πŸ”¨ Test Method:

Handle_ShouldFail_WhenRefundAmountIsNull()
* πŸ“˜ .feature Scenario:

Scenario: Refund with missing amount

πŸ“ˆ Metrics-Aware Skills

Skill Behavior
AnalyzeErrorFrequencySpans() Identifies handlers with recurring issues
InferGapsFromUnhandledSpans() Finds spans that lack test trace coverage
GenerateEdgeScenarioFromLog(log) Creates structured test case from runtime data
SuggestAssertionFromError(error_msg) Turns logs into test expectation
AttachTestToTelemetrySource() Tags generated test with observability correlation ID

πŸ“Ž Traceability and Metadata Output

generated_by: test-generator-agent
trigger: observability
trace_id: refund-2025-0143
span_id: a4e12d78e
origin: AppInsights
test_artifact: IssueRefundHandlerTests.cs
scenario: Refund with null amount
asserts: ["Amount must not be null"]

β†’ Used by Studio, QA, PRs, and observability dashboards


πŸ›  Usage in Continuous QA

Use Case Result
🚨 Observability alert β†’ test not found Agent adds scenario
πŸ§ͺ Spike in retry rate Generates scenario: β€œRetry rejected if payment already processed”
🧠 Missing spanβ†’test map Agent auto-fills .feature gap
🧾 Event observed but not validated Scenario added: "event: InvoiceApproved β†’ assert in BDD"

🧘 Integration with QA and Bug Resolver Agents

These agents can flag test absence for observability-based issues β€” the Test Generator Agent:

  • Backfills .feature
  • Suggests new test method
  • Links observability trace β†’ handler β†’ test

βœ… Summary

The Test Generator Agent turns live runtime observations into concrete, testable artifacts, ensuring:

  • πŸ” Nothing observed in production is left untested
  • πŸ§ͺ QA feedback loops close automatically
  • πŸ“Ž Tests are tagged with trace β†’ span β†’ error metadata
  • πŸ“Š Studio and CI gain insight into real-world coverage, not just design coverage

This is a core differentiator of ConnectSoft's Observability-First QA Architecture.


πŸ” Security-Aware Scenario Generation

Security-related bugs are often:

  • 🚫 Undetected by static test generation
  • πŸ” Role-dependent or permission-specific
  • 🧭 Configuration-based (edition, tenant, policy)
  • 🧨 Triggered by unauthorized access or incorrect access control

The Test Generator Agent proactively generates security-focused tests to ensure all authorization paths, privilege boundaries, and denial-of-access flows are covered and tested.


🧩 Security Test Types the Agent Generates

Test Type Description
Unauthorized Role Access Ensures roles without permission are blocked
Anonymous Access Scenarios Verifies [AllowAnonymous] or Unauthenticated β†’ 401 behavior
Role Escalation Attempt Detects missing guards when a lower-privilege role tries privileged action
Edition-Specific Permission Cases Varies access rules based on edition config
Token or Claim Manipulation Explores behavior with corrupted/malformed/missing claims
Restricted State Transition Prevents forbidden state actions (e.g., closing already-paid invoice)

🧠 Inputs for Security Scenario Generation

  • roles_allowed from blueprint or port config
  • AuthorizationMap.yaml per edition
  • Controller annotations like [Authorize(Roles = "FinanceManager")]
  • Previous test coverage: roles tested vs. roles missing
  • Studio prompts (e.g., β€œWhat happens if Guest user tries to approve invoice?”)

πŸ§ͺ Example: Unauthorized Role Test

[TestMethod]
[TraceId("invoice-2025-0147")]
[Edition("enterprise")]
[AugmentedBy("test-generator-agent")]
public async Task Post_ShouldReturn403_WhenGuestTriesToApproveInvoice()
{
    var client = factory.CreateClientWithRole("Guest");
    var response = await client.PostAsJsonAsync("/api/invoice/approve", validPayload);
    Assert.AreEqual(HttpStatusCode.Forbidden, response.StatusCode);
}

πŸ“˜ Example BDD Scenario

@edition:enterprise
@trace_id:invoice-2025-0147
@source:security-inference
Feature: Invoice approval access control

Scenario: Guest user attempts approval
  Given a user with role Guest
  When they send an approval request
  Then access is denied with status 403

πŸ” Skill-Based Flow

Skill Action
EnumerateRoleVariants(handler) Detects all roles not yet tested
SimulateUnauthorizedAccess(role) Proposes tests to enforce denial
InferSecurityPolicyFromEdition(edition) Adjusts tests for edition-specific rules
GenerateAuthorizationAssertions() Converts 403, 401, or redirect outcomes into test assertions
PatchMissingAuthScenarios() Fills BDD/test files with missing security cases

🧬 Edition Example

Blueprint:

edition_id: enterprise
roles_allowed:
  approve_invoice: [CFO, FinanceManager]

Output:

  • βœ… Validates approval for CFO
  • ❌ Rejects Guest, Analyst, SupportAgent
  • πŸ“Ž Adds test + .feature for all undefined roles

πŸ” Traceability

All security scenarios are:

  • Tagged with security_test: true
  • Linked to trace_id, edition, and handler
  • Indexed in test-metadata.yaml and Studio dashboards

βœ… Summary

The Test Generator Agent proactively defends against security regressions by:

  • 🧠 Discovering untested access paths
  • πŸ” Enforcing principle of least privilege through test scenarios
  • πŸ“Ž Tagging tests for edition-, trace-, and role-specific enforcement
  • πŸ“˜ Generating assertions for expected denials and edge paths

Security-aware scenario generation ensures the factory doesn't just produce working software β€” it produces safe software.


πŸ“˜ Scenario Enrichment for BDD and Studio

One of the Test Generator Agent’s most user-facing roles is its ability to enrich .feature files and support Studio-based exploratory QA workflows by:

  • πŸ“– Adding role-, state-, and failure-path scenarios
  • 🧩 Filling in edge cases and prompts into existing .feature files
  • ✨ Enhancing readability, clarity, and auditability of QA test specs
  • πŸ” Keeping BDD specs aligned with trace + prompt context

This bridges developer-generated tests with human-readable QA artifacts.


🧩 Key BDD Enrichment Functions

Enrichment Type Purpose
Scenario Injection Appends new Scenario: blocks to existing .feature files
Prompt-to-Gherkin Expansion Converts Studio prompts into full Gherkin test narratives
Role Variant Enrichment Adds missing role paths (e.g., Guest, Admin)
Condition Branch Scenarios Adds Given/When/Then for alternate flows (e.g., empty input, retry)
Security/Access Markers Adds @auth, @denied, @edition:lite tags
Studio Preview Markdown Generates QA- and product-friendly descriptions for visual dashboards

πŸ§ͺ BDD Example Before Enrichment

Feature: Capture payment

Scenario: Successful payment
  Given a cashier submits a valid payment
  Then the payment is recorded

🧠 After Enrichment

Feature: Capture payment

Scenario: Successful payment
  Given a cashier submits a valid payment
  Then the payment is recorded

Scenario: Duplicate payment submission
  Given a payment was already processed
  When a second request is sent
  Then the request is rejected with status DuplicatePayment

Scenario: Guest user attempts payment
  Given a user with role Guest
  When they submit a payment
  Then access is denied with 403

πŸ”§ BDD Enrichment Skills Used

Skill Description
DetectEnrichableFeature(trace_id) Loads and parses current .feature file
ProposeNewScenarios(prompt, gaps) Suggests 1–N new Gherkin scenarios
ApplyRoleMatrixToFeature() Adds one scenario per uncovered role
InsertScenarioTags() Adds @edition, @prompt, @security
WriteFeaturePatch() Appends new scenarios into correct file with preservation
GenerateStudioMarkdownSummary() Outputs readable descriptions for QA & PM dashboards

πŸ“‹ Output: Studio Markdown Summary

### Test Scenario Expansion for Capture Payment

πŸ“Ž Trace: payments-2025-0321
πŸ” Trigger: Studio prompt β€œWhat if payment is submitted twice?”

βœ… Appended to `capture_payment.feature`:
- Scenario: Duplicate payment submission
- Scenario: Guest user access denied

@tags: security, regression, prompt-driven

πŸ”„ Scenario Enrichment Metadata

All new .feature scenarios are traced with:

scenario_source: test-generator-agent
trace_id: payments-2025-0321
trigger: qa_prompt
augmented_roles: [Guest]
edition: enterprise

Stored in test-augmentation-metadata.yaml and indexed for Studio navigation.


πŸ“Š Studio Integration

Section Use
Test Preview Panel Shows enriched scenarios with trace context
Missing Roles Dashboard Lists roles not yet tested (used by ApplyRoleMatrixToFeature())
Prompt History Panel Matches generated scenarios with QA questions
Audit Logs Shows scenario origin, version, and rationale

βœ… Summary

The Test Generator Agent enriches BDD and Studio workflows by:

  • ✍️ Appending intelligent scenarios to .feature specs
  • 🧠 Translating QA prompts into structured, Gherkin-based tests
  • πŸ“Ž Embedding trace, edition, and security metadata
  • πŸ“˜ Supporting QA engineers, product managers, and test reviewers with markdown insights

This creates a seamless QA experience: from test prompt β†’ to scenario β†’ to Studio visibility.


🀝 Integration with QA Engineer Agent and Studio

The Test Generator Agent is designed to work as a direct augmentation partner for the QA Engineer Agent and Studio UX.

  • πŸ“‹ The QA Engineer Agent owns the QA strategy, gap tracking, test run validation, and regression feedback.
  • 🧠 The Test Generator Agent enhances QA workflows with intelligent, promptable, and observability-aware test generation.
  • πŸ–₯️ Studio is the shared interface, where both agents are visible to QA engineers, test designers, and product managers.

🧩 Integration Points with QA Engineer Agent

Function Description
Scenario Suggestion Sync Agent suggests test cases for uncovered flows β†’ QA Engineer Agent decides inclusion
Gap Auto-Filling QA Agent flags missing paths β†’ Test Generator Agent emits .feature patch
Prompt Collaboration QA prompt β†’ Test Generator Agent simulates β†’ QA Agent evaluates & stores
Feedback Loop QA Agent provides βœ… accepted, πŸ” needs correction, or ❌ not relevant
Studio Summary Indexing Test Generator outputs markdown + tags for QA Agent’s test plan reporting
Role Map Expansion QA Agent tracks role coverage, Test Generator fills missing paths per handler or edition

🧬 Skill-Based Collaboration

sequenceDiagram
    participant QAEngineerAgent
    participant TestGeneratorAgent
    participant Studio

    Studio->>QAEngineerAgent: Detect uncovered refund flow
    QAEngineerAgent->>TestGeneratorAgent: Prompt: "What if refund is rejected twice?"
    TestGeneratorAgent->>QAEngineerAgent: Emit `.feature` + `.cs` test case
    QAEngineerAgent->>Studio: Approve, reject, or comment
    Studio->>TestGeneratorAgent: Feedback logged
Hold "Alt" / "Option" to enable pan & zoom

🧠 Shared Artifacts

Artifact Used By Description
test-augmentation-metadata.yaml Both Stores trace→scenario link and origin
qa-report.md QA Agent Combines test summaries for visibility and coverage tracking
.feature enriched files QA Agent Updated BDD specs consumed by QA and CI
observability-events.jsonl QA Agent QA traceability and status tracking
StudioPromptContext.json QA Agent + Studio Tracks prompt β†’ test suggestion β†’ review cycle

πŸ“˜ Example: QA Prompt Collaboration

Prompt: β€œWhat if CFO tries to cancel a refund after it was approved?”

  1. Agent proposes:

  2. Scenario: CFO cannot cancel approved refund

  3. Test method: Cancel_ShouldFail_WhenAlreadyApproved_ByCFO()

  4. QA Agent accepts β†’ triggers Git commit & Studio UI patch

  5. Agent logs:

accepted_by: qa-engineer-agent
status: enriched
scenario_type: access-control

🧾 QA Workflow in Studio

Section Test Generator Agent Role
Prompt β†’ Preview Panel Show suggested .feature block
Missing Roles View Trigger enrichment skill for uncovered role Γ— handler
Approved Scenarios Tracked via qa-feedback.md
Rejected or Needs Fix Sent back to agent for retry or rephrasing
Trace View Visual chain: handler β†’ test β†’ QA prompt β†’ scenario

πŸ”„ Collaboration Cycle Summary

Step Action
1️⃣ QA Prompt Studio or QA Agent trigger
2️⃣ Agent Simulates Scenario, test file, trace metadata emitted
3️⃣ QA Approves/Rejects Through Studio
4️⃣ Studio Annotates Adds visual test coverage and changelogs
5️⃣ Retry if Needed Agent regenerates adjusted test

βœ… Summary

The Test Generator Agent integrates deeply with QA Engineer Agent and Studio by:

  • πŸ’‘ Co-creating tests from prompts and gaps
  • πŸ§ͺ Suggesting intelligent .feature expansions
  • πŸ“‹ Tracking traceability, edition, roles, and human review
  • πŸ“˜ Powering Studio’s β€œwhat if” testing UX
  • πŸ” Supporting iterative QA–AI collaboration with trace-safe retries

Together, they enable a human-AI hybrid testing workflow that is both scalable and explainable.


🎯 Self-Evaluation and Test Gap Identification

To maintain test quality autonomously, the Test Generator Agent includes a self-evaluation loop that allows it to:

  • πŸ” Identify missing or insufficient tests
  • 🧠 Reason about the impact of not testing a specific path
  • πŸ“‰ Detect discrepancies between real-world signals and generated scenarios
  • πŸ§ͺ Suggest tests based on observed gaps without relying solely on external triggers

This empowers the agent to continuously optimize test completeness even in the absence of explicit QA prompts.


🧩 Key Self-Evaluation Responsibilities

Responsibility Description
Trace-Scenario Coverage Review Compares blueprint/handler definitions vs. current test artifacts
Role Path Validation Detects which role Γ— action combinations are not tested
Edition-Specific Variant Check Confirms whether tests for lite, pro, enterprise, etc. exist
Telemetry Trace Audit Looks for runtime spans/events that have no test match
Bug Replay Coverage Compares test set with known fixed bugs to ensure permanent defense
Validator-Rule Crosscheck Flags missing negative tests for RuleFor(...) combinations
Gherkin Completeness Scans Ensures all business paths appear in .feature files with proper assertions

πŸ” Skill Set for Test Gap Identification

Skill Function
ScanTraceTestCompleteness(trace_id) Loads metadata, compares declared vs. tested
EnumerateUntestedRoles(handler, edition) Generates list of role-action gaps
CheckMissingFeaturePaths(handler) Parses .feature and detects missing states or transitions
ValidateEdgeCoverage(dto) Ensures range, null, and invalid cases are tested
CompareBugFixToTestSet(bug_id) Checks if bug conditions are replicated in current tests
IdentifyEditionTestGaps() Detects missing .feature for one or more editions
SuggestMissingAssertions() Detects scenarios without Then clauses or outcome checks

πŸ“˜ Gap Evaluation Example

trace_id: invoice-2025-0142
blueprint_id: usecase-9241
handler: CreateInvoiceHandler
roles_allowed: [FinanceManager, CFO]
tested_roles: [FinanceManager]
gap:
  - missing_role: CFO
  - no test for zero amount
  - no feature for duplicate invoice error
  - no edition-specific `.feature` for `enterprise`

β†’ Agent triggers:

  • .feature addition for CFO path
  • Validator test: Amount = 0
  • Unit test: Handle_ShouldReject_WhenInvoiceExists

🧠 Self-Evaluation Feedback Formats

Markdown Summary:

πŸ“Š Self-Evaluation Summary: CreateInvoiceHandler (Trace: invoice-2025-0142)

❌ Missing:
- [ ] Role: CFO
- [ ] Negative validator test: ZeroAmount
- [ ] Scenario: Duplicate invoice error
- [ ] Edition-specific: `enterprise` test case

βœ… Planned augmentation steps: 3

Metadata Log:

self_check_result: failed
missing_roles:
  - CFO
untested_conditions:
  - ZeroAmount
  - DuplicateInvoice
edition_variants_missing:
  - enterprise
next_actions:
  - trigger GenerateFeatureScenario(prompt="What if invoice already exists?")
  - trigger GenerateValidatorTest(field=Amount, condition=Zero)

πŸ”„ Gap Feedback Loop

  1. Agent executes ScanTraceTestCompleteness()
  2. Missing elements logged into Studio trace dashboard
  3. Agent autonomously or upon QA approval generates augmented tests
  4. Metadata updated: gap_resolved = true

πŸ“Ž Traceability for Test Gap Detection

Every scenario added via gap detection includes:

augmented_by: test-generator-agent
trigger: self-evaluation
gap_id: auto-detected
source_blueprint: usecase-9241

βœ… Summary

Through self-evaluation, the Test Generator Agent becomes proactive, not reactive:

  • πŸ” Automatically identifies test holes
  • πŸ“˜ Generates patches to secure QA coverage
  • πŸ“Š Feeds insights to Studio dashboards and QA checklists
  • πŸ” Enables closed-loop validation, continuously improving coverage without human instruction

🀝 Human-in-the-Loop Augmentation Mode

While the Test Generator Agent excels at autonomous test generation, it’s designed to work collaboratively with human QA engineers, developers, and product managers β€” enabling a β€œhuman-in-the-loop” mode to:

  • βœ… Accept QA prompts
  • πŸ§ͺ Provide test previews for confirmation
  • ✍️ Accept manual edits and inject them back into the test set
  • πŸ” Support iterative refinement of .feature, test methods, or Markdown explanations
  • 🧠 Learn from accept/reject patterns over time

This mode ensures that test generation remains auditable, adaptable, and alignable with human expertise.


🧩 Key Capabilities in Human-in-the-Loop Mode

Capability Description
Prompt-Driven Scenario Suggestion QA types a β€œWhat if…” in Studio β†’ agent generates proposal
Scenario Previews Proposed Gherkin shown in read-only or editable preview
Editable Markdown Summaries QA can revise scenario description before acceptance
Comment & Correction Loop QA rejects/edits a suggestion β†’ agent retries with feedback applied
Feedback Learning Rejected patterns are down-ranked in embeddings and prompt planners
Tagged Trace Update Mark test as human_verified, qa_adjusted, or manual_override
Studio Live Collaboration QA can submit batch prompts or comment inline on test proposals

🧠 Example Flow

sequenceDiagram
    participant QA
    participant Studio
    participant TestGen

    QA->>Studio: "What if the CFO cancels a locked invoice?"
    Studio->>TestGen: Prompt context submitted
    TestGen->>Studio: Preview scenario + test method
    QA->>Studio: Edit Gherkin + approve
    Studio->>TestGen: Submit revised version
    TestGen->>Repo: Finalize test, metadata β†’ update trace
Hold "Alt" / "Option" to enable pan & zoom

✍️ Editable Scenario Preview Example

# Suggested by agent:
Scenario: CFO cancels a locked invoice
  Given a CFO user
  When they cancel an already locked invoice
  Then the system rejects the request

# QA edits:
Scenario: CFO cannot cancel locked invoice
  Given the invoice is in "Locked" state
  And the user is CFO
  When they try to cancel it
  Then they receive a "ForbiddenOperation" error

βœ… Final version submitted with tag:

qa_modified: true
original_source: test-generator-agent
reviewer: alex.k@qa-team

πŸ“˜ Markdown Summary Edits

Before:

Scenario: CFO cancels locked invoice – Generated via prompt

After QA edit:

Scenario: CFO attempts restricted cancellation – Reviewed by QA

πŸ” Feedback Loop + Retry Logic

Input Result
QA clicks β€œReject: Not Relevant” Agent suppresses scenario pattern for similar prompts
QA selects β€œTry Again (Better Assertion)” Agent re-runs with stricter validation rules or alternate outcome phrasing
QA marks scenario as β€œDuplicate” Agent removes and updates metadata to avoid future duplication
QA types β€œAdd edge case for negative amount too” Triggers child prompt expansion

🧠 Learning + Memory from Human Feedback

The agent stores:

  • Rejected scenario titles
  • Accepted assertions
  • QA phrasing and tags
  • Editions and roles that consistently require expansion
  • QA reviewer preferences and behavior patterns (optional)

β†’ Improves generation quality over time across the platform.


πŸ“Ž Metadata Example with Human Hooks

scenario: CFO cannot cancel locked invoice
source: test-generator-agent
prompt: QA prompt from Studio
review_status: accepted_with_edits
edited_by: qa.alex.k
feedback_applied: true

βœ… Summary

The Test Generator Agent in human-in-the-loop mode:

  • 🀝 Enhances QA creativity with structured AI suggestions
  • 🧠 Learns from corrections to improve future outputs
  • πŸ–₯️ Integrates directly with Studio for real-time review
  • πŸ“„ Produces editable .feature, .cs, and Markdown assets
  • πŸ” Supports iterative, explainable, and human-verifiable QA workflows

πŸ’‘ This is where AI + QA collaboration becomes seamless and scalable.


🧠 Memory, Vector Embeddings, and Similarity Prompts

To generate contextually relevant, non-redundant, and intelligently suggested tests, the Test Generator Agent relies on:

  • πŸ” Memory: Persistent trace-aligned records of what was tested, why, and how
  • 🧠 Vector embeddings: Semantic similarity across prompts, scenarios, test cases, bugs, and DTOs
  • πŸ“š Example reuse: Drawing from similar service domains to enrich test quality and coverage

This enables the agent to avoid duplication, recommend consistent patterns, and expand intelligently using learned context.


πŸ“¦ What the Agent Stores and Embeds

Knowledge Type Format Purpose
Test Scenarios .feature titles and steps To cluster and suggest similar test ideas
QA Prompts Vector embeddings To answer similar future questions better
DTO Structures Parsed DTOs as embeddings To infer test conditions from similar DTO fields
Bug Metadata bug_id, failure symptoms, event names To suggest regression-preventing test logic
Roles-to-Handlers Maps Role Γ— Edition Γ— Action For complete role coverage suggestions
Handler-to-Test Mappings From test-metadata.yaml To detect structural test gaps
Blueprints & Use Cases Embeddings of domain flow text To auto-extend coverage across domains

🧬 Embedding-Based Prompt Expansion

Prompt:

β€œWhat if an Analyst tries to approve a payment?”

β†’ Embedding similarity finds:

  • Guest cannot approve invoice
  • Unauthorized SupportAgent tries to cancel refund
  • Non-admin user denies large transaction

β†’ Suggests:

  • Scenario: Analyst user access denied
  • Method: Post_ShouldReturn403_WhenUserIsAnalyst

🧠 Skills That Use Memory & Embeddings

Skill Usage
FindSimilarScenarios(trace_id, prompt) Pulls reusable .feature structures from related traces
InferTestsFromRelatedDTOs(dto) Suggests edge cases seen in similar DTOs (e.g., Amount, Currency)
PredictTestGapsFromPastBugs() Uses bugs with similar symptoms to generate regression tests
ClusterUncoveredRoles() Uses role embeddings to suggest test coverage plans
LearnFromQAEdits() Stores accepted/rejected scenarios and avoids generating similar rejected paths
SuggestAssertionsBasedOnMemory() Suggests Then: and Assert statements that match domain-specific expectations

πŸ“˜ Example: Memory Entry

{
  "trace_id": "invoice-2025-0131",
  "handler": "CreateInvoiceHandler",
  "prompt_embedding": [0.1, 0.42, ..., 0.08],
  "scenario": "Submit invoice with null customer ID",
  "assertion": "Fails with 'CustomerId is required'",
  "roles_tested": ["FinanceManager"],
  "roles_missing": ["CFO", "Guest"]
}

πŸ” Reuse Across Microservices

If a test exists in CreateOrderHandlerTests.cs (e-commerce domain) β†’ Can suggest similar edge case tests in CreateInvoiceHandlerTests.cs (finance domain)

This supports domain-informed reuse, especially useful in high-scale factory-wide generation.


πŸ§ͺ DTO Similarity Expansion Example

public class PaymentRequest {
    public decimal Amount { get; set; }
    public string Currency { get; set; }
}

β†’ Memory shows:

  • Amount = 0 β†’ test for zero boundary
  • Currency = "" β†’ test for invalid format

β†’ Recommends:

  • Handle_ShouldFail_WhenAmountIsZero
  • Validate_ShouldReject_WhenCurrencyIsEmpty

🧠 Prompt Learning from QA Feedback

Rejected:

β€œWhat if Guest submits a refund?” β†’ Feedback: β€œAlready covered by CFO test; redundant.”

β†’ Agent embeds this and avoids similar redundant scenarios for Guest/CFO unless roles differ materially


πŸ“Ž Metadata Tracked with Embedding-Driven Suggestions

embedding_source: "prompt: unauthorized refund"
suggested_by: similarity_from_trace[invoice-2025-0123]
feedback_history: accepted_by_qa.alex.k
semantic_similarity: 0.86

βœ… Summary

By using memory and vector embeddings, the Test Generator Agent:

  • 🧠 Thinks across modules, services, and past QA feedback
  • πŸ” Reuses relevant test logic without hardcoding
  • ✍️ Suggests smarter assertions, test names, and BDD flows
  • πŸ“Š Learns and improves coverage over time, autonomously

This ensures test generation is context-aware, non-repetitive, and knowledge-enriched.


πŸ” Retry, Correction, and Trace-Driven Enhancements

The Test Generator Agent is equipped with a resilient and trace-safe retry & correction mechanism that ensures:

  • πŸ§ͺ Broken, incomplete, or rejected tests are reprocessed
  • 🧠 Prompt-based or AI-generated scenarios are refined upon feedback
  • πŸ”„ Observability-driven augmentations are trace-aware and retryable
  • πŸ‘€ Human corrections via Studio can trigger enhanced regeneration cycles

This enables self-healing test generation and QA feedback incorporation with auditability.


🧬 Retry Triggers

Trigger Type Description
❌ Test Lint/Validation Failure Scenario is generated, but fails structure/format rules
πŸ“‰ Missing Assertion Detected Test or .feature lacks a valid outcome check
πŸ” QA Rejection in Studio Scenario marked as β€œInaccurate”, β€œNot needed”, or β€œDuplicate”
πŸ“Ž Bug Regression Trace Replay Agent is asked to regenerate test coverage for same trace ID with updated bug inputs
🧠 Memory Suggestion Conflict Duplicate scenario detected with existing .feature

πŸ›  Retry Flow

sequenceDiagram
    participant Studio
    participant QAEngineer
    participant TestGen
    participant Memory

    QAEngineer->>Studio: Reject test scenario from prompt
    Studio->>TestGen: RetryScenario(trace_id, feedback="unclear THEN clause")
    TestGen->>Memory: Lookup prior attempt
    TestGen->>TestGen: Rerun GenerateTestScenarios with revised constraints
    TestGen->>Studio: Emit revised scenario for approval
Hold "Alt" / "Option" to enable pan & zoom

🧠 Retry Modes

Mode Behavior
patch-only Adds missing methods/scenarios without regenerating the whole test class
regenerate-with-feedback Reruns the prompt with attached feedback (assertion too vague, role mismatch)
semantic-deduplication Regenerates using prompt embeddings while removing previously generated ideas
qa-intervention-loop Test is held until QA explicitly reviews and resubmits final wording

πŸ“˜ Retry Metadata Example

trace_id: payments-2025-0471
scenario_id: refund_duplicate
retry_count: 2
retry_reason: "Missing Then clause"
last_feedback: "Scenario lacks concrete outcome"
status: resolved
regenerated_by: test-generator-agent

✍️ Human Correction Loop (Studio-Driven)

Action Effect
πŸ“ β€œNeeds better Then clause” Agent reconstructs test output with stronger assertion logic
❌ β€œDuplicate of CFO approval test” Agent suppresses this scenario in current and future trace ID contexts
βž• β€œAdd variation for 'InvalidCurrency' too” Triggers child prompt expansion and batch generation

πŸ”Ž Bug Trace Replay Correction

Input Action
Bug #4871: β€œRefund allowed after approval” Agent verifies coverage in .feature + handler test
❌ No test found Agent re-executes test plan for trace ID
βœ… Outputs regression test for scenario:
β€œPrevent refund after approval”

πŸ” Observability-Driven Retry

When a telemetry span or log indicates:

  • 5xx errors
  • Timeout loops
  • Malformed DTO inputs

Agent:

  1. Queries memory for trace ID
  2. Evaluates: β€œIs this issue test-covered?”
  3. If not, retries scenario generation for that path
  4. Adds metadata: trigger=retry:observability

πŸ“Ž Tracking Retry History in Artifacts

Each .feature and test method includes:

[RetryTrace(Count = 2, LastReason = "QA feedback: unclear output")]
@retry_count:2
@feedback_source:studio.qa-review.alex.k

βœ… Summary

The Test Generator Agent uses trace-aware, intelligent retry strategies to:

  • πŸ” Improve test precision via QA feedback
  • 🧠 Refine generative logic using memory and embeddings
  • πŸ“Š Ensure no prompt or runtime gap goes untested
  • πŸ“˜ Maintain audit-safe artifacts with retry tags and feedback context

This ensures that test generation is never final β€” only validated through continuous collaboration and reasoning.


🎯 Studio Hooks and Markdown Test Storytelling

The Test Generator Agent is deeply integrated with the Studio UI, serving both:

  • 🧠 As a behind-the-scenes test generator, and
  • πŸ“˜ As a human-facing storyteller that explains what it generated, why, and how.

It uses Studio hooks and markdown-based summaries to make test artifacts:

  • Understandable to QA engineers
  • Reviewable by product stakeholders
  • Traceable by developers and test leads

πŸ“¦ Outputs Connected to Studio

Output Purpose
.feature scenarios Visualized in Studio test coverage dashboards
Markdown scenario summaries Displayed in the β€œSuggested Tests” or β€œScenario Details” panels
Feedback metadata Drives accept/reject flows and traceability
Retry context Enables inline regeneration with human guidance
Coverage links β€œTest created by agent” β†’ click to view scenario and trace
Prompt logs Shows QA prompts and associated generated scenarios

🧩 Markdown Test Storytelling Format

Agent emits a QA-readable summary for each test it generates, including:

  • βœ… Scenario title
  • 🧠 Source reasoning
  • πŸ” Retry or enrichment history
  • πŸ” Role and edition tested
  • πŸ“˜ Assertion summary
  • πŸ“Ž Trace metadata

πŸ“˜ Example Output: Markdown Story

### πŸ” Scenario: Refund is issued twice

πŸ“Ž Trace: refund-2025-0143
🧠 Source: QA prompt β€” β€œWhat if refund is attempted more than once?”

βœ… Test generated:
- Type: BDD + Unit
- Edition: `lite`
- Roles tested: `SupportAgent`

πŸ“„ Scenario Summary:

Given a support agent has already issued a refund When they try to issue it again Then the system rejects it with error "DuplicateRefund"

πŸ” Feedback:
- Attempt #1: Missing THEN clause β†’ resolved
- QA Comment: "Consider rephrasing expected outcome"

πŸ“˜ Status: βœ… Approved and committed by QA

πŸ”„ Studio UX Integration Points

Studio Component Agent Behavior
Prompt Console Receives QA test idea β†’ generates suggestions
Scenario Preview Panel Displays formatted .feature with QA feedback tools
Trace View Connects .feature and .cs output to originating handler
Test Gaps Dashboard Highlights untested paths β€” agent fills them in
Retry Request Button Re-runs scenario generation for a single trace, role, or edition
Markdown Viewer Shows scenario story in human-friendly form

🧠 Enhanced QA Review Flow

  1. QA types:

β€œWhat happens if CFO cancels after approval?”

  1. Agent generates:

  2. .feature scenario

  3. Markdown story with reasoning
  4. Suggested test name: Cancel_ShouldFail_WhenAlreadyApproved_ByCFO()

  5. QA sees:

  6. Scenario

  7. Markdown explanation
  8. β€œAccept / Edit / Retry” controls

  9. Approval β†’ Commits to repo and marks trace as β€œQA-verified”


πŸ“Ž Trace-Aware Tags in Markdown

All story outputs include:

trace_id: refund-2025-0143
scenario_id: refund_duplicate
prompt_source: studio.prompt.qa.alex.k
edition: lite
role: SupportAgent
retry_count: 1
source_skill: GenerateFeatureScenario

β†’ These enable filtering, search, and trace-to-test mapping inside Studio.


βœ… Summary

The Test Generator Agent enhances the Studio UX by:

  • πŸ“˜ Emitting clear, role/edition-aware scenario summaries
  • 🧠 Explaining its AI reasoning and augmentation history
  • πŸ” Supporting in-place review, edit, and regeneration
  • πŸ§ͺ Closing the gap between prompt β†’ code β†’ QA validation
  • πŸ“Ž Maintaining trace-safe, auditable test metadata in human-readable form

This turns Studio into an AI-augmented test design surface β€” not just a dashboard.


πŸ“Š Metrics, Coverage Impact, and Validation Reports

As a key QA automation agent, the Test Generator Agent must not only generate test assets β€” it must also:

  • πŸ” Measure what it covers
  • πŸ“ˆ Track its impact on system-wide coverage
  • 🧾 Provide clear, reportable outputs for QA, CI/CD, and Studio dashboards

Its metrics and validation system makes test augmentation quantifiable, auditable, and optimizable over time.


πŸ“¦ Core Metrics Emitted

Metric Description Format
testgen.scenario.count Total scenarios generated per trace/session Integer
testgen.methods.appended Number of unit/integration test methods added Integer
testgen.coverage.delta % increase in test coverage after augmentation Float (0–100)
testgen.role.variants.tested New role-action paths added List of roleΓ—action
testgen.retry.count Total retries performed per trace ID Integer
testgen.qa.acceptance_rate Ratio of QA-accepted to proposed scenarios Percentage
testgen.enrichment.tags Scenario tags emitted (e.g., @security, @edge) Count per tag
testgen.prompt.success_rate % of successful test generations per prompt Percentage

πŸ“ˆ Coverage Impact Calculation

The agent hooks into the Test Coverage Validator Agent, comparing:

  1. Pre-augmentation state
  2. Post-augmentation state

Using metrics such as:

  • Handlers with β‰₯1 unit test
  • DTO fields with negative test cases
  • Roles Γ— Edition covered
  • Gherkin .feature step completeness
  • Scenarios with real assertions (vs. placeholders)

β†’ It then emits a delta report, like:

trace_id: invoice-2025-0147
coverage_before:
  unit_tests: 3
  feature_scenarios: 1
  roles_tested: [FinanceManager]
coverage_after:
  unit_tests: 5
  feature_scenarios: 3
  roles_tested: [FinanceManager, Guest, CFO]
delta:
  feature_scenarios: +2
  unit_tests: +2
  roles_tested: +2

πŸ“˜ Validation Report Example

{
  "trace_id": "refund-2025-0143",
  "status": "success",
  "scenarios_added": 2,
  "unit_tests_appended": 1,
  "qa_feedback": {
    "accepted": 2,
    "rejected": 0,
    "requires_followup": false
  },
  "assertions_present": true,
  "tags": ["edge", "security", "edition:lite"]
}

βœ… This metadata is pushed to:

  • πŸ“Š Studio dashboards
  • πŸ“‹ QA Engineer Agent reports
  • πŸ“€ PR annotations (via Pull Request Creator Agent)

πŸ§ͺ Metrics for Studio Dashboards

Dashboard Tracked Metrics
Test Coverage Heatmap Trace ID Γ— Scenario count, role coverage, edition paths
Prompt Coverage Map % of QA prompts that produced accepted scenarios
Security Validation Grid Role escalation/denial paths added via Test Generator
Regression Readiness Bug-to-test trace validation reports
Edition Completeness Matrix lite, pro, enterprise scenario count per use case

πŸ“„ Markdown Summary Metrics (QA-Friendly)

### 🧾 Test Augmentation Summary β€” CreateInvoiceHandler

πŸ“Ž Trace: invoice-2025-0142  
πŸ“˜ Prompt: β€œWhat if invoice already exists?”

βœ… Tests Added:
- Scenarios: 2
- Unit Tests: 1
- Roles Added: [CFO, Guest]

πŸ” Tags:
- @security
- @edge
- @edition:enterprise

🧠 QA Feedback:
- βœ… Accepted: 2
- ❌ Rejected: 0
- 🟑 Retry: 0

πŸ“ˆ Coverage Delta: +22%

βœ… Summary

The Test Generator Agent produces clear, actionable QA metrics including:

  • πŸ“Š Scenario count, retry count, role path coverage
  • πŸ“ˆ Coverage delta with before/after snapshots
  • 🧾 QA acceptance and reasoning logs
  • πŸ” Retry efficiency and feedback loop summaries
  • πŸ“˜ Studio and markdown integration for visibility and planning

This makes the agent not just a test creator, but a test strategist with measurable value.


βœ… Final Summary

The Test Generator Agent is the AI-first, prompt-aware, and observability-augmented testing assistant within the ConnectSoft QA Engineering Cluster. It exists to:

Proactively augment test coverage using human prompts, trace analysis, telemetry, and domain understanding β€” filling the gaps left by static test generation.

It complements the Test Case Generator Agent by operating where intelligence, behavior, and reasoning are required to suggest:

  • ✍️ Exploratory and edge-case tests
  • πŸ“˜ Rich BDD scenarios
  • πŸ” Test expansions based on feedback, bug reports, and edition paths
  • πŸ“Š Structured, traceable QA metadata
  • 🀝 Collaboration cycles with QA and Studio

🧩 Feature Recap

Area Capabilities
Prompt-to-Test QA enters β€œwhat if” β†’ agent emits .feature, test methods, markdown
Gap Closure Detects missing roles, editions, assertion cases
Security Testing Suggests 401/403, escalation, abuse prevention tests
Edition Awareness Handles lite, pro, enterprise test variants
Self-Evaluation Auto-detects test gaps via metadata, DTO rules, or bug history
Studio Integration Preview + accept/reject cycle, prompt tracing, markdown summaries
Memory & Embeddings Learns from previous prompts, test structures, DTO patterns
Retry & Corrections Human feedback loop, retry metadata, audit tags
Observability-Aware Generates tests based on telemetry, logs, and event traces
Test Impact Reports Delta analysis for QA coverage, trace enrichment, edition completeness

πŸ” Test Generator vs. Test Case Generator β€” Final Comparison

Feature Test Case Generator Test Generator Agent
🎯 Trigger Static artifact (handler, controller, blueprint) Prompt, test gap, QA input, telemetry, bug
🧱 Input Code structure, DTO, blueprint Observability, prompts, bugs, gaps, QA reviews
πŸ“€ Output .cs, .feature, test-metadata.yaml .feature, augmented tests, markdown summaries
πŸ” Retry On failure or missing test On QA rejection, prompt retry, gap detection
πŸ‘€ Human-In-Loop Rare Core interaction pattern (Studio QA loop)
πŸ“Š Coverage Role Baseline test scaffolding Strategic augmentation, behavioral completeness
πŸ” Security & Role Paths Limited Robust role variant & access denial coverage
🧠 Skills Deterministic, rule-based generation AI prompt planners, OpenAI-driven scenario simulation
πŸ“˜ BDD Integration Generated from handler/ports Enriched from human prompts and coverage analysis
🧠 Observability Not integrated Uses spans, logs, runtime feedback
πŸ“Ž Trace Tagging Static alignment Dynamic + revision-aware + feedback-tagged
πŸ§ͺ Use Case β€œGenerate initial test set” β€œClose gaps, explore untested paths, simulate user behavior”

πŸ”š Closing Notes

The Test Generator Agent is:

  • 🧠 An intelligent QA collaborator
  • πŸ“˜ A scenario inventor and test storyteller
  • πŸ” A self-correcting system enhancer
  • πŸ“Š A measurable contributor to QA success

It helps ConnectSoft achieve a full-stack, AI-augmented, and observability-driven software testing platform β€” at scale.