π§ Test Generator Agent Specification¶
π― Purpose¶
The Test Generator Agent is an AI-first, behavior-aware, exploratory test generation agent that:
Detects gaps, anomalies, or incomplete paths in the system and generates hypothetical, behavioral, or runtime-inspired test cases β even when code or handlers do not directly define them.
Unlike the Test Case Generator Agent, which strictly scaffolds test files from static inputs (handlers, DTOs, validators), this agent acts as an intelligent tester, simulating what a skilled QA engineer or SDET would do when:
- Reviewing event logs or traces
- Exploring undocumented edge cases
- Validating role-based behavior
- Challenging the systemβs assumptions
- Predicting future failure modes
π§ What This Agent Focuses On¶
| Focus Area | Description |
|---|---|
| Gap Discovery | Finds missing test cases not covered by scaffolds |
| Behavior Inference | Infers test scenarios from logs, telemetry, and event streams |
| AI-Prompted Exploration | Uses OpenAI prompts to ask βWhat could go wrong?β |
| Runtime Context Awareness | Uses production logs, synthetic data, or simulations |
Augmented .feature Generation |
Extends test libraries with scenario-driven Gherkin cases |
| Security & Role Testing | Generates privilege escalation and unauthorized access tests |
| Studio Scenario Builder | Suggests missing cases or test chains in Studio interface |
| Negative Path Hypotheses | βWhat if X fails?β β emits test for simulation and validation |
π§ Test Generator Agent in Action¶
Example Blueprint:¶
- Use Case:
CapturePayment - Role:
Cashier - Coverage: 92%, but missing fraud validation
Trigger:¶
- Studio detects a lack of fraud simulation test
- QA engineer submits exploratory prompt:
βWhat happens if currency is changed mid-transaction?β
Output:¶
- Adds
Scenario: Unexpected currency switchtocapture_payment.feature - Suggests assertion:
Should reject mixed currency payments - Links the test with trace
payments-2025-0429and gap IDfraud-path-003
π How It Extends the Factory¶
| Impact Area | Value |
|---|---|
| π Post-hoc validation | Adds depth beyond code structure |
| π§ Promptable reasoning | Explores untested flows via AI |
| π Security coverage | Challenges role definitions and abuse paths |
| π Studio augmentation | Powers βwhatβs missing?β UX for QA engineers |
| π Regression reinforcement | Adds tests based on recent bugs or telemetry warnings |
| π Multi-edition awareness | Extends coverage by injecting region, locale, or tenant-specific edge tests |
β Summary¶
The Test Generator Agent is:
- An adaptive testing thinker, not just a test scaffolder
- A prompt-based scenario designer capable of inferring unknowns
- A complement to Test Case Generator Agent β together they form a closed loop of test coverage
π£ It is ConnectSoftβs QA assistant for AI-powered exploratory testing, extending trust, security, and resilience across every service.
π§ Strategic Position in the QA Cluster and Platform Flow¶
The Test Generator Agent resides in the QA Engineering Cluster, working as a behavioral test expansion engine alongside:
- π§ͺ Test Case Generator Agent
- π QA Engineer Agent
- π Test Coverage Validator Agent
- π Bug Investigator Agent
- π οΈ Bug Resolver Agent
It complements the Test Case Generator Agent by offering runtime-aware, user-behavior-simulated, and exploratory test generation.
𧬠Full Factory Flow Positioning¶
flowchart TD
A[Blueprint Finalized] --> B[Handlers/Controllers Scaffolded]
B --> C[TestCaseGeneratorAgent]
C --> D[TestArtifactsGenerated]
D --> E[TestCoverageValidatorAgent]
E -->|Gaps Found| F[TestGeneratorAgent]
F --> G[Augmented Tests (.feature, hypothesis)]
G --> H[QAEngineerAgent]
H --> I[Studio Coverage Preview]
subgraph QA Cluster
C
E
F
H
end
β
Trigger: Detected gap, Studio prompt, telemetry event, or QA action
β
Output: Augmented test cases, exploratory .feature files, Markdown hypothesis reports
π€ Strategic Collaborators¶
| Agent | Relationship |
|---|---|
| QA Engineer Agent | Accepts test augmentation requests via Studio prompts |
| Test Coverage Validator Agent | Triggers this agent on low coverage or missing role/scenario |
| Bug Investigator Agent | Submits failed event traces or logs for hypothesis testing |
| Studio | Embeds agent as βSuggest testβ helper and βWhat-if analyzerβ |
| Test Memory Service | Provides embeddings, historical test knowledge, or case similarity retrievals |
π¦ Factory Roles¶
| Phase | Test Generator Agentβs Role |
|---|---|
| π§ Generation | (Not involved) |
| π§ͺ Validation | Triggered post-initial test generation |
| π Augmentation | Adds BDD scenarios, unexpected flow coverage |
| π Documentation | Outputs hypothesis-driven summaries |
| π οΈ Bug Handling | Suggests reproduction or defense tests |
| π€ Studio UX | Embedded as a QA-side assistant for test brainstorming |
π§ Example Activation Scenarios¶
| Trigger | Result |
|---|---|
| βGuest user caused null ref in productionβ | Agent generates auth bypass test |
| βQA asks βWhat if amount is changed while approved?ββ | Agent emits scenario: Scenario: Changing amount after approval |
| βEdition βliteβ lacks currency testβ | Agent adds Scenario: Lite edition currency mismatch |
| βTelemetry shows high 400 errors for invalid formatβ | Agent adds edge case tests for known malformed inputs |
π Platform Cluster Inclusion¶
cluster: qa-engineering
agent: test-generator-agent
position:
- post-test-validation
- pre-pr-check
- pre-release-scenario-expansion
- on-demand via Studio
activation_modes:
- trace-aware
- event-based
- prompt-triggered
- gap-fill
β Summary¶
The Test Generator Agent holds a flexible, reactive position in the QA flow:
- Triggered after standard test generation
- Supports human-in-the-loop or auto-augmentation
- Integrates across Studio, QA, bug handling, and telemetry analysis
- Strengthens ConnectSoft's commitment to observability-first and validation-enforced automation
π Responsibilities¶
The Test Generator Agent is responsible for augmenting, extending, or inventing tests in response to:
- Coverage gaps
- Hypothetical failure modes
- Observability triggers
- QA prompts or Studio workflows
- Bug patterns or edge behaviors
It does not replace the Test Case Generator Agent, but complements it with AI-powered reasoning and behavioral expansion.
β Key Responsibilities Breakdown¶
| Responsibility | Description |
|---|---|
| 1. Scenario-Based Test Synthesis | Generate .feature files or assertions from human prompts, blueprint intent, or gaps |
| 2. Telemetry-Informed Test Creation | Use runtime metrics or logs to generate edge-case tests |
| 3. Security Flow Simulation | Generate tests for role misuse, unauthorized access, injection paths |
| 4. Hypothesis-Based Coverage | Propose test cases based on inferred missing logic (e.g., βwhat if input is corrupted?β) |
| 5. Prompt-Based QA Test Expansion | Turn freeform QA questions into test logic (What happens if...) |
| 6. Role-Conditional Test Variants | Add missing test scenarios per role β action pairing across editions |
| 7. Studio Integration for Suggestions | Populate βmissing scenarioβ panels in Studio or markdown summaries |
| 8. Negative Path Discovery | Proactively generate failure and edge-case sequences |
| 9. Trace Replay or Test Recovery | Rebuild test cases based on logs, bug reports, or failed executions |
| 10. Coverage Rebalancing | Fill gaps where no unit, validator, or bdd tests were previously generated |
π§ͺ What It Does Not Do¶
| Out of Scope | Reason |
|---|---|
| Standard handler/unit test scaffolding | Covered by Test Case Generator Agent |
| Basic DTO validator test creation | Already handled from RuleFor(...) patterns |
| Static test folder scaffolding | Not its role β works on runtime, prompts, feedback |
| Snapshotting full service flows | Done by QA Agent in workflow validation stage |
π Responsibility Examples¶
Example 1 β Observability-Based Trigger:¶
Telemetry shows 500 errors during partial refund flow.
- β Agent simulates conditions
- β
Emits test:
Scenario: Partial refund over max limit - β Suggests BDD step definition + validator injection
Example 2 β Prompt-Based Expansion:¶
QA enters in Studio:
βWhat happens if payment is submitted twice?β
-
β Agent proposes test:
-
Handle_ShouldReject_WhenPaymentIsDuplicate() Scenario: Duplicate payment request is rejected- β
Suggests extending integration test and
.feature
Example 3 β Role-Missing Test:¶
Coverage shows CFO not tested in enterprise edition
- β Agent adds:
Scenario: CFO approves high-value invoice
Given a CFO user
When they approve an invoice of 10,000
Then the approval succeeds
π Artifact-Level Responsibilities¶
| Artifact | Action |
|---|---|
*.feature |
Create, enrich, or patch new BDD scenarios |
*.cs test files |
Append hypothesis-based test methods |
test-metadata.yaml |
Emit augmented_by: test-generator-agent |
observability-events.jsonl |
Add logs tagged with source=telemetry, trigger=prompt |
qa-feedback.md |
Summarize new suggestions in Studio format |
β Summary¶
The Test Generator Agentβs responsibilities are centered around:
- π€ Intelligence (inferred scenarios, gaps, behaviors)
- π Flexibility (prompt- or telemetry-based activation)
- π§ͺ Completeness (role paths, failure modes, missed conditions)
- π¦ Usability (artifacts feed directly into CI/CD, Studio, PRs)
It serves as the creative and critical thinking arm of ConnectSoftβs QA cluster β inventing the tests that others miss.
π₯ Inputs¶
Unlike the Test Case Generator Agent (which consumes static artifacts like handlers and DTOs), the Test Generator Agent takes in a rich, dynamic, multi-modal input set that allows it to:
- Simulate real-world test gaps
- Respond to behavior deviations or failures
- Enrich test coverage based on runtime observations and prompts
π¦ Primary Input Categories¶
| Input Type | Description | Example |
|---|---|---|
| Blueprint & Trace Metadata | Contextual reference for feature, roles, edition | trace_id: payment-2025-0147, blueprint_id: usecase-9342 |
| Execution Gaps from Test Coverage Validator Agent | Identified missing tests by type, role, or scenario | "No role test found for CFO in CancelInvoiceHandler" |
| Telemetry, Logs, and Spans | Runtime events, logs, traces, or error spikes | 500 errors during refund, event: UnexpectedCurrencyFormat |
| QA Prompts from Studio | Free-form or structured queries initiated by QA | "What if payment is cancelled after capture?" |
| Edition-Specific Overrides | Per-edition roles, behaviors, toggles, or constraints | enterprise β emit InvoiceAuditLogged, lite β skip tax validation |
| Failed Test Executions or Bug Reports | History of test failures or uncovered bugs | "Duplicate invoice bug, no scenario covers it" |
| Memory / Similar Test Lookups | Historical tests for similar handlers or scenarios | From MicroserviceMemoryIndex or vector embeddings |
| DTO / Domain Object Snapshots | Structural references for test input generation | CreateInvoiceInput, RefundRequest |
| Existing .feature Files (for extension) | Used to enrich or patch scenarios | capture_payment.feature |
| Authorization Role Matrix | Map of role-to-action coverage by edition | Used to trigger 403, escalation, or bypass tests |
π§ Sample Enriched Input¶
trace_id: invoice-2025-0143
blueprint_id: usecase-9241
existing_coverage:
unit_tests: 3
bdd_scenarios: 2
roles_tested: [FinanceManager]
roles_missing: [CFO, Guest]
telemetry:
recent_errors:
- event: "NullReference in RefundProcessor"
- code: 500
- payload: "RefundAmount: null"
qa_prompt: "What if refund is issued twice?"
dto_structure:
RefundRequest:
- RefundAmount: decimal
- Reason: string
β Result: Generate
Scenario: Duplicate refund preventionHandle_ShouldThrow_WhenRefundIsDuplicate.featureaugmentation with CFO role
π‘ Human-Centered Inputs¶
| Source | Format |
|---|---|
| QA Studio Prompt | "What if currency is changed post approval?" |
| Bug Resolver Comment | Bug #4812: missing test for customer type = corporate |
| Manual Edition Override | "Add auth scenario for Guest user in lite edition" |
These are structured by the agentβs skill planner into actionable test-generation sequences.
π Semantic Inputs from Logs¶
{
"event": "InvoiceApprovalFailed",
"role": "Analyst",
"message": "Unauthorized access attempt",
"trace_id": "invoice-2025-0311"
}
β Agent infers:
- Missing test case for unauthorized Analyst attempting approval
- Proposes
.featurescenario + integration test
π Prompt Template Input¶
{
"prompt": "What happens if refund amount is negative?",
"context": {
"handler": "IssueRefundHandler",
"validator": "RefundRequestValidator",
"roles": ["SupportAgent"],
"blueprint_id": "usecase-8014"
}
}
β Skill invoked: ProposeEdgeCaseFromPrompt
β Summary¶
The Test Generator Agent consumes:
- π§ Blueprint context
- π Test gap metadata
- π Runtime telemetry
- π§ͺ Human prompts
- π Memory and prior test embeddings
This multi-source fusion of design-time, runtime, and interactive inputs makes the agent capable of intelligent, behavioral test expansion unmatched by static test generation tools.
π€ Outputs¶
The Test Generator Agent produces intelligent, augmented, and behavior-focused test artifacts that extend the standard test suite. These outputs are:
- π Expressive and traceable
- π§ͺ Hypothetical and exploratory
- π§© Aligned to observed gaps or QA prompts
- π Structured for CI/CD, QA agents, and Studio consumption
π¦ Primary Output Artifacts¶
| Output Type | Description | Format/Example |
|---|---|---|
Augmented .feature Scenarios |
New or patched BDD scenarios derived from prompts, logs, or test gaps | refund_flow.feature |
| Hypothesis Test Cases | Test files or methods appended to existing unit/integration tests | Handle_ShouldReject_WhenRefundExceedsLimit() |
| Studio Markdown Summary | Human-readable insights for QA, trace viewers, and prompt responses | qa-augmented-tests.md |
| Test Augmentation Metadata | JSON or YAML structured trace β gap β test mapping | test-augmentation-metadata.yaml |
| Observability Event Logs | Trace-linked reasoning, AI inferences, scenario triggers | testgen-observability.jsonl |
| Retrospective Gap Patch Commits | Optional Git-based diff patches for test extension | patch-test-trace-0427.diff |
| Prompt-to-Test Response Bundles | Structured records of QA query β generated test flow | Used by Studio feedback panel |
π Output Example: Augmented .feature File¶
@trace_id: refund-2025-0182
@augmented_by: test-generator-agent
@source: prompt
Feature: Refund flow edge cases
Scenario: Refund is issued twice
Given a support agent issued a refund
When they try to issue it again
Then the system should reject it with status "DuplicateRefund"
Scenario: Negative refund amount
Given a refund request with amount = -100
Then the request is rejected
π§ͺ Output Example: Hypothetical Test Method¶
Appended to IssueRefundHandlerTests.cs:
[TestMethod]
[TraceId("refund-2025-0182")]
[AugmentedBy("test-generator-agent")]
public async Task Handle_ShouldThrow_WhenRefundAmountIsNegative()
{
var input = new RefundRequest { RefundAmount = -100 };
var result = await handler.Handle(input);
Assert.IsFalse(result.IsSuccess);
Assert.AreEqual("Refund amount must be positive", result.Error.Message);
}
π QA Summary Output (Markdown)¶
### π QA Scenario Augmentation β Trace: refund-2025-0182
β
Added 2 BDD scenarios to `refund_flow.feature`
β
Appended 1 unit test to `IssueRefundHandlerTests.cs`
- Scenario: Duplicate refund attempt β status: handled
- Scenario: Negative refund β validator failed as expected
π§ Reasoning: Triggered by QA prompt βWhat if refund is issued twice?β
π test-augmentation-metadata.yaml¶
trace_id: refund-2025-0182
augmented_by: test-generator-agent
source: studio-prompt
test_type: bdd + unit
new_scenarios:
- Duplicate refund attempt
- Negative refund
linked_artifacts:
- refund_flow.feature
- IssueRefundHandlerTests.cs
roles_covered: [SupportAgent]
π Observability Events¶
Emitted to testgen-observability.jsonl:
{
"event": "TestAugmented",
"trace_id": "refund-2025-0182",
"source": "qa_prompt",
"scenario": "Refund is issued twice",
"roles_covered": ["SupportAgent"],
"edition": "lite"
}
π€ Git Patch (Optional)¶
β Summary¶
The Test Generator Agent emits:
- π Test files and
.featureenhancements - π§ Reasoned Markdown outputs for QA and Studio
- π¦ Metadata linking test β trace β prompt
- π Observability events for audit and trace replay
- π Optional patch artifacts for review pipelines
These outputs empower QA, Studio, CI/CD, and trace analytics with dynamic, intelligent, explainable test expansions.
π¦ Reactive Triggers β When and Why the Agent Is Invoked¶
Unlike statically triggered agents (e.g., Test Case Generator), the Test Generator Agent is activated reactively or on-demand, when existing tests are insufficient or complex behaviors demand exploration.
It responds to intelligent signals, not just code artifacts.
π Triggering Modes¶
| Mode | Description | Example |
|---|---|---|
| π§ͺ Coverage Gap Trigger | Fired when Test Coverage Validator Agent detects untested paths | βNo BDD scenario for CFO role in ApproveInvoiceβ |
| π§ QA Prompt Trigger | Triggered via Studio input or prompt request | βWhat happens if a refund is re-issued after cancellation?β |
| π Observability Trigger | Based on runtime telemetry (logs, errors, anomalies) | βSpike in 400 errors on POST /api/refundβ |
| π Bug Pattern Trigger | Initiated when Bug Resolver Agent detects untested fix area | βBug #4201 lacks test reproduction for invalid tax exemptionβ |
| π Edition/Role Test Gap | Detected when edition-specific paths arenβt tested | βEnterprise edition lacks Guest user rejection caseβ |
| π§± New Business Rule Trigger | Agent re-scans blueprint after major rule/validation update | βAdded late fee logic to InvoiceDue β generate overdue scenarioβ |
| π¬ Prompt Simulation Mode | Interactive QA request: simulate variants, override parameters | βSimulate invalid dates across timezonesβ |
π§ Reactive Lifecycle Example¶
sequenceDiagram
participant CoverageValidator
participant TestGenerator
participant QAEngineer
participant Studio
CoverageValidator->>TestGenerator: Trigger on missing scenario (CFO approval)
TestGenerator->>QAEngineer: Propose new `.feature` scenario
QAEngineer->>Studio: Approve test addition
TestGenerator->>Studio: Emit markdown summary + test metadata
π Sample Trigger: QA Prompt¶
Input:
{
"prompt": "What if refund is denied twice?",
"context": {
"handler": "IssueRefundHandler",
"blueprint_id": "usecase-8041"
}
}
Agent Action:
- Generates
Scenario: Repeat refund denial handling - Emits markdown summary for Studio
- Appends new test to
IssueRefundHandlerTests.cs
π Studio Trigger UX¶
Agent embedded in Studio under:
- β βSuggest Missing Testβ
- β βSimulate Alternate Pathβ
- β βTest Role or Edition Variantβ
- β βCover Observability Gapβ
Each action produces previewable suggestions.
π Trigger Types Recap¶
| Type | Triggered By | Result |
|---|---|---|
| π Observability | Telemetry alert, error log | Edge case test inferred from log |
| π§ͺ Coverage | Validator agent reports missing scenario | Add .feature or unit test |
| π§ QA | Prompt in Studio | Prompt-based scenario simulation |
| π Bug | Resolver agent signals reproduction required | Add test to capture defect case |
| π§Ύ Edition | Gap in edition-specific role or config | Emit conditional .feature or test branch |
| π§± Business Rule | Blueprint update | Add test reflecting new rule path |
β Summary¶
The Test Generator Agent is never passive β it is responsive, contextual, and AI-augmented:
- Activated when something is missing, ambiguous, or questioned
- Bridges the gap between human QA, production observations, and test logic
- Enables continuous, adaptive quality assurance beyond static coverage
π Process Flow (High-Level)¶
The Test Generator Agent executes in a reactive, prompt-augmented flow, designed to transform incomplete test coverage, ambiguous behaviors, or human questions into concrete, testable outputs.
Unlike deterministic generators, it operates like a QA researcher with memory and observability tools β combining design, runtime, and human input into a creative testing loop.
𧬠High-Level Execution Diagram¶
flowchart TD
Trigger[π₯ Trigger Received<br>(Prompt, Gap, Log, Edition)] --> Analyze[π§ Analyze Context]
Analyze --> Plan[π Plan Test Generation Path]
Plan --> Generate[βοΈ Execute Scenario Simulation & Prompt Skills]
Generate --> Emit[π€ Emit Test Artifacts]
Emit --> Validate[π§ͺ Run Lint + Structure Validators]
Validate --> Report[π Emit Metadata + Studio Summary]
Report --> Done[β
Done]
π§± Phase Descriptions¶
| Phase | Description |
|---|---|
| π₯ Trigger Received | Activated by Studio prompt, Test Coverage Validator, bug trace, or telemetry |
| π§ Analyze Context | Loads handler, DTO, existing tests, blueprint, roles, and test history |
| π Plan Test Path | Determines which types of tests to generate: BDD, unit, edge, edition-specific |
| βοΈ Execute Scenario Simulation | Uses prompt skills and embeddings to create test content (steps, assertions, inputs) |
| π€ Emit Artifacts | Outputs .feature, .cs, Markdown, metadata, and logs |
| π§ͺ Validate Structure | Ensures format, structure, trace metadata, naming, and consistency |
| π Emit Metadata | Saves test-augmentation-metadata.yaml, emits spans, and notifies Studio/QA agents |
π Execution Characteristics¶
| Trait | Detail |
|---|---|
| Reentrant | Can be retriggered for the same trace to add more cases |
| Traceable | Every output is trace_id and source tagged |
| Edition-Aware | Scenarios vary based on edition_id and roles_allowed |
| Prompt-Centric | User questions or QA notes become execution contexts |
| Memory-Augmented | Reuses previous patterns, tests, and coverage snapshots for alignment |
π Example Flow (Prompt-Driven)¶
- Trigger: QA in Studio asks:
βWhat if refund is attempted twice for the same transaction?β
-
Plan: Agent checks:
-
Handler:
IssueRefundHandler - Existing test coverage
- DTO:
RefundRequest -
Prior bug trace:
#REFD-2203 -
Generate:
-
* Appends unit test method to handler test * Adds markdown explanation.featurescenario: -
Validate:
-
Passes lint, snapshot, and coverage inclusion
-
Emit:
-
Saves to repo/memory
- Updates Studio dashboard
- Notifies QA agent
β Summary¶
The Test Generator Agent runs a reactive, intelligent, trace-backed pipeline that enables:
- π§ Context-driven test generation
- π Scenario expansion based on real user questions and observability
- π Outputs tied directly to trace, edition, and role
- π¬ QA collaboration and Markdown reporting
- π§ͺ Validation-integrated, repeatable, and audit-safe processes
π¬ Process Flow (Detailed Skill-Orchestrated Flow)¶
The Test Generator Agent operates as a skill-orchestrated AI agent, executing a series of modular and reactive skills β each one handling a stage in the prompt-to-test translation pipeline.
This architecture supports:
- π Reusability across test types
- π Trace alignment
- π§ Prompt + telemetry understanding
- π Memory + embedding retrieval
- π Markdown storytelling + Studio visibility
𧬠Detailed Skill Flow¶
flowchart LR
A[Trigger Event] --> B[LoadTestContext]
B --> C[IdentifyTestGap]
C --> D[PlanScenarioTemplates]
D --> E[GenerateTestScenarios]
E --> F[EmitTestArtifacts]
F --> G[ValidateGeneratedTests]
G --> H[EmitObservability]
H --> I[StudioSummaryMarkdown]
π§ Core Skills Used¶
| Skill | Purpose |
|---|---|
LoadTestContext |
Loads trace ID, blueprint, handler, DTO, edition, and QA prompt |
IdentifyTestGap |
Queries test coverage memory and recent observability logs |
PlanScenarioTemplates |
Decides what kinds of tests to generate (unit, BDD, edition-specific) |
GenerateTestScenarios |
Uses OpenAI-backed planners to simulate steps, inputs, outcomes |
EmitTestArtifacts |
Writes .feature, .cs, YAML metadata, and Markdown |
ValidateGeneratedTests |
Runs naming, formatting, trace-tag, and structure linting |
EmitObservability |
Emits OpenTelemetry spans + JSONL trace logs |
StudioSummaryMarkdown |
Generates human-readable report for QA and Studio dashboard |
π§ͺ Internal Skill Handlers and Sub-Skills¶
π Test Discovery + Planning¶
ScanCoverageByTrace(trace_id)FetchMissingScenariosByHandler()DetectUncoveredRoles(trace_id, edition)
βοΈ Prompt-Based Scenario Generation¶
GenerateFeatureScenario(prompt)SuggestTestMethod(prompt, handler, dto)InferAssertionsFromDTO(field_rules)
π Artifact Generation¶
CreateFeatureFile(handler, scenarios)AppendTestMethod(test_class, method_code)EmitTestMetadata(trace_id, test_type, role, edition)
π Output Coordination¶
All skills write into structured paths like:
Tests/
βββ PaymentsService.Specs/
β βββ Features/
β βββ refund_flow.feature
βββ PaymentsService.UnitTests/
β βββ IssueRefundHandlerTests.cs
βββ test-augmentation-metadata.yaml
βββ qa-report.md
π Observability Skill Output¶
Span Example:
{
"span": "testgen.GenerateFeatureScenario",
"trace_id": "refund-2025-0142",
"handler": "IssueRefundHandler",
"edition": "lite",
"source": "qa_prompt",
"scenario_title": "Refund denied twice"
}
Validation Result:
π§ Retriable and Self-Healing¶
All skills:
- Support retry on failure or invalid output
- Are idempotent for deterministic prompts
- Auto-tag
retry_count,source,last_modified_byin metadata
β Summary¶
The Test Generator Agent's core intelligence is delivered through modular Semantic Kernel skills, enabling it to:
- React to signals and prompts
- Simulate human testing logic
- Produce trace-aligned, validated artifacts
- Emit reasoning and coverage metadata
This skill-based execution allows for fine-grained control, modular upgrades, and full AI-driven collaboration with QA teams and Studio.
π§© Core Skills¶
The Test Generator Agentβs core skills power its ability to:
- Think like a QA engineer
- Generate complete test flows from partial prompts
- Simulate business logic without static code structure
- Convert system knowledge into BDD and executable tests
- Fill testing blind spots using pattern inference, prompt expansion, and flow simulation
π§ Core AI-Driven Skills¶
| Skill Name | Description |
|---|---|
GenerateFeatureScenario(prompt) |
Converts a prompt like βWhat if a refund is issued twice?β into a Gherkin-compliant scenario |
SuggestTestMethod(prompt, handler, dto) |
Creates MSTest method names, signatures, and assertions |
ExpandPromptIntoPaths(prompt) |
Breaks vague QA prompts into multiple edge cases |
InferAssertionsFromDTO(dto_rules) |
Suggests validation rules and expected outcomes |
SimulateRoleActions(trace, edition) |
Generates role-based test variants (e.g. Guest, CFO, SupportAgent) |
PlanFailureFlowScenarios() |
Proposes behavioral fallbacks or constraint violation tests |
MapTelemetryToTestInput(log_entry) |
Turns runtime logs into structured inputs for test simulation |
GenerateScenarioMatrix(handler, roles, inputs) |
Maps permutations of scenario candidates for dynamic .feature generation |
PredictRegressionTestImpact(bug_metadata) |
Suggests new test scenarios based on historical bug pattern embeddings |
π¬ Prompt β Scenario Skill Chain¶
Input Prompt:
βWhat if refund is issued while invoice is locked?β
Skill Execution Chain:
GenerateFeatureScenarioβ
SuggestTestMethod β
3. InferAssertionsFromDTO β
Expected error: "InvoiceLockedException"
4. EmitTestArtifacts β
Outputs .feature, .cs, and trace metadata
π§ͺ Sample Skill: ExpandPromptIntoPaths¶
Input Prompt:
βWhat if the amount is too high?β
Output:
- Scenario: Amount equals max allowed
- Scenario: Amount exceeds max allowed
- Scenario: Amount is null
- Scenario: Amount is string
- Scenario: Amount submitted twice
β Used by downstream skills to generate .feature and test method templates.
π§ DTO-Aware Assertion Inference¶
Given:
public class RefundRequest {
[Required]
public decimal Amount { get; set; }
[MaxLength(200)]
public string Reason { get; set; }
}
β InferAssertionsFromDTO() generates:
- Amount = 0 β
IsFailure("Amount must be greater than 0") - Reason = 300 chars β
IsFailure("Reason too long")
π― Use of Embeddings¶
Skills like PredictRegressionTestImpact() and SuggestTestMethod() use:
- Vector similarity with past test descriptions
- Stored embeddings from bug traces, feature summaries,
.featuretitles - Memory of past DTOs and known domain actions
β Improves precision and recall of suggested test cases.
π Markdown Summary Skill¶
| Skill | Output |
|---|---|
StudioSummaryMarkdown() |
Human-readable test reasoning summary |
ExplainWhyScenarioMatters() |
QA-facing commentary attached to .feature preview |
β Summary¶
The Test Generator Agentβs core skill system allows it to:
- Understand ambiguous prompts
- Simulate diverse paths with confidence
- Write tests from language and reasoning, not code alone
- Collaborate with QA agents and Studio via explainable, testable artifacts
These skills make it an intelligent QA engineer in software form.
π‘ Observability-Aware Test Inference¶
The Test Generator Agent integrates with the observability fabric of the platform β leveraging telemetry, spans, logs, and runtime event data to:
- π Detect real-world behavior gaps
- π§ͺ Infer scenarios that have not been exercised in tests
- β οΈ Proactively propose tests to prevent repeat issues
- π Link tests directly to production symptoms
This makes the agent not just QA-aligned β but production-informed.
π Key Observability Inputs Used¶
| Source | Example |
|---|---|
| OpenTelemetry Spans | High failure rate in RefundService.Handle() |
| Error Logs | Repeated NullReferenceException on CustomerId |
| HTTP Metrics | Surge in 400 BadRequest on POST /invoice |
| Trace Snapshots | Slow response time with specific input patterns |
| AppInsights / Logs | User retries on specific flow = behavioral anti-pattern |
| Service Events | event: PaymentMismatchDetected (never tested in .feature) |
π Skill: MapTelemetryToTestInput¶
This core skill takes in runtime telemetry (e.g., from logs or spans) and:
- Identifies which handler or controller was involved
- Parses any payload (request/response) structures
- Extracts failure conditions, error messages, or paths
- Reconstructs an inferred test case
π§ͺ Example Input: Observability-Driven Trigger¶
{
"trace_id": "refund-2025-0143",
"handler": "IssueRefundHandler",
"error_message": "Amount cannot be null",
"event": "NullReferenceException",
"log_payload": {
"RefundAmount": null,
"CustomerId": "8d2..."
}
}
β Agent generates:
- π¨ Test Method:
.feature Scenario:
π Metrics-Aware Skills¶
| Skill | Behavior |
|---|---|
AnalyzeErrorFrequencySpans() |
Identifies handlers with recurring issues |
InferGapsFromUnhandledSpans() |
Finds spans that lack test trace coverage |
GenerateEdgeScenarioFromLog(log) |
Creates structured test case from runtime data |
SuggestAssertionFromError(error_msg) |
Turns logs into test expectation |
AttachTestToTelemetrySource() |
Tags generated test with observability correlation ID |
π Traceability and Metadata Output¶
generated_by: test-generator-agent
trigger: observability
trace_id: refund-2025-0143
span_id: a4e12d78e
origin: AppInsights
test_artifact: IssueRefundHandlerTests.cs
scenario: Refund with null amount
asserts: ["Amount must not be null"]
β Used by Studio, QA, PRs, and observability dashboards
π Usage in Continuous QA¶
| Use Case | Result |
|---|---|
| π¨ Observability alert β test not found | Agent adds scenario |
| π§ͺ Spike in retry rate | Generates scenario: βRetry rejected if payment already processedβ |
| π§ Missing spanβtest map | Agent auto-fills .feature gap |
| π§Ύ Event observed but not validated | Scenario added: "event: InvoiceApproved β assert in BDD" |
π§ Integration with QA and Bug Resolver Agents¶
These agents can flag test absence for observability-based issues β the Test Generator Agent:
- Backfills
.feature - Suggests new test method
- Links observability trace β handler β test
β Summary¶
The Test Generator Agent turns live runtime observations into concrete, testable artifacts, ensuring:
- π Nothing observed in production is left untested
- π§ͺ QA feedback loops close automatically
- π Tests are tagged with trace β span β error metadata
- π Studio and CI gain insight into real-world coverage, not just design coverage
This is a core differentiator of ConnectSoft's Observability-First QA Architecture.
π Security-Aware Scenario Generation¶
Security-related bugs are often:
- π« Undetected by static test generation
- π Role-dependent or permission-specific
- π§ Configuration-based (edition, tenant, policy)
- 𧨠Triggered by unauthorized access or incorrect access control
The Test Generator Agent proactively generates security-focused tests to ensure all authorization paths, privilege boundaries, and denial-of-access flows are covered and tested.
π§© Security Test Types the Agent Generates¶
| Test Type | Description |
|---|---|
| Unauthorized Role Access | Ensures roles without permission are blocked |
| Anonymous Access Scenarios | Verifies [AllowAnonymous] or Unauthenticated β 401 behavior |
| Role Escalation Attempt | Detects missing guards when a lower-privilege role tries privileged action |
| Edition-Specific Permission Cases | Varies access rules based on edition config |
| Token or Claim Manipulation | Explores behavior with corrupted/malformed/missing claims |
| Restricted State Transition | Prevents forbidden state actions (e.g., closing already-paid invoice) |
π§ Inputs for Security Scenario Generation¶
roles_allowedfrom blueprint or port configAuthorizationMap.yamlper edition- Controller annotations like
[Authorize(Roles = "FinanceManager")] - Previous test coverage: roles tested vs. roles missing
- Studio prompts (e.g., βWhat happens if Guest user tries to approve invoice?β)
π§ͺ Example: Unauthorized Role Test¶
[TestMethod]
[TraceId("invoice-2025-0147")]
[Edition("enterprise")]
[AugmentedBy("test-generator-agent")]
public async Task Post_ShouldReturn403_WhenGuestTriesToApproveInvoice()
{
var client = factory.CreateClientWithRole("Guest");
var response = await client.PostAsJsonAsync("/api/invoice/approve", validPayload);
Assert.AreEqual(HttpStatusCode.Forbidden, response.StatusCode);
}
π Example BDD Scenario¶
@edition:enterprise
@trace_id:invoice-2025-0147
@source:security-inference
Feature: Invoice approval access control
Scenario: Guest user attempts approval
Given a user with role Guest
When they send an approval request
Then access is denied with status 403
π Skill-Based Flow¶
| Skill | Action |
|---|---|
EnumerateRoleVariants(handler) |
Detects all roles not yet tested |
SimulateUnauthorizedAccess(role) |
Proposes tests to enforce denial |
InferSecurityPolicyFromEdition(edition) |
Adjusts tests for edition-specific rules |
GenerateAuthorizationAssertions() |
Converts 403, 401, or redirect outcomes into test assertions |
PatchMissingAuthScenarios() |
Fills BDD/test files with missing security cases |
𧬠Edition Example¶
Blueprint:¶
Output:¶
- β Validates approval for CFO
- β Rejects Guest, Analyst, SupportAgent
- π Adds test +
.featurefor all undefined roles
π Traceability¶
All security scenarios are:
- Tagged with
security_test: true - Linked to
trace_id,edition, andhandler - Indexed in
test-metadata.yamland Studio dashboards
β Summary¶
The Test Generator Agent proactively defends against security regressions by:
- π§ Discovering untested access paths
- π Enforcing principle of least privilege through test scenarios
- π Tagging tests for edition-, trace-, and role-specific enforcement
- π Generating assertions for expected denials and edge paths
Security-aware scenario generation ensures the factory doesn't just produce working software β it produces safe software.
π Scenario Enrichment for BDD and Studio¶
One of the Test Generator Agentβs most user-facing roles is its ability to enrich .feature files and support Studio-based exploratory QA workflows by:
- π Adding role-, state-, and failure-path scenarios
- π§© Filling in edge cases and prompts into existing
.featurefiles - β¨ Enhancing readability, clarity, and auditability of QA test specs
- π Keeping BDD specs aligned with trace + prompt context
This bridges developer-generated tests with human-readable QA artifacts.
π§© Key BDD Enrichment Functions¶
| Enrichment Type | Purpose |
|---|---|
| Scenario Injection | Appends new Scenario: blocks to existing .feature files |
| Prompt-to-Gherkin Expansion | Converts Studio prompts into full Gherkin test narratives |
| Role Variant Enrichment | Adds missing role paths (e.g., Guest, Admin) |
| Condition Branch Scenarios | Adds Given/When/Then for alternate flows (e.g., empty input, retry) |
| Security/Access Markers | Adds @auth, @denied, @edition:lite tags |
| Studio Preview Markdown | Generates QA- and product-friendly descriptions for visual dashboards |
π§ͺ BDD Example Before Enrichment¶
Feature: Capture payment
Scenario: Successful payment
Given a cashier submits a valid payment
Then the payment is recorded
π§ After Enrichment¶
Feature: Capture payment
Scenario: Successful payment
Given a cashier submits a valid payment
Then the payment is recorded
Scenario: Duplicate payment submission
Given a payment was already processed
When a second request is sent
Then the request is rejected with status DuplicatePayment
Scenario: Guest user attempts payment
Given a user with role Guest
When they submit a payment
Then access is denied with 403
π§ BDD Enrichment Skills Used¶
| Skill | Description |
|---|---|
DetectEnrichableFeature(trace_id) |
Loads and parses current .feature file |
ProposeNewScenarios(prompt, gaps) |
Suggests 1βN new Gherkin scenarios |
ApplyRoleMatrixToFeature() |
Adds one scenario per uncovered role |
InsertScenarioTags() |
Adds @edition, @prompt, @security |
WriteFeaturePatch() |
Appends new scenarios into correct file with preservation |
GenerateStudioMarkdownSummary() |
Outputs readable descriptions for QA & PM dashboards |
π Output: Studio Markdown Summary¶
### Test Scenario Expansion for Capture Payment
π Trace: payments-2025-0321
π Trigger: Studio prompt βWhat if payment is submitted twice?β
β
Appended to `capture_payment.feature`:
- Scenario: Duplicate payment submission
- Scenario: Guest user access denied
@tags: security, regression, prompt-driven
π Scenario Enrichment Metadata¶
All new .feature scenarios are traced with:
scenario_source: test-generator-agent
trace_id: payments-2025-0321
trigger: qa_prompt
augmented_roles: [Guest]
edition: enterprise
Stored in test-augmentation-metadata.yaml and indexed for Studio navigation.
π Studio Integration¶
| Section | Use |
|---|---|
| Test Preview Panel | Shows enriched scenarios with trace context |
| Missing Roles Dashboard | Lists roles not yet tested (used by ApplyRoleMatrixToFeature()) |
| Prompt History Panel | Matches generated scenarios with QA questions |
| Audit Logs | Shows scenario origin, version, and rationale |
β Summary¶
The Test Generator Agent enriches BDD and Studio workflows by:
- βοΈ Appending intelligent scenarios to
.featurespecs - π§ Translating QA prompts into structured, Gherkin-based tests
- π Embedding trace, edition, and security metadata
- π Supporting QA engineers, product managers, and test reviewers with markdown insights
This creates a seamless QA experience: from test prompt β to scenario β to Studio visibility.
π€ Integration with QA Engineer Agent and Studio¶
The Test Generator Agent is designed to work as a direct augmentation partner for the QA Engineer Agent and Studio UX.
- π The QA Engineer Agent owns the QA strategy, gap tracking, test run validation, and regression feedback.
- π§ The Test Generator Agent enhances QA workflows with intelligent, promptable, and observability-aware test generation.
- π₯οΈ Studio is the shared interface, where both agents are visible to QA engineers, test designers, and product managers.
π§© Integration Points with QA Engineer Agent¶
| Function | Description |
|---|---|
| Scenario Suggestion Sync | Agent suggests test cases for uncovered flows β QA Engineer Agent decides inclusion |
| Gap Auto-Filling | QA Agent flags missing paths β Test Generator Agent emits .feature patch |
| Prompt Collaboration | QA prompt β Test Generator Agent simulates β QA Agent evaluates & stores |
| Feedback Loop | QA Agent provides β
accepted, π needs correction, or β not relevant |
| Studio Summary Indexing | Test Generator outputs markdown + tags for QA Agentβs test plan reporting |
| Role Map Expansion | QA Agent tracks role coverage, Test Generator fills missing paths per handler or edition |
𧬠Skill-Based Collaboration¶
sequenceDiagram
participant QAEngineerAgent
participant TestGeneratorAgent
participant Studio
Studio->>QAEngineerAgent: Detect uncovered refund flow
QAEngineerAgent->>TestGeneratorAgent: Prompt: "What if refund is rejected twice?"
TestGeneratorAgent->>QAEngineerAgent: Emit `.feature` + `.cs` test case
QAEngineerAgent->>Studio: Approve, reject, or comment
Studio->>TestGeneratorAgent: Feedback logged
π§ Shared Artifacts¶
| Artifact | Used By | Description |
|---|---|---|
test-augmentation-metadata.yaml |
Both | Stores traceβscenario link and origin |
qa-report.md |
QA Agent | Combines test summaries for visibility and coverage tracking |
.feature enriched files |
QA Agent | Updated BDD specs consumed by QA and CI |
observability-events.jsonl |
QA Agent | QA traceability and status tracking |
StudioPromptContext.json |
QA Agent + Studio | Tracks prompt β test suggestion β review cycle |
π Example: QA Prompt Collaboration¶
Prompt: βWhat if CFO tries to cancel a refund after it was approved?β
-
Agent proposes:
-
Scenario:
CFO cannot cancel approved refund -
Test method:
Cancel_ShouldFail_WhenAlreadyApproved_ByCFO() -
QA Agent accepts β triggers Git commit & Studio UI patch
-
Agent logs:
π§Ύ QA Workflow in Studio¶
| Section | Test Generator Agent Role |
|---|---|
| Prompt β Preview Panel | Show suggested .feature block |
| Missing Roles View | Trigger enrichment skill for uncovered role Γ handler |
| Approved Scenarios | Tracked via qa-feedback.md |
| Rejected or Needs Fix | Sent back to agent for retry or rephrasing |
| Trace View | Visual chain: handler β test β QA prompt β scenario |
π Collaboration Cycle Summary¶
| Step | Action |
|---|---|
| 1οΈβ£ QA Prompt | Studio or QA Agent trigger |
| 2οΈβ£ Agent Simulates | Scenario, test file, trace metadata emitted |
| 3οΈβ£ QA Approves/Rejects | Through Studio |
| 4οΈβ£ Studio Annotates | Adds visual test coverage and changelogs |
| 5οΈβ£ Retry if Needed | Agent regenerates adjusted test |
β Summary¶
The Test Generator Agent integrates deeply with QA Engineer Agent and Studio by:
- π‘ Co-creating tests from prompts and gaps
- π§ͺ Suggesting intelligent
.featureexpansions - π Tracking traceability, edition, roles, and human review
- π Powering Studioβs βwhat ifβ testing UX
- π Supporting iterative QAβAI collaboration with trace-safe retries
Together, they enable a human-AI hybrid testing workflow that is both scalable and explainable.
π― Self-Evaluation and Test Gap Identification¶
To maintain test quality autonomously, the Test Generator Agent includes a self-evaluation loop that allows it to:
- π Identify missing or insufficient tests
- π§ Reason about the impact of not testing a specific path
- π Detect discrepancies between real-world signals and generated scenarios
- π§ͺ Suggest tests based on observed gaps without relying solely on external triggers
This empowers the agent to continuously optimize test completeness even in the absence of explicit QA prompts.
π§© Key Self-Evaluation Responsibilities¶
| Responsibility | Description |
|---|---|
| Trace-Scenario Coverage Review | Compares blueprint/handler definitions vs. current test artifacts |
| Role Path Validation | Detects which role Γ action combinations are not tested |
| Edition-Specific Variant Check | Confirms whether tests for lite, pro, enterprise, etc. exist |
| Telemetry Trace Audit | Looks for runtime spans/events that have no test match |
| Bug Replay Coverage | Compares test set with known fixed bugs to ensure permanent defense |
| Validator-Rule Crosscheck | Flags missing negative tests for RuleFor(...) combinations |
| Gherkin Completeness Scans | Ensures all business paths appear in .feature files with proper assertions |
π Skill Set for Test Gap Identification¶
| Skill | Function |
|---|---|
ScanTraceTestCompleteness(trace_id) |
Loads metadata, compares declared vs. tested |
EnumerateUntestedRoles(handler, edition) |
Generates list of role-action gaps |
CheckMissingFeaturePaths(handler) |
Parses .feature and detects missing states or transitions |
ValidateEdgeCoverage(dto) |
Ensures range, null, and invalid cases are tested |
CompareBugFixToTestSet(bug_id) |
Checks if bug conditions are replicated in current tests |
IdentifyEditionTestGaps() |
Detects missing .feature for one or more editions |
SuggestMissingAssertions() |
Detects scenarios without Then clauses or outcome checks |
π Gap Evaluation Example¶
trace_id: invoice-2025-0142
blueprint_id: usecase-9241
handler: CreateInvoiceHandler
roles_allowed: [FinanceManager, CFO]
tested_roles: [FinanceManager]
gap:
- missing_role: CFO
- no test for zero amount
- no feature for duplicate invoice error
- no edition-specific `.feature` for `enterprise`
β Agent triggers:
.featureaddition for CFO path- Validator test:
Amount = 0 - Unit test:
Handle_ShouldReject_WhenInvoiceExists
π§ Self-Evaluation Feedback Formats¶
Markdown Summary:¶
π Self-Evaluation Summary: CreateInvoiceHandler (Trace: invoice-2025-0142)
β Missing:
- [ ] Role: CFO
- [ ] Negative validator test: ZeroAmount
- [ ] Scenario: Duplicate invoice error
- [ ] Edition-specific: `enterprise` test case
β
Planned augmentation steps: 3
Metadata Log:¶
self_check_result: failed
missing_roles:
- CFO
untested_conditions:
- ZeroAmount
- DuplicateInvoice
edition_variants_missing:
- enterprise
next_actions:
- trigger GenerateFeatureScenario(prompt="What if invoice already exists?")
- trigger GenerateValidatorTest(field=Amount, condition=Zero)
π Gap Feedback Loop¶
- Agent executes
ScanTraceTestCompleteness() - Missing elements logged into Studio trace dashboard
- Agent autonomously or upon QA approval generates augmented tests
- Metadata updated:
gap_resolved = true
π Traceability for Test Gap Detection¶
Every scenario added via gap detection includes:
augmented_by: test-generator-agent
trigger: self-evaluation
gap_id: auto-detected
source_blueprint: usecase-9241
β Summary¶
Through self-evaluation, the Test Generator Agent becomes proactive, not reactive:
- π Automatically identifies test holes
- π Generates patches to secure QA coverage
- π Feeds insights to Studio dashboards and QA checklists
- π Enables closed-loop validation, continuously improving coverage without human instruction
π€ Human-in-the-Loop Augmentation Mode¶
While the Test Generator Agent excels at autonomous test generation, itβs designed to work collaboratively with human QA engineers, developers, and product managers β enabling a βhuman-in-the-loopβ mode to:
- β Accept QA prompts
- π§ͺ Provide test previews for confirmation
- βοΈ Accept manual edits and inject them back into the test set
- π Support iterative refinement of
.feature, test methods, or Markdown explanations - π§ Learn from accept/reject patterns over time
This mode ensures that test generation remains auditable, adaptable, and alignable with human expertise.
π§© Key Capabilities in Human-in-the-Loop Mode¶
| Capability | Description |
|---|---|
| Prompt-Driven Scenario Suggestion | QA types a βWhat ifβ¦β in Studio β agent generates proposal |
| Scenario Previews | Proposed Gherkin shown in read-only or editable preview |
| Editable Markdown Summaries | QA can revise scenario description before acceptance |
| Comment & Correction Loop | QA rejects/edits a suggestion β agent retries with feedback applied |
| Feedback Learning | Rejected patterns are down-ranked in embeddings and prompt planners |
| Tagged Trace Update | Mark test as human_verified, qa_adjusted, or manual_override |
| Studio Live Collaboration | QA can submit batch prompts or comment inline on test proposals |
π§ Example Flow¶
sequenceDiagram
participant QA
participant Studio
participant TestGen
QA->>Studio: "What if the CFO cancels a locked invoice?"
Studio->>TestGen: Prompt context submitted
TestGen->>Studio: Preview scenario + test method
QA->>Studio: Edit Gherkin + approve
Studio->>TestGen: Submit revised version
TestGen->>Repo: Finalize test, metadata β update trace
βοΈ Editable Scenario Preview Example¶
# Suggested by agent:
Scenario: CFO cancels a locked invoice
Given a CFO user
When they cancel an already locked invoice
Then the system rejects the request
# QA edits:
Scenario: CFO cannot cancel locked invoice
Given the invoice is in "Locked" state
And the user is CFO
When they try to cancel it
Then they receive a "ForbiddenOperation" error
β Final version submitted with tag:
π Markdown Summary Edits¶
Before:
After QA edit:
π Feedback Loop + Retry Logic¶
| Input | Result |
|---|---|
| QA clicks βReject: Not Relevantβ | Agent suppresses scenario pattern for similar prompts |
| QA selects βTry Again (Better Assertion)β | Agent re-runs with stricter validation rules or alternate outcome phrasing |
| QA marks scenario as βDuplicateβ | Agent removes and updates metadata to avoid future duplication |
| QA types βAdd edge case for negative amount tooβ | Triggers child prompt expansion |
π§ Learning + Memory from Human Feedback¶
The agent stores:
- Rejected scenario titles
- Accepted assertions
- QA phrasing and tags
- Editions and roles that consistently require expansion
- QA reviewer preferences and behavior patterns (optional)
β Improves generation quality over time across the platform.
π Metadata Example with Human Hooks¶
scenario: CFO cannot cancel locked invoice
source: test-generator-agent
prompt: QA prompt from Studio
review_status: accepted_with_edits
edited_by: qa.alex.k
feedback_applied: true
β Summary¶
The Test Generator Agent in human-in-the-loop mode:
- π€ Enhances QA creativity with structured AI suggestions
- π§ Learns from corrections to improve future outputs
- π₯οΈ Integrates directly with Studio for real-time review
- π Produces editable
.feature,.cs, and Markdown assets - π Supports iterative, explainable, and human-verifiable QA workflows
π‘ This is where AI + QA collaboration becomes seamless and scalable.
π§ Memory, Vector Embeddings, and Similarity Prompts¶
To generate contextually relevant, non-redundant, and intelligently suggested tests, the Test Generator Agent relies on:
- π Memory: Persistent trace-aligned records of what was tested, why, and how
- π§ Vector embeddings: Semantic similarity across prompts, scenarios, test cases, bugs, and DTOs
- π Example reuse: Drawing from similar service domains to enrich test quality and coverage
This enables the agent to avoid duplication, recommend consistent patterns, and expand intelligently using learned context.
π¦ What the Agent Stores and Embeds¶
| Knowledge Type | Format | Purpose |
|---|---|---|
| Test Scenarios | .feature titles and steps |
To cluster and suggest similar test ideas |
| QA Prompts | Vector embeddings | To answer similar future questions better |
| DTO Structures | Parsed DTOs as embeddings | To infer test conditions from similar DTO fields |
| Bug Metadata | bug_id, failure symptoms, event names |
To suggest regression-preventing test logic |
| Roles-to-Handlers Maps | Role Γ Edition Γ Action | For complete role coverage suggestions |
| Handler-to-Test Mappings | From test-metadata.yaml |
To detect structural test gaps |
| Blueprints & Use Cases | Embeddings of domain flow text | To auto-extend coverage across domains |
𧬠Embedding-Based Prompt Expansion¶
Prompt:
βWhat if an Analyst tries to approve a payment?β
β Embedding similarity finds:
Guest cannot approve invoiceUnauthorized SupportAgent tries to cancel refundNon-admin user denies large transaction
β Suggests:
- Scenario:
Analyst user access denied - Method:
Post_ShouldReturn403_WhenUserIsAnalyst
π§ Skills That Use Memory & Embeddings¶
| Skill | Usage |
|---|---|
FindSimilarScenarios(trace_id, prompt) |
Pulls reusable .feature structures from related traces |
InferTestsFromRelatedDTOs(dto) |
Suggests edge cases seen in similar DTOs (e.g., Amount, Currency) |
PredictTestGapsFromPastBugs() |
Uses bugs with similar symptoms to generate regression tests |
ClusterUncoveredRoles() |
Uses role embeddings to suggest test coverage plans |
LearnFromQAEdits() |
Stores accepted/rejected scenarios and avoids generating similar rejected paths |
SuggestAssertionsBasedOnMemory() |
Suggests Then: and Assert statements that match domain-specific expectations |
π Example: Memory Entry¶
{
"trace_id": "invoice-2025-0131",
"handler": "CreateInvoiceHandler",
"prompt_embedding": [0.1, 0.42, ..., 0.08],
"scenario": "Submit invoice with null customer ID",
"assertion": "Fails with 'CustomerId is required'",
"roles_tested": ["FinanceManager"],
"roles_missing": ["CFO", "Guest"]
}
π Reuse Across Microservices¶
If a test exists in
CreateOrderHandlerTests.cs(e-commerce domain) β Can suggest similar edge case tests inCreateInvoiceHandlerTests.cs(finance domain)
This supports domain-informed reuse, especially useful in high-scale factory-wide generation.
π§ͺ DTO Similarity Expansion Example¶
public class PaymentRequest {
public decimal Amount { get; set; }
public string Currency { get; set; }
}
β Memory shows:
Amount = 0 β test for zero boundaryCurrency = "" β test for invalid format
β Recommends:
Handle_ShouldFail_WhenAmountIsZeroValidate_ShouldReject_WhenCurrencyIsEmpty
π§ Prompt Learning from QA Feedback¶
Rejected:
βWhat if Guest submits a refund?β β Feedback: βAlready covered by CFO test; redundant.β
β Agent embeds this and avoids similar redundant scenarios for Guest/CFO unless roles differ materially
π Metadata Tracked with Embedding-Driven Suggestions¶
embedding_source: "prompt: unauthorized refund"
suggested_by: similarity_from_trace[invoice-2025-0123]
feedback_history: accepted_by_qa.alex.k
semantic_similarity: 0.86
β Summary¶
By using memory and vector embeddings, the Test Generator Agent:
- π§ Thinks across modules, services, and past QA feedback
- π Reuses relevant test logic without hardcoding
- βοΈ Suggests smarter assertions, test names, and BDD flows
- π Learns and improves coverage over time, autonomously
This ensures test generation is context-aware, non-repetitive, and knowledge-enriched.
π Retry, Correction, and Trace-Driven Enhancements¶
The Test Generator Agent is equipped with a resilient and trace-safe retry & correction mechanism that ensures:
- π§ͺ Broken, incomplete, or rejected tests are reprocessed
- π§ Prompt-based or AI-generated scenarios are refined upon feedback
- π Observability-driven augmentations are trace-aware and retryable
- π€ Human corrections via Studio can trigger enhanced regeneration cycles
This enables self-healing test generation and QA feedback incorporation with auditability.
𧬠Retry Triggers¶
| Trigger Type | Description |
|---|---|
| β Test Lint/Validation Failure | Scenario is generated, but fails structure/format rules |
| π Missing Assertion Detected | Test or .feature lacks a valid outcome check |
| π QA Rejection in Studio | Scenario marked as βInaccurateβ, βNot neededβ, or βDuplicateβ |
| π Bug Regression Trace Replay | Agent is asked to regenerate test coverage for same trace ID with updated bug inputs |
| π§ Memory Suggestion Conflict | Duplicate scenario detected with existing .feature |
π Retry Flow¶
sequenceDiagram
participant Studio
participant QAEngineer
participant TestGen
participant Memory
QAEngineer->>Studio: Reject test scenario from prompt
Studio->>TestGen: RetryScenario(trace_id, feedback="unclear THEN clause")
TestGen->>Memory: Lookup prior attempt
TestGen->>TestGen: Rerun GenerateTestScenarios with revised constraints
TestGen->>Studio: Emit revised scenario for approval
π§ Retry Modes¶
| Mode | Behavior |
|---|---|
patch-only |
Adds missing methods/scenarios without regenerating the whole test class |
regenerate-with-feedback |
Reruns the prompt with attached feedback (assertion too vague, role mismatch) |
semantic-deduplication |
Regenerates using prompt embeddings while removing previously generated ideas |
qa-intervention-loop |
Test is held until QA explicitly reviews and resubmits final wording |
π Retry Metadata Example¶
trace_id: payments-2025-0471
scenario_id: refund_duplicate
retry_count: 2
retry_reason: "Missing Then clause"
last_feedback: "Scenario lacks concrete outcome"
status: resolved
regenerated_by: test-generator-agent
βοΈ Human Correction Loop (Studio-Driven)¶
| Action | Effect |
|---|---|
| π βNeeds better Then clauseβ | Agent reconstructs test output with stronger assertion logic |
| β βDuplicate of CFO approval testβ | Agent suppresses this scenario in current and future trace ID contexts |
| β βAdd variation for 'InvalidCurrency' tooβ | Triggers child prompt expansion and batch generation |
π Bug Trace Replay Correction¶
| Input | Action |
|---|---|
| Bug #4871: βRefund allowed after approvalβ | Agent verifies coverage in .feature + handler test |
| β No test found | Agent re-executes test plan for trace ID |
| β Outputs regression test for scenario: | |
| βPrevent refund after approvalβ |
π Observability-Driven Retry¶
When a telemetry span or log indicates:
- 5xx errors
- Timeout loops
- Malformed DTO inputs
Agent:
- Queries memory for trace ID
- Evaluates: βIs this issue test-covered?β
- If not, retries scenario generation for that path
- Adds metadata:
trigger=retry:observability
π Tracking Retry History in Artifacts¶
Each .feature and test method includes:
β Summary¶
The Test Generator Agent uses trace-aware, intelligent retry strategies to:
- π Improve test precision via QA feedback
- π§ Refine generative logic using memory and embeddings
- π Ensure no prompt or runtime gap goes untested
- π Maintain audit-safe artifacts with retry tags and feedback context
This ensures that test generation is never final β only validated through continuous collaboration and reasoning.
π― Studio Hooks and Markdown Test Storytelling¶
The Test Generator Agent is deeply integrated with the Studio UI, serving both:
- π§ As a behind-the-scenes test generator, and
- π As a human-facing storyteller that explains what it generated, why, and how.
It uses Studio hooks and markdown-based summaries to make test artifacts:
- Understandable to QA engineers
- Reviewable by product stakeholders
- Traceable by developers and test leads
π¦ Outputs Connected to Studio¶
| Output | Purpose |
|---|---|
.feature scenarios |
Visualized in Studio test coverage dashboards |
| Markdown scenario summaries | Displayed in the βSuggested Testsβ or βScenario Detailsβ panels |
| Feedback metadata | Drives accept/reject flows and traceability |
| Retry context | Enables inline regeneration with human guidance |
| Coverage links | βTest created by agentβ β click to view scenario and trace |
| Prompt logs | Shows QA prompts and associated generated scenarios |
π§© Markdown Test Storytelling Format¶
Agent emits a QA-readable summary for each test it generates, including:
- β Scenario title
- π§ Source reasoning
- π Retry or enrichment history
- π Role and edition tested
- π Assertion summary
- π Trace metadata
π Example Output: Markdown Story¶
### π Scenario: Refund is issued twice
π Trace: refund-2025-0143
π§ Source: QA prompt β βWhat if refund is attempted more than once?β
β
Test generated:
- Type: BDD + Unit
- Edition: `lite`
- Roles tested: `SupportAgent`
π Scenario Summary:
Given a support agent has already issued a refund When they try to issue it again Then the system rejects it with error "DuplicateRefund"
π Feedback:
- Attempt #1: Missing THEN clause β resolved
- QA Comment: "Consider rephrasing expected outcome"
π Status: β
Approved and committed by QA
π Studio UX Integration Points¶
| Studio Component | Agent Behavior |
|---|---|
| Prompt Console | Receives QA test idea β generates suggestions |
| Scenario Preview Panel | Displays formatted .feature with QA feedback tools |
| Trace View | Connects .feature and .cs output to originating handler |
| Test Gaps Dashboard | Highlights untested paths β agent fills them in |
| Retry Request Button | Re-runs scenario generation for a single trace, role, or edition |
| Markdown Viewer | Shows scenario story in human-friendly form |
π§ Enhanced QA Review Flow¶
- QA types:
βWhat happens if CFO cancels after approval?β
-
Agent generates:
-
.featurescenario - Markdown story with reasoning
-
Suggested test name:
Cancel_ShouldFail_WhenAlreadyApproved_ByCFO() -
QA sees:
-
Scenario
- Markdown explanation
-
βAccept / Edit / Retryβ controls
-
Approval β Commits to repo and marks trace as βQA-verifiedβ
π Trace-Aware Tags in Markdown¶
All story outputs include:
trace_id: refund-2025-0143
scenario_id: refund_duplicate
prompt_source: studio.prompt.qa.alex.k
edition: lite
role: SupportAgent
retry_count: 1
source_skill: GenerateFeatureScenario
β These enable filtering, search, and trace-to-test mapping inside Studio.
β Summary¶
The Test Generator Agent enhances the Studio UX by:
- π Emitting clear, role/edition-aware scenario summaries
- π§ Explaining its AI reasoning and augmentation history
- π Supporting in-place review, edit, and regeneration
- π§ͺ Closing the gap between prompt β code β QA validation
- π Maintaining trace-safe, auditable test metadata in human-readable form
This turns Studio into an AI-augmented test design surface β not just a dashboard.
π Metrics, Coverage Impact, and Validation Reports¶
As a key QA automation agent, the Test Generator Agent must not only generate test assets β it must also:
- π Measure what it covers
- π Track its impact on system-wide coverage
- π§Ύ Provide clear, reportable outputs for QA, CI/CD, and Studio dashboards
Its metrics and validation system makes test augmentation quantifiable, auditable, and optimizable over time.
π¦ Core Metrics Emitted¶
| Metric | Description | Format |
|---|---|---|
testgen.scenario.count |
Total scenarios generated per trace/session | Integer |
testgen.methods.appended |
Number of unit/integration test methods added | Integer |
testgen.coverage.delta |
% increase in test coverage after augmentation | Float (0β100) |
testgen.role.variants.tested |
New role-action paths added | List of roleΓaction |
testgen.retry.count |
Total retries performed per trace ID | Integer |
testgen.qa.acceptance_rate |
Ratio of QA-accepted to proposed scenarios | Percentage |
testgen.enrichment.tags |
Scenario tags emitted (e.g., @security, @edge) |
Count per tag |
testgen.prompt.success_rate |
% of successful test generations per prompt | Percentage |
π Coverage Impact Calculation¶
The agent hooks into the Test Coverage Validator Agent, comparing:
- Pre-augmentation state
- Post-augmentation state
Using metrics such as:
- Handlers with β₯1 unit test
- DTO fields with negative test cases
- Roles Γ Edition covered
- Gherkin
.featurestep completeness - Scenarios with real assertions (vs. placeholders)
β It then emits a delta report, like:
trace_id: invoice-2025-0147
coverage_before:
unit_tests: 3
feature_scenarios: 1
roles_tested: [FinanceManager]
coverage_after:
unit_tests: 5
feature_scenarios: 3
roles_tested: [FinanceManager, Guest, CFO]
delta:
feature_scenarios: +2
unit_tests: +2
roles_tested: +2
π Validation Report Example¶
{
"trace_id": "refund-2025-0143",
"status": "success",
"scenarios_added": 2,
"unit_tests_appended": 1,
"qa_feedback": {
"accepted": 2,
"rejected": 0,
"requires_followup": false
},
"assertions_present": true,
"tags": ["edge", "security", "edition:lite"]
}
β This metadata is pushed to:
- π Studio dashboards
- π QA Engineer Agent reports
- π€ PR annotations (via Pull Request Creator Agent)
π§ͺ Metrics for Studio Dashboards¶
| Dashboard | Tracked Metrics |
|---|---|
| Test Coverage Heatmap | Trace ID Γ Scenario count, role coverage, edition paths |
| Prompt Coverage Map | % of QA prompts that produced accepted scenarios |
| Security Validation Grid | Role escalation/denial paths added via Test Generator |
| Regression Readiness | Bug-to-test trace validation reports |
| Edition Completeness Matrix | lite, pro, enterprise scenario count per use case |
π Markdown Summary Metrics (QA-Friendly)¶
### π§Ύ Test Augmentation Summary β CreateInvoiceHandler
π Trace: invoice-2025-0142
π Prompt: βWhat if invoice already exists?β
β
Tests Added:
- Scenarios: 2
- Unit Tests: 1
- Roles Added: [CFO, Guest]
π Tags:
- @security
- @edge
- @edition:enterprise
π§ QA Feedback:
- β
Accepted: 2
- β Rejected: 0
- π‘ Retry: 0
π Coverage Delta: +22%
β Summary¶
The Test Generator Agent produces clear, actionable QA metrics including:
- π Scenario count, retry count, role path coverage
- π Coverage delta with before/after snapshots
- π§Ύ QA acceptance and reasoning logs
- π Retry efficiency and feedback loop summaries
- π Studio and markdown integration for visibility and planning
This makes the agent not just a test creator, but a test strategist with measurable value.
β Final Summary¶
The Test Generator Agent is the AI-first, prompt-aware, and observability-augmented testing assistant within the ConnectSoft QA Engineering Cluster. It exists to:
Proactively augment test coverage using human prompts, trace analysis, telemetry, and domain understanding β filling the gaps left by static test generation.
It complements the Test Case Generator Agent by operating where intelligence, behavior, and reasoning are required to suggest:
- βοΈ Exploratory and edge-case tests
- π Rich BDD scenarios
- π Test expansions based on feedback, bug reports, and edition paths
- π Structured, traceable QA metadata
- π€ Collaboration cycles with QA and Studio
π§© Feature Recap¶
| Area | Capabilities |
|---|---|
| Prompt-to-Test | QA enters βwhat ifβ β agent emits .feature, test methods, markdown |
| Gap Closure | Detects missing roles, editions, assertion cases |
| Security Testing | Suggests 401/403, escalation, abuse prevention tests |
| Edition Awareness | Handles lite, pro, enterprise test variants |
| Self-Evaluation | Auto-detects test gaps via metadata, DTO rules, or bug history |
| Studio Integration | Preview + accept/reject cycle, prompt tracing, markdown summaries |
| Memory & Embeddings | Learns from previous prompts, test structures, DTO patterns |
| Retry & Corrections | Human feedback loop, retry metadata, audit tags |
| Observability-Aware | Generates tests based on telemetry, logs, and event traces |
| Test Impact Reports | Delta analysis for QA coverage, trace enrichment, edition completeness |
π Test Generator vs. Test Case Generator β Final Comparison¶
| Feature | Test Case Generator | Test Generator Agent |
|---|---|---|
| π― Trigger | Static artifact (handler, controller, blueprint) | Prompt, test gap, QA input, telemetry, bug |
| π§± Input | Code structure, DTO, blueprint | Observability, prompts, bugs, gaps, QA reviews |
| π€ Output | .cs, .feature, test-metadata.yaml |
.feature, augmented tests, markdown summaries |
| π Retry | On failure or missing test | On QA rejection, prompt retry, gap detection |
| π€ Human-In-Loop | Rare | Core interaction pattern (Studio QA loop) |
| π Coverage Role | Baseline test scaffolding | Strategic augmentation, behavioral completeness |
| π Security & Role Paths | Limited | Robust role variant & access denial coverage |
| π§ Skills | Deterministic, rule-based generation | AI prompt planners, OpenAI-driven scenario simulation |
| π BDD Integration | Generated from handler/ports | Enriched from human prompts and coverage analysis |
| π§ Observability | Not integrated | Uses spans, logs, runtime feedback |
| π Trace Tagging | Static alignment | Dynamic + revision-aware + feedback-tagged |
| π§ͺ Use Case | βGenerate initial test setβ | βClose gaps, explore untested paths, simulate user behaviorβ |
π Closing Notes¶
The Test Generator Agent is:
- π§ An intelligent QA collaborator
- π A scenario inventor and test storyteller
- π A self-correcting system enhancer
- π A measurable contributor to QA success
It helps ConnectSoft achieve a full-stack, AI-augmented, and observability-driven software testing platform β at scale.