🧠 Test Generator Agent Specification¶

🎯 Purpose¶

The Test Generator Agent is an AI-first, behavior-aware, exploratory test generation agent that:

Detects gaps, anomalies, or incomplete paths in the system and generates hypothetical, behavioral, or runtime-inspired test cases — even when code or handlers do not directly define them.

Unlike the Test Case Generator Agent, which strictly scaffolds test files from static inputs (handlers, DTOs, validators), this agent acts as an intelligent tester, simulating what a skilled QA engineer or SDET would do when:

Reviewing event logs or traces
Exploring undocumented edge cases
Validating role-based behavior
Challenging the system’s assumptions
Predicting future failure modes

🧠 What This Agent Focuses On¶

Focus Area	Description
Gap Discovery	Finds missing test cases not covered by scaffolds
Behavior Inference	Infers test scenarios from logs, telemetry, and event streams
AI-Prompted Exploration	Uses OpenAI prompts to ask “What could go wrong?”
Runtime Context Awareness	Uses production logs, synthetic data, or simulations
Augmented `.feature` Generation	Extends test libraries with scenario-driven Gherkin cases
Security & Role Testing	Generates privilege escalation and unauthorized access tests
Studio Scenario Builder	Suggests missing cases or test chains in Studio interface
Negative Path Hypotheses	“What if X fails?” → emits test for simulation and validation

🧠 Test Generator Agent in Action¶

Example Blueprint:¶

Use Case: CapturePayment
Role: Cashier
Coverage: 92%, but missing fraud validation

Trigger:¶

Studio detects a lack of fraud simulation test
QA engineer submits exploratory prompt:

“What happens if currency is changed mid-transaction?”

Output:¶

Adds Scenario: Unexpected currency switch to capture_payment.feature
Suggests assertion: Should reject mixed currency payments
Links the test with trace payments-2025-0429 and gap ID fraud-path-003

📌 How It Extends the Factory¶

Impact Area	Value
🔍 Post-hoc validation	Adds depth beyond code structure
🧠 Promptable reasoning	Explores untested flows via AI
🔐 Security coverage	Challenges role definitions and abuse paths
📊 Studio augmentation	Powers “what’s missing?” UX for QA engineers
🔁 Regression reinforcement	Adds tests based on recent bugs or telemetry warnings
🌍 Multi-edition awareness	Extends coverage by injecting region, locale, or tenant-specific edge tests

✅ Summary¶

The Test Generator Agent is:

An adaptive testing thinker, not just a test scaffolder
A prompt-based scenario designer capable of inferring unknowns
A complement to Test Case Generator Agent — together they form a closed loop of test coverage

📣 It is ConnectSoft’s QA assistant for AI-powered exploratory testing, extending trust, security, and resilience across every service.

🧭 Strategic Position in the QA Cluster and Platform Flow¶

The Test Generator Agent resides in the QA Engineering Cluster, working as a behavioral test expansion engine alongside:

🧪 Test Case Generator Agent
📋 QA Engineer Agent
📈 Test Coverage Validator Agent
🐛 Bug Investigator Agent
🛠️ Bug Resolver Agent

It complements the Test Case Generator Agent by offering runtime-aware, user-behavior-simulated, and exploratory test generation.

🧬 Full Factory Flow Positioning¶

flowchart TD
    A[Blueprint Finalized] --> B[Handlers/Controllers Scaffolded]
    B --> C[TestCaseGeneratorAgent]
    C --> D[TestArtifactsGenerated]
    D --> E[TestCoverageValidatorAgent]
    E -->|Gaps Found| F[TestGeneratorAgent]

    F --> G[Augmented Tests (.feature, hypothesis)]
    G --> H[QAEngineerAgent]
    H --> I[Studio Coverage Preview]

    subgraph QA Cluster
        C
        E
        F
        H
    end

Hold "Alt" / "Option" to enable pan & zoom

✅ Trigger: Detected gap, Studio prompt, telemetry event, or QA action ✅ Output: Augmented test cases, exploratory .feature files, Markdown hypothesis reports

🤝 Strategic Collaborators¶

Agent	Relationship
QA Engineer Agent	Accepts test augmentation requests via Studio prompts
Test Coverage Validator Agent	Triggers this agent on low coverage or missing role/scenario
Bug Investigator Agent	Submits failed event traces or logs for hypothesis testing
Studio	Embeds agent as “Suggest test” helper and “What-if analyzer”
Test Memory Service	Provides embeddings, historical test knowledge, or case similarity retrievals

📦 Factory Roles¶

Phase	Test Generator Agent’s Role
🔧 Generation	(Not involved)
🧪 Validation	Triggered post-initial test generation
📊 Augmentation	Adds BDD scenarios, unexpected flow coverage
📘 Documentation	Outputs hypothesis-driven summaries
🛠️ Bug Handling	Suggests reproduction or defense tests
👤 Studio UX	Embedded as a QA-side assistant for test brainstorming

🧠 Example Activation Scenarios¶

Trigger	Result
“Guest user caused null ref in production”	Agent generates auth bypass test
“QA asks ‘What if amount is changed while approved?’”	Agent emits scenario: `Scenario: Changing amount after approval`
“Edition ‘lite’ lacks currency test”	Agent adds `Scenario: Lite edition currency mismatch`
“Telemetry shows high 400 errors for invalid format”	Agent adds edge case tests for known malformed inputs

📌 Platform Cluster Inclusion¶

cluster: qa-engineering
agent: test-generator-agent
position:
  - post-test-validation
  - pre-pr-check
  - pre-release-scenario-expansion
  - on-demand via Studio
activation_modes:
  - trace-aware
  - event-based
  - prompt-triggered
  - gap-fill

✅ Summary¶

The Test Generator Agent holds a flexible, reactive position in the QA flow:

Triggered after standard test generation
Supports human-in-the-loop or auto-augmentation
Integrates across Studio, QA, bug handling, and telemetry analysis
Strengthens ConnectSoft's commitment to observability-first and validation-enforced automation

📋 Responsibilities¶

The Test Generator Agent is responsible for augmenting, extending, or inventing tests in response to:

Coverage gaps
Hypothetical failure modes
Observability triggers
QA prompts or Studio workflows
Bug patterns or edge behaviors

It does not replace the Test Case Generator Agent, but complements it with AI-powered reasoning and behavioral expansion.

✅ Key Responsibilities Breakdown¶

Responsibility	Description
1. Scenario-Based Test Synthesis	Generate `.feature` files or assertions from human prompts, blueprint intent, or gaps
2. Telemetry-Informed Test Creation	Use runtime metrics or logs to generate edge-case tests
3. Security Flow Simulation	Generate tests for role misuse, unauthorized access, injection paths
4. Hypothesis-Based Coverage	Propose test cases based on inferred missing logic (e.g., “what if input is corrupted?”)
5. Prompt-Based QA Test Expansion	Turn freeform QA questions into test logic (`What happens if...`)
6. Role-Conditional Test Variants	Add missing test scenarios per `role → action` pairing across editions
7. Studio Integration for Suggestions	Populate “missing scenario” panels in Studio or markdown summaries
8. Negative Path Discovery	Proactively generate failure and edge-case sequences
9. Trace Replay or Test Recovery	Rebuild test cases based on logs, bug reports, or failed executions
10. Coverage Rebalancing	Fill gaps where no `unit`, `validator`, or `bdd` tests were previously generated

🧪 What It Does Not Do¶

Out of Scope	Reason
Standard handler/unit test scaffolding	Covered by Test Case Generator Agent
Basic DTO validator test creation	Already handled from `RuleFor(...)` patterns
Static test folder scaffolding	Not its role — works on runtime, prompts, feedback
Snapshotting full service flows	Done by QA Agent in workflow validation stage

🔁 Responsibility Examples¶

Example 1 — Observability-Based Trigger:¶

Telemetry shows 500 errors during partial refund flow.

✅ Agent simulates conditions
✅ Emits test: Scenario: Partial refund over max limit
✅ Suggests BDD step definition + validator injection

Example 2 — Prompt-Based Expansion:¶

QA enters in Studio:

“What happens if payment is submitted twice?”

✅ Agent proposes test:
Handle_ShouldReject_WhenPaymentIsDuplicate()
Scenario: Duplicate payment request is rejected
✅ Suggests extending integration test and .feature

Example 3 — Role-Missing Test:¶

Coverage shows CFO not tested in enterprise edition

✅ Agent adds:

Scenario: CFO approves high-value invoice
  Given a CFO user
  When they approve an invoice of 10,000
  Then the approval succeeds

📘 Artifact-Level Responsibilities¶

Artifact	Action
`*.feature`	Create, enrich, or patch new BDD scenarios
`*.cs` test files	Append hypothesis-based test methods
`test-metadata.yaml`	Emit `augmented_by: test-generator-agent`
`observability-events.jsonl`	Add logs tagged with `source=telemetry`, `trigger=prompt`
`qa-feedback.md`	Summarize new suggestions in Studio format

✅ Summary¶

The Test Generator Agent’s responsibilities are centered around:

🤖 Intelligence (inferred scenarios, gaps, behaviors)
📘 Flexibility (prompt- or telemetry-based activation)
🧪 Completeness (role paths, failure modes, missed conditions)
📦 Usability (artifacts feed directly into CI/CD, Studio, PRs)

It serves as the creative and critical thinking arm of ConnectSoft’s QA cluster — inventing the tests that others miss.

📥 Inputs¶

Unlike the Test Case Generator Agent (which consumes static artifacts like handlers and DTOs), the Test Generator Agent takes in a rich, dynamic, multi-modal input set that allows it to:

Simulate real-world test gaps
Respond to behavior deviations or failures
Enrich test coverage based on runtime observations and prompts

📦 Primary Input Categories¶

Input Type	Description	Example
Blueprint & Trace Metadata	Contextual reference for feature, roles, edition	`trace_id: payment-2025-0147`, `blueprint_id: usecase-9342`
Execution Gaps from Test Coverage Validator Agent	Identified missing tests by type, role, or scenario	`"No role test found for CFO in CancelInvoiceHandler"`
Telemetry, Logs, and Spans	Runtime events, logs, traces, or error spikes	`500 errors during refund`, `event: UnexpectedCurrencyFormat`
QA Prompts from Studio	Free-form or structured queries initiated by QA	`"What if payment is cancelled after capture?"`
Edition-Specific Overrides	Per-edition roles, behaviors, toggles, or constraints	`enterprise → emit InvoiceAuditLogged`, `lite → skip tax validation`
Failed Test Executions or Bug Reports	History of test failures or uncovered bugs	`"Duplicate invoice bug, no scenario covers it"`
Memory / Similar Test Lookups	Historical tests for similar handlers or scenarios	From `MicroserviceMemoryIndex` or vector embeddings
DTO / Domain Object Snapshots	Structural references for test input generation	`CreateInvoiceInput`, `RefundRequest`
Existing .feature Files (for extension)	Used to enrich or patch scenarios	`capture_payment.feature`
Authorization Role Matrix	Map of role-to-action coverage by edition	Used to trigger `403`, escalation, or bypass tests

🧠 Sample Enriched Input¶

trace_id: invoice-2025-0143
blueprint_id: usecase-9241
existing_coverage:
  unit_tests: 3
  bdd_scenarios: 2
  roles_tested: [FinanceManager]
  roles_missing: [CFO, Guest]
telemetry:
  recent_errors:
    - event: "NullReference in RefundProcessor"
    - code: 500
    - payload: "RefundAmount: null"
qa_prompt: "What if refund is issued twice?"
dto_structure:
  RefundRequest:
    - RefundAmount: decimal
    - Reason: string

→ Result: Generate

Scenario: Duplicate refund prevention
Handle_ShouldThrow_WhenRefundIsDuplicate
.feature augmentation with CFO role

💡 Human-Centered Inputs¶

Source	Format
QA Studio Prompt	`"What if currency is changed post approval?"`
Bug Resolver Comment	`Bug #4812: missing test for customer type = corporate`
Manual Edition Override	`"Add auth scenario for Guest user in lite edition"`

These are structured by the agent’s skill planner into actionable test-generation sequences.

🔍 Semantic Inputs from Logs¶

{
  "event": "InvoiceApprovalFailed",
  "role": "Analyst",
  "message": "Unauthorized access attempt",
  "trace_id": "invoice-2025-0311"
}

→ Agent infers:

Missing test case for unauthorized Analyst attempting approval
Proposes .feature scenario + integration test

📘 Prompt Template Input¶

{
  "prompt": "What happens if refund amount is negative?",
  "context": {
    "handler": "IssueRefundHandler",
    "validator": "RefundRequestValidator",
    "roles": ["SupportAgent"],
    "blueprint_id": "usecase-8014"
  }
}

→ Skill invoked: ProposeEdgeCaseFromPrompt

✅ Summary¶

The Test Generator Agent consumes:

🧠 Blueprint context
📉 Test gap metadata
📊 Runtime telemetry
🧪 Human prompts
🔁 Memory and prior test embeddings

This multi-source fusion of design-time, runtime, and interactive inputs makes the agent capable of intelligent, behavioral test expansion unmatched by static test generation tools.

📤 Outputs¶

The Test Generator Agent produces intelligent, augmented, and behavior-focused test artifacts that extend the standard test suite. These outputs are:

📘 Expressive and traceable
🧪 Hypothetical and exploratory
🧩 Aligned to observed gaps or QA prompts
📄 Structured for CI/CD, QA agents, and Studio consumption

📦 Primary Output Artifacts¶

Output Type	Description	Format/Example
Augmented `.feature` Scenarios	New or patched BDD scenarios derived from prompts, logs, or test gaps	`refund_flow.feature`
Hypothesis Test Cases	Test files or methods appended to existing unit/integration tests	`Handle_ShouldReject_WhenRefundExceedsLimit()`
Studio Markdown Summary	Human-readable insights for QA, trace viewers, and prompt responses	`qa-augmented-tests.md`
Test Augmentation Metadata	JSON or YAML structured trace → gap → test mapping	`test-augmentation-metadata.yaml`
Observability Event Logs	Trace-linked reasoning, AI inferences, scenario triggers	`testgen-observability.jsonl`
Retrospective Gap Patch Commits	Optional Git-based diff patches for test extension	`patch-test-trace-0427.diff`
Prompt-to-Test Response Bundles	Structured records of QA query → generated test flow	Used by Studio feedback panel

📘 Output Example: Augmented `.feature` File¶

@trace_id: refund-2025-0182
@augmented_by: test-generator-agent
@source: prompt
Feature: Refund flow edge cases

Scenario: Refund is issued twice
  Given a support agent issued a refund
  When they try to issue it again
  Then the system should reject it with status "DuplicateRefund"

Scenario: Negative refund amount
  Given a refund request with amount = -100
  Then the request is rejected

🧪 Output Example: Hypothetical Test Method¶

Appended to IssueRefundHandlerTests.cs:

[TestMethod]
[TraceId("refund-2025-0182")]
[AugmentedBy("test-generator-agent")]
public async Task Handle_ShouldThrow_WhenRefundAmountIsNegative()
{
    var input = new RefundRequest { RefundAmount = -100 };
    var result = await handler.Handle(input);
    Assert.IsFalse(result.IsSuccess);
    Assert.AreEqual("Refund amount must be positive", result.Error.Message);
}

📄 QA Summary Output (Markdown)¶

### 📌 QA Scenario Augmentation – Trace: refund-2025-0182

✅ Added 2 BDD scenarios to `refund_flow.feature`
✅ Appended 1 unit test to `IssueRefundHandlerTests.cs`

- Scenario: Duplicate refund attempt → status: handled
- Scenario: Negative refund → validator failed as expected

🧠 Reasoning: Triggered by QA prompt “What if refund is issued twice?”

📎 `test-augmentation-metadata.yaml`¶

trace_id: refund-2025-0182
augmented_by: test-generator-agent
source: studio-prompt
test_type: bdd + unit
new_scenarios:
  - Duplicate refund attempt
  - Negative refund
linked_artifacts:
  - refund_flow.feature
  - IssueRefundHandlerTests.cs
roles_covered: [SupportAgent]

📊 Observability Events¶

Emitted to testgen-observability.jsonl:

{
  "event": "TestAugmented",
  "trace_id": "refund-2025-0182",
  "source": "qa_prompt",
  "scenario": "Refund is issued twice",
  "roles_covered": ["SupportAgent"],
  "edition": "lite"
}

📤 Git Patch (Optional)¶

git apply patch-test-trace-0427.diff
# Adds missing test for RefundAmount < 0

✅ Summary¶

The Test Generator Agent emits:

📘 Test files and .feature enhancements
🧠 Reasoned Markdown outputs for QA and Studio
📦 Metadata linking test → trace → prompt
📊 Observability events for audit and trace replay
🔁 Optional patch artifacts for review pipelines

These outputs empower QA, Studio, CI/CD, and trace analytics with dynamic, intelligent, explainable test expansions.

🚦 Reactive Triggers – When and Why the Agent Is Invoked¶

Unlike statically triggered agents (e.g., Test Case Generator), the Test Generator Agent is activated reactively or on-demand, when existing tests are insufficient or complex behaviors demand exploration.

It responds to intelligent signals, not just code artifacts.

🔁 Triggering Modes¶

Mode	Description	Example
🧪 Coverage Gap Trigger	Fired when Test Coverage Validator Agent detects untested paths	“No BDD scenario for CFO role in `ApproveInvoice`”
🧠 QA Prompt Trigger	Triggered via Studio input or prompt request	“What happens if a refund is re-issued after cancellation?”
📉 Observability Trigger	Based on runtime telemetry (logs, errors, anomalies)	“Spike in 400 errors on `POST /api/refund`”
🐛 Bug Pattern Trigger	Initiated when Bug Resolver Agent detects untested fix area	“Bug #4201 lacks test reproduction for invalid tax exemption”
🎭 Edition/Role Test Gap	Detected when edition-specific paths aren’t tested	“Enterprise edition lacks Guest user rejection case”
🧱 New Business Rule Trigger	Agent re-scans blueprint after major rule/validation update	“Added late fee logic to `InvoiceDue` → generate overdue scenario”
💬 Prompt Simulation Mode	Interactive QA request: simulate variants, override parameters	“Simulate invalid dates across timezones”

🧠 Reactive Lifecycle Example¶

sequenceDiagram
    participant CoverageValidator
    participant TestGenerator
    participant QAEngineer
    participant Studio

    CoverageValidator->>TestGenerator: Trigger on missing scenario (CFO approval)
    TestGenerator->>QAEngineer: Propose new `.feature` scenario
    QAEngineer->>Studio: Approve test addition
    TestGenerator->>Studio: Emit markdown summary + test metadata

Hold "Alt" / "Option" to enable pan & zoom

📘 Sample Trigger: QA Prompt¶

Input:

{
  "prompt": "What if refund is denied twice?",
  "context": {
    "handler": "IssueRefundHandler",
    "blueprint_id": "usecase-8041"
  }
}

Agent Action:

Generates Scenario: Repeat refund denial handling
Emits markdown summary for Studio
Appends new test to IssueRefundHandlerTests.cs

📊 Studio Trigger UX¶

Agent embedded in Studio under:

✅ “Suggest Missing Test”
✅ “Simulate Alternate Path”
✅ “Test Role or Edition Variant”
✅ “Cover Observability Gap”

Each action produces previewable suggestions.

🔁 Trigger Types Recap¶

Type	Triggered By	Result
📉 Observability	Telemetry alert, error log	Edge case test inferred from log
🧪 Coverage	Validator agent reports missing scenario	Add `.feature` or unit test
🧠 QA	Prompt in Studio	Prompt-based scenario simulation
🐛 Bug	Resolver agent signals reproduction required	Add test to capture defect case
🧾 Edition	Gap in edition-specific role or config	Emit conditional `.feature` or test branch
🧱 Business Rule	Blueprint update	Add test reflecting new rule path

✅ Summary¶

The Test Generator Agent is never passive — it is responsive, contextual, and AI-augmented:

Activated when something is missing, ambiguous, or questioned
Bridges the gap between human QA, production observations, and test logic
Enables continuous, adaptive quality assurance beyond static coverage

🔁 Process Flow (High-Level)¶

The Test Generator Agent executes in a reactive, prompt-augmented flow, designed to transform incomplete test coverage, ambiguous behaviors, or human questions into concrete, testable outputs.

Unlike deterministic generators, it operates like a QA researcher with memory and observability tools — combining design, runtime, and human input into a creative testing loop.

🧬 High-Level Execution Diagram¶

flowchart TD
    Trigger[📥 Trigger Received<br>(Prompt, Gap, Log, Edition)] --> Analyze[🧠 Analyze Context]
    Analyze --> Plan[📋 Plan Test Generation Path]
    Plan --> Generate[⚙️ Execute Scenario Simulation & Prompt Skills]
    Generate --> Emit[📤 Emit Test Artifacts]
    Emit --> Validate[🧪 Run Lint + Structure Validators]
    Validate --> Report[📊 Emit Metadata + Studio Summary]
    Report --> Done[✅ Done]

Hold "Alt" / "Option" to enable pan & zoom

🧱 Phase Descriptions¶

Phase	Description
📥 Trigger Received	Activated by Studio prompt, Test Coverage Validator, bug trace, or telemetry
🧠 Analyze Context	Loads handler, DTO, existing tests, blueprint, roles, and test history
📋 Plan Test Path	Determines which types of tests to generate: BDD, unit, edge, edition-specific
⚙️ Execute Scenario Simulation	Uses prompt skills and embeddings to create test content (steps, assertions, inputs)
📤 Emit Artifacts	Outputs `.feature`, `.cs`, Markdown, metadata, and logs
🧪 Validate Structure	Ensures format, structure, trace metadata, naming, and consistency
📊 Emit Metadata	Saves `test-augmentation-metadata.yaml`, emits spans, and notifies Studio/QA agents

🔁 Execution Characteristics¶

Trait	Detail
Reentrant	Can be retriggered for the same trace to add more cases
Traceable	Every output is `trace_id` and `source` tagged
Edition-Aware	Scenarios vary based on `edition_id` and `roles_allowed`
Prompt-Centric	User questions or QA notes become execution contexts
Memory-Augmented	Reuses previous patterns, tests, and coverage snapshots for alignment

📘 Example Flow (Prompt-Driven)¶

Trigger: QA in Studio asks:

“What if refund is attempted twice for the same transaction?”

Plan: Agent checks:
Handler: IssueRefundHandler
Existing test coverage
DTO: RefundRequest
Prior bug trace: #REFD-2203
Generate:
.feature scenario:
```
Scenario: Duplicate refund is rejected
```
* Appends unit test method to handler test * Adds markdown explanation
Validate:
Passes lint, snapshot, and coverage inclusion
Emit:
Saves to repo/memory
Updates Studio dashboard
Notifies QA agent

✅ Summary¶

The Test Generator Agent runs a reactive, intelligent, trace-backed pipeline that enables:

🧠 Context-driven test generation
📘 Scenario expansion based on real user questions and observability
📎 Outputs tied directly to trace, edition, and role
💬 QA collaboration and Markdown reporting
🧪 Validation-integrated, repeatable, and audit-safe processes

🔬 Process Flow (Detailed Skill-Orchestrated Flow)¶

The Test Generator Agent operates as a skill-orchestrated AI agent, executing a series of modular and reactive skills — each one handling a stage in the prompt-to-test translation pipeline.

This architecture supports:

🔁 Reusability across test types
📎 Trace alignment
🧠 Prompt + telemetry understanding
🔍 Memory + embedding retrieval
📄 Markdown storytelling + Studio visibility

🧬 Detailed Skill Flow¶

flowchart LR
    A[Trigger Event] --> B[LoadTestContext]
    B --> C[IdentifyTestGap]
    C --> D[PlanScenarioTemplates]
    D --> E[GenerateTestScenarios]
    E --> F[EmitTestArtifacts]
    F --> G[ValidateGeneratedTests]
    G --> H[EmitObservability]
    H --> I[StudioSummaryMarkdown]

Hold "Alt" / "Option" to enable pan & zoom

🧠 Core Skills Used¶

Skill	Purpose
`LoadTestContext`	Loads trace ID, blueprint, handler, DTO, edition, and QA prompt
`IdentifyTestGap`	Queries test coverage memory and recent observability logs
`PlanScenarioTemplates`	Decides what kinds of tests to generate (unit, BDD, edition-specific)
`GenerateTestScenarios`	Uses OpenAI-backed planners to simulate steps, inputs, outcomes
`EmitTestArtifacts`	Writes `.feature`, `.cs`, YAML metadata, and Markdown
`ValidateGeneratedTests`	Runs naming, formatting, trace-tag, and structure linting
`EmitObservability`	Emits OpenTelemetry spans + JSONL trace logs
`StudioSummaryMarkdown`	Generates human-readable report for QA and Studio dashboard

🧪 Internal Skill Handlers and Sub-Skills¶

🔁 Test Discovery + Planning¶

ScanCoverageByTrace(trace_id)
FetchMissingScenariosByHandler()
DetectUncoveredRoles(trace_id, edition)

✍️ Prompt-Based Scenario Generation¶

GenerateFeatureScenario(prompt)
SuggestTestMethod(prompt, handler, dto)
InferAssertionsFromDTO(field_rules)

📘 Artifact Generation¶

CreateFeatureFile(handler, scenarios)
AppendTestMethod(test_class, method_code)
EmitTestMetadata(trace_id, test_type, role, edition)

📁 Output Coordination¶

All skills write into structured paths like:

Tests/
├── PaymentsService.Specs/
│   └── Features/
│       └── refund_flow.feature
├── PaymentsService.UnitTests/
│   └── IssueRefundHandlerTests.cs
├── test-augmentation-metadata.yaml
└── qa-report.md

📊 Observability Skill Output¶

Span Example:

{
  "span": "testgen.GenerateFeatureScenario",
  "trace_id": "refund-2025-0142",
  "handler": "IssueRefundHandler",
  "edition": "lite",
  "source": "qa_prompt",
  "scenario_title": "Refund denied twice"
}

Validation Result:

{
  "status": "success",
  "issues_found": 0,
  "methods_added": 1,
  "feature_scenarios_added": 2
}

🧠 Retriable and Self-Healing¶

All skills:

Support retry on failure or invalid output
Are idempotent for deterministic prompts
Auto-tag retry_count, source, last_modified_by in metadata

✅ Summary¶

The Test Generator Agent's core intelligence is delivered through modular Semantic Kernel skills, enabling it to:

React to signals and prompts
Simulate human testing logic
Produce trace-aligned, validated artifacts
Emit reasoning and coverage metadata

This skill-based execution allows for fine-grained control, modular upgrades, and full AI-driven collaboration with QA teams and Studio.

🧩 Core Skills¶

The Test Generator Agent’s core skills power its ability to:

Think like a QA engineer
Generate complete test flows from partial prompts
Simulate business logic without static code structure
Convert system knowledge into BDD and executable tests
Fill testing blind spots using pattern inference, prompt expansion, and flow simulation

🧠 Core AI-Driven Skills¶

Skill Name	Description
`GenerateFeatureScenario(prompt)`	Converts a prompt like “What if a refund is issued twice?” into a Gherkin-compliant scenario
`SuggestTestMethod(prompt, handler, dto)`	Creates MSTest method names, signatures, and assertions
`ExpandPromptIntoPaths(prompt)`	Breaks vague QA prompts into multiple edge cases
`InferAssertionsFromDTO(dto_rules)`	Suggests validation rules and expected outcomes
`SimulateRoleActions(trace, edition)`	Generates role-based test variants (e.g. Guest, CFO, SupportAgent)
`PlanFailureFlowScenarios()`	Proposes behavioral fallbacks or constraint violation tests
`MapTelemetryToTestInput(log_entry)`	Turns runtime logs into structured inputs for test simulation
`GenerateScenarioMatrix(handler, roles, inputs)`	Maps permutations of scenario candidates for dynamic `.feature` generation
`PredictRegressionTestImpact(bug_metadata)`	Suggests new test scenarios based on historical bug pattern embeddings

💬 Prompt → Scenario Skill Chain¶

Input Prompt:

“What if refund is issued while invoice is locked?”

Skill Execution Chain:

GenerateFeatureScenario →

Scenario: Refund rejected while invoice is locked

2. SuggestTestMethod →

Handle_ShouldReject_WhenInvoiceIsLocked()

3. InferAssertionsFromDTO → Expected error: "InvoiceLockedException" 4. EmitTestArtifacts → Outputs .feature, .cs, and trace metadata

🧪 Sample Skill: `ExpandPromptIntoPaths`¶

Input Prompt:

“What if the amount is too high?”

Output:

- Scenario: Amount equals max allowed
- Scenario: Amount exceeds max allowed
- Scenario: Amount is null
- Scenario: Amount is string
- Scenario: Amount submitted twice

→ Used by downstream skills to generate .feature and test method templates.

🧠 DTO-Aware Assertion Inference¶

Given:

public class RefundRequest {
    [Required]
    public decimal Amount { get; set; }

    [MaxLength(200)]
    public string Reason { get; set; }
}

→ InferAssertionsFromDTO() generates:

Amount = 0 → IsFailure("Amount must be greater than 0")
Reason = 300 chars → IsFailure("Reason too long")

🎯 Use of Embeddings¶

Skills like PredictRegressionTestImpact() and SuggestTestMethod() use:

Vector similarity with past test descriptions
Stored embeddings from bug traces, feature summaries, .feature titles
Memory of past DTOs and known domain actions

→ Improves precision and recall of suggested test cases.

📘 Markdown Summary Skill¶

Skill	Output
`StudioSummaryMarkdown()`	Human-readable test reasoning summary
`ExplainWhyScenarioMatters()`	QA-facing commentary attached to `.feature` preview

✅ Summary¶

The Test Generator Agent’s core skill system allows it to:

Understand ambiguous prompts
Simulate diverse paths with confidence
Write tests from language and reasoning, not code alone
Collaborate with QA agents and Studio via explainable, testable artifacts

These skills make it an intelligent QA engineer in software form.

📡 Observability-Aware Test Inference¶

The Test Generator Agent integrates with the observability fabric of the platform — leveraging telemetry, spans, logs, and runtime event data to:

📉 Detect real-world behavior gaps
🧪 Infer scenarios that have not been exercised in tests
⚠️ Proactively propose tests to prevent repeat issues
🔁 Link tests directly to production symptoms

This makes the agent not just QA-aligned — but production-informed.

📊 Key Observability Inputs Used¶

Source	Example
OpenTelemetry Spans	High failure rate in `RefundService.Handle()`
Error Logs	Repeated `NullReferenceException` on `CustomerId`
HTTP Metrics	Surge in `400 BadRequest` on `POST /invoice`
Trace Snapshots	Slow response time with specific input patterns
AppInsights / Logs	User retries on specific flow = behavioral anti-pattern
Service Events	`event: PaymentMismatchDetected` (never tested in `.feature`)

🔍 Skill: `MapTelemetryToTestInput`¶

This core skill takes in runtime telemetry (e.g., from logs or spans) and:

Identifies which handler or controller was involved
Parses any payload (request/response) structures
Extracts failure conditions, error messages, or paths
Reconstructs an inferred test case

🧪 Example Input: Observability-Driven Trigger¶

{
  "trace_id": "refund-2025-0143",
  "handler": "IssueRefundHandler",
  "error_message": "Amount cannot be null",
  "event": "NullReferenceException",
  "log_payload": {
    "RefundAmount": null,
    "CustomerId": "8d2..."
  }
}

→ Agent generates:

🔨 Test Method:

Handle_ShouldFail_WhenRefundAmountIsNull()

* 📘 .feature Scenario:

Scenario: Refund with missing amount

📈 Metrics-Aware Skills¶

Skill	Behavior
`AnalyzeErrorFrequencySpans()`	Identifies handlers with recurring issues
`InferGapsFromUnhandledSpans()`	Finds spans that lack test trace coverage
`GenerateEdgeScenarioFromLog(log)`	Creates structured test case from runtime data
`SuggestAssertionFromError(error_msg)`	Turns logs into test expectation
`AttachTestToTelemetrySource()`	Tags generated test with observability correlation ID

📎 Traceability and Metadata Output¶

generated_by: test-generator-agent
trigger: observability
trace_id: refund-2025-0143
span_id: a4e12d78e
origin: AppInsights
test_artifact: IssueRefundHandlerTests.cs
scenario: Refund with null amount
asserts: ["Amount must not be null"]

→ Used by Studio, QA, PRs, and observability dashboards

🛠 Usage in Continuous QA¶

Use Case	Result
🚨 Observability alert → test not found	Agent adds scenario
🧪 Spike in retry rate	Generates scenario: “Retry rejected if payment already processed”
🧠 Missing span→test map	Agent auto-fills `.feature` gap
🧾 Event observed but not validated	Scenario added: `"event: InvoiceApproved → assert in BDD"`

🧘 Integration with QA and Bug Resolver Agents¶

These agents can flag test absence for observability-based issues — the Test Generator Agent:

Backfills .feature
Suggests new test method
Links observability trace → handler → test

✅ Summary¶

The Test Generator Agent turns live runtime observations into concrete, testable artifacts, ensuring:

🔍 Nothing observed in production is left untested
🧪 QA feedback loops close automatically
📎 Tests are tagged with trace → span → error metadata
📊 Studio and CI gain insight into real-world coverage, not just design coverage

This is a core differentiator of ConnectSoft's Observability-First QA Architecture.

🔐 Security-Aware Scenario Generation¶

Security-related bugs are often:

🚫 Undetected by static test generation
🔐 Role-dependent or permission-specific
🧭 Configuration-based (edition, tenant, policy)
🧨 Triggered by unauthorized access or incorrect access control

The Test Generator Agent proactively generates security-focused tests to ensure all authorization paths, privilege boundaries, and denial-of-access flows are covered and tested.

🧩 Security Test Types the Agent Generates¶

Test Type	Description
Unauthorized Role Access	Ensures roles without permission are blocked
Anonymous Access Scenarios	Verifies `[AllowAnonymous]` or `Unauthenticated → 401` behavior
Role Escalation Attempt	Detects missing guards when a lower-privilege role tries privileged action
Edition-Specific Permission Cases	Varies access rules based on edition config
Token or Claim Manipulation	Explores behavior with corrupted/malformed/missing claims
Restricted State Transition	Prevents forbidden state actions (e.g., closing already-paid invoice)

🧠 Inputs for Security Scenario Generation¶

roles_allowed from blueprint or port config
AuthorizationMap.yaml per edition
Controller annotations like [Authorize(Roles = "FinanceManager")]
Previous test coverage: roles tested vs. roles missing
Studio prompts (e.g., “What happens if Guest user tries to approve invoice?”)

🧪 Example: Unauthorized Role Test¶

[TestMethod]
[TraceId("invoice-2025-0147")]
[Edition("enterprise")]
[AugmentedBy("test-generator-agent")]
public async Task Post_ShouldReturn403_WhenGuestTriesToApproveInvoice()
{
    var client = factory.CreateClientWithRole("Guest");
    var response = await client.PostAsJsonAsync("/api/invoice/approve", validPayload);
    Assert.AreEqual(HttpStatusCode.Forbidden, response.StatusCode);
}

📘 Example BDD Scenario¶

@edition:enterprise
@trace_id:invoice-2025-0147
@source:security-inference
Feature: Invoice approval access control

Scenario: Guest user attempts approval
  Given a user with role Guest
  When they send an approval request
  Then access is denied with status 403

🔍 Skill-Based Flow¶

Skill	Action
`EnumerateRoleVariants(handler)`	Detects all roles not yet tested
`SimulateUnauthorizedAccess(role)`	Proposes tests to enforce denial
`InferSecurityPolicyFromEdition(edition)`	Adjusts tests for edition-specific rules
`GenerateAuthorizationAssertions()`	Converts `403`, `401`, or redirect outcomes into test assertions
`PatchMissingAuthScenarios()`	Fills BDD/test files with missing security cases

🧬 Edition Example¶

Blueprint:¶

edition_id: enterprise
roles_allowed:
  approve_invoice: [CFO, FinanceManager]

Output:¶

✅ Validates approval for CFO
❌ Rejects Guest, Analyst, SupportAgent
📎 Adds test + .feature for all undefined roles

🔁 Traceability¶

All security scenarios are:

Tagged with security_test: true
Linked to trace_id, edition, and handler
Indexed in test-metadata.yaml and Studio dashboards

✅ Summary¶

The Test Generator Agent proactively defends against security regressions by:

🧠 Discovering untested access paths
🔐 Enforcing principle of least privilege through test scenarios
📎 Tagging tests for edition-, trace-, and role-specific enforcement
📘 Generating assertions for expected denials and edge paths

Security-aware scenario generation ensures the factory doesn't just produce working software — it produces safe software.

📘 Scenario Enrichment for BDD and Studio¶

One of the Test Generator Agent’s most user-facing roles is its ability to enrich .feature files and support Studio-based exploratory QA workflows by:

📖 Adding role-, state-, and failure-path scenarios
🧩 Filling in edge cases and prompts into existing .feature files
✨ Enhancing readability, clarity, and auditability of QA test specs
🔁 Keeping BDD specs aligned with trace + prompt context

This bridges developer-generated tests with human-readable QA artifacts.

🧩 Key BDD Enrichment Functions¶

Enrichment Type	Purpose
Scenario Injection	Appends new `Scenario:` blocks to existing `.feature` files
Prompt-to-Gherkin Expansion	Converts Studio prompts into full Gherkin test narratives
Role Variant Enrichment	Adds missing role paths (e.g., Guest, Admin)
Condition Branch Scenarios	Adds `Given/When/Then` for alternate flows (e.g., empty input, retry)
Security/Access Markers	Adds `@auth`, `@denied`, `@edition:lite` tags
Studio Preview Markdown	Generates QA- and product-friendly descriptions for visual dashboards

🧪 BDD Example Before Enrichment¶

Feature: Capture payment

Scenario: Successful payment
  Given a cashier submits a valid payment
  Then the payment is recorded

🧠 After Enrichment¶

Feature: Capture payment

Scenario: Successful payment
  Given a cashier submits a valid payment
  Then the payment is recorded

Scenario: Duplicate payment submission
  Given a payment was already processed
  When a second request is sent
  Then the request is rejected with status DuplicatePayment

Scenario: Guest user attempts payment
  Given a user with role Guest
  When they submit a payment
  Then access is denied with 403

🔧 BDD Enrichment Skills Used¶

Skill	Description
`DetectEnrichableFeature(trace_id)`	Loads and parses current `.feature` file
`ProposeNewScenarios(prompt, gaps)`	Suggests 1–N new Gherkin scenarios
`ApplyRoleMatrixToFeature()`	Adds one scenario per uncovered role
`InsertScenarioTags()`	Adds `@edition`, `@prompt`, `@security`
`WriteFeaturePatch()`	Appends new scenarios into correct file with preservation
`GenerateStudioMarkdownSummary()`	Outputs readable descriptions for QA & PM dashboards

📋 Output: Studio Markdown Summary¶

### Test Scenario Expansion for Capture Payment

📎 Trace: payments-2025-0321
🔍 Trigger: Studio prompt “What if payment is submitted twice?”

✅ Appended to `capture_payment.feature`:
- Scenario: Duplicate payment submission
- Scenario: Guest user access denied

@tags: security, regression, prompt-driven

🔄 Scenario Enrichment Metadata¶

All new .feature scenarios are traced with:

scenario_source: test-generator-agent
trace_id: payments-2025-0321
trigger: qa_prompt
augmented_roles: [Guest]
edition: enterprise

Stored in test-augmentation-metadata.yaml and indexed for Studio navigation.

📊 Studio Integration¶

Section	Use
Test Preview Panel	Shows enriched scenarios with trace context
Missing Roles Dashboard	Lists roles not yet tested (used by `ApplyRoleMatrixToFeature()`)
Prompt History Panel	Matches generated scenarios with QA questions
Audit Logs	Shows scenario origin, version, and rationale

✅ Summary¶

The Test Generator Agent enriches BDD and Studio workflows by:

✍️ Appending intelligent scenarios to .feature specs
🧠 Translating QA prompts into structured, Gherkin-based tests
📎 Embedding trace, edition, and security metadata
📘 Supporting QA engineers, product managers, and test reviewers with markdown insights

This creates a seamless QA experience: from test prompt → to scenario → to Studio visibility.

🤝 Integration with QA Engineer Agent and Studio¶

The Test Generator Agent is designed to work as a direct augmentation partner for the QA Engineer Agent and Studio UX.

📋 The QA Engineer Agent owns the QA strategy, gap tracking, test run validation, and regression feedback.
🧠 The Test Generator Agent enhances QA workflows with intelligent, promptable, and observability-aware test generation.
🖥️ Studio is the shared interface, where both agents are visible to QA engineers, test designers, and product managers.

🧩 Integration Points with QA Engineer Agent¶

Function	Description
Scenario Suggestion Sync	Agent suggests test cases for uncovered flows → QA Engineer Agent decides inclusion
Gap Auto-Filling	QA Agent flags missing paths → Test Generator Agent emits `.feature` patch
Prompt Collaboration	QA prompt → Test Generator Agent simulates → QA Agent evaluates & stores
Feedback Loop	QA Agent provides `✅ accepted`, `🔁 needs correction`, or `❌ not relevant`
Studio Summary Indexing	Test Generator outputs markdown + tags for QA Agent’s test plan reporting
Role Map Expansion	QA Agent tracks role coverage, Test Generator fills missing paths per handler or edition

🧬 Skill-Based Collaboration¶

sequenceDiagram
    participant QAEngineerAgent
    participant TestGeneratorAgent
    participant Studio

    Studio->>QAEngineerAgent: Detect uncovered refund flow
    QAEngineerAgent->>TestGeneratorAgent: Prompt: "What if refund is rejected twice?"
    TestGeneratorAgent->>QAEngineerAgent: Emit `.feature` + `.cs` test case
    QAEngineerAgent->>Studio: Approve, reject, or comment
    Studio->>TestGeneratorAgent: Feedback logged

Hold "Alt" / "Option" to enable pan & zoom

🧠 Shared Artifacts¶

Artifact	Used By	Description
`test-augmentation-metadata.yaml`	Both	Stores trace→scenario link and origin
`qa-report.md`	QA Agent	Combines test summaries for visibility and coverage tracking
`.feature` enriched files	QA Agent	Updated BDD specs consumed by QA and CI
`observability-events.jsonl`	QA Agent	QA traceability and status tracking
`StudioPromptContext.json`	QA Agent + Studio	Tracks prompt → test suggestion → review cycle

📘 Example: QA Prompt Collaboration¶

Prompt: “What if CFO tries to cancel a refund after it was approved?”

Agent proposes:
Scenario: CFO cannot cancel approved refund
Test method: Cancel_ShouldFail_WhenAlreadyApproved_ByCFO()
QA Agent accepts → triggers Git commit & Studio UI patch
Agent logs:

accepted_by: qa-engineer-agent
status: enriched
scenario_type: access-control

🧾 QA Workflow in Studio¶

Section	Test Generator Agent Role
Prompt → Preview Panel	Show suggested `.feature` block
Missing Roles View	Trigger enrichment skill for uncovered `role × handler`
Approved Scenarios	Tracked via `qa-feedback.md`
Rejected or Needs Fix	Sent back to agent for retry or rephrasing
Trace View	Visual chain: handler → test → QA prompt → scenario

🔄 Collaboration Cycle Summary¶

Step	Action
1️⃣ QA Prompt	Studio or QA Agent trigger
2️⃣ Agent Simulates	Scenario, test file, trace metadata emitted
3️⃣ QA Approves/Rejects	Through Studio
4️⃣ Studio Annotates	Adds visual test coverage and changelogs
5️⃣ Retry if Needed	Agent regenerates adjusted test

✅ Summary¶

The Test Generator Agent integrates deeply with QA Engineer Agent and Studio by:

💡 Co-creating tests from prompts and gaps
🧪 Suggesting intelligent .feature expansions
📋 Tracking traceability, edition, roles, and human review
📘 Powering Studio’s “what if” testing UX
🔁 Supporting iterative QA–AI collaboration with trace-safe retries

Together, they enable a human-AI hybrid testing workflow that is both scalable and explainable.

🎯 Self-Evaluation and Test Gap Identification¶

To maintain test quality autonomously, the Test Generator Agent includes a self-evaluation loop that allows it to:

🔍 Identify missing or insufficient tests
🧠 Reason about the impact of not testing a specific path
📉 Detect discrepancies between real-world signals and generated scenarios
🧪 Suggest tests based on observed gaps without relying solely on external triggers

This empowers the agent to continuously optimize test completeness even in the absence of explicit QA prompts.

🧩 Key Self-Evaluation Responsibilities¶

Responsibility	Description
Trace-Scenario Coverage Review	Compares blueprint/handler definitions vs. current test artifacts
Role Path Validation	Detects which `role × action` combinations are not tested
Edition-Specific Variant Check	Confirms whether tests for `lite`, `pro`, `enterprise`, etc. exist
Telemetry Trace Audit	Looks for runtime spans/events that have no test match
Bug Replay Coverage	Compares test set with known fixed bugs to ensure permanent defense
Validator-Rule Crosscheck	Flags missing negative tests for `RuleFor(...)` combinations
Gherkin Completeness Scans	Ensures all business paths appear in `.feature` files with proper assertions

🔍 Skill Set for Test Gap Identification¶

Skill	Function
`ScanTraceTestCompleteness(trace_id)`	Loads metadata, compares declared vs. tested
`EnumerateUntestedRoles(handler, edition)`	Generates list of role-action gaps
`CheckMissingFeaturePaths(handler)`	Parses `.feature` and detects missing states or transitions
`ValidateEdgeCoverage(dto)`	Ensures range, null, and invalid cases are tested
`CompareBugFixToTestSet(bug_id)`	Checks if bug conditions are replicated in current tests
`IdentifyEditionTestGaps()`	Detects missing `.feature` for one or more editions
`SuggestMissingAssertions()`	Detects scenarios without `Then` clauses or outcome checks

📘 Gap Evaluation Example¶

trace_id: invoice-2025-0142
blueprint_id: usecase-9241
handler: CreateInvoiceHandler
roles_allowed: [FinanceManager, CFO]
tested_roles: [FinanceManager]
gap:
  - missing_role: CFO
  - no test for zero amount
  - no feature for duplicate invoice error
  - no edition-specific `.feature` for `enterprise`

→ Agent triggers:

.feature addition for CFO path
Validator test: Amount = 0
Unit test: Handle_ShouldReject_WhenInvoiceExists

🧠 Self-Evaluation Feedback Formats¶

Markdown Summary:¶

📊 Self-Evaluation Summary: CreateInvoiceHandler (Trace: invoice-2025-0142)

❌ Missing:
- [ ] Role: CFO
- [ ] Negative validator test: ZeroAmount
- [ ] Scenario: Duplicate invoice error
- [ ] Edition-specific: `enterprise` test case

✅ Planned augmentation steps: 3

Metadata Log:¶

self_check_result: failed
missing_roles:
  - CFO
untested_conditions:
  - ZeroAmount
  - DuplicateInvoice
edition_variants_missing:
  - enterprise
next_actions:
  - trigger GenerateFeatureScenario(prompt="What if invoice already exists?")
  - trigger GenerateValidatorTest(field=Amount, condition=Zero)

🔄 Gap Feedback Loop¶

Agent executes ScanTraceTestCompleteness()
Missing elements logged into Studio trace dashboard
Agent autonomously or upon QA approval generates augmented tests
Metadata updated: gap_resolved = true

📎 Traceability for Test Gap Detection¶

Every scenario added via gap detection includes:

augmented_by: test-generator-agent
trigger: self-evaluation
gap_id: auto-detected
source_blueprint: usecase-9241

✅ Summary¶

Through self-evaluation, the Test Generator Agent becomes proactive, not reactive:

🔍 Automatically identifies test holes
📘 Generates patches to secure QA coverage
📊 Feeds insights to Studio dashboards and QA checklists
🔁 Enables closed-loop validation, continuously improving coverage without human instruction

🤝 Human-in-the-Loop Augmentation Mode¶

While the Test Generator Agent excels at autonomous test generation, it’s designed to work collaboratively with human QA engineers, developers, and product managers — enabling a “human-in-the-loop” mode to:

✅ Accept QA prompts
🧪 Provide test previews for confirmation
✍️ Accept manual edits and inject them back into the test set
🔁 Support iterative refinement of .feature, test methods, or Markdown explanations
🧠 Learn from accept/reject patterns over time

This mode ensures that test generation remains auditable, adaptable, and alignable with human expertise.

🧩 Key Capabilities in Human-in-the-Loop Mode¶

Capability	Description
Prompt-Driven Scenario Suggestion	QA types a “What if…” in Studio → agent generates proposal
Scenario Previews	Proposed Gherkin shown in read-only or editable preview
Editable Markdown Summaries	QA can revise scenario description before acceptance
Comment & Correction Loop	QA rejects/edits a suggestion → agent retries with feedback applied
Feedback Learning	Rejected patterns are down-ranked in embeddings and prompt planners
Tagged Trace Update	Mark test as `human_verified`, `qa_adjusted`, or `manual_override`
Studio Live Collaboration	QA can submit batch prompts or comment inline on test proposals

🧠 Example Flow¶

sequenceDiagram
    participant QA
    participant Studio
    participant TestGen

    QA->>Studio: "What if the CFO cancels a locked invoice?"
    Studio->>TestGen: Prompt context submitted
    TestGen->>Studio: Preview scenario + test method
    QA->>Studio: Edit Gherkin + approve
    Studio->>TestGen: Submit revised version
    TestGen->>Repo: Finalize test, metadata → update trace

Hold "Alt" / "Option" to enable pan & zoom

✍️ Editable Scenario Preview Example¶

# Suggested by agent:
Scenario: CFO cancels a locked invoice
  Given a CFO user
  When they cancel an already locked invoice
  Then the system rejects the request

# QA edits:
Scenario: CFO cannot cancel locked invoice
  Given the invoice is in "Locked" state
  And the user is CFO
  When they try to cancel it
  Then they receive a "ForbiddenOperation" error

✅ Final version submitted with tag:

qa_modified: true
original_source: test-generator-agent
reviewer: alex.k@qa-team

📘 Markdown Summary Edits¶

Before:

Scenario: CFO cancels locked invoice – Generated via prompt

After QA edit:

Scenario: CFO attempts restricted cancellation – Reviewed by QA

🔁 Feedback Loop + Retry Logic¶

Input	Result
QA clicks “Reject: Not Relevant”	Agent suppresses scenario pattern for similar prompts
QA selects “Try Again (Better Assertion)”	Agent re-runs with stricter validation rules or alternate outcome phrasing
QA marks scenario as “Duplicate”	Agent removes and updates metadata to avoid future duplication
QA types “Add edge case for negative amount too”	Triggers child prompt expansion

🧠 Learning + Memory from Human Feedback¶

The agent stores:

Rejected scenario titles
Accepted assertions
QA phrasing and tags
Editions and roles that consistently require expansion
QA reviewer preferences and behavior patterns (optional)

→ Improves generation quality over time across the platform.

📎 Metadata Example with Human Hooks¶

scenario: CFO cannot cancel locked invoice
source: test-generator-agent
prompt: QA prompt from Studio
review_status: accepted_with_edits
edited_by: qa.alex.k
feedback_applied: true

✅ Summary¶

The Test Generator Agent in human-in-the-loop mode:

🤝 Enhances QA creativity with structured AI suggestions
🧠 Learns from corrections to improve future outputs
🖥️ Integrates directly with Studio for real-time review
📄 Produces editable .feature, .cs, and Markdown assets
🔁 Supports iterative, explainable, and human-verifiable QA workflows

💡 This is where AI + QA collaboration becomes seamless and scalable.

🧠 Memory, Vector Embeddings, and Similarity Prompts¶

To generate contextually relevant, non-redundant, and intelligently suggested tests, the Test Generator Agent relies on:

🔁 Memory: Persistent trace-aligned records of what was tested, why, and how
🧠 Vector embeddings: Semantic similarity across prompts, scenarios, test cases, bugs, and DTOs
📚 Example reuse: Drawing from similar service domains to enrich test quality and coverage

This enables the agent to avoid duplication, recommend consistent patterns, and expand intelligently using learned context.

📦 What the Agent Stores and Embeds¶

Knowledge Type	Format	Purpose
Test Scenarios	`.feature` titles and steps	To cluster and suggest similar test ideas
QA Prompts	Vector embeddings	To answer similar future questions better
DTO Structures	Parsed DTOs as embeddings	To infer test conditions from similar DTO fields
Bug Metadata	`bug_id`, failure symptoms, event names	To suggest regression-preventing test logic
Roles-to-Handlers Maps	Role × Edition × Action	For complete role coverage suggestions
Handler-to-Test Mappings	From `test-metadata.yaml`	To detect structural test gaps
Blueprints & Use Cases	Embeddings of domain flow text	To auto-extend coverage across domains

🧬 Embedding-Based Prompt Expansion¶

Prompt:

“What if an Analyst tries to approve a payment?”

→ Embedding similarity finds:

Guest cannot approve invoice
Unauthorized SupportAgent tries to cancel refund
Non-admin user denies large transaction

→ Suggests:

Scenario: Analyst user access denied
Method: Post_ShouldReturn403_WhenUserIsAnalyst

🧠 Skills That Use Memory & Embeddings¶

Skill	Usage
`FindSimilarScenarios(trace_id, prompt)`	Pulls reusable `.feature` structures from related traces
`InferTestsFromRelatedDTOs(dto)`	Suggests edge cases seen in similar DTOs (e.g., `Amount`, `Currency`)
`PredictTestGapsFromPastBugs()`	Uses bugs with similar symptoms to generate regression tests
`ClusterUncoveredRoles()`	Uses role embeddings to suggest test coverage plans
`LearnFromQAEdits()`	Stores accepted/rejected scenarios and avoids generating similar rejected paths
`SuggestAssertionsBasedOnMemory()`	Suggests `Then:` and `Assert` statements that match domain-specific expectations

📘 Example: Memory Entry¶

{
  "trace_id": "invoice-2025-0131",
  "handler": "CreateInvoiceHandler",
  "prompt_embedding": [0.1, 0.42, ..., 0.08],
  "scenario": "Submit invoice with null customer ID",
  "assertion": "Fails with 'CustomerId is required'",
  "roles_tested": ["FinanceManager"],
  "roles_missing": ["CFO", "Guest"]
}

🔁 Reuse Across Microservices¶

If a test exists in CreateOrderHandlerTests.cs (e-commerce domain) → Can suggest similar edge case tests in CreateInvoiceHandlerTests.cs (finance domain)

This supports domain-informed reuse, especially useful in high-scale factory-wide generation.

🧪 DTO Similarity Expansion Example¶

public class PaymentRequest {
    public decimal Amount { get; set; }
    public string Currency { get; set; }
}

→ Memory shows:

Amount = 0 → test for zero boundary
Currency = "" → test for invalid format

→ Recommends:

Handle_ShouldFail_WhenAmountIsZero
Validate_ShouldReject_WhenCurrencyIsEmpty

🧠 Prompt Learning from QA Feedback¶

Rejected:

“What if Guest submits a refund?” → Feedback: “Already covered by CFO test; redundant.”

→ Agent embeds this and avoids similar redundant scenarios for Guest/CFO unless roles differ materially

📎 Metadata Tracked with Embedding-Driven Suggestions¶

embedding_source: "prompt: unauthorized refund"
suggested_by: similarity_from_trace[invoice-2025-0123]
feedback_history: accepted_by_qa.alex.k
semantic_similarity: 0.86

✅ Summary¶

By using memory and vector embeddings, the Test Generator Agent:

🧠 Thinks across modules, services, and past QA feedback
🔁 Reuses relevant test logic without hardcoding
✍️ Suggests smarter assertions, test names, and BDD flows
📊 Learns and improves coverage over time, autonomously

This ensures test generation is context-aware, non-repetitive, and knowledge-enriched.

🔁 Retry, Correction, and Trace-Driven Enhancements¶

The Test Generator Agent is equipped with a resilient and trace-safe retry & correction mechanism that ensures:

🧪 Broken, incomplete, or rejected tests are reprocessed
🧠 Prompt-based or AI-generated scenarios are refined upon feedback
🔄 Observability-driven augmentations are trace-aware and retryable
👤 Human corrections via Studio can trigger enhanced regeneration cycles

This enables self-healing test generation and QA feedback incorporation with auditability.

🧬 Retry Triggers¶

Trigger Type	Description
❌ Test Lint/Validation Failure	Scenario is generated, but fails structure/format rules
📉 Missing Assertion Detected	Test or `.feature` lacks a valid outcome check
🔁 QA Rejection in Studio	Scenario marked as “Inaccurate”, “Not needed”, or “Duplicate”
📎 Bug Regression Trace Replay	Agent is asked to regenerate test coverage for same trace ID with updated bug inputs
🧠 Memory Suggestion Conflict	Duplicate scenario detected with existing `.feature`

🛠 Retry Flow¶

sequenceDiagram
    participant Studio
    participant QAEngineer
    participant TestGen
    participant Memory

    QAEngineer->>Studio: Reject test scenario from prompt
    Studio->>TestGen: RetryScenario(trace_id, feedback="unclear THEN clause")
    TestGen->>Memory: Lookup prior attempt
    TestGen->>TestGen: Rerun GenerateTestScenarios with revised constraints
    TestGen->>Studio: Emit revised scenario for approval

Hold "Alt" / "Option" to enable pan & zoom

🧠 Retry Modes¶

Mode	Behavior
`patch-only`	Adds missing methods/scenarios without regenerating the whole test class
`regenerate-with-feedback`	Reruns the prompt with attached feedback (`assertion too vague`, `role mismatch`)
`semantic-deduplication`	Regenerates using prompt embeddings while removing previously generated ideas
`qa-intervention-loop`	Test is held until QA explicitly reviews and resubmits final wording

📘 Retry Metadata Example¶

trace_id: payments-2025-0471
scenario_id: refund_duplicate
retry_count: 2
retry_reason: "Missing Then clause"
last_feedback: "Scenario lacks concrete outcome"
status: resolved
regenerated_by: test-generator-agent

✍️ Human Correction Loop (Studio-Driven)¶

Action	Effect
📝 “Needs better Then clause”	Agent reconstructs test output with stronger assertion logic
❌ “Duplicate of CFO approval test”	Agent suppresses this scenario in current and future trace ID contexts
➕ “Add variation for 'InvalidCurrency' too”	Triggers child prompt expansion and batch generation

🔎 Bug Trace Replay Correction¶

Input	Action
Bug #4871: “Refund allowed after approval”	Agent verifies coverage in `.feature` + handler test
❌ No test found	Agent re-executes test plan for trace ID
✅ Outputs regression test for scenario:
“Prevent refund after approval”

🔁 Observability-Driven Retry¶

When a telemetry span or log indicates:

5xx errors
Timeout loops
Malformed DTO inputs

Agent:

Queries memory for trace ID
Evaluates: “Is this issue test-covered?”
If not, retries scenario generation for that path
Adds metadata: trigger=retry:observability

📎 Tracking Retry History in Artifacts¶

Each .feature and test method includes:

[RetryTrace(Count = 2, LastReason = "QA feedback: unclear output")]

@retry_count:2
@feedback_source:studio.qa-review.alex.k

✅ Summary¶

The Test Generator Agent uses trace-aware, intelligent retry strategies to:

🔁 Improve test precision via QA feedback
🧠 Refine generative logic using memory and embeddings
📊 Ensure no prompt or runtime gap goes untested
📘 Maintain audit-safe artifacts with retry tags and feedback context

This ensures that test generation is never final — only validated through continuous collaboration and reasoning.

🎯 Studio Hooks and Markdown Test Storytelling¶

The Test Generator Agent is deeply integrated with the Studio UI, serving both:

🧠 As a behind-the-scenes test generator, and
📘 As a human-facing storyteller that explains what it generated, why, and how.

It uses Studio hooks and markdown-based summaries to make test artifacts:

Understandable to QA engineers
Reviewable by product stakeholders
Traceable by developers and test leads

📦 Outputs Connected to Studio¶

Output	Purpose
`.feature` scenarios	Visualized in Studio test coverage dashboards
Markdown scenario summaries	Displayed in the “Suggested Tests” or “Scenario Details” panels
Feedback metadata	Drives accept/reject flows and traceability
Retry context	Enables inline regeneration with human guidance
Coverage links	“Test created by agent” → click to view scenario and trace
Prompt logs	Shows QA prompts and associated generated scenarios

🧩 Markdown Test Storytelling Format¶

Agent emits a QA-readable summary for each test it generates, including:

✅ Scenario title
🧠 Source reasoning
🔁 Retry or enrichment history
🔐 Role and edition tested
📘 Assertion summary
📎 Trace metadata

📘 Example Output: Markdown Story¶

### 🔍 Scenario: Refund is issued twice

📎 Trace: refund-2025-0143
🧠 Source: QA prompt — “What if refund is attempted more than once?”

✅ Test generated:
- Type: BDD + Unit
- Edition: `lite`
- Roles tested: `SupportAgent`

📄 Scenario Summary:

Given a support agent has already issued a refund When they try to issue it again Then the system rejects it with error "DuplicateRefund"

🔁 Feedback:
- Attempt #1: Missing THEN clause → resolved
- QA Comment: "Consider rephrasing expected outcome"

📘 Status: ✅ Approved and committed by QA

🔄 Studio UX Integration Points¶

Studio Component	Agent Behavior
Prompt Console	Receives QA test idea → generates suggestions
Scenario Preview Panel	Displays formatted `.feature` with QA feedback tools
Trace View	Connects `.feature` and `.cs` output to originating handler
Test Gaps Dashboard	Highlights untested paths — agent fills them in
Retry Request Button	Re-runs scenario generation for a single trace, role, or edition
Markdown Viewer	Shows scenario story in human-friendly form

🧠 Enhanced QA Review Flow¶

QA types:

“What happens if CFO cancels after approval?”

Agent generates:
.feature scenario
Markdown story with reasoning
Suggested test name: Cancel_ShouldFail_WhenAlreadyApproved_ByCFO()
QA sees:
Scenario
Markdown explanation
“Accept / Edit / Retry” controls
Approval → Commits to repo and marks trace as “QA-verified”

📎 Trace-Aware Tags in Markdown¶

All story outputs include:

trace_id: refund-2025-0143
scenario_id: refund_duplicate
prompt_source: studio.prompt.qa.alex.k
edition: lite
role: SupportAgent
retry_count: 1
source_skill: GenerateFeatureScenario

→ These enable filtering, search, and trace-to-test mapping inside Studio.

✅ Summary¶

The Test Generator Agent enhances the Studio UX by:

📘 Emitting clear, role/edition-aware scenario summaries
🧠 Explaining its AI reasoning and augmentation history
🔁 Supporting in-place review, edit, and regeneration
🧪 Closing the gap between prompt → code → QA validation
📎 Maintaining trace-safe, auditable test metadata in human-readable form

This turns Studio into an AI-augmented test design surface — not just a dashboard.

📊 Metrics, Coverage Impact, and Validation Reports¶

As a key QA automation agent, the Test Generator Agent must not only generate test assets — it must also:

🔍 Measure what it covers
📈 Track its impact on system-wide coverage
🧾 Provide clear, reportable outputs for QA, CI/CD, and Studio dashboards

Its metrics and validation system makes test augmentation quantifiable, auditable, and optimizable over time.

📦 Core Metrics Emitted¶

Metric	Description	Format
`testgen.scenario.count`	Total scenarios generated per trace/session	Integer
`testgen.methods.appended`	Number of unit/integration test methods added	Integer
`testgen.coverage.delta`	% increase in test coverage after augmentation	Float (0–100)
`testgen.role.variants.tested`	New role-action paths added	List of role×action
`testgen.retry.count`	Total retries performed per trace ID	Integer
`testgen.qa.acceptance_rate`	Ratio of QA-accepted to proposed scenarios	Percentage
`testgen.enrichment.tags`	Scenario tags emitted (e.g., `@security`, `@edge`)	Count per tag
`testgen.prompt.success_rate`	% of successful test generations per prompt	Percentage

📈 Coverage Impact Calculation¶

The agent hooks into the Test Coverage Validator Agent, comparing:

Pre-augmentation state
Post-augmentation state

Using metrics such as:

Handlers with ≥1 unit test
DTO fields with negative test cases
Roles × Edition covered
Gherkin .feature step completeness
Scenarios with real assertions (vs. placeholders)

→ It then emits a delta report, like:

trace_id: invoice-2025-0147
coverage_before:
  unit_tests: 3
  feature_scenarios: 1
  roles_tested: [FinanceManager]
coverage_after:
  unit_tests: 5
  feature_scenarios: 3
  roles_tested: [FinanceManager, Guest, CFO]
delta:
  feature_scenarios: +2
  unit_tests: +2
  roles_tested: +2

📘 Validation Report Example¶

{
  "trace_id": "refund-2025-0143",
  "status": "success",
  "scenarios_added": 2,
  "unit_tests_appended": 1,
  "qa_feedback": {
    "accepted": 2,
    "rejected": 0,
    "requires_followup": false
  },
  "assertions_present": true,
  "tags": ["edge", "security", "edition:lite"]
}

✅ This metadata is pushed to:

📊 Studio dashboards
📋 QA Engineer Agent reports
📤 PR annotations (via Pull Request Creator Agent)

🧪 Metrics for Studio Dashboards¶

Dashboard	Tracked Metrics
Test Coverage Heatmap	Trace ID × Scenario count, role coverage, edition paths
Prompt Coverage Map	% of QA prompts that produced accepted scenarios
Security Validation Grid	Role escalation/denial paths added via Test Generator
Regression Readiness	Bug-to-test trace validation reports
Edition Completeness Matrix	`lite`, `pro`, `enterprise` scenario count per use case

📄 Markdown Summary Metrics (QA-Friendly)¶

### 🧾 Test Augmentation Summary — CreateInvoiceHandler

📎 Trace: invoice-2025-0142  
📘 Prompt: “What if invoice already exists?”

✅ Tests Added:
- Scenarios: 2
- Unit Tests: 1
- Roles Added: [CFO, Guest]

🔍 Tags:
- @security
- @edge
- @edition:enterprise

🧠 QA Feedback:
- ✅ Accepted: 2
- ❌ Rejected: 0
- 🟡 Retry: 0

📈 Coverage Delta: +22%

✅ Summary¶

The Test Generator Agent produces clear, actionable QA metrics including:

📊 Scenario count, retry count, role path coverage
📈 Coverage delta with before/after snapshots
🧾 QA acceptance and reasoning logs
🔁 Retry efficiency and feedback loop summaries
📘 Studio and markdown integration for visibility and planning

This makes the agent not just a test creator, but a test strategist with measurable value.

✅ Final Summary¶

The Test Generator Agent is the AI-first, prompt-aware, and observability-augmented testing assistant within the ConnectSoft QA Engineering Cluster. It exists to:

Proactively augment test coverage using human prompts, trace analysis, telemetry, and domain understanding — filling the gaps left by static test generation.

It complements the Test Case Generator Agent by operating where intelligence, behavior, and reasoning are required to suggest:

✍️ Exploratory and edge-case tests
📘 Rich BDD scenarios
🔁 Test expansions based on feedback, bug reports, and edition paths
📊 Structured, traceable QA metadata
🤝 Collaboration cycles with QA and Studio

🧩 Feature Recap¶

Area	Capabilities
Prompt-to-Test	QA enters “what if” → agent emits `.feature`, test methods, markdown
Gap Closure	Detects missing roles, editions, assertion cases
Security Testing	Suggests 401/403, escalation, abuse prevention tests
Edition Awareness	Handles `lite`, `pro`, `enterprise` test variants
Self-Evaluation	Auto-detects test gaps via metadata, DTO rules, or bug history
Studio Integration	Preview + accept/reject cycle, prompt tracing, markdown summaries
Memory & Embeddings	Learns from previous prompts, test structures, DTO patterns
Retry & Corrections	Human feedback loop, retry metadata, audit tags
Observability-Aware	Generates tests based on telemetry, logs, and event traces
Test Impact Reports	Delta analysis for QA coverage, trace enrichment, edition completeness

🔁 Test Generator vs. Test Case Generator — Final Comparison¶

Feature	Test Case Generator	Test Generator Agent
🎯 Trigger	Static artifact (handler, controller, blueprint)	Prompt, test gap, QA input, telemetry, bug
🧱 Input	Code structure, DTO, blueprint	Observability, prompts, bugs, gaps, QA reviews
📤 Output	`.cs`, `.feature`, `test-metadata.yaml`	`.feature`, augmented tests, markdown summaries
🔁 Retry	On failure or missing test	On QA rejection, prompt retry, gap detection
👤 Human-In-Loop	Rare	Core interaction pattern (Studio QA loop)
📊 Coverage Role	Baseline test scaffolding	Strategic augmentation, behavioral completeness
🔐 Security & Role Paths	Limited	Robust role variant & access denial coverage
🧠 Skills	Deterministic, rule-based generation	AI prompt planners, OpenAI-driven scenario simulation
📘 BDD Integration	Generated from handler/ports	Enriched from human prompts and coverage analysis
🧠 Observability	Not integrated	Uses spans, logs, runtime feedback
📎 Trace Tagging	Static alignment	Dynamic + revision-aware + feedback-tagged
🧪 Use Case	“Generate initial test set”	“Close gaps, explore untested paths, simulate user behavior”

🔚 Closing Notes¶

The Test Generator Agent is:

🧠 An intelligent QA collaborator
📘 A scenario inventor and test storyteller
🔁 A self-correcting system enhancer
📊 A measurable contributor to QA success

It helps ConnectSoft achieve a full-stack, AI-augmented, and observability-driven software testing platform — at scale.

🧠 Test Generator Agent Specification¶

🎯 Purpose¶

🧠 What This Agent Focuses On¶

🧠 Test Generator Agent in Action¶

Example Blueprint:¶

Trigger:¶

Output:¶

📌 How It Extends the Factory¶

✅ Summary¶

🧭 Strategic Position in the QA Cluster and Platform Flow¶

🧬 Full Factory Flow Positioning¶

🤝 Strategic Collaborators¶

📦 Factory Roles¶

🧠 Example Activation Scenarios¶

📌 Platform Cluster Inclusion¶

✅ Summary¶

📋 Responsibilities¶

✅ Key Responsibilities Breakdown¶

🧪 What It Does Not Do¶

🔁 Responsibility Examples¶

Example 1 — Observability-Based Trigger:¶

Example 2 — Prompt-Based Expansion:¶

Example 3 — Role-Missing Test:¶

📘 Artifact-Level Responsibilities¶

✅ Summary¶

📥 Inputs¶

📦 Primary Input Categories¶

🧠 Sample Enriched Input¶

💡 Human-Centered Inputs¶

🔍 Semantic Inputs from Logs¶

📘 Prompt Template Input¶

✅ Summary¶

📤 Outputs¶

📦 Primary Output Artifacts¶

📘 Output Example: Augmented .feature File¶

🧪 Output Example: Hypothetical Test Method¶

📄 QA Summary Output (Markdown)¶

📎 test-augmentation-metadata.yaml¶

📊 Observability Events¶

📤 Git Patch (Optional)¶

✅ Summary¶

🚦 Reactive Triggers – When and Why the Agent Is Invoked¶

🔁 Triggering Modes¶

🧠 Reactive Lifecycle Example¶

📘 Sample Trigger: QA Prompt¶

📊 Studio Trigger UX¶

🔁 Trigger Types Recap¶

✅ Summary¶

🔁 Process Flow (High-Level)¶

🧬 High-Level Execution Diagram¶

🧱 Phase Descriptions¶

🔁 Execution Characteristics¶

📘 Example Flow (Prompt-Driven)¶

✅ Summary¶

🔬 Process Flow (Detailed Skill-Orchestrated Flow)¶

🧬 Detailed Skill Flow¶

🧠 Core Skills Used¶

🧪 Internal Skill Handlers and Sub-Skills¶

🔁 Test Discovery + Planning¶

✍️ Prompt-Based Scenario Generation¶

📘 Artifact Generation¶

📁 Output Coordination¶

📊 Observability Skill Output¶

🧠 Retriable and Self-Healing¶

✅ Summary¶

🧩 Core Skills¶

🧠 Core AI-Driven Skills¶

💬 Prompt → Scenario Skill Chain¶

🧪 Sample Skill: ExpandPromptIntoPaths¶

🧠 DTO-Aware Assertion Inference¶

🎯 Use of Embeddings¶

📘 Markdown Summary Skill¶

✅ Summary¶

📡 Observability-Aware Test Inference¶

📊 Key Observability Inputs Used¶

🔍 Skill: MapTelemetryToTestInput¶

🧪 Example Input: Observability-Driven Trigger¶

📈 Metrics-Aware Skills¶

📎 Traceability and Metadata Output¶

🛠 Usage in Continuous QA¶

📘 Output Example: Augmented `.feature` File¶

📎 `test-augmentation-metadata.yaml`¶

🧪 Sample Skill: `ExpandPromptIntoPaths`¶

🔍 Skill: `MapTelemetryToTestInput`¶