Skip to content

πŸ§ͺ Usability Testing Agent

🎯 Purpose

The Usability Testing Agent is an evaluation-focused, quality-assurance agent within the ConnectSoft AI Software Factory responsible for post-design usability testing, heuristic evaluation, cognitive walkthrough, accessibility compliance testing, and design validation against established usability standards.

It evaluates every design artifact β€” from wireframes and prototypes to implemented UI components β€” against recognized usability heuristics and accessibility guidelines, producing structured reports with actionable improvement recommendations.

It doesn't just find problems β€” it quantifies usability quality, maps issues to established heuristics, and provides prioritized, evidence-based recommendations that drive measurable design improvement.


🧠 Core Role in the Factory

The Usability Testing Agent serves as the quality gate for user experience in the Research and UX/UI Design cluster. It ensures that designs are not just visually appealing but functionally usable, cognitively efficient, and accessible before they reach engineering implementation.


🧩 Position in the Research and UX/UI Design Cluster

Layer Cluster Description
πŸ§ͺ Design Evaluator Research and UX/UI Design Validates designs against usability heuristics and standards
β™Ώ Accessibility Auditor Research and UX/UI Design Tests design compliance with WCAG and accessibility guidelines
πŸ“Š Quality Scorer Research and UX/UI Design Produces quantitative usability scores for comparison and tracking
flowchart TD
    UXD[UX Designer Agent] -->|ui_design_completed| UTA[Usability Testing Agent]
    UID[UI Designer Agent] -->|prototype_ready| UTA
    UR[User Researcher Agent] -->|research_insights_available| UTA
    AEA[Accessibility Engineer Agent] -->|accessibility_audit_requested| UTA
    UTA --> UXD
    UTA --> UID
    UTA --> AEA
    UTA --> UR
Hold "Alt" / "Option" to enable pan & zoom

πŸ”„ Triggering Events

Event Trigger Description
ui_design_completed Completed design requires usability evaluation before handoff to engineering
prototype_ready Interactive prototype available for cognitive walkthrough and task analysis
accessibility_audit_requested Explicit request for accessibility compliance testing of design artifacts
design_iteration_completed Updated design iteration needs re-evaluation against previous findings
usability_regression_detected Monitoring indicates usability score degradation in a product area
competitive_benchmark_requested Request to evaluate design against competitor usability benchmarks

⏱ Trigger Frequency and Schedule

Mode Description
πŸ“₯ Event-driven Primary mode β€” activates on design completion or prototype readiness
πŸ•’ Scheduled Monthly usability score trending analysis across all product editions
🚨 Regression-driven Immediate activation when usability scores drop below established thresholds

πŸ’‘ Trigger Payload Example

{
  "trigger": "ui_design_completed",
  "design_id": "appointment-booking-flow-v3",
  "design_tool": "figma",
  "screens": ["search", "calendar", "confirmation", "error"],
  "persona_context": "clinic_admin",
  "edition": "pro",
  "evaluation_scope": ["heuristic", "cognitive_walkthrough", "accessibility"]
}

πŸ“¦ Responsibilities and Deliverables

🧰 Key Responsibilities

Responsibility Description
πŸ” Heuristic Evaluation Systematic evaluation against Nielsen's 10 usability heuristics and additional domain-specific heuristics
🧠 Cognitive Walkthrough Step-by-step task analysis simulating user goal completion to identify friction points
β™Ώ Accessibility Compliance Testing Evaluate designs against WCAG 2.1 AA guidelines at the design stage (pre-implementation)
πŸ“Š Usability Scoring Produce quantitative usability scores (SUS-inspired) for comparison, tracking, and benchmarking
βœ… Design Validation Verify designs meet established usability patterns and platform conventions
πŸ’‘ Improvement Recommendations Generate prioritized, actionable recommendations with severity, effort, and impact scoring
πŸ“ˆ Trend Analysis Track usability quality over time across design iterations and product editions
πŸ† Competitive Benchmarking Compare design usability against competitor products and industry standards

πŸ“€ Deliverables

Deliverable Type Description
πŸ“Š Usability Report Comprehensive evaluation with heuristic findings, severity ratings, and usability scores
πŸ” Heuristic Evaluation Matrix Structured matrix mapping findings to specific heuristics with evidence and recommendations
🧠 Cognitive Walkthrough Report Step-by-step task analysis with success/failure predictions and friction point identification
β™Ώ Accessibility Evaluation Design-stage accessibility findings mapped to WCAG success criteria
πŸ’‘ Improvement Recommendations Prioritized list of design changes ranked by severity, effort, and expected impact
πŸ“ˆ Usability Trend Dashboard Data Historical scores for longitudinal tracking and regression detection
πŸ“š Usability Memory Index Historical store of evaluations for pattern recognition and recall

🧩 Example Output (YAML)

usability_report_id: ur-appointment-booking-v3-202606
design_id: appointment-booking-flow-v3
edition: pro
persona: clinic_admin
evaluation_date: "2026-06-15"

usability_score:
  overall: 78
  learnability: 82
  efficiency: 71
  memorability: 80
  error_tolerance: 68
  satisfaction: 85

heuristic_findings:
  - id: hf-001
    heuristic: "H1 - Visibility of system status"
    screen: calendar
    severity: major
    finding: "No loading indicator when fetching available slots"
    evidence: "Calendar transitions without feedback for 2-3 seconds"
    recommendation: "Add skeleton loader or spinner during slot retrieval"
    effort: low
    impact: high

  - id: hf-002
    heuristic: "H5 - Error prevention"
    screen: confirmation
    severity: critical
    finding: "Double-booking prevention not visible until submission"
    evidence: "Users can select conflicting time slots without warning"
    recommendation: "Show real-time conflict detection inline during slot selection"
    effort: medium
    impact: critical

  - id: hf-003
    heuristic: "H9 - Help users recognize, diagnose, and recover from errors"
    screen: error
    severity: moderate
    finding: "Error page lacks specific guidance for resolution"
    evidence: "Generic 'Something went wrong' with no actionable next steps"
    recommendation: "Provide contextual error messages with specific recovery actions"
    effort: low
    impact: high

cognitive_walkthrough:
  task: "Book an appointment for a new patient"
  steps:
    - step: 1
      action: "Search for available providers"
      success_prediction: high
      friction: none
    - step: 2
      action: "Select a time slot from the calendar"
      success_prediction: medium
      friction: "Calendar does not indicate provider availability at a glance"
    - step: 3
      action: "Confirm the booking"
      success_prediction: high
      friction: "Confirmation button not visually prominent"

accessibility_findings:
  - criterion: "1.4.3 Contrast (Minimum)"
    status: fail
    element: "Calendar inactive dates"
    contrast_ratio: 2.8
    required_ratio: 4.5
    recommendation: "Increase text contrast on inactive calendar dates"
  - criterion: "2.4.7 Focus Visible"
    status: fail
    element: "Time slot selection buttons"
    recommendation: "Add visible focus indicator for keyboard navigation"

improvement_priority:
  critical:
    - "Add real-time conflict detection during slot selection"
  high:
    - "Add loading indicator for calendar slot retrieval"
    - "Provide contextual error messages with recovery actions"
    - "Fix contrast on inactive calendar dates"
  medium:
    - "Add visible focus indicators on time slot buttons"
    - "Improve confirmation button visual prominence"
  low:
    - "Add provider availability indicators to calendar view"

🀝 Collaboration Interfaces

The Usability Testing Agent operates as a quality feedback loop, consuming design artifacts and returning structured evaluations that drive design improvement.

πŸ”„ Inbound Interfaces (Receives Data From)

Source Agent / System Interface Type Purpose
🎨 UX Designer Agent Event: ui_design_completed Provides completed designs for usability evaluation
πŸ–ΌοΈ UI Designer Agent Event: prototype_ready Provides interactive prototypes for cognitive walkthrough
πŸ”¬ User Researcher Agent Event: research_insights_available Provides user research context to inform evaluation focus
β™Ώ Accessibility Engineer Agent Event: accessibility_audit_requested Requests design-stage accessibility evaluation

πŸ“€ Outbound Interfaces (Sends Data To)

Target Agent / System Interface Type Purpose
🎨 UX Designer Agent Event: usability_report_ready Returns evaluation findings for design iteration
πŸ–ΌοΈ UI Designer Agent Event: design_improvements_identified Provides specific visual and interaction improvement recommendations
β™Ώ Accessibility Engineer Agent Event: design_a11y_findings_ready Shares design-stage accessibility issues for tracking
πŸ”¬ User Researcher Agent Event: usability_insights_generated Feeds usability patterns back into research knowledge
πŸ“₯ Memory Indexing System Internal Save Event Stores evaluation history for trend analysis and recall

πŸ•ΈοΈ Agent Interaction Graph

flowchart LR
    UXD[UX Designer] --> UTA[Usability Testing Agent]
    UID[UI Designer] --> UTA
    UR[User Researcher] --> UTA
    UTA --> UXD
    UTA --> UID
    UTA --> AEA[Accessibility Engineer]
    UTA --> UR
Hold "Alt" / "Option" to enable pan & zoom

🧠 Memory and Knowledge

πŸ“š Preloaded Knowledge

Knowledge Domain Description
πŸ” Nielsen's 10 Usability Heuristics Visibility of system status, match with real world, user control, consistency, error prevention, recognition, flexibility, aesthetic design, error recovery, help/documentation
🧠 Cognitive Walkthrough Methodology Goal-action-feedback analysis, learnability assessment, exploration vs. instruction paradigms
β™Ώ WCAG 2.1 Guidelines All Level A and AA success criteria applicable at the design stage
πŸ“Š System Usability Scale (SUS) Standardized usability scoring methodology and benchmarking scales
🎯 Platform Conventions Material Design, Human Interface Guidelines, Fluent Design usability patterns
πŸ“‹ Severity Rating Frameworks Nielsen severity scale (cosmetic β†’ catastrophic), effort/impact matrices
πŸ† Competitive Usability Benchmarks Industry-standard usability scores by product category and vertical

🧩 Dynamic Knowledge (Updated During Execution)

Source Type of Knowledge
UX Designer Agent Current design patterns, interaction models, and user flow decisions
UI Designer Agent Visual hierarchy, component library, and platform target
User Researcher Agent User behavior patterns, pain points, and task completion data
Accessibility Engineer Agent Implementation-stage a11y findings for correlation
Memory Store Historical usability evaluations, trend data, and recurring issue patterns

🧬 Semantic Memory Embeddings

The agent stores and retrieves:

  • Past usability evaluations by design type and component category
  • Recurring usability patterns for proactive detection in new designs
  • Improvement recommendation effectiveness based on before/after score comparisons
  • Accessibility findings at the design stage for correlation with implementation findings

πŸ” Knowledge Update Policies

Type Update Frequency Notes
Usability Heuristics Manual or infrequent Core heuristics are stable; domain-specific additions version-controlled
Platform Conventions On platform guideline updates Material Design, HIG, Fluent updates trigger knowledge refresh
Evaluation History Continuous Updated after every evaluation cycle
Memory Embeddings Continuous Updated after every usability report generation

βœ… Validation

πŸ” Validation Objectives

  • Confirm that all design screens are evaluated (no blind spots in coverage)
  • Ensure heuristic findings are evidence-backed with specific screen references
  • Verify accessibility findings map to specific WCAG success criteria
  • Validate recommendations include severity, effort, and impact ratings
  • Ensure usability scores are calculated consistently for trend comparability

πŸ§ͺ Types of Validation Checks

Layer Validation Logic
πŸ“Š Coverage Completeness All screens in the design are included in the evaluation
πŸ” Evidence Backing Every finding has a specific screen reference and observable evidence
β™Ώ WCAG Mapping Accessibility findings are mapped to specific WCAG success criteria
πŸ’‘ Recommendation Quality Recommendations include severity, effort estimate, and impact prediction
πŸ“ˆ Score Consistency Usability scores use consistent methodology for cross-evaluation comparison
🧾 Output Schema Compliance Report structure validated against expected YAML schema

⚠️ Flagging Risky Outputs

Scenario Action Taken
Screen not evaluated Flag as incomplete_coverage: true
Finding without evidence Flag as unsubstantiated_finding: true and request clarification
Critical accessibility failure Escalate to Accessibility Engineer Agent immediately
Usability score below threshold Flag as usability_regression: true and notify UX Designer
Recommendation missing effort/impact Flag as incomplete_recommendation: true

πŸ§ͺ Validation Result Tags

validation:
  status: passed
  screen_coverage: 100%
  evidence_backing: 98%
  wcag_mapping: complete
  recommendation_quality: valid
  score_methodology: SUS_v2
  trace_id: "evt-design-eval-appointment-v3"

πŸ”„ Process Flow

βš™οΈ High-Level Execution Phases

flowchart TD
    A[Start: Design Evaluation Triggered] --> B[Design Artifact Intake]
    B --> C[Heuristic Evaluation]
    C --> D[Cognitive Walkthrough]
    D --> E[Accessibility Compliance Check]
    E --> F[Usability Scoring]
    F --> G[Improvement Recommendation Generation]
    G --> H[Report Assembly + Memory Indexing]
Hold "Alt" / "Option" to enable pan & zoom

🧩 Detailed Process Breakdown

Step Name Description
1 Design Artifact Intake Ingest design files, prototype links, screen inventory, and persona context
2 Heuristic Evaluation Systematically evaluate each screen against all applicable heuristics
3 Cognitive Walkthrough Simulate key user tasks step-by-step, predicting success/failure at each action
4 Accessibility Compliance Check Evaluate design against WCAG 2.1 AA criteria applicable at design stage
5 Usability Scoring Calculate composite and dimension-specific usability scores
6 Recommendation Generation Prioritize improvements by severity, effort, and expected impact
7 Report Assembly Compile all findings into structured report and store in memory

βœ… Summary

The Usability Testing Agent is the design quality guardian of the ConnectSoft AI Software Factory β€” ensuring that every design is evaluated against recognized standards before reaching engineering.

It answers:

  • "Does this design follow established usability heuristics?"
  • "Where will users get confused or stuck?"
  • "Does this design meet accessibility standards at the design stage?"
  • "How does our usability compare across iterations and competitors?"
  • "What are the highest-impact improvements we can make?"

Without this agent, usability issues are discovered late β€” in testing or production. With it, design quality is measured, tracked, and improved before a single line of code is written.