🛡️ Privacy Compliance Agent Specification¶

🧠 Purpose¶

The Privacy Compliance Agent is a dedicated agent in the Security & Governance Cluster of the ConnectSoft AI Software Factory. Its mission is to autonomously audit and enforce privacy compliance across every SaaS product, service, and microservice generated by the platform.

It ensures that personal, sensitive, and health-related data is processed in full alignment with laws like GDPR, CCPA, HIPAA, and any applicable ConnectSoft internal privacy policies.

"You can't build 3,000+ SaaS systems and stay compliant by chance — this agent makes privacy-by-design automatic and traceable."

🧭 Role in the Platform¶

The agent operates in the post-architecture, post-generation phase, once entity models, APIs, and storage plans are defined — and before production deployment.

It is triggered:

After Data Architect Agent finalizes schemas
After Solution Architect Agent emits domain models and service boundaries
After DevOps Architect Agent confirms data stores and regions
Before any SaaS instance is marked deployable

🔄 Position in the Agent Lifecycle¶

flowchart LR
    DA[📦 Data Architect Agent]
    SA[🧩 Solution Architect Agent]
    DEV[⚙️ DevOps Architect Agent]
    PC[🛡️ Privacy Compliance Agent]
    LEGAL[📑 LegalOps Agent]

    DA --> PC
    SA --> PC
    DEV --> PC
    PC --> LEGAL

Hold "Alt" / "Option" to enable pan & zoom

🔐 What This Agent Ensures¶

Principle	Enforcement
Lawful Basis for Processing	All data access must map to a declared legal basis (e.g., consent, contract).
PII/PHI Classification	All fields are scanned and marked (e.g., `email`, `healthData`, `geoLocation`).
Cross-Border Awareness	Warn if data is stored or accessed outside allowed regions (e.g., EU-only tenants).
Retention Policy Boundaries	Validates TTL, auto-deletion logic, and data aging policies.
Opt-out & Erasure Support	Confirms API support for data subject rights (e.g., `DELETE /user/:id`).
Consent Path Tracing	Ensures consent is explicit, timestamped, revocable, and linked to processing reason.
Vendor / Processor Tracking	Flags any untracked data processors (e.g., 3^rd-party APIs without DPA).

📎 What It Audits¶

Layer	Focus
Entity Models	PII, sensitive types, retention tags, metadata annotations
API Specs	Parameter leakage, unsafe methods, missing erasure/delete endpoints
Storage Plans	Region-mapping, encryption-at-rest, shared tenancy risk
Auth Schemas	Role visibility, over-permissioned access to PII
External Integrations	Unlisted processors, unverified consent pathways

📋 Strategic Capabilities¶

Capability	Description
🔍 Privacy Field Detection	Uses NLP + schema analysis to tag personal and health data types
📜 Regulatory Mapping	Links violations directly to GDPR/CCPA/HIPAA article numbers
📊 Risk Quantification	Scores risk per tenant, region, and module with CVSS-style output
📤 Legal Artifacts	Generates `privacy-audit-report.md`, `data-classification.json`, `compliance-risk-matrix.yaml`
✅ Auto-Validation	Marks services ready for deployment if they meet all scoped compliance checks

✅ Summary¶

The Privacy Compliance Agent makes privacy-by-design automatic
It is the gatekeeper between architecture and compliant deployment
It ensures that every generated system is legally safe, globally aligned, and ethically responsible
It protects both user rights and ConnectSoft’s regulatory standing in every market

📋 Responsibilities¶

The Privacy Compliance Agent is entrusted with the formal, autonomous enforcement of data privacy rules across every system generated by the ConnectSoft AI Software Factory. Its responsibilities span from detection to classification, violation analysis, and compliance certification for legal and deployment readiness.

✅ Core Responsibilities¶

Category	Responsibility
Data Classification	Identify PII, PHI, SPI (Sensitive Personal Information), and other regulated fields in data models.
Regulatory Mapping	Map each detected field, usage pattern, or violation to relevant articles in GDPR, CCPA, HIPAA, etc.
Retention Policy Validation	Analyze whether time-to-live (TTL), expiration, or deletion policies align with legal data minimization principles.
Cross-Border Data Flow Review	Detects and flags cross-region data transfers that violate data residency or localization rules (e.g., GDPR Article 44).
Consent Verification	Ensure every personal data use is backed by an appropriate and traceable legal basis — including consent logs, scopes, and revocation.
Erasure & Data Rights APIs	Validate support for user rights: access, portability, correction, objection, and erasure (Article 15–20, CCPA §1798.105).
Processor/Vendor Auditing	Verify that any 3^rd-party data processors (e.g., analytics, marketing, AI APIs) are declared and governed.
API Endpoint Audit	Confirm endpoints don’t leak personal data unintentionally (e.g., via logs, URLs, or overexposed GET params).
Authorization Scope Checks	Detect whether roles (e.g., admin, support) have excessive access to sensitive data not required for their function.
Compliance Readiness Report Generation	Generate full audit documentation, suitable for legal review, GitOps publishing, or regulatory requests.

📤 Expected Outputs (Preview)¶

Artifact	Description
`data-classification.json`	JSON map of all detected PII/PHI fields, classification types, sensitivity score.
`privacy-audit-report.md`	Markdown audit summary: compliance gaps, severity, remediation tips, and legal mapping.
`compliance-risk-matrix.yaml`	Heatmap per service/module/tenant: risk level, categories, and failing controls.
`compliance-signed.yaml`	Certificate artifact indicating policy-aligned components for legal approval and DevOps gate checks.

📋 Regulatory Responsibility Matrix¶

Regulation	Agent Responsibility
GDPR	Article 5 (Principles), Article 6 (Legal Basis), Article 15–20 (Rights), Article 32 (Security), Article 44 (Transfers)
CCPA / CPRA	§§ 1798.100–199: Disclosure, access, opt-out, data sale, deletion, minimization
HIPAA (where applicable)	Privacy Rule – PHI classification, access auditing, retention
ISO 27701 / SOC 2	Documentation, traceability, incident readiness (non-binding, optional modules)

📎 Role in the SaaS Factory Lifecycle¶

Lifecycle Stage	Responsibility
Post-entity modeling	Classify and annotate all data fields with risk and legal flags
Post-API definition	Verify endpoint safety, consent path coverage, and access control alignment
Pre-deployment	Generate compliance readiness report and emit `PrivacyComplianceReady` event
Post-patch/retest	Rerun compliance checks on changed models, APIs, or processor usage (after security or legal change)

✅ Summary¶

The agent is responsible for:

Scanning data surfaces
Detecting privacy scope issues
Validating retention, consent, and rights support
Documenting compliance and risks
Blocking deployment when violations are found

This makes it a critical compliance gatekeeper in every ConnectSoft SaaS product flow.

📥 Inputs Consumed¶

The Privacy Compliance Agent requires a multi-layered, contextual input set to perform accurate, regulation-aligned privacy analysis. It consumes artifacts from upstream architectural agents, environment configuration files, data modeling sources, and tenant metadata to generate a comprehensive privacy map.

📂 Core Inputs¶

Artifact	Source Agent	Description
`entity-models.yaml`	Data Architect Agent	Full schema of entities, fields, types, and ownership (includes annotations like `isSensitive`, `isEncrypted`).
`openapi.yaml`	API Gateway Architect Agent	Public/external API surface, parameter visibility, and response schema.
`storage-map.yaml`	DevOps Architect Agent	Details of all data storage: region, encryption, tenancy mode, TTL, and access policies.
`security-zones.yaml`	Security Architect Agent	Defines trust boundaries, admin scopes, and processor separation.
`tenant-profiles.json`	Solution Architect Agent	Indicates data residency constraints, tenant geography, legal jurisdiction, and privacy modes (e.g., "strict", "standard").
`consent-config.yaml`	Identity Agent or Product Owner	Describes how and when user consent is captured, revoked, and tracked.
`third-party-integrations.yaml`	Application Architect Agent	List of all vendor integrations, whether they process PII, and if DPAs are in place.
`retention-policies.yaml`	Infrastructure Architect Agent	Declared data aging logic per table/entity/zone.

🧠 Semantic Inputs from Orchestration¶

Semantic Parameter	Purpose
`regulatoryScope`	Defines applicable laws (e.g., GDPR+CCPA, or HIPAA only) for multi-tenant/multi-region SaaS.
`enforceStrictMode`	Forces high-sensitivity mode (fail on any unclear classification or missing consent log).
`sessionId`, `traceId`	Traceability for all compliance checks and violations found.
`scanScope`	Can be limited to a specific tenant, module, or feature (e.g., `notifications`, `checkout`, etc.).

🧾 Example: `entity-models.yaml`¶

User:
  fields:
    id: { type: uuid }
    name: { type: string }
    email: { type: string, isSensitive: true, pii: true }
    phone: { type: string, pii: true, regionRestricted: true }
    dateOfBirth: { type: date, pii: true, isSensitive: true }
    isMarketingOptIn: { type: boolean, consentTracked: true }

📘 Example: `tenant-profiles.json`¶

{
  "vetclinic-premium": {
    "region": "EU",
    "jurisdiction": ["GDPR"],
    "privacyMode": "strict",
    "dataRetention": "5y",
    "dpaRequired": true
  }
}

📎 Real-Time Constraints¶

If privacyMode: strict, agent enforces:
✅ Opt-in consent must exist for marketing or sensitive processing
❌ No data may flow outside defined region
❌ All PII must be encrypted at rest
If sandbox: true, findings are marked non-blocking but logged and visualized

🧠 Orchestrator Prompt Input (Extract)¶

regulatoryScope:
  - GDPR
  - CCPA
  - HIPAA
scanScope: full
enforceStrictMode: true
sessionId: compliance-sess-0427
traceId: proj-788-v1

✅ Summary¶

The agent consumes a cross-agent, regulation-scoped set of structured artifacts, including:

📦 Data schemas
🌍 Regional tenant metadata
🔒 Security zone mappings
📜 Consent and retention declarations
🌐 API parameter flows and PII leaks

These inputs enable precise privacy violation detection, traceable findings, and region-aware risk analysis.

📤 Outputs Produced¶

The Privacy Compliance Agent emits a suite of structured, reviewable, and traceable compliance artifacts designed for consumption by:

🧑‍⚖️ LegalOps Agents
🛡️ Security & Governance Dashboards
⚙️ DevOps approval gates
🧠 Studio visualizations
📁 GitOps workflows

Each output is tied to traceId, sessionId, and regulatoryScope.

📁 Output Directory Structure¶

/compliance-results/
  data-classification.json
  privacy-audit-report.md
  compliance-risk-matrix.yaml
  compliance-summary.md
  remediation-recommendations.json
  privacy-execution-metadata.json

📦 Output Artifact Details¶

📘 `data-classification.json`¶

Maps all fields from entity models and APIs with sensitivity tags.

{
  "User": {
    "email": { "type": "string", "classification": "PII", "regionRestricted": true },
    "phone": { "type": "string", "classification": "PII", "retention": "5y" },
    "dob": { "type": "date", "classification": "Sensitive", "legalBasis": "consent" }
  }
}

📘 `privacy-audit-report.md`¶

Markdown summary of all detected violations and compliance coverage.

# ConnectSoft Privacy Audit Report

**Trace:** proj-788-v1  
**Session:** compliance-sess-0427  
**Tenant:** vetclinic-premium  
**Regulations Applied:** GDPR, CCPA

---

## 🔐 Critical Violations
- **Missing Legal Basis:** `User.email` has no consent or contract scope
- **Cross-Border Risk:** BookingService stores data outside tenant region (EU → US)

---

## ✅ Coverage Summary

- Classified Entities: 14  
- PII Fields: 22  
- Verified Consents: 18  
- Outstanding Remediations: 3

📘 `compliance-risk-matrix.yaml`¶

Heatmap of violations by service/module/tenant.

services:
  booking-api:
    critical: 1
    medium: 2
    low: 1
    compliant: false
tenants:
  vetclinic-premium:
    riskScore: 8.4
    violations: 3
    compliant: false
global:
  totalViolations: 6
  totalCompliantServices: 4 of 6

📘 `remediation-recommendations.json`¶

Machine-readable suggestions for auto-fix workflows.

[
  {
    "entity": "User",
    "field": "email",
    "issue": "Missing legal basis",
    "suggestedFix": "Add explicit consent flow on registration"
  },
  {
    "api": "/api/invoices",
    "issue": "GET request exposes PII in URL",
    "suggestedFix": "Move sensitive identifiers to POST body or headers"
  }
]

📘 `privacy-execution-metadata.json`¶

Execution trace block, used by orchestrator and dashboards.

{
  "traceId": "proj-788-v1",
  "sessionId": "compliance-sess-0427",
  "agentId": "privacy-compliance-agent@1.3",
  "regulatoryScope": ["GDPR", "CCPA"],
  "scanCoverage": "89%",
  "confidence": 0.97,
  "violationsFound": 6,
  "completedAt": "2025-05-14T19:32:12Z"
}

🧩 Used By¶

Output	Consumed By
`privacy-audit-report.md`	LegalOps Agent, Studio UI
`data-classification.json`	Data Architect Agent, Audit Log Manager
`risk-matrix.yaml`	Governance dashboards
`remediation-recommendations.json`	Security Engineer Agent, FixBot
`metadata.json`	Orchestrator, Audit pipeline

✅ Summary¶

The Privacy Compliance Agent produces:

📜 Legal-grade audit reports
📊 Structured risk matrices
🛠️ Actionable remediation guidance
🔁 Fully traceable metadata for compliance traceability and GitOps storage

🧠 Knowledge Base¶

The Privacy Compliance Agent relies on a structured, curated, and extensible knowledge base that combines:

🧾 Legal frameworks (GDPR, CCPA, HIPAA, ISO 27701)
📦 Data classification standards (PII, PHI, SPI, financial, behavioral)
📘 Consent models, retention policies, and jurisdictional rules
🧠 Prior audit learnings via semantic memory (via MCP or vector DB)

📚 Embedded Legal Frameworks¶

Regulation	Articles/Sections Used
GDPR	Articles 5–6 (Lawfulness), 15–20 (Data Subject Rights), 25 (Privacy by Design), 32 (Security), 44–49 (Transfers)
CCPA/CPRA	§§ 1798.100–199: Disclosure, access, opt-out, deletion, sale definition
HIPAA	PHI classifications, access and disclosure controls, auditability, retention
ISO/IEC 27701	Mapping for privacy information management (PIMS) controls
SOC 2 Privacy TSC	Optional enhancements for observability-first governance

🧠 Domain Classifiers¶

Classifier	Examples
`PII`	`email`, `phone`, `address`, `IP`, `UUID`
`PHI`	`diagnosisCode`, `treatmentPlan`, `insuranceId`
`SPI` (Sensitive)	`race`, `religion`, `sexualOrientation`, `politicalViews`
`Financial`	`creditCardNumber`, `iban`, `taxId`
`Behavioral`	`clickPath`, `purchaseHistory`, `referringSource`

All classifiers include embedded pattern matchers, semantic field matchers, and schema type context (e.g., string, date, geo).

Basis	Usage
`Explicit Consent`	Must be logged, revocable, user-aware (e.g., marketing opt-in)
`Contract`	Account creation, user relationship
`Legal Obligation`	Tax reporting, anti-fraud
`Vital Interest`	Healthcare and safety exemptions
`Legitimate Interest`	Risk-based justification (with opt-out)

🔐 Retention Rule Models¶

Rule	Description
`Fixed TTL`	`User.email` retained for 5y post-deletion
`Event-Linked`	`Invoice.pdf` retained until `payment.status == settled`
`Jurisdictional Constraint`	`EU` data must be purged after `3y` inactivity, `US` after `7y`

🧠 MCP-Backed Semantic Memory¶

Use Case	Example Recall
Prior field classification	`"dob"` in `UserProfile` was flagged as sensitive under HIPAA in project `petcare-452`
Common violations	`"userId"` passed in URL path violated Article 5(1)© – data minimization
Consent gap reuse	`"optInTracking"` field in `AnalyticsConfig` lacked revocation in 3 past audits

🛠️ Fuzzable Compliance Anti-Patterns¶

These patterns are preloaded for detection:

GET /api/users?email=user@example.com → PII in URL
Missing DELETE /api/user/:id → Erasure failure
X-Tenant-Id without validation → Potential multi-tenant data leak
Region mismatch between storage.region and tenant.region
Fields marked required: true without consent tracking

📎 Trace-Linked Legal Mapping¶

Every output includes violationId → lawReference:

{
  "violationId": "VIO-8843",
  "classification": "email",
  "lawReference": "GDPR Article 6 – Lawfulness of Processing",
  "riskScore": 7.2
}

✅ Summary¶

The agent’s knowledge base spans:
📚 Legal codes
🔍 Data classification patterns
📜 Consent and retention logic
🧠 MCP-backed memory from prior audits
It enables accurate field classification, risk assignment, and regulatory traceability

🔁 Process Flow Overview¶

The Privacy Compliance Agent follows a multi-phase compliance verification pipeline that mimics how a human privacy analyst would inspect a SaaS system — only faster, deeper, and fully traceable. Its flow integrates detection, classification, legal mapping, and report generation across tenants, services, and regions.

🔄 High-Level Execution Phases¶

flowchart TD
    START["🚀 Trigger: StartAgentSession(PrivacyComplianceAgent)"]
    LOAD["📥 Load Inputs & Legal Scope"]
    CLASSIFY["🔍 Data Field Classification"]
    SCAN["🕵️ Privacy Violation Detection"]
    VALIDATE["🧪 Legal Mapping & Impact Scoring"]
    REPORT["📤 Generate Audit Reports"]
    ENDPOINTS["🔁 API + Consent Verification"]
    EXPORT["📁 Emit Artifacts + Trace Metadata"]

    START --> LOAD --> CLASSIFY --> SCAN --> VALIDATE --> ENDPOINTS --> REPORT --> EXPORT

Hold "Alt" / "Option" to enable pan & zoom

🪜 Phase Descriptions¶

Phase	Description
1. Load Inputs	Ingests `entity-models.yaml`, `openapi.yaml`, `storage-map.yaml`, `tenant-profiles.json`, etc.
2. Classify Fields	Uses NLP, field heuristics, and semantic memory to tag PII, PHI, SPI, financial fields.
3. Violation Detection	Cross-checks storage, region, API parameters, roles, and zones against legal rules.
4. Legal Mapping	Assigns violations to GDPR, CCPA, HIPAA articles with CVSS-style risk scoring.
5. API Rights Checks	Detects whether endpoints support opt-out, erasure, consent flow, and access rights.
6. Report Generation	Builds audit summary, `data-classification.json`, `risk-matrix.yaml`, PoC-style fix suggestions.
7. Artifact Emission	Emits all trace-tagged outputs for GitOps use, legal review, and CI/CD blocking.

🔁 Optional Flow Branches¶

Condition	Outcome
`retestMode = true`	Re-analyzes changed entities or endpoints since last scan
`enforceStrictMode = true`	Escalates warnings to errors; auto-fails if any classification or basis is missing
`scanScope = limited`	Runs only on specified tenants/modules/services

🧠 Metadata Tracking Throughout¶

Metadata Field	Attached To
`traceId`	Every log line, artifact, event
`sessionId`	Current execution scope
`tenantId`	Every data classification and violation
`regulatoryScope`	Guides what laws and articles are checked
`scanConfidence`	Result of ML-based field classification (`0.0–1.0`)

📋 Example Execution Log Summary¶

{
  "sessionId": "compliance-sess-0427",
  "traceId": "proj-788-v1",
  "totalFieldsScanned": 64,
  "piiDetected": 17,
  "regionViolations": 2,
  "consentMissing": 3,
  "reportGenerated": true,
  "status": "Non-compliant",
  "complianceScore": 72.5
}

📢 Events Emitted¶

Event	When
`PrivacyScanStarted`	Agent begins session
`PrivacyViolationDiscovered`	Finding severity ≥ medium
`PrivacyComplianceReady`	No violations or all in `low`/`informational` range
`PrivacyComplianceFailed`	Blocking errors found (region, consent, access gaps)

✅ Summary¶

The agent follows a 7-phase pipeline that:
🔍 Classifies fields
⚖️ Maps to law
🧪 Detects violations
📄 Emits complete legal-grade artifacts
Execution is fully traceable, retryable, and scope-limited for fast feedback

🧠 Classification Engine (Step-by-Step)¶

The Classification Engine is the first critical analysis phase of the Privacy Compliance Agent. It’s responsible for identifying and tagging personal, sensitive, financial, and health-related data fields from entity models, API inputs, and storage schemas — forming the foundation for privacy enforcement and legal mapping.

🔍 Step-by-Step Classification Pipeline¶

flowchart TD
    START["Start Classification"]
    EXTRACT["1️⃣ Extract Fields from Models + APIs"]
    MATCH["2️⃣ Apply Regex + NLP-based Classifier"]
    SEMMEM["3️⃣ Semantic Recall (MCP Memory)"]
    RANK["4️⃣ Confidence Scoring + Override Rules"]
    SCOPE["5️⃣ Apply Tenant/Region Constraints"]
    OUTPUT["6️⃣ Emit Classified Fields"]

    START --> EXTRACT --> MATCH --> SEMMEM --> RANK --> SCOPE --> OUTPUT

Hold "Alt" / "Option" to enable pan & zoom

1️⃣ Extract Fields from Models¶

The agent processes:

entity-models.yaml
openapi.yaml (parameters + response types)
storage-map.yaml (region, TTL, access)

Example fields extracted:

User:
  - name
  - email
  - phone
  - dateOfBirth
  - isMarketingOptIn

2️⃣ Pattern Matching + Field Name NLP¶

Applies classification using:

Detector	Description
`RegexMatcher`	Detects fields like `email`, `phone`, `iban`, `ssn`, `ipAddress`
`NameVectorClassifier`	Uses word embeddings (e.g., `dateOfBirth` → `Sensitive`)
`Metadata Hint Analyzer`	Checks if fields include `isSensitive`, `requiresConsent`, `regionRestricted` flags

3️⃣ Semantic Recall (via MCP)¶

The agent queries vector DB:

Retrieve prior matches of similar schemas (e.g., User, Patient, ClientInfo)
Reinforces field context (e.g., patientBirthDate is likely PHI)

Response:

{
  "match": "petco-internal#UserProfile",
  "field": "email",
  "classification": "PII",
  "confidence": 0.96
}

4️⃣ Confidence Scoring + Overrides¶

Rule	Behavior
`> 0.90`	Auto-classify as PII/PHI
`0.70–0.90`	Tentative tag, flagged for LegalOps Agent review
`< 0.70`	Skip unless `strictMode: true` or field appears in known schema set
Overrides	Honor any `@PrivacyType(PHI)` or `field.metadata.classification` tags from `entity-models.yaml`

5️⃣ Apply Tenant/Region Constraints¶

Field-level classification is region-scoped:

If tenant.region = EU, and field is email, classify under GDPR Article 6
If tenant.region = US, and field is diagnosisCode, flag as HIPAA PHI
If regionRestricted: true → mark for cross-border transfer validation

6️⃣ Emit Structured Output¶

All results stored in data-classification.json:

{
  "User": {
    "email": { "classification": "PII", "confidence": 0.98, "regionRestricted": true },
    "dateOfBirth": { "classification": "Sensitive", "requiresConsent": true, "confidence": 0.93 },
    "phone": { "classification": "PII", "confidence": 0.87 }
  }
}

🧪 NLP Classifier Examples¶

Field Name	Classification	Reason
`taxId`	Financial (PII)	Matches regex and known pattern
`optInToTracking`	Consent-bound	Matches NLP embedding and prior audits
`allergies`	PHI	Health-related content pattern
`utmSource`	Non-sensitive	Behavioral, not PII

✅ Summary¶

The classification engine runs field-by-field tagging using:
Regex, NLP, prior memory, region constraints
Every field is annotated with:
📌 Classification
🔐 Consent/retention info
📈 Confidence
Outputs form the basis for legal rule mapping, risk scoring, and deployment readiness checks

🧩 Skills & Kernel Functions¶

The Privacy Compliance Agent is built as a skill-composed Semantic Kernel agent, where each phase of its audit and enforcement lifecycle is executed via modular, traceable kernel skills. These skills operate on structured artifacts and are scoped to privacy, regulatory mapping, and remediation logic.

🧠 Core Kernel Skills¶

Skill Name	Purpose
`ClassifyFieldsSkill`	Identifies PII, PHI, SPI, behavioral and financial fields in entity models and API specs.
`CrossRegionRiskSkill`	Validates if any sensitive fields are stored or processed outside allowed regional boundaries.
`RetentionPolicyAnalyzer`	Analyzes declared or inferred TTLs against jurisdictional retention requirements.
`ConsentPathValidator`	Ensures processing of personal data is legally based (e.g., via consent, contract) and revocable.
`ApiRightsScannerSkill`	Validates presence of erasure (`DELETE`), access, portability, and opt-out support in APIs.
`DataProcessorAuditSkill`	Verifies declared vs. undeclared 3^rd-party data processors and checks DPA presence.
`LegalBasisMapper`	Maps violations to GDPR, CCPA, HIPAA references with article-level precision.
`ComplianceRiskScorerSkill`	Computes CVSS-style risk and legal severity scores for each finding.
`RemediationPlannerSkill`	Generates fix suggestions (e.g., add retention rule, consent flag, move region).
`ComplianceReportEmitter`	Emits `privacy-audit-report.md`, `risk-matrix.yaml`, and classification JSON.

🔁 Skill Execution Chain (Simplified)¶

ClassifyFieldsSkill
→ ConsentPathValidator
→ CrossRegionRiskSkill
→ RetentionPolicyAnalyzer
→ DataProcessorAuditSkill
→ ApiRightsScannerSkill
→ LegalBasisMapper
→ ComplianceRiskScorerSkill
→ RemediationPlannerSkill
→ ComplianceReportEmitter

Each skill emits structured telemetry (traceId, skillId, riskScore, target).

🧠 Skill Execution Example: `ConsentPathValidator`¶

{
  "field": "email",
  "source": "User",
  "legalBasis": "missing",
  "recommendedBasis": "explicitConsent",
  "traceId": "proj-788-v1",
  "violationId": "GDPR-A6-EMAIL-MISSING"
}

📦 Skill Outputs → Artifact Mapping¶

Skill	Output	Destination
`ClassifyFieldsSkill`	Classified field records	→ `data-classification.json`
`ApiRightsScannerSkill`	Rights endpoint flags	→ `privacy-audit-report.md`
`LegalBasisMapper`	Law-linked violations	→ `risk-matrix.yaml`, `remediation-recommendations.json`
`RemediationPlannerSkill`	Suggested code/config changes	→ `remediation-recommendations.json`
`ComplianceReportEmitter`	Full report set	→ `compliance-results/` folder

📎 Skill Constraints & Safeguards¶

Constraint	Behavior
`enforceStrictMode: true`	Escalates medium-confidence classifications to blocking violations
`regulatoryScope`	Each skill activates only for selected law sets (e.g., `GDPR+CCPA`, or `HIPAA` only)
`sandboxMode: true`	Allows reporting but disables GitOps block events or deployment rejections

🧠 AI Prompt Injection in Skills (Simplified)¶

You are a privacy compliance validator. Analyze this field: `email`. It belongs to a `User` entity and appears in OpenAPI. Determine:
- Classification (PII, Sensitive, PHI)
- Consent required?
- Legal basis?
- Regional restriction?

✅ Summary¶

The Privacy Compliance Agent uses task-scoped semantic skills to execute its audits
Skills are composable, traceable, and compliant-aware
Outputs from each skill flow into compliance reports, dashboards, or blocking signals

🗣️ System Prompt¶

The System Prompt defines the agent’s role, boundaries, execution rules, and regulatory obligations. It is injected by the Orchestrator at session initialization and governs how the agent interprets laws, handles uncertainty, and emits artifacts.

This prompt ensures that the agent consistently enforces privacy-by-design principles across all ConnectSoft-generated SaaS systems.

🧠 System Prompt (v1.0)¶

You are the Privacy Compliance Agent within the ConnectSoft AI Software Factory.

Your role is to verify that all SaaS systems generated by the platform comply with
data privacy regulations including GDPR, CCPA, HIPAA, and internal ConnectSoft privacy policies.

You must:
- Analyze all entity models, APIs, storage plans, and integration maps
- Identify personal, sensitive, financial, and health-related fields
- Determine whether each field is classified appropriately and processed lawfully
- Validate the presence of consent, lawful basis, retention policies, and opt-out/erasure APIs
- Flag and rank any compliance risks based on severity and regulatory references
- Generate structured reports: `data-classification.json`, `privacy-audit-report.md`, `risk-matrix.yaml`
- Tag every artifact and action with `traceId`, `sessionId`, `tenantId`, and `regulatoryScope`

You must always operate under the following constraints:
- If `strictMode: true`, treat all medium-risk issues as deployment-blocking
- If `sandboxMode: true`, you may report but must not trigger any blocking events
- All memory recall must respect tenant boundaries and legal jurisdictions
- When confidence is below threshold, escalate violations for human or LegalOps Agent review

Output must be:
- Accurate and legally mapped (include law citations)
- GitOps-ready and CI/CD-compatible
- Machine-readable and human-auditable

Collaborate with:
- Data Architect Agent (for schemas)
- API Gateway Architect (for surface analysis)
- Security Architect Agent (for trust zone enforcement)
- LegalOps Agent (for remediation and exception handling)

You are the enforcement engine for privacy-by-design.
Be assertive, precise, and legally grounded.

📋 Key Prompt Principles¶

Principle	Effect
Zero Trust Privacy Enforcement	Assumes no field is safe until proven classified, consented, and retained legally
Policy-Aware Behavior	Behavior adapts based on `regulatoryScope`, `tenant.region`, `scanMode`, etc.
Multi-Agent Integration	Expects upstream context and emits downstream outputs for security, legal, and compliance flows
Traceable by Design	All outputs and actions must support auditability (`traceId`, `riskScore`, `violationId`)

📌 Prompt-Scoped Behavior Overrides¶

Flag	Influence
`enforceStrictMode: true`	Escalates warnings to violations, blocks output until resolved
`scanScope: [tenantA]`	Limits focus to only affected services/modules
`regulatoryScope: [GDPR, HIPAA]`	Activates articles 5, 6, 15–20, 32, 44; HIPAA privacy rules
`sandboxMode: true`	Allows simulation of findings without triggering CI/CD blocks

✅ Summary¶

The system prompt governs the agent’s:
🎯 Mission (enforce privacy laws)
🛡️ Execution constraints
📤 Output expectations
🤝 Inter-agent behavior
It guarantees repeatability, legal traceability, and factory-wide policy alignment

🧾 Input Prompt Template¶

The Input Prompt Template provides the Privacy Compliance Agent with the structured context it needs to perform scoped, regulation-aware audits across entities, tenants, and services.

This prompt is delivered as YAML or JSON by the Orchestrator, enriched with legal metadata and project-specific constraints.

📥 Standard YAML Input Format¶

traceId: proj-788-v1
sessionId: compliance-sess-0427
projectId: proj-788
environment: staging
tenantScope:
  - vetclinic-premium
  - petco-enterprise
scanScope: full
regulatoryScope:
  - GDPR
  - CCPA
enforceStrictMode: true
sandboxMode: false
retryOnViolation: true
memoryRecallEnabled: true
inputs:
  entityModelsPath: ./models/entity-models.yaml
  openApiSpecPath: ./apis/openapi.yaml
  storageMapPath: ./infra/storage-map.yaml
  tenantProfilesPath: ./compliance/tenant-profiles.json
  consentConfigPath: ./consent/consent-config.yaml
  retentionPoliciesPath: ./compliance/retention-policies.yaml
  thirdPartyMapPath: ./integrations/third-party-integrations.yaml

🧠 Key Input Fields Explained¶

Field	Description
`traceId`, `sessionId`	Full traceability across logs and reports
`tenantScope[]`	Targets tenants with custom jurisdiction or strict residency
`regulatoryScope[]`	Specifies which laws to enforce (GDPR, CCPA, HIPAA, or combined)
`enforceStrictMode`	If `true`, medium-confidence violations block deployment
`sandboxMode`	Enables dry-run audits without triggering failure conditions
`inputs.*Path`	File paths or object references to schema inputs (can be S3/Git URIs)

tenantScope:
  - vetclinic-premium
regulatoryScope:
  - GDPR
  - CCPA
scanScope: limited
enforceStrictMode: false
sandboxMode: true

→ This will allow full scan of one tenant under two laws without blocking deployments (audit only).

📎 Optional Fields for Targeted Execution¶

Field	Use
`targetService`	Scan one microservice (e.g., `billing-api`)
`skipRetiredFields`	Skip fields marked `deprecated: true`
`triggerOnDeploy`	Link scan to deployment pipeline or Git tag

🧠 Semantic Overrides¶

In advanced factory pipelines, the Orchestrator may inject natural-language prompts:

Run a privacy compliance scan on EU tenants using GDPR + HIPAA rules.  
Focus on cross-border data, consent compliance, and PHI exposure.  
Include full classification and recommendations, strict mode off.

→ Translated into YAML/JSON input before execution.

✅ Summary¶

The input prompt template:

Defines project, tenant, regulation, and scan boundaries
Specifies file paths or memory object references to needed artifacts
Controls behavior via strictMode, sandboxMode, and retryOnViolation
Ensures traceable, reproducible audit execution per ConnectSoft’s governance model

📤 Output Format and Structure¶

The Privacy Compliance Agent emits a bundle of structured and human-readable outputs. These artifacts serve as:

📜 Legal documentation for internal and external audits
🔒 Privacy enforcement checkpoints in CI/CD
📊 Input to dashboards and policy decisioning
🛠️ Remediation targets for Security, LegalOps, or Data Architect Agents

All outputs are trace-linked, GitOps-compatible, and machine-processable.

📁 Output Directory Layout¶

/compliance-results/
  data-classification.json
  privacy-audit-report.md
  compliance-risk-matrix.yaml
  remediation-recommendations.json
  privacy-execution-metadata.json
  compliance-findings.json

📘 Output Artifacts Explained¶

✅ `data-classification.json`¶

Contains detected field classifications:

{
  "User": {
    "email": { "classification": "PII", "confidence": 0.96, "regionRestricted": true },
    "dateOfBirth": { "classification": "Sensitive", "requiresConsent": true },
    "marketingOptIn": { "classification": "Behavioral", "consentLinked": true }
  }
}

📘 `privacy-audit-report.md`¶

Human-readable summary:

# Privacy Compliance Report

**Tenant:** vetclinic-premium  
**Trace:** proj-788-v1  
**Scope:** GDPR, CCPA  
**Result:** Non-Compliant (3 High, 1 Medium)

---

## 🔐 High-Risk Findings
- `User.email` missing lawful basis
- `Booking.invoicePdf` stored in US for EU tenant

## 🛠️ Recommendations
- Add explicit consent to `email` field
- Migrate invoice storage to EU region or redact

📘 `compliance-risk-matrix.yaml`¶

Structured risk scoring:

services:
  booking-api:
    high: 2
    medium: 1
    low: 1
    compliant: false
tenants:
  vetclinic-premium:
    region: EU
    violations: 4
    riskScore: 8.7
global:
  totalFindings: 6
  coverage: 91%

📘 `remediation-recommendations.json`¶

Fix suggestions (used by FixBot or LegalOps):

[
  {
    "field": "email",
    "issue": "Missing consent",
    "suggestedFix": "Add opt-in and lawful basis on collection"
  },
  {
    "storage": "invoicePdf",
    "issue": "Stored in US for EU tenant",
    "suggestedFix": "Relocate to EU blob storage or redact content"
  }
]

📘 `compliance-findings.json`¶

Canonical, structured findings list:

[
  {
    "id": "VIO-GDPR-A6-001",
    "severity": "high",
    "category": "Lawful Basis",
    "tenant": "vetclinic-premium",
    "field": "email",
    "law": "GDPR Article 6",
    "traceId": "proj-788-v1"
  }
]

📘 `privacy-execution-metadata.json`¶

Execution trace block:

{
  "traceId": "proj-788-v1",
  "sessionId": "compliance-sess-0427",
  "regulatoryScope": ["GDPR", "CCPA"],
  "completedAt": "2025-05-14T19:44:30Z",
  "scanCoverage": 94,
  "confidenceScore": 0.91,
  "violationsFound": 6
}

📎 Artifact Consumers¶

Output	Consumed By
`*.md`	LegalOps, Studio viewer
`*.json`	CI/CD pipeline, FixBot, Data Architect Agent
`risk-matrix.yaml`	Governance dashboard
`metadata.json`	Orchestrator and compliance pipeline

✅ Summary¶

The Privacy Compliance Agent emits a complete audit bundle including:

✅ Field-level classification (data-classification.json)
✅ Legal-grade reports (privacy-audit-report.md)
✅ Risk/violation matrices (risk-matrix.yaml, compliance-findings.json)
✅ Fix suggestions and execution metadata

These outputs provide regulatory traceability, GitOps validation, and legal sign-off readiness.

🧠 Memory¶

The Privacy Compliance Agent uses a multi-layered memory architecture to enhance precision, recall prior violations, and track compliance resolution across sessions. It combines:

💾 Short-term session memory (per scan)
🧠 Long-term semantic memory (via MCP vector DB)
🗂️ Historical audit tracking (for regression, retest, drift detection)

🧱 Memory Layers¶

Layer	Purpose	Backing
Session Memory	Tracks in-scan field classification, confidence, and tenant context	In-memory (Semantic Kernel)
Semantic Recall	Recalls prior violations, fix patterns, and data classification outcomes	MCP + Vector DB
Audit History	Stores previous findings and report metadata for diffing or retesting	GitOps or Audit Log store
Retention State Cache	Matches entity TTLs and purging logic to jurisdictional minimums	Rulebook with temporal diffs

🧠 Semantic Memory Retrieval (via MCP)¶

Query	Retrieved Pattern
`"email field with missing consent"`	Previously tagged in project `petco-hrms`, GDPR A6 violation
`"storage location for EU tenant in US"`	CVE-patterned violation in 3 other audits, matched with 0.91 confidence
`"User entity with dob field"`	Pattern matched as `Sensitive` in 6 prior projects (recall → auto-classify)

📘 Example Recall Record (Vector Match)¶

{
  "matchId": "csa-2024-userprofile",
  "field": "email",
  "classification": "PII",
  "source": "User",
  "lawReference": "GDPR Article 6",
  "suggestedFix": "Add explicit consent flow"
}

🧾 Memory Influence on Skill Execution¶

Skill	Memory Role
`ClassifyFieldsSkill`	Boosts or overrides classification from known schema recall
`RetentionPolicyAnalyzer`	Compares current retention config to known best practices
`LegalBasisMapper`	Recalls violations and citation matches across prior audits
`RemediationPlannerSkill`	Suggests fixes previously used for similar issues
`ComplianceRiskScorerSkill`	Adjusts risk level if recurring issue with same service or tenant is found

🔁 Regression Memory (Retest Support)¶

If a previous scan marked an issue as fixed:

{
  "violationId": "VIO-GDPR-A6-001",
  "previousStatus": "fixed",
  "lastSeen": "2025-03-21",
  "retestResult": "reappeared",
  "escalate": true
}

→ The agent will emit a PrivacyRegressionDetected event and raise severity.

📊 Memory-Enhanced Reporting¶

The agent includes historical markers:

violationPreviouslySeen: true
remediationPreviouslySuggested: true
hasReappeared: true
regressionRiskScoreBoost: +1.2

→ These affect sorting, risk scoring, and attention flags.

🔒 Tenant-Scoped Memory¶

Memory Rule	Enforcement
Cross-tenant recall	Allowed only for general patterns (e.g., "email fields often require consent")
Specific field history	Never shared across tenants unless anonymized
Fix suggestions	Shared by category, not project identity, unless in same legal context and approved

✅ Summary¶

Memory improves:
📌 Field classification accuracy
🔁 Violation re-detection
📤 Fix recommendation quality
🧾 Compliance history traceability
All memory access is:
🧠 Semantic and relevance-based
🔐 Tenant-scoped and regulation-aware
📚 Versioned per legal framework

✅ Validation Rules¶

Validation ensures that the Privacy Compliance Agent emits only legally grounded, traceable, and actionable findings. Each rule checks whether discovered issues are:

⚖️ Legitimate (mapped to regulation)
🔄 Reproducible (based on static input or memory-backed recall)
📈 Severity-ranked (based on data type, tenant scope, and legal context)
🔎 Not false positives (via scoring thresholds and human hooks)

📏 Validation Dimensions¶

Rule Domain	Purpose
Field Classification Validity	Ensure NLP+regex+semantic results exceed minimum confidence thresholds
Lawful Basis Mapping	Every personal data field must be linked to a lawful basis (e.g., consent, contract)
Retention Policy Alignment	Retention settings must not exceed or violate the defined legal max/min durations
Consent Traceability	If field requires consent, ensure it's declared in `consent-config.yaml` or enforced in flow
API Erasure Support	For GDPR/CCPA compliance, APIs must support `DELETE`, opt-out, or access on applicable resources
Cross-Region Safety	Flag any sensitive data stored or processed in unauthorized geographies
Processor Declarations	All 3^rd-party vendors handling PII must appear in `third-party-integrations.yaml`
Multi-Tenant Boundaries	No shared PII between tenants unless explicitly allowed or isolated via zones

🔬 Validation Confidence Thresholds¶

Confidence Score	Behavior
`> 0.90`	Auto-confirmed classification/violation
`0.75–0.90`	Marked `tentative`, requires human or LegalOps review (unless `strictMode: true`)
`< 0.75`	Suppressed unless matched by prior memory or forced by config

📘 Example Validation Block (per Finding)¶

{
  "violationId": "VIO-GDPR-A6-001",
  "field": "email",
  "lawReference": "GDPR Article 6",
  "validated": true,
  "confidence": 0.94,
  "reproducible": true,
  "source": "User.email",
  "traceId": "proj-788-v1"
}

🔁 Retest & Regression Validation¶

If a finding reappears:
Validate input model has reverted
Compare PoC to prior exploit (structure, flow)
Raise regressionFlag: true and increase severity by +1

🔐 Region-Aware Validation Rules¶

Condition	Enforced Rule
EU tenant with US storage	❌ `GDPR Article 44` violation
CCPA tenant with no `DELETE /user/:id`	❌ Missing erasure right
Marketing opt-in not logged	❌ `GDPR Article 7` breach (consent not stored)

📊 Validation Rule Scoring Matrix¶

Type	Weight
PII without lawful basis	High (8.0–10.0)
Cross-region sensitive data	High (7.5–9.5)
PHI missing retention	Medium–High (6.5–8.0)
Missing erasure API	Medium (5.5–7.0)
Behavioral data without opt-out	Low–Medium (3.0–6.0)
Incorrect classification (false positive)	Eliminated from output

🔎 Human Intervention Hook (Optional)¶

{
  "event": "ComplianceFindingRequiresReview",
  "reason": "Classification confidence 0.79 below strict threshold",
  "field": "phone",
  "source": "User",
  "law": "CCPA §1798.105",
  "recommendedAction": "Review manually before blocking"
}

✅ Summary¶

All findings are law-linked, confidence-scored, and validation-checked
Invalid, unclear, or low-confidence items are:
❌ Suppressed
⚠️ Escalated
🧑‍⚖️ Routed for human review
Validation rules support deployment gating, regression detection, and audit consistency

🔄 Retry / Correction Flow¶

The Privacy Compliance Agent includes a self-healing, correction-aware execution path. If findings are uncertain, validations fail, or configurations change mid-pipeline, the agent enters a targeted retry or correction loop — either automatically or as part of a post-remediation retest.

This ensures that the platform is not blocked due to transient schema errors, false positives, or pending data policy updates.

🔁 Retry Triggers¶

Condition	Triggered Action
New or updated `entity-models.yaml`	Partial rescan and reclassification of affected fields
Updated `consent-config.yaml`	Re-evaluation of consent-linked violations
Missing legal basis for PII	Auto-invoke `RemediationPlannerSkill` and revalidate
Confidence < threshold	Retry classification with boosted memory + updated NLP context
Retention mismatch	Re-check storage-map and TTL annotations after infrastructure sync
Regression suspected	Re-analyze matching previous `violationId` or risk signature

🔁 Retry Flow¶

flowchart TD
    INIT[Initial Finding]
    VALIDATE{Valid?}
    VALIDATE -- No --> RETRY_CHECK{Retry Permitted?}
    RETRY_CHECK -- Yes --> FIX_APPLY[Auto-correct + Rerun]
    RETRY_CHECK -- No --> ESCALATE[Emit Human Review Event]
    FIX_APPLY --> REVALIDATE{Validation Passes?}
    REVALIDATE -- Yes --> FINALIZE
    REVALIDATE -- No --> ESCALATE

Hold "Alt" / "Option" to enable pan & zoom

🛠️ Correction Strategies¶

Finding	Correction Attempt
`Missing consent`	Suggest insertion of opt-in flag and legal basis annotation
`Region mismatch`	Recommend field relocation, masking, or retention shortening
`Erasure API not found`	Suggest adding `DELETE` endpoint for affected resource
`Storage TTL too long`	Recommend override to meet jurisdictional limits (e.g., 3y → 1y for EU data)

🧠 Retry-Aware Skills¶

Skill	Retry Role
`ClassifyFieldsSkill`	Retry classification with boosted semantic recall
`LegalBasisMapper`	Re-map to updated regulatory scope or revised tenant profile
`RemediationPlannerSkill`	Suggest and simulate correction with diff check
`ComplianceRiskScorerSkill`	Adjust severity after fix or fail retry

📋 Retry Metadata (in `privacy-execution-metadata.json`)¶

{
  "retryAttempted": true,
  "retryReason": "Consent basis missing for field `email`",
  "retryOutcome": "Fixed + validated",
  "confidenceBefore": 0.71,
  "confidenceAfter": 0.92
}

🚫 Retry Limits¶

Control	Default
`maxRetriesPerField`	2
`retriesAllowedPerSession`	10
`retryConfidenceBoostThreshold`	+0.10 minimum gain to accept correction
`retryOnRegionViolation`	false by default (manual fix preferred)

📢 Retry Events Emitted¶

Event	Use
`ComplianceRetryAttempted`	Logged on retry start
`ComplianceRetrySucceeded`	Resolution validated
`ComplianceRetryFailed`	Unresolved issue remains, escalation triggered
`ComplianceFixSimulated`	Fix generated and tested, ready for commit by FixBot or LegalOps

✅ Summary¶

Retries are intentional, trace-scoped, and legally constrained
Correctable violations are automatically rerun if fixable
Agent never hides or downgrades unresolved findings:
🛠️ Fix → revalidate
❌ Fail → escalate
⚠️ Inconclusive → human review

🤝 Collaboration Interfaces¶

The Privacy Compliance Agent integrates with key ConnectSoft agents, legal systems, and governance workflows to enforce end-to-end privacy compliance orchestration. It consumes artifacts from architecture agents and emits structured reports and events for remediation, gating, audit, and legal sign-off.

🔗 Collaborating Agents and Interfaces¶

Agent	Interaction	Description
Data Architect Agent	Input Provider	Supplies `entity-models.yaml`, annotations (`isSensitive`, `regionRestricted`, `retentionPolicy`).
API Gateway Architect Agent	Input Provider	Supplies `openapi.yaml`, identifies parameter exposure and RESTful rights support.
Security Architect Agent	Trust Context	Enforces cross-tenant separation, zone-based visibility for PII/PHI.
DevOps Architect Agent	Storage Context	Provides storage-region maps, encryption status, deployment constraints.
LegalOps Agent	Output Consumer + Reviewer	Reviews flagged high-risk findings, confirms regulation mappings, approves exceptions or waivers.
FixBot / Security Engineer Agent	Remediation Consumer	Uses `remediation-recommendations.json` to implement TTLs, consent enforcement, API adjustments.
Studio Governance Dashboard	Viewer	Visualizes compliance status, risk heatmaps, tenant readiness.
HumanOps Agent	Escalation Fallback	Handles ambiguous or disputed violations with manual verification prompts.

📤 Emitted Events¶

Event Name	Consumed By	Purpose
`PrivacyComplianceReady`	DevOps Agent, Orchestrator	Signals readiness to deploy under compliance envelope
`PrivacyComplianceFailed`	Studio Dashboard, DevOps Agent	Blocks deployment or alerts CI gate
`ComplianceFindingReported`	LegalOps Agent	Invoked for every high-risk confirmed violation
`ComplianceRetryAttempted`	Orchestrator or FixBot	Indicates fix logic attempted within policy window
`ComplianceFindingRequiresReview`	HumanOps Agent	Escalates ambiguous or low-confidence items
`PrivacyRegressionDetected`	Studio, Audit Logs	Signals a previously fixed issue has reappeared in current build

🔁 Collaboration Workflow Example¶

sequenceDiagram
    participant DA as Data Architect
    participant GW as API Gateway Agent
    participant PC as Privacy Compliance Agent
    participant SE as Security Engineer Agent
    participant LEGAL as LegalOps Agent
    participant STUDIO as Studio Dashboard

    DA->>PC: entity-models.yaml
    GW->>PC: openapi.yaml
    PC->>LEGAL: privacy-audit-report.md (for review)
    PC->>SE: remediation-recommendations.json
    PC->>STUDIO: compliance-risk-matrix.yaml

Hold "Alt" / "Option" to enable pan & zoom

Output File	Shared With	Purpose
`data-classification.json`	Data Architect Agent, Studio	Field-level tagging
`privacy-audit-report.md`	LegalOps, HumanOps	Human-readable summary and law mapping
`compliance-risk-matrix.yaml`	Studio, DevOps Agent	Deployment gating, severity visual
`remediation-recommendations.json`	FixBot, SE Agent	Automatable fix planning
`compliance-findings.json`	Orchestrator, Audit pipeline	Central risk index
`privacy-execution-metadata.json`	All	Trace, coverage, session status

🔐 Governance Integration Rules¶

All outputs tagged with:
traceId, tenantId, sessionId, lawReference[]
Role-based sharing:
LegalOps sees all findings
Studio sees tenant-level summaries only
FixBot only sees fixable violations (non-policy-exempt)
Remediation approval may be required before update is accepted (via LegalOpsConfirmRequired event)

✅ Summary¶

The Privacy Compliance Agent:
Consumes architectural + legal context
Emits findings to legal, engineering, and CI/CD actors
Triggers gating, dashboard updates, and human escalation
It operates as a regulatory coordination hub, ensuring privacy compliance is collaborative, traceable, and enforceable

🔎 Observability & Human Intervention Hooks¶

The Privacy Compliance Agent is fully observable and designed for auditable traceability, dashboard integration, and human oversight. Every compliance action — from field classification to violation validation — is trace-tagged, logged, and exportable.

The agent also supports human-in-the-loop controls for ambiguous cases, sensitive edge conditions, and legal approval workflows.

📊 Observability Features¶

Feature	Description
OpenTelemetry Spans	Each classification, mapping, and validation step emits spans (`traceId`, `skill`, `targetField`, `riskScore`).
Structured Logging	All findings, retries, and overrides are logged with metadata: `violationId`, `tenantId`, `lawReference`, `confidence`.
Prometheus Metrics Export	Emits metrics for dashboards and pipelines:

compliance.violations.total
compliance.risk.high.count
compliance.fields.classified.percent
compliance.score.avg | | Execution Metadata | Stored in privacy-execution-metadata.json, used for audit trails and pipeline gates. | | Studio Dashboard Feed | Feeds risk matrix, tenant readiness, and compliance trend charts to governance dashboards. |

📈 Sample Metrics Output¶

compliance.violations.total: 6
compliance.violations.high: 2
compliance.fields.classified.percent: 94.6
compliance.retention.passed: 87.0
compliance.regression.detected: 1

🧭 Studio Dashboard Views¶

🟢 Compliance status per environment/tenant
🗺️ Risk matrix by service and severity
🔁 Historical trend of regressions vs. remediated issues
📜 Legal article coverage progress (e.g., GDPR Article 6, 15, 20)
🧾 Execution logs, last scan trace, and result summaries

🧑‍⚖️ Human Intervention Hooks¶

Scenario	Hook Triggered
Classification confidence < 0.75	`ComplianceFindingRequiresReview`
Field is ambiguous in meaning (e.g., `customerCode`)	Manual override option
Fix suggestion may break downstream system	`RemediationConfirmationRequired`
Legal justification unclear (e.g., `legitimate interest` used on sensitive data)	`LegalOpsReviewRequired`
Regressed issue not clearly resolved	`PrivacyRegressionDetected`
Field marked as exempt but unverified	`ComplianceOverrideRequest`

🧠 Human Action Interface (Studio or GitOps PR Comment)¶

{
  "action": "approve_violation_override",
  "violationId": "VIO-GDPR-A6-001",
  "justification": "Lawful basis confirmed offline",
  "approvedBy": "LegalOps Agent",
  "timestamp": "2025-05-14T20:12:15Z"
}

→ Audit log updated, override accepted, deployment unblocked.

📘 Escalation & Gate Control¶

PrivacyComplianceFailed blocks build/release
ComplianceFindingRequiresReview surfaces Studio prompt + LegalOps agent
ComplianceRegressionDetected raises red alert in tenant dashboard

✅ Summary¶

The agent is observability-first, emitting:
Metrics, spans, logs, dashboards, and events
Human oversight is built-in:
🧠 Low-confidence decisions
⚖️ Legal exemptions
🔁 Regression checks
Governance, LegalOps, and CI/CD teams can fully audit and interact with the agent

🛡️ Privacy Compliance Agent Specification¶

🧠 Purpose¶

🧭 Role in the Platform¶

🔄 Position in the Agent Lifecycle¶

🔐 What This Agent Ensures¶

📎 What It Audits¶

📋 Strategic Capabilities¶

✅ Summary¶

📋 Responsibilities¶

✅ Core Responsibilities¶

📤 Expected Outputs (Preview)¶

📋 Regulatory Responsibility Matrix¶

📎 Role in the SaaS Factory Lifecycle¶

✅ Summary¶

📥 Inputs Consumed¶

📂 Core Inputs¶

🧠 Semantic Inputs from Orchestration¶

🧾 Example: entity-models.yaml¶

📘 Example: tenant-profiles.json¶

📎 Real-Time Constraints¶

🧠 Orchestrator Prompt Input (Extract)¶

✅ Summary¶

📤 Outputs Produced¶

📁 Output Directory Structure¶

📦 Output Artifact Details¶

📘 data-classification.json¶

📘 privacy-audit-report.md¶

📘 compliance-risk-matrix.yaml¶

📘 remediation-recommendations.json¶

📘 privacy-execution-metadata.json¶

🧩 Used By¶

✅ Summary¶

🧠 Knowledge Base¶

📚 Embedded Legal Frameworks¶

🧠 Domain Classifiers¶

🧠 Consent Models¶

🔐 Retention Rule Models¶

🧠 MCP-Backed Semantic Memory¶

🛠️ Fuzzable Compliance Anti-Patterns¶

📎 Trace-Linked Legal Mapping¶

✅ Summary¶

🔁 Process Flow Overview¶

🔄 High-Level Execution Phases¶

🪜 Phase Descriptions¶

🔁 Optional Flow Branches¶

🧠 Metadata Tracking Throughout¶

📋 Example Execution Log Summary¶

📢 Events Emitted¶

✅ Summary¶

🧠 Classification Engine (Step-by-Step)¶

🔍 Step-by-Step Classification Pipeline¶

1️⃣ Extract Fields from Models¶

2️⃣ Pattern Matching + Field Name NLP¶

3️⃣ Semantic Recall (via MCP)¶

4️⃣ Confidence Scoring + Overrides¶

5️⃣ Apply Tenant/Region Constraints¶

6️⃣ Emit Structured Output¶

🧪 NLP Classifier Examples¶

✅ Summary¶

🧩 Skills & Kernel Functions¶

🧠 Core Kernel Skills¶

🔁 Skill Execution Chain (Simplified)¶

🧠 Skill Execution Example: ConsentPathValidator¶

📦 Skill Outputs → Artifact Mapping¶

📎 Skill Constraints & Safeguards¶

🧠 AI Prompt Injection in Skills (Simplified)¶

✅ Summary¶

🗣️ System Prompt¶

🧠 System Prompt (v1.0)¶

📋 Key Prompt Principles¶

📌 Prompt-Scoped Behavior Overrides¶

✅ Summary¶

🧾 Input Prompt Template¶

📥 Standard YAML Input Format¶

🧠 Key Input Fields Explained¶

📘 Sample Fragment: CCPA + GDPR for Single Tenant¶

📎 Optional Fields for Targeted Execution¶

🧠 Semantic Overrides¶

✅ Summary¶

📤 Output Format and Structure¶

🧾 Example: `entity-models.yaml`¶

📘 Example: `tenant-profiles.json`¶

📘 `data-classification.json`¶

📘 `privacy-audit-report.md`¶

📘 `compliance-risk-matrix.yaml`¶

📘 `remediation-recommendations.json`¶

📘 `privacy-execution-metadata.json`¶

🧠 Skill Execution Example: `ConsentPathValidator`¶

✅ `data-classification.json`¶

📘 `privacy-audit-report.md`¶

📘 `compliance-risk-matrix.yaml`¶

📘 `remediation-recommendations.json`¶

📘 `compliance-findings.json`¶

📘 `privacy-execution-metadata.json`¶

📋 Retry Metadata (in `privacy-execution-metadata.json`)¶