PII Redaction at the Protocol Level: Privacy by Design for AI Workflows

Here is something that happens every day in enterprise AI deployments and that almost no one is talking about.

An AI agent, assisting a support representative, calls a tool to look up a customer's account. The tool queries the CRM database and returns a JSON object containing the customer's name, email address, phone number, billing address, and the last four digits of their credit card. The agent receives this data, incorporates it into its context, and uses it to draft a response.

The customer's PII is now in the model context. Depending on the provider and configuration, it may be logged for debugging. It may persist in the conversation history. It may influence completions in other sessions. It may be included in training data. And it is entirely invisible to the compliance team, because the tool invocation produced no audit record and no filtering was applied.

This is not a hypothetical. It is the default behaviour of every major AI agent framework today. And it is incompatible with every privacy regulation that applies to your customer data.

The Problem with Post-Hoc Privacy

The typical enterprise response to data privacy concerns is to add controls around the data at rest and in transit: encrypt the database, encrypt the API connection, restrict who can query the CRM, log every access. These are necessary controls. They are also insufficient when AI agents enter the picture.

AI agents create a new data flow that bypasses traditional controls. The tool queries the database (a controlled operation) and returns the data to the agent (an uncontrolled operation). The agent is not a user with an identity and permissions — it is a process that ingests data into a probabilistic model. The data is no longer in your controlled systems. It is in the model context, and from there, it can go anywhere.

Some teams attempt to address this by instructing tool authors to redact PII in their tool code. This is the "privacy as a feature" approach, and it fails for three reasons:

It's opt-in. Every tool author must implement redaction independently. If one tool forgets, the PII leaks. The privacy guarantee is only as strong as the weakest tool in the chain.
It's inconsistent. Different tools implement different patterns, with different regex, different thresholds for what counts as PII, and different approaches to edge cases like credit card numbers that might be order numbers.
It's unverifiable. There is no way for the host or the compliance team to confirm that a tool actually redacts PII without reading its source code. And even then, the redaction might be conditional or bypassed by certain input patterns.

Privacy controls need to be structural, not behavioural. They need to operate at the infrastructure layer, consistently and transparently, regardless of what individual tools do.

How MPP's Privacy Filter Works

MPP moves PII redaction from the tool layer to the host layer. The privacy filter engine sits between the tool's output and the agent's input. Every tool response passes through the filter before delivery. The tool never knows its output was filtered — redaction happens transparently.

Tool output → Privacy Filter Engine → Redacted output → Agent

"Contact john@example.com for details"
       ↓
"Contact [REDACTED:email] for details"

The architecture is deliberate. By filtering at the host layer:

Every tool is covered. The filter applies to all tool responses, regardless of whether the tool author implemented their own redaction. There is no opt-in.
Consistency is guaranteed. The same patterns and the same redaction format are applied uniformly across all tools.
The tool can't bypass it. The filter runs in the host process, outside the WASM sandbox. The tool has no mechanism to skip, disable, or circumvent the filter.
It's auditable. Every redaction is recorded in the response metadata, creating a verifiable record of what was filtered and which patterns matched.

Built-In Patterns

MPP ships with six built-in patterns that cover the most common PII types:

| Pattern | Example Input | Redacted Output | |---------|--------------|----------------| | email | jane.doe@acme.com | [REDACTED:email] | | phone | (555) 867-5309 | [REDACTED:phone] | | ssn | 123-45-6789 | [REDACTED:ssn] | | credit_card | 4111 1111 1111 1111 | [REDACTED:credit_card] | | ip_address | 192.168.1.100 | [REDACTED:ip_address] | | ipv6_address | 2001:db8::1 | [REDACTED:ipv6_address] |

The credit card pattern deserves special attention. Matching 13–16 digit sequences would produce massive false positives — order numbers, timestamps, and internal IDs frequently hit that range. MPP applies the Luhn algorithm after the regex match. Only sequences that pass the Luhn check (which all valid credit card numbers do, by specification) are redacted. A 16-digit order number that doesn't satisfy Luhn is left untouched.

Custom Patterns

Built-in patterns cover common PII. But enterprises have their own sensitive identifiers — employee IDs, internal project codes, patient record numbers, account identifiers — that are specific to their domain.

MPP supports custom patterns declared in the tool's manifest:

{
  "privacy_filters": {
    "enabled": true,
    "patterns": ["email", "phone", "ssn", "credit_card"],
    "custom_patterns": [
      {
        "id": "employee_id",
        "regex": "EMP-\\d{6}",
        "description": "Internal employee identifier"
      },
      {
        "id": "patient_mrn",
        "regex": "MRN-[A-Z]{2}\\d{8}",
        "description": "Patient medical record number"
      },
      {
        "id": "project_code",
        "regex": "PROJ-[A-Z]{3}-\\d{4}",
        "description": "Project tracking code"
      }
    ]
  }
}

Custom patterns are applied alongside built-in patterns. A tool response that contains both an email address and an employee ID will have both redacted:

Input:  "Contact jane.doe@acme.com (EMP-384726) for approval"
Output: "Contact [REDACTED:email] ([REDACTED:employee_id]) for approval"

ReDoS Protection

Custom regex patterns introduce a risk: a carelessly or maliciously constructed pattern can cause Regular Expression Denial of Service (ReDoS), where the regex engine enters catastrophic backtracking and hangs for seconds or minutes on certain inputs.

MPP mitigates this with two constraints:

| Constraint | Limit | |-----------|-------| | Maximum regex length | 512 characters | | Execution timeout | 100ms per pattern |

If a custom pattern exceeds the length limit, it is rejected at load time. If a pattern match exceeds 100ms on any input, the match is aborted and the input passes through unredacted (safe default: if the filter can't complete, the data is not modified).

Recursive JSON Redaction

Tool responses are typically JSON objects, not flat strings. A CRM lookup might return:

{
  "customer": {
    "name": "Jane Doe",
    "email": "jane@example.com",
    "phone": "(555) 867-5309",
    "addresses": [
      {
        "type": "billing",
        "line1": "123 Main St",
        "city": "Springfield"
      }
    ],
    "notes": "Customer called from 192.168.1.50 regarding SSN issue (ref: 123-45-6789)"
  }
}

MPP's filter engine recursively processes the entire JSON structure:

String values are matched against all active patterns and redacted if a match is found.
Objects are recursed into — every value is processed. Keys are not redacted (they are structural, not data).
Arrays are recursed into — every element is processed.
Numbers, booleans, and nulls pass through unchanged.

After filtering:

{
  "customer": {
    "name": "Jane Doe",
    "email": "[REDACTED:email]",
    "phone": "[REDACTED:phone]",
    "addresses": [
      {
        "type": "billing",
        "line1": "123 Main St",
        "city": "Springfield"
      }
    ],
    "notes": "Customer called from [REDACTED:ip_address] regarding SSN issue (ref: [REDACTED:ssn])"
  }
}

The agent receives the customer's name (not PII by itself), the billing address (which may or may not require redaction depending on policy), but not the email, phone, IP address, or SSN. The structural context is preserved — the agent knows there is a customer with contact information — but the sensitive values are replaced with tokens.

Response Metadata

After filtering, MPP records which patterns were applied in the response metadata:

{
  "result": {
    "content": { "..." },
    "metadata": {
      "execution_time_ms": 45,
      "memory_used_bytes": 2048,
      "privacy_filters_applied": ["email", "phone", "ssn", "ip_address"]
    }
  }
}

This metadata serves two purposes:

Auditability. The compliance team can see exactly which PII types were detected and redacted in each tool response. This creates a demonstrable record that privacy controls were applied.
Debugging. If the agent's output seems incomplete (because key data was redacted), the developer can check the metadata to understand what was filtered and adjust the tool's privacy configuration if the redaction was too aggressive.

The Compliance Impact

GDPR: Data Minimisation (Article 5(1)(c))

GDPR requires that personal data collected and processed be limited to what is necessary for the purposes for which it is processed. When an AI agent calls a tool that returns customer records, the full record often contains far more PII than the agent actually needs.

MPP's privacy filters enforce data minimisation at the protocol level. Even if a tool returns a full customer record, the agent only receives the non-PII portions. The minimisation is not dependent on the tool author's good intentions — it is enforced by the host runtime.

HIPAA: Access Controls (§ 164.312(a)(1))

HIPAA requires that covered entities implement technical policies and procedures for electronic information systems that maintain electronic protected health information, to allow access only to those persons or software programs that have been granted access rights.

MMP's privacy filters ensure that even when a tool has been granted access to a health system (through an approved capability), the protected health information in its response is redacted before reaching the AI model. The PHI never enters the model context.

CCPA: Information Security (§ 1798.150)

CCPA gives consumers a private right of action when their personal information is exposed due to a failure to maintain reasonable security procedures. Privacy filtering at the protocol level is a demonstrable "reasonable security procedure" — it prevents PII from reaching the model context regardless of tool behaviour.

EU AI Act: Transparency and Oversight

The EU AI Act requires transparency in AI system operations and meaningful human oversight. MPP's privacy filter metadata — recording which PII types were detected and redacted in each tool invocation — provides a transparent, auditable record of how personal data is handled in every AI agent interaction.

Why Protocol-Level Beats Application-Level

The fundamental argument for protocol-level privacy filtering is the same argument for HTTPS. If encryption is implemented at the application level, every application must implement it correctly, and a single mistake exposes data. When encryption is implemented at the protocol level (TLS), every application benefits from it automatically, and the guarantee is uniform.

Privacy filtering for AI tools is the same pattern:

| Approach | Guarantee | Coverage | Consistency | Verifiability | |----------|----------|----------|-------------|--------------| | Tool-level redaction | Per-tool, opt-in | Only tools that implement it | Varies by implementation | Requires source code review | | Protocol-level filtering | Universal, automatic | Every tool response | Uniform patterns and format | Metadata records every filter application |

MPP chose protocol-level filtering because it is the only approach that provides a guarantee to the compliance team. Not a promise that tool authors will do the right thing — a structural guarantee that PII will be redacted from every tool response, applied consistently, recorded in metadata, and enforced by the host runtime.

Configuration for Enterprise Teams

For organisations deploying MPP, privacy filter configuration is part of the host runtime setup:

Minimum patterns. Set a baseline of required patterns (email, phone, SSN, credit card) that apply to all tools regardless of their manifest declarations. This ensures that even tools that don't declare privacy filters are filtered at the host level.

Domain-specific patterns. Add custom patterns for your organisation's internal identifiers. Employee IDs, patient record numbers, account numbers, and internal codes should be redacted with the same rigour as standard PII.

Pattern testing. Test custom patterns against representative data before deploying. Verify that false positive rates are acceptable and that the 100ms timeout is not triggered by your patterns on typical inputs.

Metadata integration. Feed the privacy_filters_applied metadata from tool responses into your compliance reporting pipeline. This creates an automatic, continuous record of PII handling across all AI agent interactions.

The privacy filter is not a feature you turn on and forget. It is a control that requires the same attention as any other data protection mechanism in your environment — initial configuration, periodic validation, and integration into your compliance reporting workflow.

For the full privacy filter API and pattern reference, see the Privacy Filters documentation. For how privacy filtering fits into the broader security model, read The Enterprise Case for AI Tool Governance.