MPP vs. Running Tools in Docker: Why Containers Aren't Enough

When enterprise security teams first encounter the AI tool execution problem — untrusted code running with ambient authority and no verification — the immediate reflex is to reach for Docker. It makes sense. Containers are the default isolation mechanism for modern infrastructure. The tooling is mature. The teams know how to operate it.

But Docker was designed to isolate long-running services, not to sandbox per-invocation function calls. The mismatch between the problem (secure, fast, lightweight tool execution) and the solution (a container runtime designed for service deployment) shows up across every dimension that matters for AI agent workflows.

This is not an argument that Docker is bad. Docker is excellent at what it does. This is an argument that AI tool execution is a different problem — and MPP is purpose-built to solve it.

The Comparison

1. Startup Latency

Docker: Hundreds of milliseconds to seconds, depending on image size, the runtime, and whether the image is cached. A minimal Alpine-based container takes 200–500ms. A typical application container takes 1–3 seconds.

MPP: Single-digit milliseconds for cold start. Sub-millisecond for warm start (pre-compiled WASM module).

Why it matters: An AI agent might invoke 5–20 tools in a single conversation. At 500ms per Docker container startup, you add 2.5–10 seconds of pure overhead to every interaction — before the tool even starts executing. Users notice. At 5ms per MPP invocation, the overhead is invisible.

The difference is structural, not optimisational. Docker creates a full Linux namespace with its own PID space, mount table, network stack, and filesystem overlay. WASM instantiates a module in an existing process by allocating linear memory and initialising the instruction pointer. The amount of work is different by orders of magnitude.

2. Package Size

Docker: Minimal images (Alpine-based, single binary) are 5–50 MB. Typical application images with dependencies are 100–500 MB. Images with language runtimes (Node.js, Python) regularly exceed 1 GB.

MPP: A typical .mpp package containing a WASM binary, manifest, and resources is 100–500 KB.

Why it matters: Size affects three things:

Distribution speed. Downloading a 100 KB MPP package from a registry is instantaneous. Pulling a 500 MB Docker image, even with layer caching, takes seconds to minutes.
Storage at scale. An enterprise registry hosting 1,000 Docker images might consume 500 GB. The same 1,000 tools as MPP packages consume 500 MB.
Edge deployment. Devices with constrained storage (IoT gateways, embedded systems, air-gapped workstations) can hold thousands of MPP packages but only a handful of Docker images.

3. Cryptographic Verification

Docker: Not built in. Docker Content Trust (DCT) / Notary exists but is opt-in, rarely enabled, and operates at the image tag level rather than verifying individual content layers. Most Docker deployments pull and run images without any cryptographic verification.

MPP: Mandatory by design. Every .mpp package is signed with Ed25519. The Gatekeeper verifies the signature against the SHA-256 content hash before any tool code executes. Unsigned packages are flagged as unverified. Verification takes ~60μs.

Why it matters: Supply-chain attacks succeed when there is no integrity check between "the publisher built this artifact" and "the consumer runs this artifact." Docker's verification story is bolted on and opt-in. MPP's is built into the execution pipeline and always-on.

4. Capability Granularity

Docker: Isolation is at the container level. A container either has network access or it doesn't. A container either has a volume mounted or it doesn't. Fine-grained domain-level network control requires external tooling: iptables rules, network policies (Calico, Cilium), or a service mesh (Istio, Linkerd). Fine-grained filesystem control is limited to volume mount points.

MPP: The capability model operates at the per-domain, per-path, per-variable level:

| Capability | Docker | MPP | |-----------|--------|-----| | Allow api.github.com but block api.stripe.com | Requires iptables + DNS-level rules | Declared in manifest, enforced by sandbox | | Read /data/inputs but not /data/secrets | Requires separate volume mounts with careful configuration | Separate read declarations per path | | Expose GITHUB_TOKEN but not AWS_SECRET_KEY | Requires selective --env or entrypoint filtering | Declared per variable in manifest |

Why it matters: AI tools have diverse capability profiles. A PR analysis tool needs one API domain and one credential. A database migration tool needs different domains, filesystem write access, and multiple credentials. With Docker, achieving this granularity requires bespoke container configurations and external network policies for every tool. With MPP, the tool's manifest declares what it needs, and the runtime enforces it with no external tooling.

5. Privacy Filtering

Docker: None. Data flows in and out of containers without any filtering or redaction. Implementing PII redaction requires either modifying the tool itself (application-level) or adding a separate proxy/sidecar container with custom logic.

MPP: Built in. The privacy filter engine processes every tool response against configurable patterns (email, phone, SSN, credit card, IP addresses) and custom patterns, redacting matches before the data reaches the agent. No additional infrastructure required.

Why it matters: When an AI agent invokes a tool inside a Docker container, the raw response — including any PII — flows directly back to the agent. The container provides no data-layer protection. MPP's privacy filter is a structural control: it operates at the host layer, applies to every tool, and cannot be bypassed by the tool code.

6. Audit Logging

Docker: Container-level logging (stdout/stderr capture) is available through Docker's logging drivers, but it captures application output, not a structured record of what the container accessed, what permissions it used, or what data flowed through it. Structured audit logging for AI tool invocations requires custom implementation.

MPP: Every invocation writes a structured record: package identity, capabilities granted, input received, output produced (post-filtering), sensitivity score, HITL decision, and attestation token details. Entries are hash-chained for tamper evidence.

Why it matters: When an auditor asks "what tools did your AI agent use, what could they access, and what data did they return?", Docker's logging gives you stdout output from containers. MPP's audit log gives you a tamper-evident record of every permission decision, every capability grant, and every filtered response.

7. Operational Complexity

Docker: Running tools in Docker containers requires:

A container runtime (Docker, containerd, Podman)
An orchestrator for managing multiple tool containers (Kubernetes, Docker Compose)
Image registry infrastructure (Docker Hub, ECR, internal registry)
Network policies for per-tool isolation
Volume management for filesystem access
Monitoring and restart policies for long-running tool servers
Secret management integration (Vault, AWS Secrets Manager)

MPP: Running tools as MPP packages requires:

The MPP host runtime (a Rust library or the CLI)
An MPP registry (or direct .mpp file distribution)

The capability model, network filtering, environment variable management, filesystem isolation, privacy filtering, and audit logging are all handled by the MPP runtime. There is no orchestrator, no network policy engine, no sidecar, and no separate secret management integration.

Where Docker Wins

This comparison would be dishonest without acknowledging where Docker is the better choice.

Long-running services. If your AI tool is a persistent server that maintains connections, caches state, and serves multiple requests concurrently, Docker is the right tool. MPP is designed for per-invocation execution, not long-running processes.

Complex language runtimes. MPP tools compile to WebAssembly. The WASM ecosystem has excellent support for Rust, C, and C++, and growing support for Go, Python (via Pyodide), and other languages. But if your tool depends on a specific language runtime with native extensions that don't compile to WASM, a container is your only option today.

GPU access. WASM sandboxes do not have GPU access. Tools that require CUDA, ROCm, or other GPU runtimes need containers or direct host execution. This is relevant for tools that perform inference, image processing, or numerical computation.

Existing infrastructure investment. If your organisation already operates a mature Kubernetes platform with established security policies, adding AI tools as containers may be straightforward operationally — even if it's suboptimal technically.

The Right Frame

The framing is not "MPP or Docker." It is "what problem are you solving?"

If you are deploying a persistent service that needs process isolation, network namespacing, and an orchestrated lifecycle — use a container.

If you are executing a function on behalf of an AI agent that takes input, produces output, needs fine-grained capability control, should not see PII, and must complete in milliseconds — use MPP.

| Dimension | Docker | MPP | |-----------|--------|-----| | Designed for | Service deployment | Function execution | | Startup latency | 200ms–3s | 1–10ms | | Package size | 50 MB–1 GB | 100–500 KB | | Crypto verification | Opt-in (DCT/Notary) | Built-in (Ed25519) | | Capability granularity | Container-level | Per-domain, per-path, per-variable | | Privacy filtering | None built-in | Built-in PII redaction | | Audit logging | Application stdout | Structured, hash-chained records | | Operational overhead | High (orchestrator, policies, monitoring) | Low (single runtime) |

Most enterprise AI deployments will use both. Containers for the infrastructure layer — the agent framework, the database, the API gateway. MPP for the tool execution layer — the individual functions that agents invoke dozens of times per conversation.

They are complementary, not competing. But the tool execution layer needs a tool execution solution, not a service deployment solution pressed into service.

For a deep dive into how MPP's WASM sandbox works, read WASM Sandboxing Explained. For the broader security model, see Inside the Gatekeeper.