Ghost-Browser
Overview
Ghost-Browser is a sandboxed headless browsing agent that performs web scraping and navigation within a strict content-addressed boundary. It demonstrates how MPP enforces network-scoped capabilities — the tool can fetch URLs but cannot persist cookies, access stored credentials, or leak DOM content outside the sandbox.
Give an AI agent eyes on the web — without handing over your browser session.
Manifest
[package]
name = "ghost-browser"
version = "0.2.1"
description = "Sandboxed headless browsing agent"
authors = ["MPP Reference Team"]
license = "Apache-2.0"
[runtime]
target = "wasm32-wasi"
memory = "128MB"
[permissions]
network = "fetch-only"
fs = "deny"
credentials = "deny"
[permissions.network.allowlist]
domains = ["*"]
schemes = ["https"]
[sandbox]
csp = "strict"
[signing]
algorithm = "Ed25519"
key_id = "mpp-reference-2025"Architecture
Ghost-Browser runs a lightweight HTML parser and HTTP client inside the WASM sandbox. Every outbound request passes through the MPP runtime's network capability gate, which enforces the fetch-only permission and the strict CSP policy.
Execution Flow
- Agent Request: The host AI agent sends a URL or navigation instruction to the tool via the MPP invoke interface.
- URL Validation: The WASM module validates the target URL against the declared allowlist. Only
https://schemes are permitted. - Capability Gate: The runtime intercepts the outbound fetch, verifies the
network = "fetch-only"permission, and strips any ambient credentials, cookies, or auth headers. - Content Parsing: The raw HTML response is parsed in-sandbox. The tool extracts structured data (text, links, metadata) according to the agent's request.
- Response: Extracted content is returned as JSON through the MPP result channel. No raw HTML is passed to the agent unless explicitly requested.
Security Boundaries
| Layer | Control |
|---|---|
| WASM sandbox | Linear memory isolation — no access to host browser state or DOM |
| Network: fetch-only | Can make outbound HTTPS GET/POST requests; cannot open WebSockets, SSE streams, or raw TCP |
| Credential deny | No access to cookies, localStorage, session tokens, or auth headers |
| Strict CSP | No inline scripts, no eval, no dynamic imports — prevents code injection in parsed content |
| FS deny | No file-system access — no caching, no persistent state between invocations |
| Ed25519 signature | Package integrity is verified before any code is loaded |
Permissions Detail
- network — fetch-only: The tool can issue HTTP
GETandPOSTrequests to any domain over HTTPS. Plain HTTP is blocked. WebSocket and streaming connections are denied. - credentials — deny: The runtime strips all ambient authentication from outbound requests. The tool cannot access cookies, bearer tokens, or API keys from the host environment.
- fs — deny: No file-system reads or writes. Downloaded content exists only in WASM linear memory for the duration of the invocation.
Usage Example
# Install from registry
mpp install ghost-browser@0.2.1
# Verify signature before first run
mpp verify ghost-browser
# ✓ Ed25519 signature valid (key: mpp-reference-2025)
# ✓ Manifest hash matches archive
# ✓ Permissions: network(fetch-only), credentials(deny), fs(deny)
# Fetch and extract page content
mpp run ghost-browser --input '{
"url": "https://example.com/blog/ai-safety",
"extract": ["title", "body_text", "links"]
}'
# Example response
{
"status": "ok",
"data": {
"title": "AI Safety in Production Systems",
"body_text": "As AI agents become more capable...",
"links": [
{ "text": "Research Paper", "href": "https://example.com/papers/safety.pdf" },
{ "text": "GitHub Repo", "href": "https://github.com/example/ai-safety" }
]
},
"meta": { "status_code": 200, "content_length": 24810, "elapsed_ms": 340 }
}Extraction Modes
Ghost-Browser supports several structured extraction modes, defined in the extract field of the input:
| Mode | Description |
|---|---|
title | Page title from <title> or og:title meta tag |
body_text | Cleaned plaintext from the main content area (nav/footer stripped) |
links | All anchor tags with text and href, deduplicated |
meta | OpenGraph, Twitter Card, and standard meta tags |
tables | Structured table data as arrays of row objects |
raw_html | Full sanitised HTML response (stripped of scripts and event handlers) |
Threat Mitigations
- Session Hijacking: Credentials are never attached to requests. The tool has no access to the user's browser sessions or auth tokens.
- DOM Exfiltration: The tool operates on fetched HTML, not the host's live DOM. There is a complete isolation boundary between the agent's browsing and the user's browser.
- XSS via Parsed Content: The strict CSP sandbox prevents execution of any script content encountered in fetched pages.
- SSRF: While the tool can fetch arbitrary HTTPS URLs, it operates in a fully isolated sandbox with no access to internal network resources. Hosts can further restrict the domain allowlist.
- Supply-Chain Tampering: The Ed25519 signature covers the entire archive. Any modification invalidates the signature and prevents execution.
Source & Build
# Clone the reference repo
git clone https://github.com/mpp-protocol/reference-tools.git
cd reference-tools/ghost-browser
# Build the WASM module
cargo build --target wasm32-wasi --release
# Package as .mpp artifact
mpp pack --sign --key ~/.mpp/keys/mpp-reference-2025.keyThe resulting ghost-browser-0.2.1.mpp artifact can be published to any MPP-compatible registry or shared directly as a signed file.