Back|Reference Tools
Reference Tool · v0.2.1

Ghost-Browser

Overview

Ghost-Browser is a sandboxed headless browsing agent that performs web scraping and navigation within a strict content-addressed boundary. It demonstrates how MPP enforces network-scoped capabilities — the tool can fetch URLs but cannot persist cookies, access stored credentials, or leak DOM content outside the sandbox.

Give an AI agent eyes on the web — without handing over your browser session.

Manifest

[package]
name        = "ghost-browser"
version     = "0.2.1"
description = "Sandboxed headless browsing agent"
authors     = ["MPP Reference Team"]
license     = "Apache-2.0"

[runtime]
target = "wasm32-wasi"
memory = "128MB"

[permissions]
network     = "fetch-only"
fs          = "deny"
credentials = "deny"

[permissions.network.allowlist]
domains = ["*"]
schemes = ["https"]

[sandbox]
csp = "strict"

[signing]
algorithm = "Ed25519"
key_id    = "mpp-reference-2025"

Architecture

Ghost-Browser runs a lightweight HTML parser and HTTP client inside the WASM sandbox. Every outbound request passes through the MPP runtime's network capability gate, which enforces the fetch-only permission and the strict CSP policy.

Execution Flow

  1. Agent Request: The host AI agent sends a URL or navigation instruction to the tool via the MPP invoke interface.
  2. URL Validation: The WASM module validates the target URL against the declared allowlist. Only https:// schemes are permitted.
  3. Capability Gate: The runtime intercepts the outbound fetch, verifies thenetwork = "fetch-only" permission, and strips any ambient credentials, cookies, or auth headers.
  4. Content Parsing: The raw HTML response is parsed in-sandbox. The tool extracts structured data (text, links, metadata) according to the agent's request.
  5. Response: Extracted content is returned as JSON through the MPP result channel. No raw HTML is passed to the agent unless explicitly requested.

Security Boundaries

LayerControl
WASM sandboxLinear memory isolation — no access to host browser state or DOM
Network: fetch-onlyCan make outbound HTTPS GET/POST requests; cannot open WebSockets, SSE streams, or raw TCP
Credential denyNo access to cookies, localStorage, session tokens, or auth headers
Strict CSPNo inline scripts, no eval, no dynamic imports — prevents code injection in parsed content
FS denyNo file-system access — no caching, no persistent state between invocations
Ed25519 signaturePackage integrity is verified before any code is loaded

Permissions Detail

  • network — fetch-only: The tool can issue HTTP GET and POST requests to any domain over HTTPS. Plain HTTP is blocked. WebSocket and streaming connections are denied.
  • credentials — deny: The runtime strips all ambient authentication from outbound requests. The tool cannot access cookies, bearer tokens, or API keys from the host environment.
  • fs — deny: No file-system reads or writes. Downloaded content exists only in WASM linear memory for the duration of the invocation.

Usage Example

# Install from registry
mpp install ghost-browser@0.2.1

# Verify signature before first run
mpp verify ghost-browser
# ✓ Ed25519 signature valid (key: mpp-reference-2025)
# ✓ Manifest hash matches archive
# ✓ Permissions: network(fetch-only), credentials(deny), fs(deny)

# Fetch and extract page content
mpp run ghost-browser --input '{
  "url": "https://example.com/blog/ai-safety",
  "extract": ["title", "body_text", "links"]
}'

# Example response
{
  "status": "ok",
  "data": {
    "title": "AI Safety in Production Systems",
    "body_text": "As AI agents become more capable...",
    "links": [
      { "text": "Research Paper", "href": "https://example.com/papers/safety.pdf" },
      { "text": "GitHub Repo", "href": "https://github.com/example/ai-safety" }
    ]
  },
  "meta": { "status_code": 200, "content_length": 24810, "elapsed_ms": 340 }
}

Extraction Modes

Ghost-Browser supports several structured extraction modes, defined in the extract field of the input:

ModeDescription
titlePage title from <title> or og:title meta tag
body_textCleaned plaintext from the main content area (nav/footer stripped)
linksAll anchor tags with text and href, deduplicated
metaOpenGraph, Twitter Card, and standard meta tags
tablesStructured table data as arrays of row objects
raw_htmlFull sanitised HTML response (stripped of scripts and event handlers)

Threat Mitigations

  • Session Hijacking: Credentials are never attached to requests. The tool has no access to the user's browser sessions or auth tokens.
  • DOM Exfiltration: The tool operates on fetched HTML, not the host's live DOM. There is a complete isolation boundary between the agent's browsing and the user's browser.
  • XSS via Parsed Content: The strict CSP sandbox prevents execution of any script content encountered in fetched pages.
  • SSRF: While the tool can fetch arbitrary HTTPS URLs, it operates in a fully isolated sandbox with no access to internal network resources. Hosts can further restrict the domain allowlist.
  • Supply-Chain Tampering: The Ed25519 signature covers the entire archive. Any modification invalidates the signature and prevents execution.

Source & Build

# Clone the reference repo
git clone https://github.com/mpp-protocol/reference-tools.git
cd reference-tools/ghost-browser

# Build the WASM module
cargo build --target wasm32-wasi --release

# Package as .mpp artifact
mpp pack --sign --key ~/.mpp/keys/mpp-reference-2025.key

The resulting ghost-browser-0.2.1.mpp artifact can be published to any MPP-compatible registry or shared directly as a signed file.