The Security Challenge of Third-Party UI

Integrating third-party UI into conversational agents like ChatGPT requires balancing extensibility with strict security. The primary goal is to allow developers to render custom HTML/JS (views) while preventing those views from accessing the host's sensitive data (cookies, local storage) or executing malicious scripts.

Directly injecting content into an iframe via srcdoc fails because the iframe shares the parent's origin and Content Security Policy (CSP). This either blocks all application scripts (due to strict CSP) or, if the CSP is relaxed, allows the app to perform a sandbox escape, accessing the host's localStorage and cookies. Conversely, using a simple src attribute for the iframe requires the host to maintain an infinite whitelist of third-party domains in their CSP, which is not scalable.

The Double Iframe Solution

To solve this, ChatGPT employs a double-iframe architecture—a pattern originally pioneered by Facebook for their app marketplace.

  1. The Outer Iframe: This frame is served from a controlled, unique subdomain (e.g., app-unique-id.openai-usercontent.com). Using unique subdomains per app prevents cross-app storage collisions, ensuring that one app cannot access the localStorage or cookies of another.
  2. The Inner Iframe: This frame uses the srcdoc attribute to render the actual application content. Because it is nested within the outer iframe, it is effectively isolated from the host's origin.

This structure allows the host to enforce a specific CSP on the inner frame via meta tags, preventing the execution of unauthorized scripts or the rendering of nested iframes, while maintaining a secure, isolated browsing context.

Practical Implications for Developers

For developers building MCP (Model Context Protocol) apps, this architecture necessitates strict metadata management. Because the host environment enforces a strict CSP, developers must explicitly declare every external domain their application interacts with (e.g., for API calls, images, or scripts) in the MCP app metadata. Failure to declare these domains results in runtime errors and app rejection from the store.

To mitigate these issues, tools like Skybridge provide a "CSP Inspector." This tool diffs the domains declared in an app's metadata against the actual network calls observed during development, allowing developers to identify missing domains before submission. This addresses a common pain point where apps function correctly in local development but fail in production due to restrictive CSP headers.