Build MCP Servers to Connect ChatGPT to Private Data

Create remote MCP servers using Python and FastMCP to expose vector store data to ChatGPT apps and deep research via standardized search and fetch tools.

MCP as the Standard for AI Tool Extensions

Model Context Protocol (MCP) is an open protocol emerging as the industry standard for connecting AI models to external tools and knowledge sources over the internet. Remote MCP servers enable ChatGPT apps (formerly connectors), deep research features, company knowledge bases, and API integrations by providing access to private data like vector stores. This approach prioritizes read-only access for compatibility, avoiding mutable operations that could conflict with model reasoning.

The core opportunity: bridge proprietary data sources to LLMs without rebuilding retrieval pipelines from scratch. OpenAI recommends MCP for data-only apps, where you expose search and fetch tools—no custom UI required if focusing purely on data. Tradeoffs include strict schema adherence for tool outputs (JSON-encoded in text content items) to ensure model compatibility, and reliance on vector stores for simplicity, though any data source works.

"Remote MCP servers can be used to connect models over the Internet to new data sources and capabilities." This highlights MCP's role in scalable, standardized integrations beyond one-off prompts.

Vector Stores as the Starting Data Source

Start with OpenAI's vector stores for retrieval-augmented generation (RAG)-like functionality. Upload files via dashboard (platform.openai.com/storage/vector_stores) or API, using examples like the public-domain "cats.pdf" (19th-century book on cats, URL: https://cdn.openai.com/API/docs/cats.pdf). Note the vector store ID for server integration.

Why vector stores? They handle embedding, indexing, and similarity search out-of-the-box, reducing boilerplate. Alternatives like custom databases were possible but rejected here for speed—vector stores integrate directly with OpenAI APIs. Post-setup, the store becomes queryable via MCP tools, enabling ChatGPT to perform semantic search on private docs.

Tradeoffs: Vector stores incur storage/query costs (check OpenAI pricing), and file limits apply (e.g., PDF size caps). For production, monitor token counts and compaction to manage context windows.

Essential Tools: Search and Fetch Schemas

MCP servers for ChatGPT compatibility must implement two read-only tools: search (find relevant results) and fetch (retrieve full content). These follow precise schemas to match model expectations, using MCP's content array format where results are JSON strings in type: "text" items.

Search tool:

  • Input: Single query string.
  • Output: {"results": [{ "id": "unique-id", "title": "human-readable", "url": "canonical-url" }]} as JSON-encoded text in one content item.

Example response:

{
  "content": [
    {
      "type": "text",
      "text": "{\"results\":[{\"id\":\"doc-1\",\"title\":\"...\",\"url\":\"...\"}]}"
    }
  ]
}

Fetch tool:

  • Input: Document id string.
  • Output: {"id": "...", "title": "...", "text": "full content", "url": "...", "metadata": {}} as JSON-encoded text.

Example:

{
  "content": [
    {
      "type": "text",
      "text": "{\"id\":\"doc-1\",\"title\":\"...\",\"text\":\"full text...\",\"url\":\"https://example.com/doc\",\"metadata\":{\"source\":\"vector_store\"}}",
    }
  ]
}

Reasoning: search provides lightweight previews for relevance ranking; fetch delivers payloads for reasoning. Deviation risks model parsing failures. Non-obvious: URLs enable citations in research outputs; metadata adds provenance without bloating text.

"For ChatGPT deep research and company knowledge... your MCP server should implement two read-only tools: search and fetch, using the compatibility schema." This enforces minimal viable integration.

FastMCP Implementation in Python

Use FastMCP (GitHub: https://github.com/jlowin/fastmcp) for a lightweight Python server. Full code integrates OpenAI client for vector store queries:

  1. Install: pip install fastmcp openai.
  2. Define tools querying the store by ID.
  3. Run server, expose endpoints.

Replit demo (https://replit.com/@kwhinnery-oai/DeepResearchServer) allows instant testing: remix, add API key/vector ID, connect to ChatGPT.

Other frameworks exist across languages, but all must match MCP tool specs. Tradeoffs: FastMCP is simple for prototypes but may need scaling (e.g., async for high QPS). Authentication via Apps SDK handles user sessions.

"In this example, we are going to build our MCP server using Python and FastMCP." Practical choice for rapid iteration.

Deployment and ChatGPT Integration

Post-server build:

  • Follow Apps SDK: Quickstart, MCP server build, connect in ChatGPT developer mode.
  • For data-only: Skip UI, focus on tools.
  • Supports chat, deep research, API (Responses API).
  • Terminology: Connectors → apps (Dec 17, 2025 update).

Production tips: Secure with auth (Apps SDK guide), test via submission guidelines. Use for company knowledge in Business/Enterprise. Evolution: From legacy Assistants to MCP for better scalability.

"Note: For ChatGPT app setup (developer mode, connecting your MCP server, and optional UI), start with the Apps SDK docs."

Key Takeaways

  • Use vector stores for quick private data setup; upload via dashboard/API and note ID.
  • Implement exactly search (query → results list) and fetch (ID → full doc) with JSON-in-text MCP format.
  • Build with Python FastMCP for simplicity; test on Replit before deploying.
  • Prioritize read-only tools for ChatGPT/deep research compatibility; add metadata/URLs for citations.
  • Integrate via Apps SDK: auth, connect in developer mode, submit for production.
  • Scale tradeoffs: Monitor costs, ensure schema precision to avoid model errors.
  • Extend beyond vectors to any data source following MCP specs.

Summarized by x-ai/grok-4.1-fast via openrouter

8945 input / 2928 output tokens in 17934ms

© 2026 Edge