The Shift to the Agentic Data Cloud

Traditional enterprise data architectures are often "walled gardens" of siloed databases that force developers to move data to AI, creating latency and losing real-time business context. Google’s strategy is to "move AI to the data" by creating an Agentic Data Cloud. This platform integrates AI at every layer of the stack—from TPU-accelerated inference to model-integrated SQL—transforming databases from passive systems of insight into active systems of action.

AI-Native Databases: Beyond Storage

An AI-native database understands data through built-in AI primitives rather than just storing it.

  • AlloyDB for PostgreSQL: A powerhouse that combines relational storage with high-performance vector search. It utilizes Google’s proprietary SCAN index (used in YouTube and Search) to support 10 billion+ vectors and provides columnar indexing to boost HNSW performance by up to 4x.
  • Hybrid Search & Re-ranking: By combining vector search with BM25-based full-text search, AlloyDB enables holistic queries. Developers can use AI functions like AI.rank to re-rank search candidates using Gemini’s world knowledge, allowing for nuanced intent matching (e.g., understanding that "Santorini" implies specific weather and clothing needs).
  • Multimodal Capabilities: The platform supports time-series forecasting via the Times FM model, allowing agents to perform complex predictions in seconds that previously required days of manual processing.

Bridging the Gap: The Data Agent Platform

Moving from a simple demo to a production-ready agent requires solving the "accuracy gap" and the "security gap." Google’s Data Agent Platform addresses this through:

  • Contextual Accuracy: The platform uses schema ontologies, query blueprints, and value searches to guide LLMs toward near 100% accuracy in text-to-SQL tasks.
  • Deterministic Security: Instead of relying on the agent to be "safe," the platform uses parameterized secure views. These act as deterministic guardrails, ensuring that even if an agent is manipulated, it can only access data authorized for the specific end-user.

Open Standards and MCP

Google is heavily invested in the Model Context Protocol (MCP) to ensure interoperability. By providing managed MCP servers, Google allows agents to interact with the entire Google Cloud ecosystem—provisioning databases, executing SQL, and performing observability tasks—without custom scaffolding. The open-source MCP Toolbox has reached 1.0 status, supporting over 40 data sources and fostering a community-driven approach to agentic connectivity.

Key Takeaways

  • Move AI to the data: Avoid the latency of moving enterprise data to external AI models by using databases that have AI primitives (vector, graph, forecasting) built into the SQL layer.
  • Use hybrid search: Don't rely on keyword search alone. Combine vector search with full-text search (BM25) to capture user intent accurately.
  • Prioritize deterministic security: Use parameterized secure views rather than relying on LLM prompts to enforce access control; this prevents prompt injection and unauthorized data access.
  • Leverage MCP: Adopt the Model Context Protocol to standardize how your agents discover and interact with your infrastructure, reducing the need for custom integration code.
  • Focus on context: To reach 100% accuracy in text-to-SQL, provide the model with schema ontologies and query blueprints that define the specific business logic of your data.

Notable Quotes

  • "What if instead, you could move AI to data and break down those walls?" — Amit Ganesh, on the core philosophy of the Agentic Data Cloud.
  • "The graph model is virtual and layered over your SQL model... You don't need to create a graph copy." — Yiannis Papakonstantinou, explaining how Spanner handles graph RAG without data duplication.
  • "The promise of generative AI clashes with the unforgiving nature of production-ready database agentic applications." — Yiannis Papakonstantinou, on the necessity of moving beyond simple demos to robust, secure infrastructure.
  • "Natural language is the new protocol for human-to-agent and agent-to-agent communication." — Amit Ganesh, on the shift in how systems interact with data.