Bringing LLM Intelligence to SQL Workflows

AlloyDB AI functions bridge the gap between static database storage and generative AI by allowing developers to invoke foundation models like Gemini directly within SQL queries. This approach enables complex operations—such as reranking hybrid search results based on external knowledge, filtering transactions for fraud, or converting unstructured text into structured JSON—without needing to extract data to an external application layer.

Key generally available functions include:

  • Ranking & Filtering: ai.rank (including semantic ranking) and ai.if for intelligent, context-aware filtering.
  • Generation: ai.generate for transforming unstructured data into structured formats.
  • Forecasting: ai.forecast, powered by the TimesFM model, for predictive analytics on historical data.
  • Insights: ai.analyze_sentiment, ai.summarize, and ai.agg_summarize for distilling large volumes of text or multi-row data into actionable insights.

Optimizing Performance and Cost

A primary barrier to using LLMs in databases is the latency and cost of row-by-row API calls. AlloyDB addresses this through "Optimized AI Functions." Instead of calling a remote LLM for every row, the system trains a local model on your specific embeddings and LLM outputs. When a query is executed, the database invokes this local model, which can process up to 100,000 rows per second. Benchmarks indicate this method is up to 23,000 times faster and 6,000 times cheaper than traditional row-at-a-time LLM calls, costing less than one-tenth of a cent per operation. Additional performance acceleration is achieved through asynchronous bulk prompting, AI function acceleration, and array-based processing.