Why AI Agents Fail: Shubham Saboo on Simple Fixes via ADK

From Prompt Engineering to User-Centric Agents

Shubham Saboo traces AI agents' evolution from GPT-3's janky prompt loops to today's sophisticated systems. Back then, success meant crafting 'magic words' for structured outputs like JSON via endless afternoons of trial-and-error. "Previously, the art was just making sure you get something out of the model that you want," Saboo says. Now, structured outputs via Pydantic schemas make that trivial—models are 'table stakes,' a universal function anyone can access.

What endures: shipping winners understand users and problems deeply. "The people who are shipping the most useful apps are the ones who understand their users and the problems," Saboo emphasizes. Prompt quality mirrored problem insight then; now, with agents everywhere, communication clarity separates winners. Treat agents like interns: shape problems clearly for optimal output. Saboo's Awesome LLM Apps repo (105k stars, 15k forks) started as his personal GPT-3 experiment organizer—structured local samples for sanity—exploding to top-100 all-time GitHub status, proving developers crave runnable examples.

Smitha Kolan probes how this led to his Google PM role: Saboo built publicly to track fast-paced AI, unexpectedly aiding millions. His books—one on fresh GPT-3, another on neural search powering RAG/embeddings—cement his creds, but hands-on building drives his views.

Agent CLI: Terminal-Based Agent Factory

Google's Agent CLI bundles CLI tools and skills for any coding agent (Gemini CLI, Claude, etc.), eliminating hallucinations in ADK agent code. Install via one uvx command; it auto-scaffolds, evals, deploys on Agent Platform—handling YAML, env config, cloud setup from English prompts.

Saboo demos: Prompt Gemini CLI to "build a caveman style agent that compresses verbose text into technical grunts." It scaffolds files, installs deps, spins ADK web UI (localhost:8080) for testing—all in <1 minute. Chatbot grunts replies like "Me strong. Words too many. Fire big. Hunt now." Sidebar logs events, states, artifacts for debugging.

Deployment? Prompt for options (Agent Engine for serverless scaling, costs ~$0.01/hour); explicit approvals prevent surprises. Deploys to cloud console dashboard with traces, playground, shareable endpoint in 5-10 mins—no console hopping or doc-pasting.

Evals auto-generate/test: Saboo prompts 20 criteria; all pass (flags fails for fixes). Extend via prompts: Add Google Search (internet access), multi-agent workflows, RAG. "99% of the time in one shot," Saboo claims, covering dev lifecycle sans terminal leaves.

Kolan notes: Pre-CLI, she'd paste ADK docs into Gemini; now prepackaged skills skip that.

Multi-Agent Mastery and Production Resilience

Saboo builds a "PR Roaster": Multi-agent system critiquing GitHub PRs. Leverages ADK 2.0's graph-based workflows over pure prompts—nodes for planning, analysis, roasting. Live demo roasts Kolan's code: "This function is like a caveman trying to invent the wheel... but ending up with a square rock."

Production pitfalls: 99% fail from ignoring realities like dropped connections. ADK's resumable agents checkpoint state, retry seamlessly. Ambient agents run 24/7 on cron, handling long tasks autonomously.

Multi-lang support (Python, TS, Go, Java) via Agent Engine. Tools integrate natively: Google Search, Storage, MCPs. Observability baked-in: traces, metrics from deploy.

Saboo contrasts: Old agents = loop + parsing; now six cron agents automate his work. Hype chaser? No—focus simple architectures, clear comms.

Soft Skills Trump Code; Embeddings Remain Vital

Technical chops evolve, but 'soft skills' dominate: problem-shaping, user empathy. "How do you communicate with your agent? Do you understand the users?" Saboo asks. Creativity limits now, not models.

Rapid fire: RAG alive via better retrieval (not dead). Embeddings essential—every dev must grasp for agents/RAG. Saboo's can't-live-without: Gemini CLI for daily building.

Kolan highlights Saboo's arc: From solo experimenter to Google PM via open-source value.

"The model is a universal function now... Your only job now is to shape the problem."

Key Takeaways

Start with user/problem understanding—models commoditize; clarity wins.
Install Agent CLI (uvx); prompt coding agents for 99% scaffold/eval/deploy success.
Use ADK web UI locally for event logs before cloud deploy to Agent Engine.
Add tools (Search, RAG) via single prompts; auto-evals flag production issues.
Build resumable/ambient agents for 24/7 reliability—checkpoint state, cron jobs.
Prefer graph workflows for multi-agents over prompt chains; supports Python/TS/Go/Java.
Generate 20+ evals automatically; fix fails iteratively with coding agents.
Treat agents as interns: Simple English shapes output better than complex code.
Master embeddings for RAG/agents; skip hype, ship runnable examples like Awesome LLM Apps.

From Prompt Engineering to User-Centric Agents

Agent CLI: Terminal-Based Agent Factory

Multi-Agent Mastery and Production Resilience

Soft Skills Trump Code; Embeddings Remain Vital

Key Takeaways

More on Edge

Lattice Framework, AI Capex Boom, Local Models Rise

Max Claude Max OAuth for Safe Agentic Coding

Safely Maximize Claude Max with OAuth: Avoid Bans

GPT-5.5 on Vercel AI Gateway Powers Agentic Coding