60-Min Fix: Hardcoded Agent to Scalable RAG Beast
Luis Sala and Jacob Badish refactor Jacob's 'vibe-coded' outreach agent from hardcoded case studies to a production RAG system using ADK, Vertex AI Vector Search, and Gemini in 60 minutes.
Jacob's Vibe-Coded Prototype: Strengths and Hardcoded Limits
Jacob Badish, a non-technical executive, built 'Project Titanium' solo during evenings and weekends. The agent targets executives at customer companies, researches their pain points via Google Search grounding, verifies facts to curb hallucinations, and drafts personalized outreach emails tying issues to Google solutions. It uses parallel 'fan-out' tasks for multi-company research, exponential backoff for reliability, low temperature for factual outputs, and Gemini SDK calls.
"I vibe coded this in the evenings and weekends. I was blown away at how doable it was. If I could do it, anyone can do it," Jacob says. Key wins: Reduced runtime from 15 minutes via parallelism; self-taught robustness like backoff from Gemini prompting. But limits: Hardcoded 10-12 case studies lead to repetitive, non-scalable outputs. With 1,600+ public Google case studies available, dynamism is needed for team rollout.
Luis Sala praises the structure: "You are actually doing what's called a fan out... Essentially we can think of them as sub-agents each one responsible for processing a single account." Yet, hardcoded data and basic SDK make scaling brittle.
Migrating to ADK: Maintainable Agent Foundations
Luis recommends shifting from raw Gemini SDK to Agent Development Kit (ADK) for production. ADK enables modular agents (e.g., sequential pipelines) while replicating v1 workflow first, then iterating. They prompt a coding agent (using Antigravity skills) to port code: Generate plan, verify, build root/sequential agents for research, verification, email drafting.
"When developing an agent, it is absolutely perfectly a legitimate tactic to use the native SDK... however... shift over to a specialized SDK specifically designed for agents," Luis advises. Baby steps preserve fan-out parallelism. ADK code stays Python, adds env vars for GCP creds. Hiccups like freezes are debugged live, emphasizing iterative planning: "The idea of creating a plan is vital. We don't want to just start coding without having at least an idea."
Jacob concurs: "I always now add in then reverify your work. Make it go back a second time cuz it catches things that it misses."
Dynamic Case Studies via Crawler and Vertex AI Vector Search
Core upgrade: Replace hardcoded studies with RAG. Luis' coding agent builds a Playwright crawler for Google's case study site (https://cloud.google.com/customers). Phase 1: Load pages, click 'show more' repeatedly, extract 1,600+ URLs. Phase 2: Fetch HTML, use Gemini to reformat as markdown JSON, chunk and embed into Vertex AI Vector Search 2.0.
No local ChromaDB or Pinecone—managed Vertex for scalability. Query function hybrids semantic + text search: "We're going to execute a semantic search... and then... a text search and we're going to combine those results... sometimes you might need to do a hybrid search."
Ingestion outputs massive JSON; agent queries tie company pains to top matches. Live demo: Input company/role, triggers searches, vector retrieval, consolidated intel, relevant cases, punchy email.
"The accuracy and value that this can bring now having a vector database really up and running... it's incredible," Jacob exclaims post-demo.
Production Polish: UI, Deployment, and Trade-offs
They add a simple Firebase UI for company/role input, copy-paste outputs. Next: Code cleanup, blog post (links in description). Jacob critiques: Luis spent 20 minutes explaining before coding—"trying to get to the production building phase quicker." Luis admits: "I think the problem is I talked too much."
Trade-offs: ADK boosts maintainability but introduces agent-specific patterns; Vertex scales but ties to GCP. Crawler handles dynamic sites but risks changes (e.g., button selectors). Hybrid search balances precision/recall.
Q&A Deep Dive: Vector Architecture and Realities
Post-video chat unpacks: Crawler uses Playwright CLI for headless browsing. Gemini structures content pre-embed. Chunking: Markdown per case study, Gemini-extracted for relevance. Alternatives like Chroma viable locally, but Vertex for prod.
"We first needed to create a crawler that could crawl a specific website... extract the URLs... then... content extraction process," Luis details. Exposure via ADK functions. Jacob notes chat energy mirrored his questions, validating common pains.
Key Takeaways
- Start agents with raw SDKs like Gemini for prototypes, migrate to ADK for modularity and sequential pipelines.
- Build reliability early: Fan-out parallelism, exponential backoff, low temperature, verify prompts.
- For RAG on web data, chain Playwright crawler (URL discovery + content fetch) with LLM markdown extraction before vector ingest.
- Use hybrid semantic + text search in Vertex AI for robust retrieval; combine results explicitly.
- Plan before coding: Prompt coding agents (e.g., Antigravity) for step-by-step ports, always reverify.
- Add UI last (e.g., Firebase) for usability; polish code readability post-MVP.
- Non-technical builders: Iterate via Gemini collaboration—"go back and forth... rewrite the code redeploy the code test it."
- Scale prototypes by dynamic data: Crawl once, query forever vs. hardcoding.
- Timebox fixes: 60 minutes forces focus, exposes real hiccups like freezes or rate limits.