Build Graph RAG Multi-Agents for Multimodal Data

Streamline Setup for Hands-On Google Cloud AI Lab

This workshop assumes basic familiarity with Google Cloud Console and terminals, targeting developers new to agentic AI pipelines but comfortable with Python and APIs. Prerequisites: A Gmail account (avoid edu/corporate for restrictions), free lab credits via provided link. Start by redeeming credits at the lab's credit link—click to associate with your account, confirm in Console > Billing > Credits (ignore warnings, check for associated billing account).

Open Google Cloud Shell (VS Code-like managed VM, persistent across sessions). Authenticate with gcloud auth login, clone repo git clone <way-back-home-repo>, create project via setup script (./create-project.sh): auto-generates project ID (e.g., way-back-home-XXXX), attaches billing, enables APIs. Run gcloud config set project <ID>, enable APIs (gcloud services enable ... for AI Platform, Cloud Build, Run, Spanner, etc.—no costs until usage). Execute setup script (./setup.sh) to generate .env with vars (API keys, project ID; toggle hidden files in editor if missing). Sync deps with uv sync (Rust-based, faster than pip/venv for pyproject.toml deps like Gemini SDK). Load initial survivor data (python load_initial_data.py), creating Spanner instance—click output link to verify.

Common pitfalls: Mismatched accounts/projects (double-check yellow project indicator); empty .env (rerun setup); API delays (expected warnings). Use virtual TA bot or chat for blockers. Rocket emoji steps are core; coffee/optional for depth (e.g., deploy to Cloud Run later).

Quote: "UV sync essentially behind the scenes we're using Rust and it creates a virtual environment for you with all the packages um kind of managed. So it's a more modern way of managing your virtual environments."

Model Complex Relationships in Spanner as Unified Graph+Vector Store

Spanner unifies profiles, graphs, vectors—avoid separate Postgres/Neo4j/VectorDB+ETL. Schema implicitly models survivors (blue nodes), locations/biomes (red), skills, needs, resources as interconnected entities. Initial data: 4 survivors with skills (e.g., engineering, medical), needs (food, treatment), locations.

Load via script populates tables with relations (e.g., Survivor → HAS_SKILL → Skill, Need → REQUIRES → Skill). Visualize in Spanner Studio (Console > Spanner > Instances > your-instance > Spanner Studio):

Query survivors/locations: SELECT * FROM survivors JOIN biomes... → graph view shows nodes/edges.
Skills: Hover blue (survivor) → red (biome); some have 2-3 skills.
Needs: Filter crises (medical, technical, science).
Matches: Edges between skills/needs show potentials (e.g., doctor → injury).

This powers 3D graph UI (updates real-time post-ingestion). Quality criteria: Clear node colors/types, hover details, edge labels for relations. Before: Flat tables; after: Interactive graph reveals matches (e.g., hungry engineer near medic).

Quote: "With Spanner, you can kind of unify all that all one place. You can do vector um u management in one place. You can do draft management in one place. So it's a really good production um you know standard um for you know one unified database that kind of simplifies data governance."

Generate Embeddings In-Database for Efficient RAG Foundations

Embeddings (numeric vectors for text/images) convert unstructured multimodal data (selfie+speech: "I'm Annie, engineer, hungry") to structured entities/skills/needs. Avoid Python-side: Spin up notebooks, import SDKs, query DB, embed serially—slow, unscalable.

Create models directly in Spanner (bottom-up stack):

CREATE MODEL text_embedding USING 'gemini-embedding-001' (or multimodal like Gemma).
Service layer: RAG/keyword/hybrid search calls model in-SQL (parallelized, DB-native).

Benefits: Faster (no client roundtrips), scalable (vectors stored alongside graph). Later: Tooling layer connects services; agent layer reasons over tools (model=brain, tools=toolbox).

Quote: "Spanner is that you can essentially uh create the model definition directly within the DB. You can create the embedding definition for the embedding model directly in DB and then you can call it directly in the in the DB. So it's parallelized, it's more efficient, faster outputs."

Ingest Multimodal Data via Multi-Agent Pipeline to Update Graph

Challenge: Time-critical matching (e.g., injury → medic) from SOS signals (photo/video/text). Process: Capture → Extract entities/relations (skills: engineer; needs: food/medical; location) → Embed → Insert/update Spanner graph.

Use Google Agent Development Kit (ADK, open-source): Orchestrate multi-agents.

Multimodal agent: Gemini processes image/video/text → structured JSON (name, skills, needs).
Graph updater agent: Embeds entities, queries Spanner for matches, inserts relations (e.g., NEW_SURVIVOR → HAS_NEED → Hunger).
Pipeline: Upload UI → ADK agents → Real-time graph/3D viz update + search.

Code in repo (level2/): Agents as ADK classes with tools (Spanner client, Gemini). Run locally first (python app.py), optional deploy Cloud Run/Agent Engine. Pitfalls: Unstructured extraction fails → Use structured outputs/prompts; duplicates → Idempotent inserts.

Before: Siloed data; after: Dynamic graph (e.g., new medic matches injury).

Quote: "agent is like we have model as a brain and we have a tool uh tools in the toolbox and model as a brain going to reasoning and pick the tool uh to solve your problem right so you need to attach the tools to the agent."

Enable Graph RAG Search and Long-Term Agent Memory

Graph RAG: Hybrid semantic+graph traversal over Spanner (vectors for similarity, edges for relations). Query: "Who can treat injuries near biome X?" → Embed query → KNN vectors → Traverse skills/needs → Ranked matches.

Implementation:

Tool: SQL with embeddings (SELECT ..., ML.GENERATE_EMBEDDING(...) SIMILARITY).
Agent reasons: Pick graph_search tool → Hybrid results.

Memory: Long-term personalization. Store uploads/searches in Spanner (e.g., UserSession → History). Agents query history for context ("Remember last match? Update with new data."). Avoid stateless: Persist vectors/graphs for recall.

Quality: Precise matches (graph > vector alone); real-time (in-DB). Practice: Upload mock video ("Injured, need doctor"), search → See graph evolve.

Quote: "we need to make this time critical decision that we want to match the helper uh need in real time. what if they have this emergency need for the medical need right and then lastly we we want to cover like long-term memory how do we remember the data that all the survival upload."

Key Takeaways

Redeem lab credits with Gmail; use Cloud Shell for persistent dev env—run uv sync for modern Python deps.
Unify graph+vector in Spanner: Schema survivors → skills/needs; visualize queries in Studio for insights.
Embed in-DB (CREATE MODEL Gemini): Parallel, efficient vs. app-side—call in SQL for RAG.
Multi-agent with ADK: Multimodal ingest → entity extraction → graph update; attach Spanner/Gemini tools.
Graph RAG: Hybrid vector+traversal for relation-aware search (e.g., skills matching needs).
Add memory: Persist sessions/searches in Spanner for personalized, stateful agents.
Test iteratively: Load data, query graphs, upload multimodal—watch 3D UI update.
Scale: Deploy agents to Cloud Run; optional AM memory bank for advanced persistence.
Avoid: Account mismatches, unenabled APIs—TA/chat for help; focus rocket steps first.