Summaries · #cloud

DAY 01Yesterday MAY 6 · 20261 SUMMARIES

Level Up CodingSoftware EngineeringMay 6, 2026

Ditch preferred_username for Azure AD Guest Auth

Using preferred_username as identity anchor worked for employees but failed silently for all B2B guests, causing 403 errors post-launch. Anchor on oid instead for reliable identification.

Level Up Coding

DAY 02Tuesday MAY 5 · 20261 SUMMARIES

Google Cloud TechAI & LLMsMay 5, 2026

Secure AI Agents via MCP Toolbox Custom Tools

MCP Toolbox prevents confused deputy attacks by letting developers pre-write constrained SQL tools with bound parameters, separating agent flexibility from app-controlled security for runtime agents.

Google Cloud Tech

DAY 03Sunday MAY 3 · 20261 SUMMARIES

Towards AIAI & LLMsMay 3, 2026

SageMaker Fine-Tuning: LoRA Beats QLoRA on Cost-Perf Balance

LoRA cuts trainable params by 96% vs full fine-tuning, balancing cost savings and accuracy on Llama2-7B/Mistral7B; QLoRA saves 8x memory but trains slower due to dequantization overhead.

Towards AI

DAY 04April 30, 2026 APR 30 · 20263 SUMMARIES

Google Cloud TechDevOps & CloudApr 30, 2026

Bigtable Scales Petabytes for Real-Time NoSQL Workloads

Bigtable auto-scales to hundreds of petabytes and millions of ops/sec with low latency, powering Google Search/YouTube/Maps; ideal for time series, ML features, and streaming via Flink/Kafka integrations.

Google Cloud Tech

Learning DataDevOps & CloudApr 30, 2026

Scale PyTorch DDP Multi-Node on AWS EC2: Infra-First Guide

Multi-node DDP demands identical environments, data access, and open security groups across EC2 instances; use torchrun launcher with DDPManager for minimal code changes and reliable gradient sync via NCCL.

Caleb Writes CodeAI News & TrendsApr 30, 2026

TPUs Dominate at Infrastructure Scale Over Per-Chip GPU Wins

Google's TPU v8t (training) and v8i (inference) lag Nvidia GPUs per chip but deliver superior performance at scale—9600-chip superpods hit 121 exaFLOPS FP4—via cube topology and Virgo networking, optimizing for AI's bandwidth-heavy workloads.

DAY 05April 29, 2026 APR 29 · 20262 SUMMARIES

Google Cloud TechAI & LLMsApr 29, 2026

Next '26: Build Agents with ADK, Skills, and Gemini

Google Cloud Next '26 demos production multi-agent systems using open-source ADK for any language/model, modular skills for efficient context, and tools like MCP servers—open-sourced Race Condition repo for marathon planning.

Google Cloud Tech

Dwarkesh PatelApr 29, 2026

Batch Size Unlocks 1000x LLM Inference Efficiency

Reiner Pope deduces frontier LLM training and serving mechanics from roofline analysis, revealing batch size as the core driver of latency-cost tradeoffs, with optimal batches of ~2000 tokens amortizing weights for massive gains.