№ 02 / SUMMARIES

#devops-cloud

Every summary, chronological. Filter by category, tag, or source from the rail.

Tag · #devops-cloud
DAY 01Sunday MAY 10 · 20261 SUMMARIES
AI EngineerAI Automation

Replay Logs Fail Agents: Use VM Snapshots Instead

Replay durability constrains agent code with growing logs; split into context logs (DB durable) and execution snapshots (14MB Firecracker VMs, <1s save/100ms restore) for multi-day sessions.

AI Engineer
DAY 02Friday MAY 8 · 20262 SUMMARIES
AI RevolutionAI News & Trends

OpenAI's Real-Time Voice AI Powers Agents, Backed by MRC Networking

OpenAI's GPT-Realtime-2 enables live voice agents with GPT-4o reasoning, 128k context, parallel tools, and 96.6% audio accuracy; MRC networking spreads data across paths for 131k-GPU clusters with microsecond failure recovery.

AI Revolution
Level Up CodingSoftware Engineering

Token Bucket Fails at Window Boundaries—Use Sliding Window

Token bucket rate limiting lets clients burst 40 requests across a minute boundary despite 100/min limit; sliding window counters prevent this by tracking requests in the last N seconds from now, enforcing even distribution.

DAY 03Thursday MAY 7 · 20261 SUMMARIES
Data and BeyondDevOps & Cloud

AI Agents Expose IDP Flaws Built for Humans

Internal Developer Platforms (IDPs) assume human interpreters for ambiguities like unclear errors and tribal knowledge; AI agents fail because they execute exactly as interfaces allow, demanding explicit, machine-readable contracts to avoid disasters like deleting entire databases.

Data and Beyond
DAY 04May 6, 2026 MAY 6 · 20263 SUMMARIES
Level Up CodingDevOps & Cloud

Manual Deployment Unlocks Foundry Hosted Agents

Deploy Foundry hosted agents by building container images in ACR, setting up Foundry Project with RBAC, creating via Azure SDK with env vars and resources (cpu=0.25, mem=0.5Gi), then assigning Azure AI User RBAC to Agent ID—avoids azd preview failures.

Level Up Coding
Google Cloud TechDevOps & Cloud

Migrate MongoDB to Firestore Serverless Seamlessly

Firestore's MongoDB-compatible API lets you reuse existing code, drivers, and aggregation pipelines on a serverless DB with real-time queries for AI agents and five-nines availability.

Towards AIAI & LLMs

Agent 365: Govern Sprawling AI Agents Securely

Microsoft Agent 365 acts as a control plane to observe, govern, and secure AI agents across Microsoft tools, local devices, multi-cloud platforms, and SaaS partners, addressing agent sprawl with discovery, policy controls, and runtime blocking—now generally available at $15/user/month.

DAY 05May 5, 2026 MAY 5 · 20261 SUMMARIES
KodeKloudAI Automation

Claude Managed Agents: Infra-Free Deployment at $0.08/Hour

Anthropic's Claude Managed Agents offloads agent infra, security, and scaling to their cloud for $0.08 per session-hour + tokens, letting you build via API—but vendor lock-in and costs demand ROI checks.

KodeKloud
DAY 06May 4, 2026 MAY 4 · 20261 SUMMARIES
Google Cloud TechAI & LLMs

Scale GenAI to Billions of Rows in BigQuery at 94% Less Cost

BigQuery's optimized mode distills LLMs into lightweight models using embeddings, slashing token use by 94% (55M to 3M) and query time from 16min to 2min on 34k images or 50k voice commands, scaling to billions of rows.

Google Cloud Tech
DAY 07May 3, 2026 MAY 3 · 20261 SUMMARIES
AI Engineer

Engineer AI Context Like Code: Full Lifecycle

Treat AI agent context as code with a Context Development Lifecycle—Generate, Evaluate, Distribute, Observe—to create reliable, scalable prompts that drive better agent outputs via testing, sharing, and feedback loops.

AI Engineer
DAY 08May 1, 2026 MAY 1 · 20263 SUMMARIES
Level Up CodingSoftware Engineering

Flink Treats Batch as Streaming for Unified Low-Latency Processing

Apache Flink processes unbounded streams and bounded batches with one engine using operators, state, windows, and exactly-once guarantees, eliminating dual codebases for real-time apps like recommendation engines handling millions of events.

Level Up Coding
The AI Daily BriefAI News & Trends

Harness-as-a-Service Fuels Reliable AI Agents

Big tech earnings reveal explosive AI cloud growth amid compute shortages. Harness-as-a-Service platforms like Cursor SDK and managed agents provide sandboxed runtimes, shifting agent building from DIY harnesses to scalable infrastructure.

Vercel BlogDevOps & Cloud

Vercel Sandbox Firewall Enables Postgres Connections

Vercel Sandbox now supports outbound Postgres connections to hosted DBs like Neon and Supabase by detecting TLS upgrades during negotiation—no code changes required, just add DB host to allowed domains.

DAY 09April 16, 2026 APR 16 · 20261 SUMMARIES
Google Cloud TechDevOps & Cloud

Scale 60M req/mo solo on Cloud Run for $180

Solo builder scales feature flag SaaS RocketFlag to 60M requests/month across regions using Go on Cloud Run, batch DB writes to Firestore/BigQuery, and Cloud Armor—total Dec bill $180 USD (252 AUD) with zero SRE time.

Google Cloud Tech
DAY 10April 15, 2026 APR 15 · 20262 SUMMARIES
Towards AI

Scaling LLM Inference: KV Cache, Batching, Spec Decoding & Multi-LoRA

Production LLM serving shifts from training's throughput focus to inference's memory-bound latency challenges, solved by PagedAttention (96% util), continuous batching, EAGLE-3 (up to 6.5x speedup), and FastLibra for multi-LoRA (63% TTFT cut).

Towards AI
Towards AIAI & LLMs

Healthcare LLM Rate Limits: 2 Fail, 1 Works

Simple per-user rate limits on LLM APIs fail to stop credential stuffing attacks (causing $47K bills) and block critical clinical workflows; context-aware throttling with priority and anomaly detection is the only production-ready solution.

DAY 11April 14, 2026 APR 14 · 20261 SUMMARIES
Better StackDevOps & Cloud

Zrok: Open-Source ngrok Fix for Secure Localhost Sharing

Zrok enables one-command sharing of localhost apps, files, TCP/UDP services publicly or privately via tokens—zero-trust on OpenZiti beats ngrok's limits, random URLs, and public exposure without port forwarding.

Better Stack
DAY 12April 13, 2026 APR 13 · 20261 SUMMARIES
Towards AIData Science & Visualization

Snowflake-Native Fraud ML Pipeline: Train to Monitor

Build end-to-end fraud detection with XGBoost in Snowflake ML—data loading to drift monitoring—avoiding data gravity, handling 0.5-2% imbalance via scale_pos_weight=27.6, achieving ROC-AUC=0.7275 and optimal F1=0.5874 at threshold=0.58.

Towards AI
DAY 13April 11, 2026 APR 11 · 20261 SUMMARIES
Better StackDevOps & Cloud

Run S3-Compatible MinIO Locally to Cut Dev Costs

Deploy MinIO via Docker on your laptop for S3-compatible object storage using unchanged boto3 Python code, solving AWS S3 cost, latency, and lock-in issues for local dev and AI/RAG pipelines.

Better Stack
DAY 14April 10, 2026 APR 10 · 20261 SUMMARIES
AI EngineerAI Automation

Enterprise Registry Unifies MCP & A2A Agents at Scale

Build private MCP and A2A registries enriched with enterprise metadata to enable discovery, governance, lineage, and standardized deployment across global teams building AI agents.

AI Engineer
DAY 15April 8, 2026 APR 8 · 20266 SUMMARIES
Towards AI

Scale RAG to Production: Fix 8 Anti-Patterns with 5 Pillars

RAG fails in production due to 8 anti-patterns like vector-only retrieval and stateful pods; counter them with 5 pillars—governance, core hardening, retrieval smarts, agent actions/memory, and security/FinOps—for reliable, observable systems.

Towards AI
AI SupremacyAI News & Trends

AI Chokepoints: Chips, Power Reshape Global Race

Frontier AI shifts from diffusible software to physical chokepoints in chips, helium, HBM/DRAM, power delivery, concentrating capability in few geographies like the US.

Learning DataData Science & Visualization

Fixing ML Pipelines for Databricks Constraints

Databricks free workspaces block public DBFS, continuous triggers, and large models—use Unity Catalog volumes, micro-batch streaming, vector_to_array for probs, and top-50k user subsets to ship reliably.

Your Average Tech BroBusiness & SaaS

Scaled SaaS to 25K Users/$8K MRR: Social + Tech Fixes

Grew Yorby.ai to 25K users/$8K MRR using 40 organic creator accounts; fixed Supabase connection pooling with a dedicated GCE server; pivoted ICP from prosumer creators to low-churn agencies via PostHog analysis.

AI EngineerAI & LLMs

Sandbox AI-Generated Code with Capability Security

Run untrusted LLM-generated code in isolates or containers using capability-based security: explicitly allow only needed access to block hallucinations, leaks, and injections.

AI EngineerAI & LLMs

Secure MCP Servers for Production with 5 Principles

Design MCP servers for agents using 5 principles to shrink attack surface and block OASP top 10 threats; deploy remotely via HTTP with OAuth 2.1, preferring CIMD over DCR for dynamic client auth.

DAY 16April 4, 2026 APR 4 · 20262 SUMMARIES
Google Cloud Tech

Gemini CLI: Context to CI/CD for Production AI Agents

Gemini CLI turns natural language 'vibe coding' into full ADK agents with context engineering, skills, hooks, tests, and automated Cloud Run deployment—proving AI can handle end-to-end dev without manual coding.

Google Cloud Tech
IBM Technology

Secure Agentic AI with Tokens & Delegation

Prevent credential replay, rogue agents, and overpermissioning in agentic flows using verifiable agent identities, delegation tokens, token exchanges at each hop, scoped permissions, and secure vaults for last-mile access.

DAY 17April 3, 2026 APR 3 · 20261 SUMMARIES
The PrimeTimeAI News & Trends

Copilot Injects Ads into 11K GitHub PRs

Microsoft's GitHub Copilot added ad-like promotions for Raycast to 11,400 pull requests, prioritizing AI usage over fixing GitHub's 90 incidents in 90 days and 90.84% uptime.

The PrimeTime
DAY 18March 31, 2026 MAR 31 · 20261 SUMMARIES
Google Cloud Tech

Build Graph RAG Multi-Agents for Multimodal Data

Step-by-step workshop to ingest images/videos/text into Cloud Spanner graph DB, add embeddings for Graph RAG search, orchestrate multi-agents with ADK, and enable long-term memory—all using Google Cloud for real-time survivor matching.

Google Cloud Tech

Showing 30 of 40