Summaries · #deep-learning

DAY 01Yesterday MAY 6 · 20262 SUMMARIES

Generative AIMay 6, 2026

Generative AI: Prediction to Creation via Scale

Generative AI shifts machines from analyzing data (traditional AI's strength) to creating new content like text or images, powered by Markov chains, deep learning, and massive datasets/compute yielding $33.9B investment in 2024.

Generative AI

Towards AIAI & LLMsMay 6, 2026

GPU Bandwidth Limits LLM Speed, Not FLOPS

Generating one token from a 70B model on H100 needs 140GB weight reads—one op per byte—making memory bandwidth the inference bottleneck, not compute throughput.

DAY 02April 28, 2026 APR 28 · 20261 SUMMARIES

Caleb Writes CodeApr 28, 2026

Diffusion: Data-Efficient Framework Outshining Autoregressives on Scarce Data

Diffusion is a training framework—not architecture—that creates extra samples by gradually noising clean data over 1,000 steps, outperforming autoregressives on 25-100M tokens where data is limited but compute abundant; lags in text due to slow inference and infrastructure.

Caleb Writes Code

DAY 03April 26, 2026 APR 26 · 20261 SUMMARIES

Andrej Karpathy BlogAI & LLMsApr 26, 2026

Karpathy's 200-Line Pure Python AI Builds

Train GPT, RNNs, RL Pong, and Bitcoin tx in pure Python with zero dependencies—distilling neural nets to essentials in under 200 lines.

Andrej Karpathy Blog

DAY 04April 21, 2026 APR 21 · 20263 SUMMARIES

AI EngineerAI & LLMsApr 21, 2026

DeepMind's Diffusion Model Training Secrets

Sander from DeepMind reveals data curation trumps model tweaks, latent autoencoders enable scale, diffusion denoises via spectral autoregression for superior audiovisual generation.

AI Engineer

Towards AIApr 21, 2026

PCL: Confidence RL for Dynamic LLM Environments

PCL algorithm integrates predictive confidence scores into LLM RL rewards via ensembles and blended token/sequence signals, enabling adaptation to nonstationary changes without retraining.

Generative AIAI & LLMsApr 21, 2026

Sentences Define Word Meanings via Self-Attention

Transformers ended 30 years of sequential processing flaws by using self-attention, where every word weighs relevance from the entire sentence context, powering GPT and all modern LLMs.

DAY 05April 20, 2026 APR 20 · 20263 SUMMARIES

Caleb Writes CodeAI & LLMsApr 20, 2026

LLM Inference: mmap Loading & Quantization Deep Dive

Efficient LLM inference hinges on mmap for lazy memory loading (e.g., <10s startup on llama.cpp) and quantization like GGUF K-Quants or AWQ/EXL2 to shrink 15GB models while preserving quality via salient weights and mixed precision.

Caleb Writes Code

Andrej Karpathy BlogAI & LLMsApr 20, 2026

Karpathy's Blog: Pure Python AI From Scratch

Andrej Karpathy distills neural nets into minimal Python code—200 lines for GPT training/inference—plus RL, RNNs, and human baselines on vision tasks.

Level Up CodingData Science & VisualizationApr 20, 2026

Preprocessing Swings CNN Accuracy from 65% to 87% on CIFAR-10

Raw CIFAR-10 pixels yield 65% test accuracy; normalization/standardization lift to 69%; geometric augmentation maintains ~67%; photometric brightness/contrast crashes to 20%; combined pipeline with deeper CNN hits 87%.