#deep-learning
Every summary, chronological. Filter by category, tag, or source from the rail.
Generative AI: Prediction to Creation via Scale
Generative AI shifts machines from analyzing data (traditional AI's strength) to creating new content like text or images, powered by Markov chains, deep learning, and massive datasets/compute yielding $33.9B investment in 2024.
GPU Bandwidth Limits LLM Speed, Not FLOPS
Generating one token from a 70B model on H100 needs 140GB weight reads—one op per byte—making memory bandwidth the inference bottleneck, not compute throughput.
Diffusion: Data-Efficient Framework Outshining Autoregressives on Scarce Data
Diffusion is a training framework—not architecture—that creates extra samples by gradually noising clean data over 1,000 steps, outperforming autoregressives on 25-100M tokens where data is limited but compute abundant; lags in text due to slow inference and infrastructure.
Caleb Writes CodeKarpathy's 200-Line Pure Python AI Builds
Train GPT, RNNs, RL Pong, and Bitcoin tx in pure Python with zero dependencies—distilling neural nets to essentials in under 200 lines.
DeepMind's Diffusion Model Training Secrets
Sander from DeepMind reveals data curation trumps model tweaks, latent autoencoders enable scale, diffusion denoises via spectral autoregression for superior audiovisual generation.
AI EngineerPCL: Confidence RL for Dynamic LLM Environments
PCL algorithm integrates predictive confidence scores into LLM RL rewards via ensembles and blended token/sequence signals, enabling adaptation to nonstationary changes without retraining.
Sentences Define Word Meanings via Self-Attention
Transformers ended 30 years of sequential processing flaws by using self-attention, where every word weighs relevance from the entire sentence context, powering GPT and all modern LLMs.
LLM Inference: mmap Loading & Quantization Deep Dive
Efficient LLM inference hinges on mmap for lazy memory loading (e.g., <10s startup on llama.cpp) and quantization like GGUF K-Quants or AWQ/EXL2 to shrink 15GB models while preserving quality via salient weights and mixed precision.
Caleb Writes CodeKarpathy's Blog: Pure Python AI From Scratch
Andrej Karpathy distills neural nets into minimal Python code—200 lines for GPT training/inference—plus RL, RNNs, and human baselines on vision tasks.
Preprocessing Swings CNN Accuracy from 65% to 87% on CIFAR-10
Raw CIFAR-10 pixels yield 65% test accuracy; normalization/standardization lift to 69%; geometric augmentation maintains ~67%; photometric brightness/contrast crashes to 20%; combined pipeline with deeper CNN hits 87%.