Edge
Subscribe
№ 02 / SUMMARIES

#deep-learning

Every summary, chronological. Filter by category, tag, or source from the rail.

Tag · #deep-learning
DAY 01Yesterday MAY 6 · 20262 SUMMARIES
Generative AI

Generative AI: Prediction to Creation via Scale

Generative AI shifts machines from analyzing data (traditional AI's strength) to creating new content like text or images, powered by Markov chains, deep learning, and massive datasets/compute yielding $33.9B investment in 2024.

Generative AI
Towards AIAI & LLMs

GPU Bandwidth Limits LLM Speed, Not FLOPS

Generating one token from a 70B model on H100 needs 140GB weight reads—one op per byte—making memory bandwidth the inference bottleneck, not compute throughput.

DAY 02April 28, 2026 APR 28 · 20261 SUMMARIES
Caleb Writes Code

Diffusion: Data-Efficient Framework Outshining Autoregressives on Scarce Data

Diffusion is a training framework—not architecture—that creates extra samples by gradually noising clean data over 1,000 steps, outperforming autoregressives on 25-100M tokens where data is limited but compute abundant; lags in text due to slow inference and infrastructure.

Caleb Writes Code
DAY 03April 26, 2026 APR 26 · 20261 SUMMARIES
Andrej Karpathy BlogAI & LLMs

Karpathy's 200-Line Pure Python AI Builds

Train GPT, RNNs, RL Pong, and Bitcoin tx in pure Python with zero dependencies—distilling neural nets to essentials in under 200 lines.

Andrej Karpathy Blog
DAY 04April 21, 2026 APR 21 · 20263 SUMMARIES
AI EngineerAI & LLMs

DeepMind's Diffusion Model Training Secrets

Sander from DeepMind reveals data curation trumps model tweaks, latent autoencoders enable scale, diffusion denoises via spectral autoregression for superior audiovisual generation.

AI Engineer
Towards AI

PCL: Confidence RL for Dynamic LLM Environments

PCL algorithm integrates predictive confidence scores into LLM RL rewards via ensembles and blended token/sequence signals, enabling adaptation to nonstationary changes without retraining.

Generative AIAI & LLMs

Sentences Define Word Meanings via Self-Attention

Transformers ended 30 years of sequential processing flaws by using self-attention, where every word weighs relevance from the entire sentence context, powering GPT and all modern LLMs.

DAY 05April 20, 2026 APR 20 · 20263 SUMMARIES
Caleb Writes CodeAI & LLMs

LLM Inference: mmap Loading & Quantization Deep Dive

Efficient LLM inference hinges on mmap for lazy memory loading (e.g., <10s startup on llama.cpp) and quantization like GGUF K-Quants or AWQ/EXL2 to shrink 15GB models while preserving quality via salient weights and mixed precision.

Caleb Writes Code
Andrej Karpathy BlogAI & LLMs

Karpathy's Blog: Pure Python AI From Scratch

Andrej Karpathy distills neural nets into minimal Python code—200 lines for GPT training/inference—plus RL, RNNs, and human baselines on vision tasks.

Level Up CodingData Science & Visualization

Preprocessing Swings CNN Accuracy from 65% to 87% on CIFAR-10

Raw CIFAR-10 pixels yield 65% test accuracy; normalization/standardization lift to 69%; geometric augmentation maintains ~67%; photometric brightness/contrast crashes to 20%; combined pipeline with deeper CNN hits 87%.