Automate YouTube Shorts with Claude Code & Remotion

Claude Code builds a full YouTube clipping agent in 15-30 minutes: analyzes transcripts for high-tension moments, generates HeyGen avatar hooks from 1000+ viral templates, trims with FFmpeg, captions via Remotion, outputs 9:16 shorts.

Pipeline Planning Delivers Full Architecture in Minutes

Start Claude Code in Antigravity (antigravity.google) with /plan mode to outline the entire system before coding. Provide a detailed prompt specifying inputs (folder of MP4 videos + transcripts), goals (extract 5 high-tension moments per video via Claude analysis, inspired by Alex Hormozi's advice), and outputs (vertical 9:16 MP4s with HeyGen avatar intro hooks, Remotion animated captions, FFmpeg trims). Include a viral-hooks.md file with 1000+ templates like "This represents your X before, during, and after mind-blowing method." Claude generates a step-by-step plan in 5-6 minutes: Claude for transcript analysis, FFmpeg for fast trimming, HeyGen for 5-10s avatar hooks (select best hook, fill variables contextually, end with "Watch this"), Remotion for TikTok-style captions (one word at a time), FFmpeg concatenation. Project structure auto-plans folders for inputs/outputs. This front-loaded planning prevents scope creep, enabling 15-minute initial builds.

Tool Setup Unlocks Programmatic Video Editing

Install Claude Code via quick-start command in Antigravity terminal. Add Remotion skill globally (npx @remotion/mcp@latest install) for code-based video rendering (free, handles captions/motion graphics/compositing). Create .env with Anthropic API key (platform.anthropic.com), HeyGen API key/avatar ID/voice ID (heygen.com; clone voice or use ElevenLabs import). HeyGen generates hooks: wide shot for hook, punch-in 30% zoom on "Watch this," no speech gaps. FFmpeg handles picture-in-picture reformatting (screen top half, facecam bottom full-frame). Enable auto-accept edits to build index.ts + utils in one pass. Remotion excels for deterministic overlays vs. manual editors; scales to batch 40+ videos without per-clip tweaks.

Iterative Refinement Yields Usable Clips Fast

Run on single video first: processes one folder's MP4/transcript into 5 clips. Initial output impresses—avatar hook + trimmed content + captions—but fix via notes: redirect outputs to project folder (not prior projects), raise captions 100-150px, center on split-screen, add top-third text hook supporting spoken one. Rerun refines: blurry upscales fixed by vertical stacking, HeyGen captions synced word-by-word. Total: 20-30 minutes to production-ready shorts despite first-pass issues like folder paths or credit shortages. Scale by batching folders; next: auto-publishing. Trade-off: AI avatars need credits/tuning for polish, but 80% automation frees manual polish for hooks. Result: repurposes long-form into viral-ready TikTok/Reels/Shorts, boosting distribution without daily editing.

Summarized by x-ai/grok-4.1-fast via openrouter

7467 input / 1838 output tokens in 15832ms

© 2026 Edge