AI Agent Skills: Procedural Knowledge via Markdown

Skill Structure Delivers Step-by-Step Workflows

AI agents excel at reasoning and facts (e.g., Kubernetes architecture, SQL history, unladen swallow airspeed) but fail on procedural tasks like 47-step compliant financial reports without exhaustive prompting or risky guessing. Skills solve this with a skill.md markdown file in a folder: YAML frontmatter mandates 'name' (e.g., PDF Builder) and 'description' as the trigger condition (e.g., "use when user asks to extract a PDF"), which the LLM matches via reasoning. Below frontmatter, plain markdown provides instructions, rules, step-by-step workflows, input/output examples. Optional folders enhance: /scripts for executable JS/Python/Bash; /references for extra docs loaded on need; /assets for templates/data files. This teaches agents exact sequences and judgment, versionable via Git, portable between platforms.

Impact: Replaces per-task 47-step prompts with reusable, trigger-activated procedures, enabling agents to handle repeatable tasks reliably without context bloat.

Progressive Disclosure Scales to Hundreds of Skills

Loading full details from hundreds of skills at startup exhausts LLM token budgets. Progressive disclosure uses 3 tiers: Tier 1 loads only name/description metadata (handful of tokens per skill, like a table of contents). Tier 2 pulls full skill.md body when LLM reasoning matches task to description. Tier 3 loads optional /scripts, /references, /assets only for specific needs. Agent starts lightweight, expands context just-in-time.

Impact: Handles 100+ skills without overwhelming context windows, ensuring fast startup and relevant knowledge injection, critical for production agents.

Skills Provide Procedural Memory, Complementing Other Methods

Skills target procedural knowledge (how-to sequences with judgment), mirroring human procedural memory (e.g., riding a scooter in Rome). Compare:

MCP (Model Context Protocol): Grants tool/API access but not when/how to use.
RAG: Supplies factual chunks from knowledge bases but no workflows.
Fine-tuning: Bakes knowledge into weights (permanent but costly, breaks on model updates).

Skills integrate with these—MCP for invocation capability, skills for timing/judgment; RAG for facts during execution. As files, skills update easily, no retraining needed. Open standard at agentskills.io (Apache 2.0) adopted by Claude Code, OpenAI Codex, others—build once, run anywhere.

Impact: Agents gain human-like memory types (semantic=RAG, episodic=chat logs, procedural=skills), turning general LLMs into task specialists.

Security: Audit Scripts Like Dependencies

Skills' power from executable scripts (file system, env vars, API keys) risks prompt injection, tool poisoning, malware in public skills. Treat as software deps: review code before local execution.

Impact: Enables safe, powerful automation while avoiding common open ecosystem pitfalls.