Reveal Agent Execution Details in Debug Logs

Open Agent Debug Logs from any session's three-dot menu to view session-specific details like loading of instructions, agents, hooks, and custom skills—including file sources. Logs capture your input message, each tool call, and LLM calls with token counts for optimization. Session summaries show type (local), status, model turns, tool calls, total tokens, errors, and events. Click the agent flowchart for a visual step-by-step breakdown, expanding complex calls to trace execution flow and diagnose why outputs differ from expectations.

For example, if a skill or instruction fails, logs pinpoint loading paths (user-level or workspace-level) and reveal transparency from VS Code and GitHub Copilot's open-source nature, beyond visible tool outputs.

Inspect Raw LLM Interactions in Chat Debug View

Toggle Chat Debug View from the session list to access unfiltered data sent to LLMs: system messages (with customizations), user messages (including memory and preferences), context like current date/location, and full request/response chains. Hover or check entries for model used, duration, and total tokens per call.

In a refactor session, view the progression from initial request (e.g., "build a base62 encoder/decoder using Python 3.13") through tool calls to final response summary. This granularity exposes why agents underperform, such as missing context or inefficient prompts.

Troubleshoot Behavior and Optimize Token Usage

Query token usage mid-session (e.g., "how many tokens did I use?") for breakdowns: totals like 214,000 tokens used vs. compacted context window (e.g., user context at 1.6-6%, tool results/files growing but intelligently summarized by VS Code/Copilot to retain only key implementation details).

Invoke /troubleshoot skill for issues like undetected skills—e.g., ask "where are you loading skills from?" to confirm sources and fix loading errors. Unread sessions show badges; allow all commands if needed. These tools ensure agents behave as expected, revealing granular insights for setup tweaks before production builds.