How to Use Agent Skills

March 18, 2026

27 min episode · 2 min read

Episode

27 min

Read time

2 min

AI-Generated Summary

Published Mar 19, 2026

Key Takeaways

✓Progressive Disclosure Architecture: Skills use a three-layer loading system — a ~100-token metadata description, the full skill.md body, and linked supplementary files — so agents load only the context needed at each decision point. This prevents the system prompt bloat that caused earlier agents to become slower, more expensive, and less reliable as capabilities expanded.
✓The Gotcha Section: The highest-signal content in any skill is a dedicated section documenting common failure points Claude hits when executing that skill. Update this section each time the agent makes a mistake, turning the skill into a living document that accumulates institutional knowledge and prevents repeated errors over time.
✓Two Skill Categories for Testing Strategy: Skills fall into capability uplift (Claude can't do this reliably without the skill) or encoded preference (Claude can do each step, but the skill sequences them to match team workflows). Capability uplift skills may become obsolete as models improve; encoded preference skills are more durable but only as valuable as their fidelity to actual workflows.
✓Skill Creator Tool for Non-Engineers: Anthropic updated their skill creator to let subject matter experts — not just engineers — test and benchmark skills without writing code. It runs evals against multiple prompts, scores performance, runs A/B tests against base Claude, and auto-rewrites vague descriptions. Anthropic tested this on their own skills and saw improved triggering in five out of six cases.
✓Skills as Cross-Platform Reusable Capabilities: Skills are supported across Claude Code, OpenAI Codex, GitHub Copilot, Cursor, and now Notion AI, meaning a skill authored once works across ecosystems. Notion's implementation lets users convert any page into a skill with one click, signaling that the reusable-capability model is converging across the entire AI stack from consumer to enterprise.

What It Covers

The Claude Code team at Anthropic shares how they build and use agent skills — reusable folders of instructions, scripts, and resources that load contextually rather than bloating system prompts. The episode covers skill architecture, nine key skill categories, best practices from Tariq's post, and how the concept applies across all user levels.

Key Questions Answered

•Progressive Disclosure Architecture: Skills use a three-layer loading system — a ~100-token metadata description, the full skill.md body, and linked supplementary files — so agents load only the context needed at each decision point. This prevents the system prompt bloat that caused earlier agents to become slower, more expensive, and less reliable as capabilities expanded.
•The Gotcha Section: The highest-signal content in any skill is a dedicated section documenting common failure points Claude hits when executing that skill. Update this section each time the agent makes a mistake, turning the skill into a living document that accumulates institutional knowledge and prevents repeated errors over time.
•Two Skill Categories for Testing Strategy: Skills fall into capability uplift (Claude can't do this reliably without the skill) or encoded preference (Claude can do each step, but the skill sequences them to match team workflows). Capability uplift skills may become obsolete as models improve; encoded preference skills are more durable but only as valuable as their fidelity to actual workflows.
•Skill Creator Tool for Non-Engineers: Anthropic updated their skill creator to let subject matter experts — not just engineers — test and benchmark skills without writing code. It runs evals against multiple prompts, scores performance, runs A/B tests against base Claude, and auto-rewrites vague descriptions. Anthropic tested this on their own skills and saw improved triggering in five out of six cases.
•Skills as Cross-Platform Reusable Capabilities: Skills are supported across Claude Code, OpenAI Codex, GitHub Copilot, Cursor, and now Notion AI, meaning a skill authored once works across ecosystems. Notion's implementation lets users convert any page into a skill with one click, signaling that the reusable-capability model is converging across the entire AI stack from consumer to enterprise.

Notable Moment

Anthropic found that despite roughly 28,000 skills existing on ClawHub, the vast majority fit into just nine categories — a surprisingly narrow taxonomy given the volume. The team concluded that most agent work clusters around a predictable set of recurring task types regardless of organization or industry.

Know someone who'd find this useful?