Agent Skills: Production-Grade Engineering Skills for AI Coding Agents

Introduction

In the rapidly evolving landscape of AI-powered software development, one challenge stands out: ensuring that AI coding agents follow the same rigorous engineering practices that senior developers apply to production code. Enter Agent Skills by Addy Osmani - a comprehensive collection of production-grade engineering skills specifically designed for AI coding agents.

With over 16,000 stars on GitHub, this project has captured the attention of developers worldwide. Agent Skills encodes the workflows, quality gates, and best practices that senior engineers use when building software, packaging them so AI agents follow them consistently across every phase of development.

The Problem Agent Skills Solves

AI coding agents default to the shortest path - which often means skipping specs, tests, security reviews, and the practices that make software reliable. Without structured guidance, these agents can produce code that works but lacks the quality attributes needed for production environments.

Agent Skills addresses this by providing structured workflows that enforce the same discipline senior engineers bring to production code. Each skill encodes hard-won engineering judgment: when to write a spec, what to test, how to review, and when to ship.

The Development Lifecycle

Development Lifecycle Workflow

The Agent Skills framework organizes development into six distinct phases, each with specific commands and skills that activate automatically:

DEFINE - The first phase focuses on clarifying what to build. The /spec command activates skills that help turn vague ideas into concrete proposals with clear acceptance criteria.

PLAN - Once requirements are clear, the /plan command breaks down specifications into small, atomic tasks with proper dependency ordering.

BUILD - The implementation phase uses /build to deliver changes incrementally, one slice at a time, ensuring each piece is tested and verified.

VERIFY - Testing and debugging happen with /test, following the principle that tests are proof of correctness.

REVIEW - Before merging, /review ensures code quality through systematic review processes.

SHIP - Finally, /ship deploys to production with proper safeguards and monitoring.

This lifecycle approach ensures that every phase of development receives appropriate attention, preventing the common pattern of rushing to implementation without proper planning.

Skills Architecture: 20 Production-Grade Skills

Skills Architecture

The framework includes 20 carefully crafted skills, organized by development phase:

Define Phase Skills

idea-refine - Structured divergent/convergent thinking to turn vague ideas into concrete proposals. Use when you have a rough concept that needs exploration and refinement.

spec-driven-development - Write a PRD covering objectives, commands, structure, code style, testing, and boundaries before any code. Essential when starting a new project, feature, or significant change.

Plan Phase Skills

planning-and-task-breakdown - Decompose specs into small, verifiable tasks with acceptance criteria and dependency ordering. Transforms specifications into implementable units.

Build Phase Skills

incremental-implementation - Thin vertical slices with implement, test, verify, commit cycles. Feature flags, safe defaults, and rollback-friendly changes for any multi-file modification.

test-driven-development - Red-Green-Refactor methodology with test pyramid (80/15/5), test sizes, DAMP over DRY principles, and the Beyonce Rule for comprehensive testing.

context-engineering - Feed agents the right information at the right time through rules files, context packing, and MCP integrations. Critical for maintaining output quality.

source-driven-development - Ground every framework decision in official documentation with verification, source citations, and clear flagging of unverified information.

frontend-ui-engineering - Component architecture, design systems, state management, responsive design, and WCAG 2.1 AA accessibility compliance for user-facing interfaces.

api-and-interface-design - Contract-first design incorporating Hyrum’s Law, One-Version Rule, error semantics, and boundary validation for robust interfaces.

Verify Phase Skills

browser-testing-with-devtools - Chrome DevTools MCP integration for live runtime data including DOM inspection, console logs, network traces, and performance profiling.

debugging-and-error-recovery - Five-step triage process: reproduce, localize, reduce, fix, guard. Includes stop-the-line rule and safe fallbacks for robust error handling.

Review Phase Skills

code-review-and-quality - Five-axis review with change sizing (~100 lines), severity labels (Nit/Optional/FYI), review speed norms, and splitting strategies for effective code reviews.

code-simplification - Chesterton’s Fence principle and Rule of 500 to reduce complexity while preserving exact behavior. Essential when code becomes harder to maintain.

security-and-hardening - OWASP Top 10 prevention, authentication patterns, secrets management, dependency auditing, and three-tier boundary system for security.

performance-optimization - Measure-first approach with Core Web Vitals targets, profiling workflows, bundle analysis, and anti-pattern detection for performance requirements.

Ship Phase Skills

git-workflow-and-versioning - Trunk-based development, atomic commits, change sizing (~100 lines), and the commit-as-save-point pattern for reliable version control.

ci-cd-and-automation - Shift Left philosophy, Faster is Safer principle, feature flags, quality gate pipelines, and failure feedback loops for continuous integration.

deprecation-and-migration - Code-as-liability mindset, compulsory vs advisory deprecation, migration patterns, and zombie code removal for managing technical debt.

documentation-and-adrs - Architecture Decision Records, API documentation, inline documentation standards - documenting the why behind decisions.

shipping-and-launch - Pre-launch checklists, feature flag lifecycle, staged rollouts, rollback procedures, and monitoring setup for confident deployments.

Skill Anatomy: How Each Skill Works

Skill Anatomy

Every skill follows a consistent anatomy designed for agent comprehension and execution:

Frontmatter - Contains the skill name and description, which agents use for discovery and activation decisions.

Overview - A concise explanation of what the skill does and why it matters, providing the elevator pitch for the workflow.

When to Use - Clear triggering conditions that help agents decide when to activate the skill, including both positive triggers and negative exclusions.

Process - The heart of the skill - a step-by-step workflow that agents follow. Must be specific and actionable, not vague advice.

Rationalizations - The most distinctive feature of well-crafted skills. These are common excuses agents use to skip important steps, paired with factual counter-arguments. They prevent agents from rationalizing their way out of following the process.

Red Flags - Observable signs that the skill is being violated, useful during code review and self-monitoring.

Verification - Exit criteria with evidence requirements. Every checkbox must be verifiable with proof - test output, build results, screenshots, etc.

Agent Integration: How Skills Work with AI Agents

Agent Integration Flow

Agent Skills integrates with popular AI coding tools through multiple mechanisms:

Claude Code Integration

The recommended approach uses the marketplace install:

      
        /plugin marketplace add addyosmani/agent-skills
/plugin install agent-skills@addy-agent-skills

For local development, clone the repository and point Claude to the plugin directory. Skills activate automatically based on the task context.

Cursor Integration

Copy any SKILL.md file into .cursor/rules/ or reference the full skills/ directory. Cursor’s rule system picks up the skill content and applies it during code generation.

Gemini CLI Integration

Install as native skills for auto-discovery:

      
        gemini skills install https://github.com/addyosmani/agent-skills.git --path skills

Or install from a local clone for development and customization.

GitHub Copilot Integration

Use agent definitions from agents/ as Copilot personas and skill content in .github/copilot-instructions.md for persistent context.

Other Agents

Skills are plain Markdown - they work with any agent that accepts system prompts or instruction files. The format is designed for maximum compatibility across different AI coding tools.

Slash Commands: Quick Access to Skills

Slash Commands

Seven slash commands map directly to the development lifecycle, each activating the appropriate skills automatically:

Command	Purpose	Key Principle
`/spec`	Define what to build	Spec before code
`/plan`	Plan how to build it	Small, atomic tasks
`/build`	Build incrementally	One slice at a time
`/test`	Prove it works	Tests are proof
`/review`	Review before merge	Improve code health
`/code-simplify`	Simplify the code	Clarity over cleverness
`/ship`	Ship to production	Faster is safer

Skills also activate automatically based on context - designing an API triggers api-and-interface-design, building UI triggers frontend-ui-engineering, and so on.

Agent Personas: Specialist Reviewers

Beyond skills, the framework includes pre-configured specialist personas for targeted reviews:

code-reviewer - Senior Staff Engineer perspective with five-axis code review using the “would a staff engineer approve this?” standard.

test-engineer - QA Specialist focused on test strategy, coverage analysis, and the Prove-It pattern for comprehensive testing.

security-auditor - Security Engineer conducting vulnerability detection, threat modeling, and OWASP assessment.

These personas provide specialized expertise when specific types of review are needed.

Reference Checklists

Quick-reference material that skills pull in when needed:

testing-patterns.md - Test structure, naming, mocking, React/API/E2E examples, and anti-patterns
security-checklist.md - Pre-commit checks, authentication, input validation, headers, CORS, OWASP Top 10
performance-checklist.md - Core Web Vitals targets, frontend/backend checklists, measurement commands
accessibility-checklist.md - Keyboard navigation, screen readers, visual design, ARIA, testing tools

Key Design Principles

Agent Skills embodies several key design choices that distinguish it from generic prompts:

Process, not prose - Skills are workflows agents follow, not reference docs they read. Each has steps, checkpoints, and exit criteria.

Anti-rationalization - Every skill includes a table of common excuses agents use to skip steps with documented counter-arguments.

Verification is non-negotiable - Every skill ends with evidence requirements. “Seems right” is never sufficient.

Progressive disclosure - The SKILL.md is the entry point. Supporting references load only when needed, keeping token usage minimal.

Google Engineering Practices

Skills bake in best practices from Google’s engineering culture, including concepts from “Software Engineering at Google” and Google’s engineering practices guide:

Hyrum’s Law in API design
The Beyonce Rule and test pyramid in testing
Change sizing and review speed norms in code review
Chesterton’s Fence in simplification
Trunk-based development in git workflow
Shift Left and feature flags in CI/CD
Dedicated deprecation skill treating code as a liability

These aren’t abstract principles - they’re embedded directly into the step-by-step workflows agents follow.

Getting Started

To start using Agent Skills with your AI coding workflow:

Choose your AI coding tool (Claude Code, Cursor, Gemini CLI, etc.)
Follow the setup instructions for your tool in the docs/ directory
Use slash commands to activate skills during development
Let skills auto-activate based on task context

The framework is designed to be immediately useful with minimal setup, while providing depth for complex projects.

Conclusion

Agent Skills represents a significant advancement in AI-assisted software development. By encoding production-grade engineering practices into structured workflows, it helps ensure that AI-generated code meets the same quality standards we expect from senior engineers.

Whether you’re building a new feature, refactoring existing code, or managing a complex migration, Agent Skills provides the guardrails and guidance needed for production-quality results. The combination of clear processes, anti-rationalization tables, and verification requirements creates a framework that truly helps AI agents think and work like experienced engineers.

With support for multiple AI coding tools and a comprehensive set of skills covering every phase of development, Agent Skills is an essential addition to any modern development workflow.