DSPy Agent Skills: Production-Grade DSPy 3.2 Skills for Coding Agents

DSPy Agent Skills is a synthesized, spec-compliant pack of five agent skills that turns Claude Code, Codex CLI, and any agentskills.io-compatible agent into a DSPy expert. Validated against DSPy 3.2.0 (the real API, not inferred from stale docs), it provides progressive disclosure from short references to deep documentation, runnable example scripts with offline dry-run mode, and committed baseline vs. GEPA-optimized performance numbers.

Five Skills Overview

Understanding the Five-Skill Pack

The overview diagram shows the five skills and how they relate to each other:

dspy-fundamentals The foundation skill covers Signatures, Modules, Predict/ChainOfThought/ReAct, and save/load. It auto-invokes whenever you write any new DSPy code, providing the core API signatures and patterns you need to get started.

dspy-evaluation-harness The evaluation skill covers writing metrics, splitting dev/val sets, and calling dspy.Evaluate. It auto-invokes when you need to measure DSPy program performance, providing the evaluation infrastructure that optimization depends on.

dspy-gepa-optimizer The optimization skill covers optimizing and compiling DSPy programs with dspy.GEPA. It auto-invokes when you want to improve your DSPy program’s performance, providing the GEPA optimizer that bootstraps few-shot examples and compiles optimized programs.

dspy-rlm-module The long-context skill covers codebase QA, recursive exploration, and long-context reasoning via dspy.RLM. It auto-invokes when you need to work with large codebases or documents that exceed standard context windows.

dspy-advanced-workflow The orchestration skill ties everything together. It auto-invokes for end-to-end builds, chaining the other four skills to produce a complete baseline-to-optimized pipeline. When you say “Build a DSPy sentiment classifier, optimize it with GEPA, and save the artifact,” this skill coordinates the entire workflow.

Auto-Invocation Flow

Invocation Flow

Understanding the Auto-Invocation Flow

The invocation flow diagram shows how DSPy Agent Skills integrates with coding agents:

User Prompt You describe what you want in natural language: “Build a DSPy sentiment classifier, optimize it with GEPA, and save the artifact.” No special syntax or commands required.

Coding Agent (Claude Code / Codex CLI) The coding agent receives your prompt and checks its loaded skills for matching triggers. Each skill defines when it should auto-invoke based on the content of your request.

Skill Loader The skill loader matches your prompt against the trigger patterns defined in each SKILL.md. For the sentiment classifier example, it matches dspy-advanced-workflow (which orchestrates the other four skills).

Skills Loaded Each skill provides three layers of information: the short SKILL.md (triggers, API signatures, quick-start examples), the deep reference.md (full API details, edge cases, version-specific notes), and runnable example_*.py scripts with offline --dry-run mode.

Generated Code The agent uses the loaded skill knowledge to generate a complete pipeline: baseline definition, evaluation, GEPA optimization, and artifact export. No further prompting needed.

Progressive Disclosure

Understanding the Progressive Disclosure Architecture

The progressive disclosure diagram shows the three-layer documentation architecture that each skill provides:

SKILL.md – Short Reference The SKILL.md file is the entry point. It contains triggers (when the skill auto-invokes), API signatures (the function and class names you need), and quick-start examples. This is what the agent reads first to understand the skill’s scope and capabilities.

reference.md – Deep Documentation When the agent needs more detail, it reads the reference.md file. This contains full API documentation, edge cases, version-specific notes (especially important for DSPy 3.1.3 vs 3.2.0 differences), and detailed usage patterns. This layer prevents the agent from hallucinating API details that changed between versions.

example_*.py – Runnable Scripts Each skill includes runnable example scripts that can be executed in --dry-run mode (no API key needed) or with real LLM calls. These scripts serve as both documentation and validation: they demonstrate the correct API usage and can be run to verify that the generated code actually works.

80 Validation Tests All three layers are validated by 80 tests that check frontmatter spec compliance, JSON schema correctness, Python AST validity, and skill-document correctness guards. This ensures that the skills remain accurate as DSPy evolves.

BetterTogether Pipeline

Pipeline

Understanding the BetterTogether Pipeline

The pipeline diagram shows the four-stage workflow that dspy-advanced-workflow orchestrates:

1. Baseline – Define Signature + Module You start by defining a DSPy Signature (input/output specification) and wrapping it in a Module (Predict, ChainOfThought, or ReAct). This is your unoptimized baseline – the raw LLM performance before any prompt engineering or few-shot optimization.

2. Evaluate – Define Metric + Split Data Next, you define a metric function that measures your program’s performance, split your data into dev/val sets, and run dspy.Evaluate to establish baseline scores. This step is critical: you cannot improve what you cannot measure.

3. Optimize – GEPA Optimizer + Compile The GEPA optimizer takes your baseline program, metric, and training data, then compiles an optimized version. GEPA bootstraps few-shot examples, searches over prompt variations, and produces a program that consistently outperforms the baseline.

4. Export – Save Artifact + Reuse The optimized program is saved as an artifact that can be loaded with .load() for reuse in production. This creates a reproducible, version-controlled optimization result.

Verified Results The pipeline includes committed baseline vs. optimized numbers from three real examples:

RAG QA: 75.77 -> 100.00 (+24.23 points)
Math Reasoning: 85.00 -> 93.33 (+8.33 points)
Invoice Extraction: 0.833 -> 0.931 (+0.098 F1)

Installation

Claude Code (via marketplace)

      
        /plugin marketplace add intertwine/dspy-agent-skills
/plugin install dspy-agent-skills@dspy-agent-skills

Agent Skills CLI (npx skills)

      
        npx skills add intertwine/dspy-agent-skills --list
npx skills add intertwine/dspy-agent-skills --skill '*' -a codex -y

Repo Checkout (both Claude Code and Codex CLI)

      
        git clone https://github.com/intertwine/dspy-agent-skills
cd dspy-agent-skills
./scripts/install.sh           # symlinks into ~/.claude/skills/ and ~/.agents/skills/

Flags: --claude-only, --codex-only, --copy (copy instead of symlink), --uninstall, --dry-run.

Manual

Drop skills/* into ~/.claude/skills/ (Claude Code) or ~/.agents/skills/ (Codex CLI).

Key Features

Feature	Description
Five Spec-Compliant Skills	Fundamentals, Evaluation, GEPA Optimizer, RLM Module, Advanced Workflow
DSPy 3.2.0 Validated	Tested against the real API, not inferred from stale docs
Progressive Disclosure	Short SKILL.md + deep reference.md + runnable example scripts
Offline Dry-Run	All examples run with –dry-run (no API key needed)
BetterTogether Pipeline	End-to-end baseline -> evaluate -> optimize -> export workflow
80 Validation Tests	Frontmatter spec, JSON schema, Python AST, skill-doc guards
Dual-Agent Support	Single source of truth for both Claude Code and Codex CLI
Plugin Manifest	Marketplace manifest for one-click install
Committed Results	Baseline vs. GEPA-optimized numbers from real LLM runs

Troubleshooting

Issue	Solution
Skills not auto-invoking	Verify skills are installed in `~/.claude/skills/` or `~/.agents/skills/`
DSPy version mismatch	Use `env -u UV_EXCLUDE_NEWER uv run --with dspy==3.2.0` to force 3.2.0
`--dry-run` fails	Ensure DSPy is installed: `pip install dspy==3.2.0`
GEPA optimization stalls	Check that your OPENAI_API_KEY is set and has sufficient quota
Validation tests fail	Run `uv run --with pytest python -m pytest tests/ -v` for details
Symlink issues on Windows	Use `--copy` flag instead of symlinks: `./scripts/install.sh --copy`

Conclusion

DSPy Agent Skills fills a critical gap in the coding agent ecosystem: giving agents deep, validated knowledge of a complex framework (DSPy 3.2.x) through a structured skill system. The five-skill pack covers the full DSPy lifecycle from fundamentals to advanced optimization, with progressive disclosure that lets agents start quickly and go deep when needed.

The most impressive aspect is the rigor: every API claim is validated against the real DSPy 3.2.0 API (not inferred from stale documentation), 80 validation tests ensure spec compliance, and committed baseline vs. optimized numbers provide ground-truth performance data. This is what production-grade agent skills look like – not just prompts, but a complete, tested, version-aware knowledge system.

Links:

GitHub: https://github.com/intertwine/dspy-agent-skills
DSPy Documentation: https://dspy.ai/

Enjoyed this post? Never miss out on future posts by following us

DSPy Agent Skills: Production-Grade DSPy 3.2 Skills for Coding Agents

DSPy Agent Skills: Production-Grade DSPy 3.2 Skills for Coding Agents

Understanding the Five-Skill Pack

Auto-Invocation Flow

Understanding the Auto-Invocation Flow

Progressive Disclosure

Understanding the Progressive Disclosure Architecture

BetterTogether Pipeline

Understanding the BetterTogether Pipeline

Installation

Claude Code (via marketplace)

Agent Skills CLI (npx skills)

Repo Checkout (both Claude Code and Codex CLI)

Manual

Key Features

Troubleshooting

Conclusion

Related Posts

Huashu Design: HTML-Native Design Skill for AI Coding Age...

Agentic Stack: One Brain, Many Harnesses - Portable AI Ag...

9Router: Free AI Coding Router with Token Saver and Auto-...

Claude Code Templates: The Ultimate CLI for Configuring a...

Agency Agents: A Complete AI Agency at Your Fingertips

AiToEarn: AI-Powered Earning Platform for Automated Income

Contents