Skills：面向AI Agent的可执行操作契约设计范式

weixin_30776545

298人浏览 · 2026-06-21 15:14:13

weixin_30776545 · 2026-06-21 15:14:13 发布

1. Skills不是插件，也不是API——它是一套“可执行的说明书”

我第一次看到 SKILL.md 这个文件名时，下意识点开想看看是不是什么配置模板，结果里面写着：“请调用 scripts/calculate_tax.py ，传入 income 和 filing_status 参数，返回 JSON 格式结果”。我当时愣了三秒——这哪是文档？这分明是带注释的作业指导书。

这就是 Skills 的本质： 它不提供运行时能力，而是提供可被 AI 理解、可被系统调度、可被人类验证的结构化操作契约 。你不会在 Skills 里找到一行编译后的二进制代码，也不会看到一个暴露在公网的 REST 接口地址；你只会看到一段用自然语言写清楚“谁在什么条件下该做什么、怎么做、依据什么判断成功”的指令集，外加配套的脚本、模板、参考数据等支撑材料。

为什么这个设计如此关键？因为当前绝大多数 AI 工具链卡在“意图理解”和“动作执行”之间的断层上。比如你对 Claude 说：“把这份销售报表按季度汇总，并标出同比下滑超15%的部门”，它能理解“汇总”“季度”“同比下滑”，但无法自动识别你公司内部用的 Excel 模板路径、财务系统 API 的认证方式、或者“下滑超15%”这个阈值是否要动态从配置文件读取。Skills 正是为填平这个断层而生：它把“业务语义”翻译成“执行契约”，让 AI 不再靠猜测，而是按说明书一步步操作。

提示：Skills 的核心价值不在“它能做什么”，而在“它让 AI 能稳定地、可审计地、可复现地做什么”。一个没写 Skills 的 Agent，就像一个只背过菜谱却没进过厨房的厨师；而一个装满 Skills 的 Agent，是带着全套刀具、调味料清单、火候对照表和失败案例笔记上岗的主厨。

我实测过一个典型场景：用同一份 sales_summary.skill 在 TRAE Solo 和 Claude Code 中调用。TRAE Solo 启动时只加载 SKILL.md 的 name/description 字段（约 200 字符），整个技能目录含 47 个技能，启动耗时仅 180ms；而当用户输入“生成Q3销售趋势图”时，TRAE 才完整读入该技能的全部内容（含 scripts/plot_trend.py 和 templates/q3_report.md ），上下文增量仅 12KB。反观传统插件模式——每个插件都需预加载 JS bundle 或 Python module，47 个插件光初始化就吃掉 1.2GB 内存。这不是优化，而是范式切换。

Skills 的轻量性还体现在版本控制上。我把 legal_review.skill 的 SKILL.md 提交到 Git，同事 git blame 一眼就能看到第 3 行“需同步更新 GDPR 第28条合规检查项”是谁在哪天加的，而不用翻查 SDK 更新日志或 API 文档变更记录。上周我们团队用 git diff v1.2..v1.3 skills/legal_review/ 直接生成了客户审计报告，全程零人工整理——因为 Skills 本身就是可执行的合规证据。

这种设计也彻底改变了协作模式。法务同事不懂 Python，但他能直接编辑 SKILL.md 里的检查条款：“若合同金额 > 500万，必须包含不可抗力条款第4.2款原文”；开发同事则负责实现 scripts/check_clauses.py ；而 QA 只需运行 skill-test --skill legal_review --case high_value_contract 即可验证整条链路。三个人用同一份文本工作，没有翻译损耗，没有需求失真。

2. SKILL.md 不是 Markdown 文档，它是 Skills 的“宪法性文件”

很多人把 SKILL.md 当成普通 README 来写，结果发现 Agent 总是调用失败。我拆解过 37 个社区热门 Skills，其中 29 个的首行失败原因都是 SKILL.md 结构违规。这不是语法错误，而是契约失效——就像签合同时漏写了甲方名称，法律上整个协议可能归于无效。

SKILL.md 的核心结构只有三块，但每一块都有强制语义：

2.1 必须存在的 YAML Front Matter 区块

---
name: "Sales Tax Calculator"
description: "Calculate final price including regional sales tax for US states and Canadian provinces"
version: "1.2.0"
author: "Finance Team @ Acme Corp"
license: "MIT"
tags: ["finance", "tax", "us", "ca"]
requires:
  - python: ">=3.9"
  - packages: ["pandas>=1.5.0"]
inputs:
  - name: "subtotal"
    type: "number"
    description: "Pre-tax amount in USD"
  - name: "state_province"
    type: "string"
    description: "Two-letter state/province code (e.g., 'CA', 'ON')"
outputs:
  - name: "total"
    type: "number"
    description: "Final amount including tax"
  - name: "tax_rate"
    type: "number"
    description: "Applied tax rate as decimal (e.g., 0.075)"
---

这个区块不是可选装饰，而是 Skills 运行时的元数据中枢。Agent 启动时只解析此区块，用于：

技能发现 ： description 字段被向量化后参与语义匹配，所以别写“计算税费”这种泛泛描述，要写“计算美国加州和加拿大安大略省销售税，支持免税商品豁免逻辑”
环境校验 ： requires 字段触发预检，若本地无 pandas>=1.5.0 ，Agent 会直接报错并提示 pip install pandas==1.5.2 ，而不是等到执行时抛 ModuleNotFoundError
参数绑定 ： inputs 定义的字段名必须与脚本中 argparse 或函数签名严格一致，我见过太多人写 input: "subtotal" （少了个 s）导致参数始终为空

注意：YAML 区块必须以 --- 开头和结尾，且必须位于文件最顶部。曾有团队把 --- 放在第三行，结果 Agent 把前两行当作文本描述解析， name 字段读成空字符串，整个技能在客户端列表里显示为“Unnamed Skill”。

2.2 指令主体（Instruction Body）：用“动词+宾语+条件”句式写作

指令主体不是自由发挥的说明文字，而是 Agent 执行时的逐字操作指南。正确写法示例：

## How to execute this skill

1. Validate `state_province` against the `data/us_states.csv` and `data/ca_provinces.csv` files. If not found, return error: "Invalid state/province code: {state_province}".

2. Load `tax_rates.json` and find the `rate` value for the matching jurisdiction.

3. Calculate `total = subtotal * (1 + rate)`.

4. Return JSON with keys `total` and `tax_rate`, both rounded to 2 decimal places.

错误写法（常见坑）：

❌ “The system should calculate tax...” → Agent 不是“系统”，它是执行者，指令必须用祈使句
❌ “Refer to the tax table for rates” → 没指定文件路径，Agent 不知道去哪找
❌ “Round the result appropriately” → “appropriately” 是模糊表述，必须明确“rounded to 2 decimal places”

我测试过不同表述对成功率的影响：使用模糊动词（如“handle”“process”“manage”）的 Skills，执行失败率高达 63%；而用精确动词（“validate”“load”“calculate”“return”）的 Skills，失败率压到 4.7%。这不是玄学，因为 Agent 的 LLM 底层在做指令解析时，会将动词映射到预定义的动作空间（Action Space），模糊动词会导致映射失败。

2.3 资源引用规范：路径即契约，相对即绝对

Skills 目录内所有路径必须是相对于技能根目录的 硬编码相对路径 。这是 Skills 可移植性的基石。例如：

- Reference data: `data/tax_rates.json`
- Script to run: `scripts/calculate.py`
- Template for output: `templates/invoice_summary.md`

这些路径在 SKILL.md 中出现，就必须真实存在于对应位置。Agent 加载技能时，会校验 data/tax_rates.json 是否存在，若缺失则拒绝激活该技能——这比运行时报错更早拦截问题。

我踩过的最深的坑是路径大小写。在 macOS/Linux 下 Data/tax_rates.json 和 data/tax_rates.json 被视为不同文件，但 Windows 默认不区分大小写。我们有个技能在开发机（macOS）上完美运行，上线到 Windows 服务器后始终报“file not found”。最终发现 SKILL.md 写的是 Data/ ，而实际目录是 data/ 。解决方案很简单：在 CI 流程中加入路径校验脚本，遍历所有 SKILL.md 中的路径引用，用 os.path.exists() 实际检测，不通过则阻断发布。

3. Skills 的真正威力不在单点功能，而在“技能组合拳”的编排艺术

刚接触 Skills 的人常陷入一个误区：把每个 Skills 当成独立工具，像安装 Chrome 插件一样堆砌。结果是技能库越来越重，但实际调用率不足 15%。真正的高手玩法，是把 Skills 当作乐高积木，用“组合编排”释放指数级能力。

3.1 组合模式一：串行流水线（Sequential Pipeline）

这是最基础也最实用的模式。典型场景：客户投诉处理。单个 Skills 无法覆盖全流程，但组合起来就是完整 SOP：

customer_complaint.skill/
├── SKILL.md          # 主技能：协调整个流程
├── scripts/
│   ├── route_complaint.py    # 根据关键词分派给法务/客服/技术
│   └── generate_response.py  # 生成标准化回复
├── subskills/        # 嵌套子技能目录
│   ├── legal_review.skill/   # 法务审核技能
│   ├── refund_calc.skill/    # 退款计算技能
│   └── escalation.skill/     # 升级处理技能

SKILL.md 的指令主体这样写：

## Execution flow

1. Run `subskills/route_complaint.py` with input `complaint_text`. Capture output `department`.

2. Based on `department`:
   - If "legal": execute `subskills/legal_review.skill` with `complaint_text` and `company_policy_v3.pdf`
   - If "finance": execute `subskills/refund_calc.skill` with `order_id` and `refund_reason`
   - If "escalation": execute `subskills/escalation.skill` with `complaint_id` and `urgency_score`

3. Aggregate all sub-skill outputs and run `scripts/generate_response.py` to produce final reply.

关键点在于： 子技能路径必须用 subskills/xxx.skill/ 格式显式声明，不能写成 ../legal_review.skill/ 。Agent 会据此构建技能依赖图，在启动时预加载所有关联技能的 metadata，确保调用时无需二次发现。

3.2 组合模式二：条件分支网（Conditional Graph）

当流程涉及复杂决策树时，纯串行不够用。Skills 支持用 if/else 语法显式定义分支：

## Conditional execution

- If `input.type == "contract"` and `input.value > 500000`:
    Execute `subskills/gdpr_audit.skill`
- Else if `input.type == "invoice"` and `input.due_date < today()`:
    Execute `subskills/late_fee_calc.skill`
- Else:
    Execute `subskills/basic_validation.skill`

这里 input.type 是 Skills 输入参数， today() 是 Agent 内置的上下文函数。我实测发现，用这种声明式分支比在 Python 脚本里写 if/elif/else 更可靠——因为 Agent 的调度器会在执行前做静态分析，提前发现逻辑漏洞（比如某个分支永远无法命中），而脚本里的逻辑错误只能等到运行时暴露。

3.3 组合模式三：并行协同（Parallel Coordination）

某些任务天然需要多角色协作，比如一次产品发布：市场部写通稿、技术部生成 API 文档、客服部更新 FAQ。Skills 支持并行触发：

## Parallel tasks

Run these skills simultaneously:
- `subskills/press_release.skill` with `product_name` and `launch_date`
- `subskills/api_docs.skill` with `api_spec.yaml`
- `subskills/faq_update.skill` with `new_features.md`

Wait for all to complete, then run `scripts/assemble_launch_package.py`.

Agent 会为每个子技能分配独立沙箱环境，避免资源冲突。我们用此模式将产品发布准备时间从 3 天压缩到 47 分钟——因为三个团队不再需要排队等待对方输出，而是并行工作，最后由 assemble_launch_package.py 自动整合。

实操心得：组合技能的最大陷阱是“隐式耦合”。比如 press_release.skill 生成的文件名是 release_v1.2.md ，而 assemble_launch_package.py 却硬编码读取 release.md 。解决方案是强制约定：所有子技能输出必须写入 outputs/ 目录，且文件名由主技能在调用时通过参数指定，如 --output-file press_release.md 。我们在 CI 中加入静态检查，扫描所有 SKILL.md 中的 Execute 语句，确保参数传递完整。

4. 从零搭建第一个 Skills：以“会议纪要智能提炼”为例

现在我们动手实现一个真实可用的 Skills，不讲虚的，直接上手。目标：输入一段 Zoom 会议录音转录文本，输出结构化纪要，含待办事项、决策点、风险项三类提取。

4.1 初始化目录结构

mkdir -p meeting_summary.skill/{scripts,references,assets}
touch meeting_summary.skill/SKILL.md

目录结构必须严格遵循规范：

meeting_summary.skill/
├── SKILL.md          # 技能宪法
├── scripts/
│   ├── extract_actions.py    # 提取待办事项
│   ├── identify_decisions.py # 识别决策点
│   └── flag_risks.py         # 标记风险项
├── references/
│   ├── meeting_formats.txt   # 公司常用会议格式样例
│   └── decision_keywords.txt # 决策类关键词列表
└── assets/
    └── template.md           # 纪要输出模板

4.2 编写 SKILL.md —— 关键在输入/输出契约

---
name: "Meeting Summary Generator"
description: "Extract action items, decisions, and risks from meeting transcripts using company-specific formats and keywords"
version: "1.0.0"
author: "Ops Team"
license: "Internal"
tags: ["productivity", "meetings", "ops"]
requires:
  - python: ">=3.10"
  - packages: ["spacy>=3.7.0", "python-docx>=0.8.11"]
inputs:
  - name: "transcript"
    type: "string"
    description: "Raw meeting transcript text, preferably with speaker labels"
  - name: "meeting_type"
    type: "string"
    description: "Type of meeting (e.g., 'sprint_planning', 'client_review', 'exec_sync')"
outputs:
  - name: "summary"
    type: "string"
    description: "Formatted summary in Markdown with sections: ## Action Items, ## Decisions, ## Risks"
  - name: "confidence_score"
    type: "number"
    description: "0.0-1.0 confidence in extraction accuracy"
---
## How to execute this skill

1. Load `references/meeting_formats.txt` and `references/decision_keywords.txt` into memory.

2. Run `scripts/extract_actions.py` with `transcript` and `meeting_type`. Capture output `actions_list`.

3. Run `scripts/identify_decisions.py` with `transcript` and `decision_keywords.txt`. Capture output `decisions_list`.

4. Run `scripts/flag_risks.py` with `transcript` and `meeting_formats.txt`. Capture output `risks_list`.

5. Render `assets/template.md` using Jinja2, injecting `actions_list`, `decisions_list`, `risks_list`.

6. Return JSON with keys `summary` (rendered Markdown) and `confidence_score` (average of sub-skills' scores).

注意几个细节：

inputs 中 transcript 类型为 string ，而非 file_path ，因为 Agent 会把用户粘贴的文本直接传入，不是读文件
outputs 明确要求 summary 是 Markdown 字符串，这样客户端可直接渲染，不用再解析
指令中 Run 动词后跟脚本路径，Agent 会自动拼接 scripts/ 前缀并执行

4.3 实现核心脚本 —— 保持极简，专注契约

scripts/extract_actions.py 示例（Python 3.10+）：

#!/usr/bin/env python3
import sys
import json
import re

def main():
    # 从 stdin 读取 JSON 输入，符合 Skills 运行时约定
    input_data = json.load(sys.stdin)
    transcript = input_data.get("transcript", "")
    meeting_type = input_data.get("meeting_type", "general")
    
    # 简单规则提取（生产环境应替换为 spaCy NER）
    actions = []
    for line in transcript.split('\n'):
        if 'action' in line.lower() or 'todo' in line.lower() or 'follow up' in line.lower():
            # 提取 "John to update docs by Friday" 这类模式
            match = re.search(r'([A-Za-z\s]+)to\s+([^\.\n]+?)(?:\.|$)', line)
            if match:
                owner = match.group(1).strip()
                task = match.group(2).strip()
                actions.append(f"- [{owner}] {task}")
    
    # 输出必须是 JSON，且包含 required keys
    output = {
        "actions_list": actions,
        "confidence_score": 0.72 if actions else 0.3
    }
    print(json.dumps(output))

if __name__ == "__main__":
    main()

关键约束：

必须从 stdin 读取 JSON 输入 ，Agent 通过管道传参，不是命令行参数
必须向 stdout 输出 JSON ，且必须包含 SKILL.md 中 outputs 定义的所有 key
脚本内不处理文件 I/O ，所有文件路径由 SKILL.md 指令指定，脚本只负责逻辑

4.4 验证与调试 —— 用 Skills CLI 做原子测试

别急着扔进 Agent 里跑。先用官方 Skills CLI 本地验证：

# 安装 CLI（假设已配置 Python 环境）
pip install agent-skills-cli

# 测试单个脚本
echo '{"transcript": "Alice: to update API docs by EOD. Bob: review PR #42", "meeting_type": "sprint_planning"}' | \
  python meeting_summary.skill/scripts/extract_actions.py

# 测试整个技能（CLI 会模拟 Agent 加载流程）
skills test --skill meeting_summary.skill \
  --input '{"transcript": "Carol: finalize budget by Friday. Dave: confirm vendor contract.", "meeting_type": "exec_sync"}'

CLI 会：

解析 SKILL.md 的 YAML 区块，校验结构
检查 scripts/ 下所有脚本是否存在且可执行
模拟 Agent 的输入注入和输出捕获
输出详细的执行日志和性能指标（如各步骤耗时）

我坚持这个习惯：每个 Skills 提交前，必须通过 CLI 的 --verbose 模式看到完整的执行链路，包括“加载 references/meeting_formats.txt: OK”、“执行 extract_actions.py: 124ms”、“输出 validation: PASS”。这比在 Agent UI 里盲试高效十倍。

5. Skills 生态实战避坑指南：那些文档里不会写的血泪教训

跑了 17 个生产级 Skills 项目后，我总结出 5 个高频致命坑，每个都曾让我们停摆超 4 小时。这些不是理论风险，是凌晨三点 Slack 里真实的哀嚎。

5.1 坑一：YAML Front Matter 的隐形换行符

问题现象：Skills 在本地测试一切正常，部署到 TRAE 服务器后 name 字段读成空字符串。

根因排查：

用 hexdump -C SKILL.md | head -20 查看文件十六进制
发现 --- 后第一行末尾有 0D 0A （CRLF），而 TRAE 服务器运行在 Linux，只认 0A （LF）
YAML 解析器遇到 0D 0A 时，将第一行解析为 \r\nname: ... ， \r 被当作非法字符，整个区块解析失败

解决方案：

在 Git 配置中全局启用 core.autocrlf=input （Linux/macOS）或 core.autocrlf=true （Windows）
CI 流程中加入校验： grep -I $'\r$' SKILL.md && echo "CRLF detected!" && exit 1 || echo "OK"

经验：所有 Skills 文件必须用 Unix 换行符（LF）。用 VS Code 编辑时，右下角状态栏确认显示 “LF”，不是 “CRLF”。

5.2 坑二：脚本权限的“静默失败”

问题现象：Agent 报错 Permission denied: scripts/flag_risks.py ，但文件明明有执行权限。

根因排查：

ls -l scripts/flag_risks.py 显示 -rw-r--r-- ，缺少 x 权限
但开发者说“我本地 chmod 了啊”，查 Git 记录发现 git update-index --chmod=+x scripts/flag_risks.py 没提交
Git 默认不跟踪文件权限变更， chmod 后必须显式 git add --chmod=+x scripts/flag_risks.py

解决方案：

CI 中加入权限检查脚本：

#!/bin/bash
find . -name "*.py" -path "./scripts/*" -exec ls -l {} \; | grep -v "^-rwx" && echo "ERROR: Missing exec permission" && exit 1

或统一用 python -m 方式调用： python -m scripts.flag_risks ，绕过文件权限依赖

5.3 坑三：相对路径的“软链接陷阱”

问题现象：在 IDE 里右键运行 scripts/extract_actions.py 成功，但 Agent 调用失败，报 FileNotFoundError: references/decision_keywords.txt 。

根因排查：

开发者为方便，在 meeting_summary.skill/ 下建了软链接 references -> ../shared_refs
SKILL.md 写的是 references/decision_keywords.txt ，但 Agent 的工作目录是技能根目录， os.path.exists("references/decision_keywords.txt") 返回 False，因为软链接未被解析

解决方案：

绝对禁止在 Skills 目录内使用软链接 。所有资源必须物理存在于技能目录内
共享资源用 Git Submodule 或 CI 时复制： cp -r ../shared_refs meeting_summary.skill/references/

5.4 坑四：输入参数的“类型幻觉”

问题现象：用户输入数字 123 ，脚本收到却是字符串 "123" ，导致 if input_value > 100: 永远为 False。

根因排查：

SKILL.md 中 inputs 定义 type: "number" ，但 Agent 并不自动转换类型，它只是把原始输入（JSON 中的字符串）原样传入
类型转换必须由脚本自己完成

解决方案：

在所有脚本开头强制类型转换：

# scripts/extract_actions.py
import json
import sys

input_data = json.load(sys.stdin)
# 显式转换
try:
    input_value = float(input_data.get("input_value", "0"))
except (ValueError, TypeError):
    raise ValueError("input_value must be a number")

5.5 坑五：Agent 版本的“指令解析差异”

问题现象：同一个 SKILL.md ，在 TRAE v2.1 上完美运行，在 Claude Code v1.8 上总跳过 if/else 分支。

根因排查：

对比两个 Agent 的指令解析日志，发现 TRAE 使用 llm-instruct-v3 解析器，支持 if/else 语法；Claude Code v1.8 用 llm-instruct-v1 ，只识别 Run 动词，忽略条件语句

解决方案不是升级 Agent（客户环境不可控），而是改写指令：

## Execution flow

1. Run `scripts/branch_router.py` with `input_type` and `input_value`. This script returns the skill name to execute.

2. Run the skill returned in step 1.

把逻辑判断下沉到 Python 脚本，保证跨平台兼容。