feat(#1309): add mandatory finding for agent behavior definition changes#2133
feat(#1309): add mandatory finding for agent behavior definition changes#2133fullsend-ai-coder[bot] wants to merge 1 commit into
Conversation
Add a new Platform Security rule to the code-review skill that instructs the review agent to always produce a finding when a diff modifies files that define agent behavior. This closes a governance gap where the review agent could silently approve changes to its own skill definitions, agent definitions, harness configs, or policy files without flagging them for human review. The rule covers four file patterns: skills/*/SKILL.md, agents/*.md, harness/*.yaml, and policies/*.yaml. Severity is info for additive changes and medium for modifications that relax existing checks or remove constraints. Note: make lint could not run due to Go module cache permission errors in the sandbox (infrastructure issue, not code-related). This is a Markdown-only change with no Go code modifications. Closes #1309
Site previewPreview: https://31dbed02-site.fullsend-ai.workers.dev Commit: |
|
🤖 Finished Review · ✅ Success · Started 5:13 PM UTC · Completed 5:22 PM UTC |
ReviewFindingsLow
Info
|
| Examples of permission-declaring files: GitHub App manifest JSON, | ||
| `permissions:` blocks in `.github/workflows/*.yml`, token scoping | ||
| maps, IAM policy JSON/YAML, Kubernetes `Role`/`ClusterRole` YAML. | ||
| - **Agent behavior definition changes:** If the diff modifies any file |
There was a problem hiding this comment.
[low] logic-error
The new rule does not specify an explicit category name for findings. Without corresponding categories in the security sub-agent definition and the pr-review routing table, re-review severity anchoring will route findings for this rule to the correctness dimension instead of security. Consider adding category directives (e.g., agent-behavior-modification) to sub-agents/security.md and the routing table.
| Examples of permission-declaring files: GitHub App manifest JSON, | ||
| `permissions:` blocks in `.github/workflows/*.yml`, token scoping | ||
| maps, IAM policy JSON/YAML, Kubernetes `Role`/`ClusterRole` YAML. | ||
| - **Agent behavior definition changes:** If the diff modifies any file |
There was a problem hiding this comment.
[low] logic-error
The security sub-agent Own section does not explicitly list agent behavior definition files. Adding them would improve clarity and consistency with the mandatory-finding directive in SKILL.md.
| files control how agents review, triage, and implement changes; | ||
| modifications warrant careful human review because agents have a | ||
| structural blind spot when evaluating changes to their own | ||
| instructions. Severity: **info** for additive changes (new rules, new |
There was a problem hiding this comment.
[low] formatting-consistency
Uses Medium with uppercase M. The canonical severity list and existing rules use all lowercase. Should be medium for consistency.
Add a new Platform Security rule to the code-review skill that instructs the review agent to always produce a finding when a diff modifies files that define agent behavior. This closes a governance gap where the review agent could silently approve changes to its own skill definitions, agent definitions, harness configs, or policy files without flagging them for human review.
The rule covers four file patterns: skills//SKILL.md, agents/.md, harness/.yaml, and policies/.yaml. Severity is info for additive changes and medium for modifications that relax existing checks or remove constraints.
Note: make lint could not run due to Go module cache permission errors in the sandbox (infrastructure issue, not code-related). This is a Markdown-only change with no Go code modifications.
Closes #1309
Post-script verification
agent/1309-agent-behavior-review-rule)eda1e12306a7a4393c077260bcd5acde491b8649..HEAD)