Add performance gates for incremental Codex cost scans#1434
Conversation
|
Codex review: needs maintainer review before merge. Reviewed June 11, 2026, 6:18 PM ET / 22:18 UTC. Summary Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path. Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Risk before merge
Maintainer options:
Next step before merge
Review detailsBest possible solution: Retry the Codex review after fixing the execution failure. Do we have a high-confidence way to reproduce the issue? Unclear. The review failed before ClawSweeper could establish a reproduction path. Is this the best way to solve the issue? Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction. AGENTS.md: unclear because the file could not be read completely. Codex review notes: model internal, reasoning high; reviewed against 1912f75f4962. Label changesLabel changes:
Label justifications:
Evidence reviewedWhat I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
82f5852 to
a9acafa
Compare
|
Rebased onto current The original 200k-row fixture produced only a 4x cold/incremental ratio on this machine. The fixture now uses 500k rows while retaining the 5x requirement; two consecutive runs passed with roughly 70 ms full scans and sub-millisecond incremental scans. Proof:
|
a9acafa to
16ae060
Compare
|
Updated the PR head to
Fresh exact-head CI is now running. |
16ae060 to
da0c456
Compare
|
Rebased on current Autoreview found and I fixed a CI-flakiness blocker: the gates no longer assert wall-clock ratios. They now prove cache reuse by changing old content while preserving cache metadata, and prove row-cursor behavior by changing an old SQLite row before appending a new one. Proof:
|
da0c456 to
69b2818
Compare
|
Rebased onto current Proof on
Exact-head CI is now running. |
69b2818 to
3df376e
Compare
Co-authored-by: pickaxe <54486432+ProspectOre@users.noreply.github.com>
3df376e to
c70bb0d
Compare
|
Validated exact head
The new deterministic gates prove that unchanged Codex session files retain cached parse results and that priority-turn refreshes process appended SQLite rows without replaying mutated historical rows. |
Summary
Supersedes #1423 because its fork branch does not allow maintainer edits.
Proof
swift test --filter CostUsagePerformanceGateTeststwice: 2 tests passed on both runsmake check: cleangit diff --check: clean