perf: reduce Codex cost refresh metadata work by steipete · Pull Request #1430 · steipete/CodexBar

steipete · 2026-06-11T10:16:49Z

Summary

replace separate Foundation metadata/resource-identifier reads with one portable fstatat pass
skip repeated existence/root validation for Codex session paths already found during the current enumeration
accept the previous 0.33 cache producer key because parsing semantics did not change, then rewrite with the current key
add regressions for cache compatibility plus append, truncation, replacement, and symlink-target metadata

Closes #1392.

Root cause

Profiling current main against a real 365-day archive showed the proposed SQLite priority memo was aimed at a minority cost. The archive contains 60,607 JSONL files (~43 GB); logs_2.sqlite is ~713 MB with ~430k rows. In the expired refresh trace, priority SQLite work was about 4% while per-file Foundation metadata/resource-identifier reads and repeated cached-path validation dominated.

A direct 60,607-file benchmark measured:

Metadata path	Time
Foundation attributes + resource identifier	14.16s
One POSIX file-stat pass	0.23s

The replacement keeps current refresh policy and cache correctness. It adds no cursor, persisted memo, overlap state, or new unbounded collection.

Exact candidate proof

All runs used the exact branch build and the same real archive. Candidate cache output was normalized by removing updatedAt; expired and immediate-refresh JSON were byte-identical.

Exact head / state	Changed lines	Wall time
Current `main`, expired cache	baseline	28.79s
Current `main`, forced refresh	baseline	31.46-37.57s
#1404 exact head, expired	+952/-48	37.43s
#1421 exact head, persisted relaunch	+623/-26	20.22s
#1422 exact head, persisted relaunch	+681/-45	24.80s
#1423 exact head, expired	+546/-26	37.03s
#1430, compatible-cache migration refresh	+124/-24	12.00s
#1430, warm expired refresh	+124/-24	6.74s
#1430, immediate relaunch/cache hit	+124/-24	1.45s

The candidate is roughly 4-5.5x faster on the recurring expired-refresh path than current main or the stacked alternatives, with no extra persisted artifact. Peak footprint remained ~736-737 MB; the remaining dominant work is the existing ~39 MB JSON cache decode/encode.

Correctness contracts

append: size/mtime changes still select incremental parsing
truncation or replacement: size/mtime/inode changes still force the existing fallback path
symlinks: fstatat(..., 0) follows the target, matching previous Foundation behavior
window changes, late completion/model attribution, pricing changes, parser invalidation, and forced refreshes: existing scanner behavior unchanged
overlapping refreshes and bounded memory: no new mutable memo or retained state
Linux CLI: Darwin/Glibc timestamp fields are selected at compile time; required x64 and arm64 CI must pass before merge

Validation

swift test --filter 'CostUsageCacheTests|CostUsageScannerTests' - 21/21 pass
focused priority/cache/metadata regressions - 16/16 pass
make check - SwiftFormat, SwiftLint, and generated parser hash clean
full swift test run twice - only existing parallel app-server timeout flakes; each affected suite passes alone
branch autoreview (gpt-5.5) - clean, no accepted/actionable findings, confidence 0.82
git diff --check - clean

This supersedes the larger stateful stack while retaining contributor credit from #1404, #1421, #1422, and #1423.

clawsweeper · 2026-06-11T10:17:47Z

Codex review: needs maintainer review before merge. Reviewed June 11, 2026, 6:32 AM ET / 10:32 UTC.

Summary
The PR replaces Codex cost-cache Foundation metadata/root checks with a POSIX stat-based path, accepts one compatible prior Codex cache producer key, updates the generated parser hash and changelog, and adds focused cache/metadata tests.

Reproducibility: not applicable. as a bug reproduction path for this PR review. The PR body reports profiling current main and the exact branch build on a real 60,607-file archive, but this read-only review did not rerun that benchmark.

Review metrics: 3 noteworthy metrics.

Diff Scope: 7 files, +124/-24. The performance change is bounded to cost-cache/scanner code, generated hash, tests, and one changelog entry.
Focused Tests Added: 2 @test methods added. The new tests cover the compatibility key and the metadata cases most likely to regress cache correctness.
Compatibility Key: 1 prior producer key accepted. Existing Codex cache reuse after upgrade is the main semantic choice maintainers should notice before merge.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

[P1] This read-only review did not execute Swift tests or the large-archive benchmark; CI should confirm the reported focused tests, full tests where stable, make check, and parser-hash check before merge.
[P1] The compatible v0.33 Codex producer key is an intentional upgrade decision while the generated parser hash changes; maintainers should be comfortable that the parser semantics did not change before relying on the old cache.

Maintainer options:

Accept the compatible-cache migration (recommended)
Maintain the prior producer key compatibility after CI passes if maintainers agree the parser hash changed only because of metadata/cache helper code, not parser semantics.
Force a cold cache on doubt
Remove the compatible producer key before merge if maintainers want every existing Codex cache rebuilt whenever the generated parser hash changes.

Next step before merge

[P2] No narrow automated repair target is present; maintainers should review CI and decide whether to accept the compatible producer-key migration before merge.

Security
Cleared: No concrete security or supply-chain concern was found; the diff does not touch workflows, dependencies, package resolution, install/build scripts, secrets, or publishing automation.

Review details

Best possible solution:

Land the focused stat-based metadata/cache-validation patch after CI confirms the added regressions and maintainers accept the compatible-cache migration semantics.

Do we have a high-confidence way to reproduce the issue?

Not applicable as a bug reproduction path for this PR review. The PR body reports profiling current main and the exact branch build on a real 60,607-file archive, but this read-only review did not rerun that benchmark.

Is this the best way to solve the issue?

Yes, based on source inspection this is a narrow maintainable solution: it targets the measured metadata/root-validation cost without adding a persisted cursor, memo, or new stateful artifact. The main remaining decision is whether to accept the prior producer key as parser-compatible.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against 159d03ceb318.

Label changes

Label changes:

add merge-risk: 🚨 compatibility: The PR deliberately reuses a previous Codex cache producer key despite a generated parser-hash change, so upgrade cache semantics need maintainer acceptance.
add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
remove rating: 🐚 platinum hermit: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.

Label justifications:

P2: This is a normal-priority performance/cache correctness improvement for large Codex session archives with limited product blast radius.
merge-risk: 🚨 compatibility: The PR deliberately reuses a previous Codex cache producer key despite a generated parser-hash change, so upgrade cache semantics need maintainer acceptance.
rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: The external-contributor proof gate does not apply to this OWNER-authored PR; the PR body still reports exact-branch real-archive benchmark results and focused validation commands.

Evidence reviewed

What I checked:

Repository policy read: AGENTS.md was present and fully read; the review applied its read-only, focused-test, cache/provider, and make check guidance without running tests or modifying the checkout. (AGENTS.md:1, 159d03ceb318)
Diff scope: The exact local PR head changes 7 files with 124 additions and 24 deletions across changelog, generated hash, cost cache/scanner code, and tests. (e3b0597e02c5)
Cache compatibility path: The PR adds one compatible Codex producer key and only applies that compatibility when loading the default Codex cache without an explicit producer override. (Sources/CodexBarCore/Vendored/CostUsage/CostUsageCache.swift:3, e3b0597e02c5)
Stat-based metadata source: The metadata helper now uses fstatat, platform-specific timestamp fields, size, and dev:inode file identity while following symlink targets with flags 0. (Sources/CodexBarCore/Vendored/CostUsage/CostUsageScanner+CacheHelpers.swift:623, e3b0597e02c5)
Duplicate path validation skip: The scan plan builds a seen path set, excludes already enumerated cached files, and passes known scan paths into the session index to avoid repeating existence/root checks. (Sources/CodexBarCore/Vendored/CostUsage/CostUsageScanner.swift:2268, e3b0597e02c5)
Focused regressions: The patch adds regression coverage for accepting the prior Codex producer key and for append, truncation, replacement, and symlink-target metadata behavior. (Tests/CodexBarTests/CostUsageScannerTests.swift:7, e3b0597e02c5)

Likely related people:

steipete: Blame and log history show Peter Steinberger on the current cached-session helpers, metadata helper, Codex cost scanner performance work, generated producer-key cache invalidation, and this PR head. (role: recent area contributor; confidence: high; commits: 6cf422512061, 69de57b85de7, 5aded294c234; files: Sources/CodexBarCore/Vendored/CostUsage/CostUsageScanner.swift, Sources/CodexBarCore/Vendored/CostUsage/CostUsageScanner+CacheHelpers.swift, Sources/CodexBarCore/Vendored/CostUsage/CostUsageCache.swift)
ratulsarna: ratulsarna authored the earlier Codex cost scanner overcounting/cross-day fix that touched CostUsageCache.swift and CostUsageScanner.swift, which are central to this patch's cache correctness surface. (role: adjacent cache correctness contributor; confidence: medium; commits: fd19445056a1; files: Sources/CodexBarCore/Vendored/CostUsage/CostUsageCache.swift, Sources/CodexBarCore/Vendored/CostUsage/CostUsageScanner.swift, Tests/CodexBarTests/CostUsageCacheTests.swift)
hhh2210: hhh2210 co-authored the generated parser-hash producer-key cache invalidation work that this PR's compatible-producer-key migration adjusts. (role: adjacent cache invalidation coauthor; confidence: medium; commits: 5aded294c234; files: Scripts/regenerate-codex-parser-hash.sh, Sources/CodexBarCore/Generated/CodexParserHash.generated.swift, Sources/CodexBarCore/Vendored/CostUsage/CostUsageCache.swift)
ProspectOre: The PR head has a Co-authored-by trailer mapping to ProspectOre, and the PR body/changelog credit ProspectOre for the related Codex cost-scan performance context. (role: PR coauthor and adjacent performance contributor; confidence: medium; commits: e3b0597e02c5; files: CHANGELOG.md, Sources/CodexBarCore/Vendored/CostUsage/CostUsageScanner+CacheHelpers.swift, Sources/CodexBarCore/Vendored/CostUsage/CostUsageScanner.swift)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Co-authored-by: pickaxe <54486432+ProspectOre@users.noreply.github.com>

perf: reduce Codex cost refresh metadata work

e3b0597

Co-authored-by: pickaxe <54486432+ProspectOre@users.noreply.github.com>

steipete force-pushed the perf/cost-usage-file-stat branch from 7dceb10 to e3b0597 Compare June 11, 2026 10:25

steipete merged commit dd8cf8b into main Jun 11, 2026
7 checks passed

steipete deleted the perf/cost-usage-file-stat branch June 11, 2026 10:36

steipete mentioned this pull request Jun 11, 2026

Resolve codex priority turns incrementally per refresh #1404

Merged

clawsweeper Bot mentioned this pull request Jun 11, 2026

Persist the codex priority-turns memo across launches #1421

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: reduce Codex cost refresh metadata work#1430

perf: reduce Codex cost refresh metadata work#1430
steipete merged 1 commit into
mainfrom
perf/cost-usage-file-stat

steipete commented Jun 11, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

steipete commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Exact candidate proof

Correctness contracts

Validation

Uh oh!

clawsweeper Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

steipete commented Jun 11, 2026 •

edited

Loading

clawsweeper Bot commented Jun 11, 2026 •

edited

Loading