feat(storage): prune stale aggregated payloads on finalization by MegaRedHand · Pull Request #389 · lambdaclass/ethlambda

MegaRedHand · 2026-05-21T15:00:02Z

Summary

Adds PayloadBuffer::prune(finalized_slot) and Store::prune_stale_aggregated_payloads(finalized_slot), mirroring leanSpec's prune_stale_attestation_data for the two aggregated payload pools.
Wires the new pruner into Store::update_checkpoints next to the existing prune_live_chain / prune_gossip_signatures calls, and includes pruned_payloads in the "Pruned finalized data" log line.
Adds three unit tests: buffer-level prune (drop + noop) and a store-level test covering both new_payloads and known_payloads.

Motivation

ethlambda was missing the equivalent of leanSpec's prune_stale_attestation_data. The new and known aggregated payload buffers were only bounded by FIFO eviction at 64/512 entries, so post-finalization entries (target.slot ≤ finalized.slot) could linger for ~33 min at 1 attestation/slot. Lingering known proofs pollute existing_proofs_for_data lookups: when a fresh gossip-signature batch happens to collide on the exact stale data_root, build_job selects the stale entry as a child and routes through aggregate_mixed recursive aggregation instead of plain XMSS aggregation. Criterion now matches leanSpec exactly: keep iff target.slot > finalized.slot.

Test plan

cargo test -p ethlambda-storage — 38 tests pass (including the 3 new ones)
cargo clippy -p ethlambda-storage --all-targets -- -D warnings clean
Verify on devnet that the "Pruned finalized data" log line surfaces non-zero pruned_payloads at finalization boundaries

The new and known aggregated payload buffers were only bounded by FIFO eviction at 64/512 entries: post-finalization entries lingered until naturally pushed out, polluting `existing_proofs_for_data` lookups and occasionally forcing recursive XMSS aggregation when plain aggregation would suffice. Mirrors leanSpec's `prune_stale_attestation_data`: drop entries whose `target.slot <= finalized.slot` from both buffers when finalization advances, alongside the existing gossip-signature prune.

github-actions · 2026-05-21T15:01:12Z

🤖 Kimi Code Review

The PR implements pruning for aggregated attestation payloads on finalization, correctly mirroring the leanSpec's prune_stale_attestation_data. The logic is sound: once a checkpoint is finalized, attestations targeting it cannot influence fork choice and are safe to discard.

Code Quality & Safety

Defensive arithmetic (line 274)
Consider using saturating_sub to prevent potential panics if total_proofs ever desynchronizes from the actual entry counts:
```
*total_proofs = total_proofs.saturating_sub(entry.proofs.len());
```
While the current logic should maintain consistency, this guards against future refactoring bugs.
Lock ordering (lines 862-863)
The method acquires new_payloads then known_payloads. Ensure this ordering is consistent with other methods in Store that might acquire both locks to avoid deadlocks. If elsewhere the order is reversed, standardize on one direction.
Documentation
The doc comments accurately describe the consensus rationale (fork choice irrelevance after finalization) and correctly reference the leanSpec.

Correctness

The pruning condition entry.data.target.slot > finalized_slot (keep if greater) correctly implements "at or below" removal semantics as specified.
total_proofs accounting is consistent with the push implementation (increment on add, decrement on remove).
The order VecDeque is correctly filtered to match the HashMap retention.

Testing

The unit tests cover:

Pruning of entries at the exact finalized slot boundary (item 1)
No-op behavior when all entries are fresh (item 2)
Integration across both new_payloads and known_payloads buffers (item 3)

Recommendation: Approve after considering the saturating_sub suggestion for defensive programming.

Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

github-actions · 2026-05-21T15:02:02Z

🤖 Claude Code Review

PR #389 — `feat(storage): prune stale aggregated payloads on finalization`

Author: Tomás Grüner | Files changed: 1 (crates/storage/src/store.rs) | +130 / -2

Overview

Adds the storage-side equivalent of leanSpec's prune_stale_attestation_data. The two aggregated payload buffers (new_payloads, known_payloads) previously relied only on FIFO capacity caps (64/512 entries); post-finalization entries with target.slot ≤ finalized_slot could linger and corrupt existing_proofs_for_data lookups, causing stale proofs to be selected as aggregation parents. This PR plugs that hole and wires the pruner into update_checkpoints alongside the existing prune_live_chain/prune_gossip_signatures calls.

Correctness

Boundary condition (target.slot > finalized_slot): correct. Entries at exactly finalized_slot are pruned, matching leanSpec's criterion.

total_proofs accounting: the decrement *total_proofs -= entry.proofs.len() inside the retain closure is correct and maintains the invariant.

Borrow-checker workaround (let total_proofs = &mut self.total_proofs): this is the standard Rust pattern for mutating a field while retain borrows the map. It's valid and correct, though a brief comment explaining why the extraction is needed would aid future readers.

prune_stale_aggregated_payloads visibility: marked pub, consistent with prune_gossip_signatures. No concern.

Code Quality

Unnecessary HashSet allocation — lines 273–290

The pruned_roots: HashSet<H256> is only used to filter self.order. Once data.retain has run, the pruned roots are no longer in self.data. The order cleanup can be simplified:

// Current (two allocations: HashSet + retain scan)
let mut pruned_roots: HashSet<H256> = HashSet::new();
let total_proofs = &mut self.total_proofs;
self.data.retain(|root, entry| {
    if entry.data.target.slot > finalized_slot {
        true
    } else {
        *total_proofs -= entry.proofs.len();
        pruned_roots.insert(*root);
        false
    }
});
if !pruned_roots.is_empty() {
    self.order.retain(|r| !pruned_roots.contains(r));
}
pruned_roots.len()

// Simpler (no HashSet; check data membership directly)
let mut pruned = 0;
let total_proofs = &mut self.total_proofs;
self.data.retain(|_, entry| {
    if entry.data.target.slot > finalized_slot {
        true
    } else {
        *total_proofs -= entry.proofs.len();
        pruned += 1;
        false
    }
});
if pruned > 0 {
    self.order.retain(|r| self.data.contains_key(r));
}
pruned

After data.retain completes, self.data contains exactly the entries to keep, so self.data.contains_key(r) is a clean O(1) check per order element. For buffers capped at 64–512 entries the difference is negligible, but the code becomes easier to read and reason about.

Tests

payload_buffer_prune_drops_entries_with_finalized_target: covers the boundary (target 3 and 5 pruned at finalized=5, target 7 kept), verifies total_proofs and order are both updated. Good.
payload_buffer_prune_noop_when_nothing_stale: covers the no-op path. Good.
store_prune_stale_aggregated_payloads_clears_both_buffers: verifies pruning across both buffers and that the combined return value is correct. Good.

Minor gap: no test for prune on an empty buffer (trivially safe given the implementation, but a one-liner assertion could serve as documentation). Also no test for the all-stale case where total_proofs should drop to zero and order should be empty; this would guard against future regressions in the accounting.

Summary

The implementation is correct, well-motivated, and consistent with the leanSpec reference. The only actionable suggestion is replacing the intermediate HashSet<H256> with a counter and a post-retain data.contains_key check (see above) — it reduces allocations and simplifies the logic without changing semantics. The tests are solid; adding an all-stale and/or empty-buffer case would round them out but is not blocking.

Automated review by Claude (Anthropic) · sonnet · custom prompt

github-actions · 2026-05-21T15:02:52Z

🤖 Codex Code Review

High: crates/storage/src/store.rs:272-287 and :751-756 prune aggregated payloads purely on entry.data.target.slot <= finalized_slot, but this client’s fork choice still counts latest messages based on attestation_data.head.root, not target.slot (crates/blockchain/src/store.rs:53-61, crates/storage/src/store.rs:298-315, crates/blockchain/fork_choice/src/lib.rs:16-23). Late attestations with a finalized target are still accepted and stored by gossip processing (crates/blockchain/src/store.rs:140-184, :279-315, :342-391), so this pruning can remove a validator’s latest vote and reduce weight on the live subtree, potentially changing LMD GHOST head selection after finalization. The new tests only cover the simplified slot == target.slot case (crates/storage/src/store.rs:2118-2203, helper at :2208-2216), so they don’t exercise the realistic slot > target.slot path where this shows up. I’d block on either proving that finalized-target votes are never meant to feed fork choice in this implementation, or adding a regression test showing head selection is unchanged for late-valid attestations before keeping this prune.

I couldn’t run the Rust tests here because cargo/rustup tried to write under the read-only path /home/runner/.rustup/tmp.

Automated review by OpenAI Codex · gpt-5.4 · custom prompt

greptile-apps · 2026-05-21T15:03:47Z

Greptile Summary

This PR introduces PayloadBuffer::prune(finalized_slot) and Store::prune_stale_aggregated_payloads(finalized_slot), wiring them into update_checkpoints alongside the existing live-chain and gossip-signature pruners. It also refactors GossipSignatureBuffer::prune to drop the intermediate HashSet in favour of a before/after length comparison and a contains_key guard on order.

PayloadBuffer::prune: retains entries whose target.slot > finalized_slot, correctly decrements total_proofs for each removed entry, and purges stale roots from the VecDeque order in a single O(n) retain pass.
GossipSignatureBuffer::prune refactor: replaces the HashSet<H256> accumulation with the same before - after pattern, reducing one allocation per prune call.
Tests: three new unit tests cover the buffer-level prune (drop + no-op) and the store-level combined prune across both new_payloads and known_payloads.

Confidence Score: 5/5

Safe to merge — the changes are additive pruning logic with no risk of incorrectly dropping live data, and all accounting invariants (total_proofs, order) are correctly maintained.

The new PayloadBuffer::prune correctly tracks the before/after count, decrements total_proofs for each evicted entry, and scrubs stale roots from order in a single retain pass. The GossipSignatureBuffer refactor is a straightforward drop of an unnecessary HashSet with no behavioural change. The store-level method acquires locks sequentially and sums results safely. All three new tests exercise the prune paths including the no-op and combined-buffer cases.

No files require special attention.

Important Files Changed

Filename	Overview
crates/storage/src/store.rs	Adds PayloadBuffer::prune, Store::prune_stale_aggregated_payloads, and wires them into update_checkpoints; also cleans up GossipSignatureBuffer::prune to remove an unnecessary HashSet allocation. Logic, accounting (total_proofs), and order cleanup all look correct. Three tests validate the new paths.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[update_checkpoints] -->|finalized.slot > old_finalized_slot| B[prune_live_chain]
    A --> C[prune_gossip_signatures]
    A --> D[prune_stale_aggregated_payloads]

    D --> E[new_payloads.lock.prune]
    D --> F[known_payloads.lock.prune]

    E --> G[PayloadBuffer::prune]
    F --> G

    G -->|data.retain| H{target.slot > finalized_slot?}
    H -->|yes| I[keep entry]
    H -->|no| J[total_proofs -= proofs.len
remove entry]
    G --> K[order.retain: keep if data.contains_key]

    C --> L[GossipSignatureBuffer::prune]
    L -->|data.retain| M{slot > finalized_slot?}
    M -->|yes| N[keep entry]
    M -->|no| O[total_signatures -= sigs.len
remove entry]
    L --> P[order.retain: keep if data.contains_key]

_{Reviews (2): Last reviewed commit: "Merge branch 'main' into feat/prune-stal..." | Re-trigger Greptile}

Replace the pruned_roots HashSet with a size-delta counter; filter self.order via self.data.contains_key after retain. Same semantics, one fewer allocation per prune, and matches the disjoint-field-borrow pattern used elsewhere in the file.

…::prune Mirror the same size-delta + data.contains_key pattern applied to PayloadBuffer::prune in the prior commit, so both buffers share a single prune idiom.

Merge branch 'main' into feat/prune-stale-aggregated-payloads

dadbd30

greptile-apps Bot reviewed May 21, 2026

View reviewed changes

Comment thread crates/storage/src/store.rs

MegaRedHand marked this pull request as draft May 21, 2026 15:29

MegaRedHand added 3 commits May 21, 2026 13:00

refactor(storage): drop intermediate HashSet in GossipSignatureBuffer…

1439aac

…::prune Mirror the same size-delta + data.contains_key pattern applied to PayloadBuffer::prune in the prior commit, so both buffers share a single prune idiom.

Merge branch 'main' into feat/prune-stale-aggregated-payloads

4c8cd6f

MegaRedHand marked this pull request as ready for review May 21, 2026 16:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(storage): prune stale aggregated payloads on finalization#389

feat(storage): prune stale aggregated payloads on finalization#389
MegaRedHand wants to merge 5 commits into
mainfrom
feat/prune-stale-aggregated-payloads

MegaRedHand commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

greptile-apps Bot commented May 21, 2026 •

edited

Loading

Important Files Changed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MegaRedHand commented May 21, 2026

Summary

Motivation

Test plan

Uh oh!

github-actions Bot commented May 21, 2026

🤖 Kimi Code Review

Uh oh!

github-actions Bot commented May 21, 2026

🤖 Claude Code Review

PR #389 — feat(storage): prune stale aggregated payloads on finalization

Overview

Correctness

Code Quality

Tests

Summary

Uh oh!

github-actions Bot commented May 21, 2026

🤖 Codex Code Review

Uh oh!

greptile-apps Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PR #389 — `feat(storage): prune stale aggregated payloads on finalization`

greptile-apps Bot commented May 21, 2026 •

edited

Loading