Skip to content

feat(storage): prune stale aggregated payloads on finalization#389

Open
MegaRedHand wants to merge 5 commits into
mainfrom
feat/prune-stale-aggregated-payloads
Open

feat(storage): prune stale aggregated payloads on finalization#389
MegaRedHand wants to merge 5 commits into
mainfrom
feat/prune-stale-aggregated-payloads

Conversation

@MegaRedHand
Copy link
Copy Markdown
Collaborator

Summary

  • Adds PayloadBuffer::prune(finalized_slot) and Store::prune_stale_aggregated_payloads(finalized_slot), mirroring leanSpec's prune_stale_attestation_data for the two aggregated payload pools.
  • Wires the new pruner into Store::update_checkpoints next to the existing prune_live_chain / prune_gossip_signatures calls, and includes pruned_payloads in the "Pruned finalized data" log line.
  • Adds three unit tests: buffer-level prune (drop + noop) and a store-level test covering both new_payloads and known_payloads.

Motivation

ethlambda was missing the equivalent of leanSpec's prune_stale_attestation_data. The new and known aggregated payload buffers were only bounded by FIFO eviction at 64/512 entries, so post-finalization entries (target.slot ≤ finalized.slot) could linger for ~33 min at 1 attestation/slot. Lingering known proofs pollute existing_proofs_for_data lookups: when a fresh gossip-signature batch happens to collide on the exact stale data_root, build_job selects the stale entry as a child and routes through aggregate_mixed recursive aggregation instead of plain XMSS aggregation. Criterion now matches leanSpec exactly: keep iff target.slot > finalized.slot.

Test plan

  • cargo test -p ethlambda-storage — 38 tests pass (including the 3 new ones)
  • cargo clippy -p ethlambda-storage --all-targets -- -D warnings clean
  • Verify on devnet that the "Pruned finalized data" log line surfaces non-zero pruned_payloads at finalization boundaries

The new and known aggregated payload buffers were only bounded by FIFO
eviction at 64/512 entries: post-finalization entries lingered until
naturally pushed out, polluting `existing_proofs_for_data` lookups and
occasionally forcing recursive XMSS aggregation when plain aggregation
would suffice. Mirrors leanSpec's `prune_stale_attestation_data`: drop
entries whose `target.slot <= finalized.slot` from both buffers when
finalization advances, alongside the existing gossip-signature prune.
@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

The PR implements pruning for aggregated attestation payloads on finalization, correctly mirroring the leanSpec's prune_stale_attestation_data. The logic is sound: once a checkpoint is finalized, attestations targeting it cannot influence fork choice and are safe to discard.

Code Quality & Safety

  1. Defensive arithmetic (line 274)
    Consider using saturating_sub to prevent potential panics if total_proofs ever desynchronizes from the actual entry counts:

    *total_proofs = total_proofs.saturating_sub(entry.proofs.len());

    While the current logic should maintain consistency, this guards against future refactoring bugs.

  2. Lock ordering (lines 862-863)
    The method acquires new_payloads then known_payloads. Ensure this ordering is consistent with other methods in Store that might acquire both locks to avoid deadlocks. If elsewhere the order is reversed, standardize on one direction.

  3. Documentation
    The doc comments accurately describe the consensus rationale (fork choice irrelevance after finalization) and correctly reference the leanSpec.

Correctness

  • The pruning condition entry.data.target.slot > finalized_slot (keep if greater) correctly implements "at or below" removal semantics as specified.
  • total_proofs accounting is consistent with the push implementation (increment on add, decrement on remove).
  • The order VecDeque is correctly filtered to match the HashMap retention.

Testing

The unit tests cover:

  • Pruning of entries at the exact finalized slot boundary (item 1)
  • No-op behavior when all entries are fresh (item 2)
  • Integration across both new_payloads and known_payloads buffers (item 3)

Recommendation: Approve after considering the saturating_sub suggestion for defensive programming.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

PR #389feat(storage): prune stale aggregated payloads on finalization

Author: Tomás Grüner | Files changed: 1 (crates/storage/src/store.rs) | +130 / -2


Overview

Adds the storage-side equivalent of leanSpec's prune_stale_attestation_data. The two aggregated payload buffers (new_payloads, known_payloads) previously relied only on FIFO capacity caps (64/512 entries); post-finalization entries with target.slot ≤ finalized_slot could linger and corrupt existing_proofs_for_data lookups, causing stale proofs to be selected as aggregation parents. This PR plugs that hole and wires the pruner into update_checkpoints alongside the existing prune_live_chain/prune_gossip_signatures calls.


Correctness

Boundary condition (target.slot > finalized_slot): correct. Entries at exactly finalized_slot are pruned, matching leanSpec's criterion.

total_proofs accounting: the decrement *total_proofs -= entry.proofs.len() inside the retain closure is correct and maintains the invariant.

Borrow-checker workaround (let total_proofs = &mut self.total_proofs): this is the standard Rust pattern for mutating a field while retain borrows the map. It's valid and correct, though a brief comment explaining why the extraction is needed would aid future readers.

prune_stale_aggregated_payloads visibility: marked pub, consistent with prune_gossip_signatures. No concern.


Code Quality

Unnecessary HashSet allocation — lines 273–290

The pruned_roots: HashSet<H256> is only used to filter self.order. Once data.retain has run, the pruned roots are no longer in self.data. The order cleanup can be simplified:

// Current (two allocations: HashSet + retain scan)
let mut pruned_roots: HashSet<H256> = HashSet::new();
let total_proofs = &mut self.total_proofs;
self.data.retain(|root, entry| {
    if entry.data.target.slot > finalized_slot {
        true
    } else {
        *total_proofs -= entry.proofs.len();
        pruned_roots.insert(*root);
        false
    }
});
if !pruned_roots.is_empty() {
    self.order.retain(|r| !pruned_roots.contains(r));
}
pruned_roots.len()

// Simpler (no HashSet; check data membership directly)
let mut pruned = 0;
let total_proofs = &mut self.total_proofs;
self.data.retain(|_, entry| {
    if entry.data.target.slot > finalized_slot {
        true
    } else {
        *total_proofs -= entry.proofs.len();
        pruned += 1;
        false
    }
});
if pruned > 0 {
    self.order.retain(|r| self.data.contains_key(r));
}
pruned

After data.retain completes, self.data contains exactly the entries to keep, so self.data.contains_key(r) is a clean O(1) check per order element. For buffers capped at 64–512 entries the difference is negligible, but the code becomes easier to read and reason about.


Tests

  • payload_buffer_prune_drops_entries_with_finalized_target: covers the boundary (target 3 and 5 pruned at finalized=5, target 7 kept), verifies total_proofs and order are both updated. Good.
  • payload_buffer_prune_noop_when_nothing_stale: covers the no-op path. Good.
  • store_prune_stale_aggregated_payloads_clears_both_buffers: verifies pruning across both buffers and that the combined return value is correct. Good.

Minor gap: no test for prune on an empty buffer (trivially safe given the implementation, but a one-liner assertion could serve as documentation). Also no test for the all-stale case where total_proofs should drop to zero and order should be empty; this would guard against future regressions in the accounting.


Summary

The implementation is correct, well-motivated, and consistent with the leanSpec reference. The only actionable suggestion is replacing the intermediate HashSet<H256> with a counter and a post-retain data.contains_key check (see above) — it reduces allocations and simplifies the logic without changing semantics. The tests are solid; adding an all-stale and/or empty-buffer case would round them out but is not blocking.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

  1. High: crates/storage/src/store.rs:272-287 and :751-756 prune aggregated payloads purely on entry.data.target.slot <= finalized_slot, but this client’s fork choice still counts latest messages based on attestation_data.head.root, not target.slot (crates/blockchain/src/store.rs:53-61, crates/storage/src/store.rs:298-315, crates/blockchain/fork_choice/src/lib.rs:16-23). Late attestations with a finalized target are still accepted and stored by gossip processing (crates/blockchain/src/store.rs:140-184, :279-315, :342-391), so this pruning can remove a validator’s latest vote and reduce weight on the live subtree, potentially changing LMD GHOST head selection after finalization. The new tests only cover the simplified slot == target.slot case (crates/storage/src/store.rs:2118-2203, helper at :2208-2216), so they don’t exercise the realistic slot > target.slot path where this shows up. I’d block on either proving that finalized-target votes are never meant to feed fork choice in this implementation, or adding a regression test showing head selection is unchanged for late-valid attestations before keeping this prune.

I couldn’t run the Rust tests here because cargo/rustup tried to write under the read-only path /home/runner/.rustup/tmp.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 21, 2026

Greptile Summary

This PR introduces PayloadBuffer::prune(finalized_slot) and Store::prune_stale_aggregated_payloads(finalized_slot), wiring them into update_checkpoints alongside the existing live-chain and gossip-signature pruners. It also refactors GossipSignatureBuffer::prune to drop the intermediate HashSet in favour of a before/after length comparison and a contains_key guard on order.

  • PayloadBuffer::prune: retains entries whose target.slot > finalized_slot, correctly decrements total_proofs for each removed entry, and purges stale roots from the VecDeque order in a single O(n) retain pass.
  • GossipSignatureBuffer::prune refactor: replaces the HashSet<H256> accumulation with the same before - after pattern, reducing one allocation per prune call.
  • Tests: three new unit tests cover the buffer-level prune (drop + no-op) and the store-level combined prune across both new_payloads and known_payloads.

Confidence Score: 5/5

Safe to merge — the changes are additive pruning logic with no risk of incorrectly dropping live data, and all accounting invariants (total_proofs, order) are correctly maintained.

The new PayloadBuffer::prune correctly tracks the before/after count, decrements total_proofs for each evicted entry, and scrubs stale roots from order in a single retain pass. The GossipSignatureBuffer refactor is a straightforward drop of an unnecessary HashSet with no behavioural change. The store-level method acquires locks sequentially and sums results safely. All three new tests exercise the prune paths including the no-op and combined-buffer cases.

No files require special attention.

Important Files Changed

Filename Overview
crates/storage/src/store.rs Adds PayloadBuffer::prune, Store::prune_stale_aggregated_payloads, and wires them into update_checkpoints; also cleans up GossipSignatureBuffer::prune to remove an unnecessary HashSet allocation. Logic, accounting (total_proofs), and order cleanup all look correct. Three tests validate the new paths.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[update_checkpoints] -->|finalized.slot > old_finalized_slot| B[prune_live_chain]
    A --> C[prune_gossip_signatures]
    A --> D[prune_stale_aggregated_payloads]

    D --> E[new_payloads.lock.prune]
    D --> F[known_payloads.lock.prune]

    E --> G[PayloadBuffer::prune]
    F --> G

    G -->|data.retain| H{target.slot > finalized_slot?}
    H -->|yes| I[keep entry]
    H -->|no| J[total_proofs -= proofs.len
remove entry]
    G --> K[order.retain: keep if data.contains_key]

    C --> L[GossipSignatureBuffer::prune]
    L -->|data.retain| M{slot > finalized_slot?}
    M -->|yes| N[keep entry]
    M -->|no| O[total_signatures -= sigs.len
remove entry]
    L --> P[order.retain: keep if data.contains_key]
Loading

Reviews (2): Last reviewed commit: "Merge branch 'main' into feat/prune-stal..." | Re-trigger Greptile

Comment thread crates/storage/src/store.rs
@MegaRedHand MegaRedHand marked this pull request as draft May 21, 2026 15:29
Replace the pruned_roots HashSet with a size-delta counter; filter
self.order via self.data.contains_key after retain. Same semantics,
one fewer allocation per prune, and matches the disjoint-field-borrow
pattern used elsewhere in the file.
…::prune

Mirror the same size-delta + data.contains_key pattern applied to
PayloadBuffer::prune in the prior commit, so both buffers share a
single prune idiom.
@MegaRedHand MegaRedHand marked this pull request as ready for review May 21, 2026 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant