[improve][pip] PIP-429: Optimize Handling of Compacted Last Entry by Skipping Payload Buffer Parsing#24439
Conversation
There was a problem hiding this comment.
I prefer to make changes on the server side for several reasons.
- Broker can handle this issue well and provide a compatibility solution.
- Add a flag for the valid compacted position can avoid deserializing and uncompressing the entire batch messages metadata header.
- If making this change at the client-side, it increases the complexity for the client, and all language clients need to change.
- It's weird to return EntryBuffer when getting the last message ID.
- Actually, Pulsar only needs to retain valid message data in a compacted entry, but it retains all compacted messages with "empty header" and "empty payload".
Please check the
|
Yes. Unfortunately, it's limited by client side's logic that the default logic to parse the payload buffer looks like: int batchSize = msgMetadata.getNumMessagesInBatch();
for (int i = 0; i < batchSize; i++) {
int batchIndex = i;
final var singleMessageMetadata = parse(payload);
if (singleMessageMetadata.isCompactedOut()) {
break;
}
// Create a message, whose batch index is i, from the payload buffer
}
This makes sense to me. Adding a new field to |
If we purpose are to ensure that the compaction task is successful, we only need to check If we need make the Pulsar reader to read Kafka format data, then we need this change. |
No. We need to ensure the The consumer is able to configure a |
Co-authored-by: Penghui Li <penghui@apache.org>
…Skipping Payload Buffer Parsing (apache#24439)
…Skipping Payload Buffer Parsing (apache#24439)
Documentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: