Misc. vuln fixes by jmecom · Pull Request #19 · eerimoq/detools

jmecom · 2026-04-07T20:02:56Z

Summary

This PR fixes multiple vulnerabilities across the C apply path, the Python apply path, and the native Python extensions used during patch creation.

The common issue behind most of the real findings was that malformed patch metadata or caller-provided native buffers were not being rejected early enough. In practice that meant negative decoded sizes, impossible in-place patch geometry, oversized logical CRLE segments, and unchecked native buffers could reach arithmetic, reads, or writes that assumed the inputs were already valid.

This change hardens those boundaries. Valid patches should behave the same as before, but malformed or adversarial inputs now fail early and predictably instead of being normalized by casts, propagated into invalid geometry, or passed through to unsafe native code.

Breakdown

C library: negative decoded chunk sizes bypassed bounds validation
In c/detools.c, decoded sizes from the patch stream could be negative and still pass through common_process_size() after an implicit cast to size_t.
C library: in-place patches could encode segment_size = 0
In the in-place apply header, segment_size was not rejected before being used in segment math.
C library: in-place patches could encode shift_size > memory_size
The in-place shift calculation assumed a valid layout and did not reject impossible memory geometry before using it.
Python apply path: negative sequential / in-place / bsdiff sizes were not rejected consistently
detools/apply.py trusted several signed size fields from patch headers and chunk metadata longer than it should have.
Python apply path: negative from_size could reach file reads
In the in-place Python apply path, a negative decoded from_size could become read(-N) behavior on file-like objects.
Python apply path: short diff reads failed too late
During in-place segment application, a short source read could continue until bsdiff.add_bytes() tripped on a length mismatch, leaving the failure mode too deep into the apply flow.
CRLE decompressor buffered attacker-controlled logical output
In detools/compression/crle.py, repeated and scattered segments could expand into large _outdata buffers even when the caller only requested a small amount of output.
bsdiff native wrapper trusted caller-provided diff buffers
detools/bsdiff.c did not verify that the scratch diff buffer was large enough for to_data.
bsdiff native wrapper trusted oversized inputs
Inputs larger than the algorithm’s int32_t limits could be truncated before reaching the core bsdiff logic.
bsdiff native wrapper trusted suffix-array contents
Caller-provided suffix-array entries were used as offsets without validating that they were in range.
suffix_array native wrapper did not validate output buffer sizing
detools.suffix_array.sais() and divsufsort() assumed the caller had provided a large enough output buffer.
bsdiff.add_bytes() leaked Python buffer exports on one error path
The length-mismatch path returned without releasing acquired buffers.

Impact

Negative decoded chunk sizes in the C apply path
A crafted patch could bypass the intended size bound checks and drive invalid apply behavior in the low-level C patch path.
segment_size = 0 in in-place C apply
A crafted in-place patch could drive divide-by-zero style behavior and invalid segment processing.
shift_size > memory_size in in-place C apply
A crafted in-place patch could push the memory-shift logic into invalid geometry and nonsensical offsets.
Negative sizes in the Python apply path
Corrupt patches could slip past expected bounds checks and fail later in less predictable ways.
Negative from_size in Python in-place apply
A crafted patch could trigger oversized reads from the source object, which is especially bad for constrained environments or unusual file-like wrappers.
Short reads in Python in-place diff application
Corrupt patches could fail mid-apply with a low-level mismatch instead of being rejected as invalid patch input up front.
CRLE output buffering
A crafted CRLE stream could force the decompressor to buffer far more logical output than the caller requested, creating an avoidable resource-exhaustion vector.
Unchecked diff scratch buffers in bsdiff
Callers of the native patch-creation helper could trigger out-of-bounds native writes with undersized buffers.
Oversized native inputs
Inputs above the underlying algorithm limits could truncate into invalid negative or wrapped values before reaching the search logic.
Unchecked suffix-array contents
Malformed suffix arrays could drive out-of-range native reads and undefined behavior in patch creation.
Unchecked suffix-array output buffers
Callers could hand the wrapper too-small output buffers and trigger native memory corruption.
Leaked buffer exports in add_bytes()
Repeated error-path use could accumulate Python-side resource leakage and leave mutable buffers pinned unexpectedly.

Testing

Added direct regression tests for the C-library malformed patch cases in c/tst/test_fuzzer.c.
Added Python security regressions in tests/test_apply_security.py covering:
- negative diff / extra / data-format sizes
- invalid in-place header values
- short source reads during in-place diff application
- invalid heatshrink headers
Added CRLE regressions in tests/test_crle.py to verify repeated and scattered segments now decompress incrementally instead of over-buffering.
Added native wrapper regressions in tests/test_bsdiff.py covering:
- undersized diff buffers
- invalid suffix-array entries
- oversized inputs past INT32_MAX
- the add_bytes() buffer-release error path
Added native wrapper regressions in tests/test_suffix_array.py covering:
- undersized output buffers
- oversized inputs past INT32_MAX

eerimoq · 2026-04-07T20:42:09Z

Looks way to complicated to be correct =)

jmecom · 2026-04-07T21:42:56Z

Looks way to complicated to be correct =)

Happy to break it down into separate commits if you like? Pretty confident the findings are correct.

jmecom · 2026-04-10T18:18:10Z

Looks way to complicated to be correct =)

Happy to break it down into separate commits if you like? Pretty confident the findings are correct.

I went ahead and did this. Eyes appreciated. Given that embedded devices may use this library for firmware update, this presents a notable attack surface unless the device requires, for example, the delta patches be signed.

eerimoq · 2026-04-10T18:57:35Z

If you allow anyone to upload delta firmware to your device, you probably have bigger problems as it seems your device is not protected by any authentication =)

eerimoq · 2026-04-10T19:12:18Z

split into separate PRs and I might review some. and if possible, make corrupt patch check in helper functions to reduce the lines of code

jmecom added 4 commits April 10, 2026 10:57

apply: validate malformed patch sizes

2182850

suffix_array: validate buffer sizes

082318b

bsdiff: validate buffers and inputs

f30a597

crle: handle incremental decompression correctly

648cd41

jmecom force-pushed the jm/fix-vulns branch from cbb3d39 to 648cd41 Compare April 10, 2026 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. vuln fixes#19

Misc. vuln fixes#19
jmecom wants to merge 4 commits intoeerimoq:masterfrom
jmecom:jm/fix-vulns

jmecom commented Apr 7, 2026

Uh oh!

eerimoq commented Apr 7, 2026

Uh oh!

jmecom commented Apr 7, 2026

Uh oh!

jmecom commented Apr 10, 2026

Uh oh!

eerimoq commented Apr 10, 2026

Uh oh!

eerimoq commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jmecom commented Apr 7, 2026

Summary

Breakdown

Impact

Testing

Uh oh!

eerimoq commented Apr 7, 2026

Uh oh!

jmecom commented Apr 7, 2026

Uh oh!

jmecom commented Apr 10, 2026

Uh oh!

eerimoq commented Apr 10, 2026

Uh oh!

eerimoq commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants