Skip to content

Misc. vuln fixes#19

Open
jmecom wants to merge 4 commits intoeerimoq:masterfrom
jmecom:jm/fix-vulns
Open

Misc. vuln fixes#19
jmecom wants to merge 4 commits intoeerimoq:masterfrom
jmecom:jm/fix-vulns

Conversation

@jmecom
Copy link
Copy Markdown

@jmecom jmecom commented Apr 7, 2026

Summary

This PR fixes multiple vulnerabilities across the C apply path, the Python apply path, and the native Python extensions used during patch creation.

The common issue behind most of the real findings was that malformed patch metadata or caller-provided native buffers were not being rejected early enough. In practice that meant negative decoded sizes, impossible in-place patch geometry, oversized logical CRLE segments, and unchecked native buffers could reach arithmetic, reads, or writes that assumed the inputs were already valid.

This change hardens those boundaries. Valid patches should behave the same as before, but malformed or adversarial inputs now fail early and predictably instead of being normalized by casts, propagated into invalid geometry, or passed through to unsafe native code.

Breakdown

  • C library: negative decoded chunk sizes bypassed bounds validation
    In c/detools.c, decoded sizes from the patch stream could be negative and still pass through common_process_size() after an implicit cast to size_t.

  • C library: in-place patches could encode segment_size = 0
    In the in-place apply header, segment_size was not rejected before being used in segment math.

  • C library: in-place patches could encode shift_size > memory_size
    The in-place shift calculation assumed a valid layout and did not reject impossible memory geometry before using it.

  • Python apply path: negative sequential / in-place / bsdiff sizes were not rejected consistently
    detools/apply.py trusted several signed size fields from patch headers and chunk metadata longer than it should have.

  • Python apply path: negative from_size could reach file reads
    In the in-place Python apply path, a negative decoded from_size could become read(-N) behavior on file-like objects.

  • Python apply path: short diff reads failed too late
    During in-place segment application, a short source read could continue until bsdiff.add_bytes() tripped on a length mismatch, leaving the failure mode too deep into the apply flow.

  • CRLE decompressor buffered attacker-controlled logical output
    In detools/compression/crle.py, repeated and scattered segments could expand into large _outdata buffers even when the caller only requested a small amount of output.

  • bsdiff native wrapper trusted caller-provided diff buffers
    detools/bsdiff.c did not verify that the scratch diff buffer was large enough for to_data.

  • bsdiff native wrapper trusted oversized inputs
    Inputs larger than the algorithm’s int32_t limits could be truncated before reaching the core bsdiff logic.

  • bsdiff native wrapper trusted suffix-array contents
    Caller-provided suffix-array entries were used as offsets without validating that they were in range.

  • suffix_array native wrapper did not validate output buffer sizing
    detools.suffix_array.sais() and divsufsort() assumed the caller had provided a large enough output buffer.

  • bsdiff.add_bytes() leaked Python buffer exports on one error path
    The length-mismatch path returned without releasing acquired buffers.

Impact

  • Negative decoded chunk sizes in the C apply path
    A crafted patch could bypass the intended size bound checks and drive invalid apply behavior in the low-level C patch path.

  • segment_size = 0 in in-place C apply
    A crafted in-place patch could drive divide-by-zero style behavior and invalid segment processing.

  • shift_size > memory_size in in-place C apply
    A crafted in-place patch could push the memory-shift logic into invalid geometry and nonsensical offsets.

  • Negative sizes in the Python apply path
    Corrupt patches could slip past expected bounds checks and fail later in less predictable ways.

  • Negative from_size in Python in-place apply
    A crafted patch could trigger oversized reads from the source object, which is especially bad for constrained environments or unusual file-like wrappers.

  • Short reads in Python in-place diff application
    Corrupt patches could fail mid-apply with a low-level mismatch instead of being rejected as invalid patch input up front.

  • CRLE output buffering
    A crafted CRLE stream could force the decompressor to buffer far more logical output than the caller requested, creating an avoidable resource-exhaustion vector.

  • Unchecked diff scratch buffers in bsdiff
    Callers of the native patch-creation helper could trigger out-of-bounds native writes with undersized buffers.

  • Oversized native inputs
    Inputs above the underlying algorithm limits could truncate into invalid negative or wrapped values before reaching the search logic.

  • Unchecked suffix-array contents
    Malformed suffix arrays could drive out-of-range native reads and undefined behavior in patch creation.

  • Unchecked suffix-array output buffers
    Callers could hand the wrapper too-small output buffers and trigger native memory corruption.

  • Leaked buffer exports in add_bytes()
    Repeated error-path use could accumulate Python-side resource leakage and leave mutable buffers pinned unexpectedly.

Testing

  • Added direct regression tests for the C-library malformed patch cases in c/tst/test_fuzzer.c.
  • Added Python security regressions in tests/test_apply_security.py covering:
    • negative diff / extra / data-format sizes
    • invalid in-place header values
    • short source reads during in-place diff application
    • invalid heatshrink headers
  • Added CRLE regressions in tests/test_crle.py to verify repeated and scattered segments now decompress incrementally instead of over-buffering.
  • Added native wrapper regressions in tests/test_bsdiff.py covering:
    • undersized diff buffers
    • invalid suffix-array entries
    • oversized inputs past INT32_MAX
    • the add_bytes() buffer-release error path
  • Added native wrapper regressions in tests/test_suffix_array.py covering:
    • undersized output buffers
    • oversized inputs past INT32_MAX

@eerimoq
Copy link
Copy Markdown
Owner

eerimoq commented Apr 7, 2026

Looks way to complicated to be correct =)

@jmecom
Copy link
Copy Markdown
Author

jmecom commented Apr 7, 2026

Looks way to complicated to be correct =)

Happy to break it down into separate commits if you like? Pretty confident the findings are correct.

@jmecom
Copy link
Copy Markdown
Author

jmecom commented Apr 10, 2026

Looks way to complicated to be correct =)

Happy to break it down into separate commits if you like? Pretty confident the findings are correct.

I went ahead and did this. Eyes appreciated. Given that embedded devices may use this library for firmware update, this presents a notable attack surface unless the device requires, for example, the delta patches be signed.

@eerimoq
Copy link
Copy Markdown
Owner

eerimoq commented Apr 10, 2026

If you allow anyone to upload delta firmware to your device, you probably have bigger problems as it seems your device is not protected by any authentication =)

@eerimoq
Copy link
Copy Markdown
Owner

eerimoq commented Apr 10, 2026

split into separate PRs and I might review some. and if possible, make corrupt patch check in helper functions to reduce the lines of code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants