Skip to content

BUG: seqfold 0.10.0 panics when 0.9.0 would return -inf #31

@maxtidev

Description

@maxtidev

Since version 0.10.0 (the Rust rewrite), seqfold panics for inputs that returned -inf in version 0.9.0 (among other differences, see below).

Version 0.9.0

$ python --version
Python 3.12.12
$ seqfold --version
seqfold 0.9.0
$ python -c "from seqfold import dg; print(dg('CC'))"
-inf

Version 0.10.0

$ python --version
Python 3.12.12
$ seqfold --version
seqfold 0.10.0
$ python -c "from seqfold import dg; print(dg('CC'))" 

thread '<unnamed>' (252846) panicked at src/core/fold.rs:767:22:
index out of bounds: the len is 2 but the index is 2
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "<string>", line 1, in <module>
pyo3_runtime.PanicException: index out of bounds: the len is 2 but the index is 2

I built a simple fuzzer to get a deeper understanding of the issue. Feel free to also use this in your CI etc. if you'd like:

fuzz.py

import itertools
from seqfold import dg

ALPHABET = "ACTG"
MIN_LENGTH = 2
MAX_LENGTH = 10

def dna_fuzzer():
    for i in range(MIN_LENGTH, MAX_LENGTH):
        for comb in itertools.product(ALPHABET, repeat=i):
            yield "".join(comb)

for seq in dna_fuzzer():
    try:
        res = dg(seq)
    except BaseException as e:
        print(type(e))
    else:
        print(res)

With version 0.9.0

$ time python fuzz.py > fuzz-out-0.9.0.txt
python fuzz.py > fuzz-out-0.9.0.txt  14,59s user 0,01s system 99% cpu 14,677 total

With version 0.10.0

$ time python fuzz.py > fuzz-out-0.10.0.txt
[...]
thread '<unnamed>' (282237) panicked at src/core/fold.rs:767:22:
index out of bounds: the len is 9 but the index is 9
python fuzz.py > fuzz-out-0.10.0.txt  1,84s user 0,05s system 98% cpu 1,920 total

result.py

from collections import Counter

FUZZ1_FILE = "fuzz-out-0.9.0.txt"
FUZZ2_FILE = "fuzz-out-0.10.0.txt"

with open(FUZZ1_FILE) as f1, open(FUZZ2_FILE) as f2:
    fuzz1 = f1.readlines()
    fuzz2 = f2.readlines()

assert len(fuzz1) == len(fuzz2)

equal = [r1 for r1, r2 in zip(fuzz1, fuzz2) if r1 == r2]
unequal = [(r1, r2) for r1, r2 in zip(fuzz1, fuzz2) if r1 != r2]
print(f"{len(equal)=}\n{len(unequal)=}\n")

unequal_counts = Counter(unequal)
print(f"{unequal_counts=}")
$ python result.py
len(equal)=333096
len(unequal)=16428

unequal_counts=Counter({('-inf\n', "<class 'pyo3_runtime.PanicException'>\n"): 16304, ('-0.0\n', '0.0\n'): 120, ("<class 'IndexError'>\n", "<class 'pyo3_runtime.PanicException'>\n"): 4})

As we can see, there are three relevant differences in the output between version 0.9.0 and 0.10.0.

  • First, what would return -inf before now panics.
  • Second, what returned -0.0 before now returns 0.0 (a positive change in my mind, but still a difference).
  • And third, inputs of length one (e.g. dg("A")) previously raised an IndexError but now cause a Rust panic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions