Foundation fixes: sum (core+symbolic), string-keyed dicts, recursion/enabled/changes/partial-dict errors, fmt safety, soundness oracle by danwt · Pull Request #93 · danwt/specl

danwt · 2026-06-26T15:38:35Z

Foundation fixes for the core language, symbolic backend, and tooling. Each commit is atomic with tests; all touched-crate suites and the example corpus pass.

Features

sum() builtin over a Seq/Set of numbers or a dict's values (lexer→parser→types→IR→eval→VM). Conservation invariants (total == sum(balances)) were previously unwritable in core. Adds examples/other/conservation.specl.
Symbolic sum() encoding (symbolic: encode sum() over collections #92): the symbolic backend now unrolls sum into a chained Z3 Add (dict values, set members, comprehensions), so conservation invariants over unbounded Nat are proven inductively, not just BFS-checked.
String-keyed dicts (feat: support string-keyed maps #50): Dict[String, V] now auto-routes to BFS (keys are fixed by init; only values vary) instead of being forced to symbolic and rejected. String action parameters are rejected (no finite domain — previously mis-enumerated as ints). Adds examples/other/string-keys.specl.

Correctness / trust

Reject recursive functions instead of stack-overflow crash (functions are inlined). Includes a fix so nested non-recursive Min(Min(a,b),c) is not misflagged.
Reject enabled()/changes() at type-check (changes silently returned true, enabled errored internally).
Reject partial dict init (bug: partial dict init panics with out-of-bounds index instead of erroring #95) with a clear DictDomainMismatch instead of an out-of-bounds panic under symmetry reduction.
Subrange widening in unify (Range-typed const won't widen into a wider range type #88): 1..4 fits where 0..8 is expected; disjoint ranges still error.
Honest OK verdict: ignored liveness/fairness now shows OK (safety invariants only; N ... not checked).

Tooling

fmt never deletes comments (fmt deletes all comments #87, data-loss half): format_or_keep() leaves any commented file unchanged (CLI note; LSP makes no edit). Full comment formatting stays open in fmt deletes all comments #87.
Cross-tool soundness oracle (soundness: add a cross-tool differential oracle (vs TLC) #94): specl/tools/soundness/oracle.sh runs the same TLA+ spec through both TLC and specl and asserts verdict agreement. It immediately found a false-proof bug: the default symbolic path (IC3) reports OK on DieHard while BFS, BMC, and TLC all find the violation — filed as soundness: IC3 reports OK on a spec with a real violation (DieHard) #96 and allowlisted so the oracle stays green while guarding regressions.
Docs: parenthesizing membership in in let value position (let x = EXPR in BODY: membership in in value position requires parentheses #71).

Bugs found, filed (not fixed here — deeper investigations)

soundness: IC3 reports OK on a spec with a real violation (DieHard) #96 IC3 false proof (critical, soundness) — found by the oracle.
symbolic: support Set[Seq[T]] (needed for message-passing protocols) #35 symbolic Set[Seq] — left as-is: BFS handles it correctly and symbolic errors cleanly (no false proof). The issue itself documents this as fundamentally hard (exponential); adding new symbolic encoding given soundness: IC3 reports OK on a spec with a real violation (DieHard) #96 would be reckless.

Out of scope (per request)

Liveness implementation (#90) and module composition (#91) — not pursued.

Testing

New unit/integration tests for: sum (type + symbolic), subrange widening, enabled/changes rejection, recursion (direct/mutual/nested-ok), format_or_keep, partial-dict error, string-keyed dict + String-param rejection, symbolic conservation. Example corpus passes (incl. new conservation + string-keys specs). cargo test green across syntax/types/eval/ir/symbolic/mc/cli; clippy clean on changed lines.

🤖 Generated with Claude Code

Aggregates a Seq/Set of numbers, or the values of a dict, into an Int. Conservation invariants like `totalSupply == sum(balances)` were previously unwritable in core specl (only union_all existed; sums needed explicit per-index expansion). Wired through lexer, parser, pretty-printer, type checker, IR, tree-walk evaluator, and bytecode VM. The symbolic backend returns a clear error and auto-routing falls back to BFS (full Z3 sum encoding is a follow-up). Adds examples/other/conservation.specl and type-checker unit tests.

A range-typed const (e.g. `1..4`) failed to unify where a wider range was expected (e.g. a `Dict[_, 0..8]` value), though an integer literal in the same position widened fine. unify now accepts two ranges when one contains the other, matching the existing Int<->Range looseness; disjoint ranges remain an error. Fixes #88.

…e stack Functions are inlined at call sites, so a self- or mutually-recursive function inlined forever and aborted the process with a stack overflow. The compiler now tracks the inlining stack and returns a clear compile error on re-entry. Adds direct- and mutual-recursion regression tests.

…g value Neither is evaluated by the model checker yet: changes() silently returned true at runtime and enabled() raised an internal error. Both are now rejected at type-check time with a clear "not yet supported" message, so an invariant can't quietly hold for the wrong reason.

The recursion guard pushed the function name before compiling its arguments, so a nested non-recursive call like Min(Min(a, b), c) was wrongly rejected (caught by raft.specl in the example corpus). Compile the arguments in the caller's context first, then mark the function as being inlined only around its body.

…nored The checker warned that liveness/fairness declarations are ignored but still printed a bare "Result: OK", so a skimmer could read an ignored property as a passing one. The OK line now reads "OK (safety invariants only; N liveness/fairness declaration(s) not checked)" when any were ignored.

The AST-based formatter dropped every comment because the parser discards them as trivia, so `fmt --write` and the LSP format-on-save silently deleted all comments. Until comments are reattached to the syntax tree, format_or_keep() compares the comment set before and after formatting and returns the original source unchanged if any comment would be lost. The CLI prints a note; the LSP makes no edit. Comment-free files format as before. Addresses the data-loss half of #87 (full comment formatting remains open).

…position Document the workaround for the genuine `let x = EXPR in BODY` ambiguity (#71): parenthesize a membership test used as the bound value.

…ions The symbolic backend previously errored on sum() and fell back to BFS. It now unrolls sum into a chained Z3 Add: a dict sums its per-key value vars, a set sums member values, and a comprehension sums the element value where the filter holds. Conservation invariants (total == sum(balances)) over unbounded Nat are now proven inductively. Closes #92.

A dict whose init did not cover its declared key domain (e.g. Dict[0..5, _] filled only over 0..3) caused an out-of-bounds panic downstream — notably under symmetry reduction, which permutes over the full declared domain. The explorer now validates each dict variable's initial size against its declared key range and returns a clear DictDomainMismatch error. Closes #95.

String-keyed dicts (Dict[String, V]) were forced to the symbolic backend, which rejects non-range keys, so they could not be checked by default. Their keys are fixed by init and only the values vary, so BFS can enumerate them: analyze() no longer flags a non-range-key dict with bounded values as unbounded, and such specs auto-route to BFS. String-typed action parameters are rejected (no finite domain — previously mis-enumerated as integers); iterate keys(d) instead. Adds examples/other/string-keys.specl and unit tests. Closes #50.

Adds specl/tools/soundness/oracle.sh: runs the same TLA+ spec through both TLC and specl (auto-translating the .tla) and asserts the verdicts agree, catching unsound results that internal-consistency tests cannot. Ships a known-OK Counter pair and reuses DieHard from the comparison benchmarks. The oracle immediately found a false-proof bug: specl's default symbolic path (IC3) reports OK on DieHard while BFS, BMC, and TLC all find the violation (filed as #96, allowlisted so the oracle stays green while guarding regressions). Closes #94.

The cross-tool oracle lives in the private repo's soundness harness (which has a more comprehensive parser/minispec/e2e differential vs TLC). Keep the public repo free of the TLC/Java dependency.

danwt added 12 commits June 26, 2026 17:10

claude: docs(lang): note parenthesizing membership in in let value …

c03c5f5

…position Document the workaround for the genuine `let x = EXPR in BODY` ambiguity (#71): parenthesize a membership test used as the bound value.

danwt changed the title ~~Foundation fixes: sum() builtin, recursion/enabled/changes errors, range widening, fmt comment safety~~ Foundation fixes: sum (core+symbolic), string-keyed dicts, recursion/enabled/changes/partial-dict errors, fmt safety, soundness oracle Jun 26, 2026

claude: chore: remove TLC soundness oracle from public repo

2731a74

The cross-tool oracle lives in the private repo's soundness harness (which has a more comprehensive parser/minispec/e2e differential vs TLC). Keep the public repo free of the TLC/Java dependency.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Foundation fixes: sum (core+symbolic), string-keyed dicts, recursion/enabled/changes/partial-dict errors, fmt safety, soundness oracle#93

Foundation fixes: sum (core+symbolic), string-keyed dicts, recursion/enabled/changes/partial-dict errors, fmt safety, soundness oracle#93
danwt wants to merge 13 commits into
mainfrom
danwt/claude/foundation-fixes

danwt commented Jun 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

danwt commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Features

Correctness / trust

Tooling

Bugs found, filed (not fixed here — deeper investigations)

Out of scope (per request)

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

danwt commented Jun 26, 2026 •

edited

Loading