Skip to content

Complete architecture roadmap improvements#131

Merged
mxsm merged 1 commit into
mainfrom
mxsm/architecture-roadmap-v21
Jun 24, 2026
Merged

Complete architecture roadmap improvements#131
mxsm merged 1 commit into
mainfrom
mxsm/architecture-roadmap-v21

Conversation

@mxsm

@mxsm mxsm commented Jun 24, 2026

Copy link
Copy Markdown
Owner

Summary

Closes #130.

This PR completes the architecture reassessment roadmap:

  • aligns SIMD docs and public API behavior by documenting the actual memchr/memmem search path and adding try_substring error reporting
  • adds serde support for CheetahStr
  • extracts shared inline string storage and splits focused CheetahString internals into representation and pattern modules
  • removes unconnected SIMD search dead code
  • adds packed validation evidence, packed benchmarks, and repeatable validation tooling
  • documents the current x86_64 SSE2-only SIMD strategy and the gates required before adding AArch64/NEON support

Validation

Local validation passed:

  • cargo fmt --all
  • cargo check --all-features --all-targets
  • cargo test --all-features
  • cargo clippy --all-features --all-targets -- -D warnings
  • cargo check --no-default-features
  • scripts/verify-packed.ps1 -RunMiri -RunSanitizer -RunFuzz -RunBench -FuzzRuns 100

Packed evidence is archived at bench-results/packed-evidence/20260624-130416/summary.md with total failing required gates: 0.

Release follow-up

After this PR merges, bump Cargo.toml and README examples to 2.1.0, then trigger the existing Release workflow with version 2.1.0.

Summary by CodeRabbit

  • New Features

    • Added support for an experimental packed-string mode, including a new benchmark and broader evidence capture for testing, fuzzing, and performance checks.
    • Expanded serialization support so packed string values can be encoded and decoded more consistently.
  • Bug Fixes

    • Improved substring error handling with clearer failures for invalid ranges and boundaries.
    • Updated string handling to better manage short inline values and conversions.
  • Documentation

    • Added guidance for the packed evidence workflow and clarified SIMD/search behavior.

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR implements the v2.1.0 architecture roadmap: extracts a shared InlineStr type for fixed-capacity inline storage, splits CheetahString internals into repr and pattern submodules, adds Error::InvalidRange and stepwise try_substring validation, removes SIMD substring search, adds CheetahStr serde support, adds a packed Criterion benchmark suite, and introduces a PowerShell verification script with archived evidence.

Changes

v2.1.0 Architecture Improvements

Layer / File(s) Summary
InlineStr type and Error::InvalidRange contract
src/inline.rs, src/error.rs, src/lib.rs
Introduces InlineStr (23-byte fixed-capacity inline UTF-8 storage with empty, from_str, push_str, into_string, and read accessors) and adds Error::InvalidRange { start, end } with a matching Display arm. Registers mod inline; in the crate module tree.
repr.rs and pattern.rs internal modules
src/cheetah_string/repr.rs, src/cheetah_string/pattern.rs
repr.rs defines InnerString (Inline/Static/Shared/Owned variants) and re-exports INLINE_CAPACITY. pattern.rs defines sealed StrPattern/SplitPattern traits, StrPatternImpl, SplitWrapper, and SplitStr with char-optimised split paths and a runtime panic for reverse string-pattern iteration.
CheetahStr and CheetahString refactored to use InlineStr
src/cheetah_str.rs, src/cheetah_string.rs
Repr::Inline and InnerString::Inline variants now wrap InlineStr instead of raw { len, data } fields. All construction (empty, from_slice, from_string, from_builder_string, etc.) and read paths (as_str, as_bytes, len, is_empty, push_str_internal, reserve) delegate to InlineStr methods. CheetahString imports InnerString/INLINE_CAPACITY from the new repr submodule.
try_substring validation and substring panic update
src/cheetah_string.rs, tests/serde.rs
try_substring gains explicit ordered validation returning InvalidRange, IndexOutOfBounds, and InvalidCharBoundary error variants. substring docs reference try_substring for recoverable errors. Serde tests verify the new error variants and cover inline/shared CheetahStr JSON round-trips.
SIMD substring search removal and scope clarification
src/simd.rs, src/cheetah_string.rs, tests/simd.rs, docs/simd-portability.md
Removes find_bytes, find_bytes_sse2, find_byte_sse2, and their tests from simd.rs. Updates contains/find doc comments to state the memchr/memmem backend. Renames SIMD test functions to include an explicit feature qualifier. Adds docs/simd-portability.md documenting x86_64 SSE2-only scope and AArch64/NEON evidence gates.
Serde support for CheetahStr
src/serde.rs
Adds Serialize and Deserialize impls for CheetahStr, a cheetah_str deserializer function handling &str/String/bytes/Vec<u8>, and changes CheetahString deserialization to use the DeError alias consistently.
Packed benchmark and Cargo wiring
Cargo.toml, benches/packed.rs
Adds benches/packed.rs with bench_packed_construction, bench_packed_push_str, and bench_packed_topic_insert Criterion groups comparing CheetahString and PackedCheetahString, gated behind experimental-packed. Registers the bench target in Cargo.toml.
Verification script, evidence artifacts, and docs
scripts/verify-packed.ps1, bench-results/..., src/unsafe_proof.md, bench-results/README.md
Adds scripts/verify-packed.ps1 with Invoke-Gate, Add-AsanRuntimeToPath, and conditional Miri/ASan/fuzz/bench gate runners. Archives the 20260624-130416 evidence run (test, Miri, ASan, fuzz, bench logs and summary) under bench-results/packed-evidence/. Updates README.md and unsafe_proof.md to document the evidence layout and script invocation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • mxsm/cheetah-string#97: Tightens SIMD conditional compilation for contains/find to x86_64-only, overlapping directly with this PR's removal of the SIMD substring search helpers and scope documentation.
  • mxsm/cheetah-string#101: Refactors the same core src/cheetah_string.rs inline/representation strategy and touches src/simd.rs, so changes overlap at the function and variant level with this PR.
  • mxsm/cheetah-string#103: Overhauled string-pattern splitting logic and added tests/split_edge_cases coverage, directly related to this PR's new pattern.rs module with SplitPattern/SplitStr and reverse-iteration panic behavior.

Suggested reviewers

  • TeslaRustor

Poem

🐇 Hoppity-hop through inline bytes so neat,
InlineStr now stands on its own two feet!
The SIMD search hath hopped away,
memchr and memmem are here to stay.
With evidence archived in tidy rows,
The packed bench data merrily flows~ 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title is related to the PR, but it is too generic to convey the main change clearly. Use a concise, specific title that names the main change, such as packed validation, serde, or SIMD cleanup.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed The PR covers the listed goals: SIMD docs, CheetahStr serde, substring errors, module refactors, dead-code removal, and packed evidence tooling.
Out of Scope Changes check ✅ Passed The diff appears focused on the roadmap objectives, with no clear unrelated feature or refactor additions.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mxsm/architecture-roadmap-v21

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@mxsm mxsm merged commit 6862eae into main Jun 24, 2026
6 of 7 checks passed
@mxsm mxsm mentioned this pull request Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Complete architecture roadmap improvements and release v2.1.0

1 participant