Skip to content

docs: add offline quickstart smoke proof#235

Open
Gradata wants to merge 3 commits into
mainfrom
gra-1783-offline-smoke
Open

docs: add offline quickstart smoke proof#235
Gradata wants to merge 3 commits into
mainfrom
gra-1783-offline-smoke

Conversation

@Gradata
Copy link
Copy Markdown
Owner

@Gradata Gradata commented May 29, 2026

Summary

  • add scripts/smoke_quickstart.py, an offline SDK/CLI smoke proof for Show HN readers
  • cover the smoke path with tests/test_quickstart_smoke.py
  • document the pre-cloud smoke command in README and docs quickstart

Verification

  • python3 scripts/smoke_quickstart.py → PASS; created temp local brain, recorded correction, recalled rules, rendered manifest with sessions_trained: 1
  • python3 -m pytest tests/test_quickstart_smoke.py → 1 passed
  • uv run --extra dev ruff check scripts/smoke_quickstart.py tests/test_quickstart_smoke.py → All checks passed

Paperclip: GRA-1783

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 29, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 174efc37-6497-43d1-97e3-cdfb1c71f7df

📥 Commits

Reviewing files that changed from the base of the PR and between e18e9f9 and 49ff246.

📒 Files selected for processing (1)
  • Gradata/src/gradata/cli.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: pytest windows-latest / py3.11
  • GitHub Check: pytest windows-latest / py3.12
  • GitHub Check: pytest ubuntu-latest / py3.11
  • GitHub Check: pytest ubuntu-latest / py3.12
  • GitHub Check: pytest macos-latest / py3.11
  • GitHub Check: pytest macos-latest / py3.12
  • GitHub Check: pytest (py3.11)
  • GitHub Check: pytest (py3.12)
🧰 Additional context used
📓 Path-based instructions (1)
Gradata/src/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/src/**/*.py: Prefer sentence-transformers for local embeddings, google-genai for Gemini embeddings, cryptography for AES-GCM encrypted system.db, bm25s for BM25 rule ranking, and mem0ai for external memory adapters — guard all optional dependency imports with try / except ImportError at the call site, never at module level
Maintain strict layering: Layer 0 (Primitives: _types.py, _db.py, _events.py, _paths.py, _file_lock.py; Patterns: contrib/patterns/) must never import from Layer 1 (Enhancements: enhancements/, rules/) or Layer 2 (Public API: brain.py, cli.py, daemon.py, mcp_server.py)
Never use bare except: pass — use typed exceptions or at minimum logger.warning(...) with exc_info=True to avoid silent failure in a memory product
Never import from out-of-scope sibling directories ../Sprites/ or ../Hausgem/ within gradata/* code — that is a layering bug
Never leak private-sibling paths into public docs/code — no references to ../Sprites/, ../Hausgem/, email addresses, OneDrive paths, or Sprites-specific examples from inside gradata/*
Use atomic-write helper when writing JSON files to prevent corruption from mid-write crashes

Files:

  • Gradata/src/gradata/cli.py
🔇 Additional comments (2)
Gradata/src/gradata/cli.py (2)

517-548: LGTM!


885-885: LGTM!


📝 Walkthrough
  • Added offline smoke test script: Gradata/scripts/smoke_quickstart.py — end-to-end offline SDK/CLI smoke proof that creates a temporary brain, records a correction, performs a recall, and renders a manifest using only stdlib + local package; no network or credentials required.
  • Public/test helpers: smoke(tmp_root: Path) -> dict[str, object] (test helper) and main() -> int (CLI entrypoint) added.
  • New test: Gradata/tests/test_quickstart_smoke.py — pytest validates DB creation, sessions_trained, and expected CLI invocations.
  • Documentation: README.md and docs/getting-started/quickstart.md include an optional “prove the local quickstart works offline” step instructing users to run the smoke script before configuring cloud sync/API keys.
  • .gitignore updated to explicitly include Gradata/scripts/smoke_quickstart.py in the repository/distribution.
  • CLI behavior change (install verification): _cmd_install_agent now verifies installs by querying recent CORRECTION events (verification_brain.query_events(...)) and checking serialized event data for the verification marker instead of searching rule search results.
  • Minor CLI numeric computation tweak: cmd_prove adjusted a linear-regression numerator to use zip(..., strict=False).
  • Verification performed: smoke script run passed, pytest passed, and ruff checks passed.
  • No breaking changes or security fixes introduced.

Walkthrough

This PR adds an offline smoke test script plus helper and pytest, documents the optional offline quickstart step and ships the script via .gitignore, and changes the CLI install verification to inspect recent CORRECTION events instead of searching rule text.

Changes

Offline quickstart smoke test

Layer / File(s) Summary
Smoke test script implementation
Gradata/scripts/smoke_quickstart.py
Script creates isolated temp dirs/env, runs a sequence of python -m gradata.cli commands (init, correct draft+final, recall, manifest JSON, stats), validates system.db creation, parses manifest JSON, and returns structured result data. Includes _run() subprocess wrapper and main() entrypoint.
Documentation and distribution
.gitignore, Gradata/README.md, Gradata/docs/getting-started/quickstart.md
Adds a .gitignore exception so Gradata/scripts/smoke_quickstart.py is included; README and quickstart docs add an optional offline verification step with the exact python3 scripts/smoke_quickstart.py command and a brief description of what it does.
Test coverage
Gradata/tests/test_quickstart_smoke.py
Adds a pytest that invokes the smoke helper against a temporary path and asserts that a database was created, sessions_trained is present, and expected CLI command strings were executed.
CLI install verification tweak
Gradata/src/gradata/cli.py
Changes the post-install verification check to query recent CORRECTION events (query_events(event_type="CORRECTION", ...)) and scan event data for the verification marker instead of searching rules via verification_brain.search(...); updates warning wording accordingly and adjusts a zip(..., strict=False) usage in cmd_prove.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Gradata/gradata#189: Also modifies Gradata/src/gradata/cli.py install verification/readback path, related to marker readability logic.
  • Gradata/gradata#210: Overlaps in cmd_prove changes and install verification/readback adjustments in Gradata/src/gradata/cli.py.
  • Gradata/gradata#124: Updates .gitignore patterns for publish exceptions and may overlap with the .gitignore change in this PR.

Suggested labels

docs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding an offline quickstart smoke proof with documentation and tests.
Description check ✅ Passed The description is directly related to the changeset, clearly outlining the addition of the smoke test script, test coverage, and documentation updates.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch gra-1783-offline-smoke

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.22.0)

OpenGrep fatal error (exit code 2):
┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

�[1m Loading rules from local config...�[0m
[00.18][ERROR]: Error: exception Glob.Lexer.Syntax_error("malformed glob pattern: missing ']'")
Raised at Glob__Lexer.syntax_error in file "libs/glob/Lexer.mll", line 8, characters 2-26
Called from Glob__Lexer.__ocaml_lex_token_rec in file "libs/glob/Lexer.mll", line 29, characters 26-53
Cal


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the docs label May 29, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Gradata/scripts/smoke_quickstart.py`:
- Line 35: The hard-coded timeout=30 in smoke_quickstart.py makes the smoke run
flaky on slow CI; make the timeout configurable by replacing the literal with a
variable (e.g., cmd_timeout) that is populated from a config source such as an
environment variable (SMOKE_CMD_TIMEOUT) or a CLI argument with a default of 30,
parse it to int, and use that variable wherever timeout=30 is currently passed
(search for the exact token "timeout=30" in smoke_quickstart.py and update the
surrounding function that runs subprocesses or commands to accept/consume the
configurable timeout).
- Line 57: The env construction currently overwrites PYTHONPATH by setting
"PYTHONPATH": str(SRC); instead prepend SRC to any existing PYTHONPATH instead
of replacing it: read the current value via os.environ.get("PYTHONPATH", ""),
join SRC and the existing value with os.pathsep (handling empty existing value
so you don't add a trailing separator), and set that combined string as the
"PYTHONPATH" entry in the env dict where SRC is referenced.

In `@Gradata/tests/test_quickstart_smoke.py`:
- Line 14: The assertion for result["sessions_trained"] only checks for not None
and would accept 0 (a no-op run); change the check on the test that contains
assert result["sessions_trained"] is not None to assert that
result["sessions_trained"] is a positive integer (e.g., assert
isinstance(result["sessions_trained"], int) and result["sessions_trained"] > 0
or simply assert result["sessions_trained"] >= 1) so the test fails when no
training sessions were actually run.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 484650c1-c1f6-4c47-b7f2-42a56603c65a

📥 Commits

Reviewing files that changed from the base of the PR and between a197bff and c5cc810.

📒 Files selected for processing (5)
  • .gitignore
  • Gradata/README.md
  • Gradata/docs/getting-started/quickstart.md
  • Gradata/scripts/smoke_quickstart.py
  • Gradata/tests/test_quickstart_smoke.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: pytest windows-latest / py3.12
  • GitHub Check: pytest ubuntu-latest / py3.12
  • GitHub Check: pytest macos-latest / py3.11
  • GitHub Check: pytest windows-latest / py3.11
  • GitHub Check: pytest macos-latest / py3.12
  • GitHub Check: pytest ubuntu-latest / py3.11
  • GitHub Check: pytest (py3.12)
  • GitHub Check: pytest (py3.11)
🧰 Additional context used
📓 Path-based instructions (1)
Gradata/tests/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/tests/**/*.py: Set BRAIN_DIR environment variable via tmp_path in conftest.py for test isolation — ensure _paths.py module cache refreshes when calling Brain.init() directly inside tests
Add unit tests in tests/test_*.py for every CI push without LLM calls (deterministic); mark integration tests with @pytest.mark.integration and skip them by default (they hit real LLM APIs)

Files:

  • Gradata/tests/test_quickstart_smoke.py
🔇 Additional comments (3)
.gitignore (1)

178-178: LGTM!

Gradata/README.md (1)

109-116: LGTM!

Gradata/docs/getting-started/quickstart.md (1)

5-14: LGTM!

env=env,
text=True,
capture_output=True,
timeout=30,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make command timeout configurable to reduce flaky failures.

A fixed 30s timeout is brittle for slower CI/dev hosts; one slow command will fail the full smoke run.

Proposed fix
-        timeout=30,
+        timeout=int(env.get("GRADATA_SMOKE_TIMEOUT_SEC", "60")),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Gradata/scripts/smoke_quickstart.py` at line 35, The hard-coded timeout=30 in
smoke_quickstart.py makes the smoke run flaky on slow CI; make the timeout
configurable by replacing the literal with a variable (e.g., cmd_timeout) that
is populated from a config source such as an environment variable
(SMOKE_CMD_TIMEOUT) or a CLI argument with a default of 30, parse it to int, and
use that variable wherever timeout=30 is currently passed (search for the exact
token "timeout=30" in smoke_quickstart.py and update the surrounding function
that runs subprocesses or commands to accept/consume the configurable timeout).

env.update(
{
"HOME": str(home_dir),
"PYTHONPATH": str(SRC),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Preserve existing PYTHONPATH instead of overwriting it.

Overwriting can break environments that already depend on PYTHONPATH. Prepend SRC and keep existing entries.

Proposed fix
-            "PYTHONPATH": str(SRC),
+            "PYTHONPATH": (
+                f"{SRC}{os.pathsep}{env['PYTHONPATH']}"
+                if env.get("PYTHONPATH")
+                else str(SRC)
+            ),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"PYTHONPATH": str(SRC),
"PYTHONPATH": (
f"{SRC}{os.pathsep}{env['PYTHONPATH']}"
if env.get("PYTHONPATH")
else str(SRC)
),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Gradata/scripts/smoke_quickstart.py` at line 57, The env construction
currently overwrites PYTHONPATH by setting "PYTHONPATH": str(SRC); instead
prepend SRC to any existing PYTHONPATH instead of replacing it: read the current
value via os.environ.get("PYTHONPATH", ""), join SRC and the existing value with
os.pathsep (handling empty existing value so you don't add a trailing
separator), and set that combined string as the "PYTHONPATH" entry in the env
dict where SRC is referenced.

commands = cast("list[str]", result["commands"])

assert result["database_created"] is True
assert result["sessions_trained"] is not None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Strengthen sessions_trained assertion to catch no-op runs.

is not None still passes for 0; this can miss regressions where training doesn’t happen.

Proposed fix
-    assert result["sessions_trained"] is not None
+    assert isinstance(result["sessions_trained"], int)
+    assert result["sessions_trained"] >= 1
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert result["sessions_trained"] is not None
assert isinstance(result["sessions_trained"], int)
assert result["sessions_trained"] >= 1
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Gradata/tests/test_quickstart_smoke.py` at line 14, The assertion for
result["sessions_trained"] only checks for not None and would accept 0 (a no-op
run); change the check on the test that contains assert
result["sessions_trained"] is not None to assert that result["sessions_trained"]
is a positive integer (e.g., assert isinstance(result["sessions_trained"], int)
and result["sessions_trained"] > 0 or simply assert result["sessions_trained"]
>= 1) so the test fails when no training sessions were actually run.

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant