Skip to content

fix(reads): skip lazy SELECTs for absent tables via listTables pre-check#209

Merged
khustup2 merged 3 commits into
mainfrom
fix/skip-missing-table-reads
May 27, 2026
Merged

fix(reads): skip lazy SELECTs for absent tables via listTables pre-check#209
khustup2 merged 3 commits into
mainfrom
fix/skip-missing-table-reads

Conversation

@khustup2
Copy link
Copy Markdown
Contributor

@khustup2 khustup2 commented May 26, 2026

Summary

SessionStart reads hivemind_rules / hivemind_goals (context renderer) and skills (auto-pull) on every start, relying on catching relation does not exist. The client handled it, but the SELECT still failed server-side, so pg-deeplake logged a 42P01 per read on every fresh/empty workspace — the single largest cluster in the prod error digest (~1,060/24h, dominated by hivemind_goals/hivemind_rules/skills).

This gates those reads on a trusted table list so we never issue the SELECT for a table that isn't there.

  • DeeplakeApi.knownTablesOrNull() — returns the cached table list, or null when the lookup is untrusted (lets callers tell an empty workspace apart from a failed fetch).
  • renderContextBlock and runPull take an optional tableExists predicate; when it reports a table absent, the SELECT is skipped (section/rows empty).
  • Wired from the read entrypoints: hooks/session-start.ts (claude), hooks/hermes, hooks/cursor, commands/context.ts, and skillify/auto-pull.ts. (codex doesn't render the block; it only auto-pulls, covered centrally.)

Fallback preserved: the existing isMissingTableError catch stays. When the list is unavailable (predicate omitted), behavior is unchanged — a transient lookup blip never drops a read of a table that exists. The pre-check only fast-skips when we have a confident list showing the table is absent.

listTables() is cached and already warmed by the write paths each session, so this is typically zero extra HTTP calls.

Scope: kills the bare-name reads (hivemind_rules/hivemind_goals/skills). Schema-qualified stragglers (default.sessions, etc.) come from SDK/shell paths and are out of scope.

Test plan

  • npm run typecheck clean
  • New regressions: renderer + pull skip the SELECT (assert no query fired) when tableExists reports absent, and still SELECT when present
  • Updated DeeplakeApi test mocks with knownTablesOrNull()
  • Touched-module suites green; full suite green except 5 pre-existing failures unrelated to this change (cli-index / install-consent install-consent flow, standalone-embed-client defaults — none import these modules)

Summary by CodeRabbit

  • New Features

    • Added table existence checking to prevent errors when querying non-existent database tables during initial setup.
  • Bug Fixes

    • Improved handling of missing database tables in context rendering and data synchronization to gracefully skip unavailable resources instead of failing.
  • Tests

    • Added test coverage for table existence validation logic across multiple features.

Review Change Stack

SessionStart reads hivemind_rules / hivemind_goals (context renderer) and
skills (auto-pull) on every start, relying on catching "relation does not
exist". The client handled it, but the SELECT still failed server-side, so
pg-deeplake logged a 42P01 per read on every fresh/empty workspace — the
single largest cluster in the prod error digest.

Gate these reads on a trusted table list:
  - DeeplakeApi.knownTablesOrNull() returns the cached table list, or null
    when the lookup is untrusted (so callers tell "empty workspace" from
    "fetch failed").
  - renderContextBlock and runPull take an optional tableExists predicate;
    when it reports a table absent we skip the SELECT entirely.
  - The existing isMissingTableError catch stays as a fallback: when the
    list is unavailable (predicate omitted) behavior is unchanged, so a
    transient lookup blip never drops a read of a table that exists.

listTables() is cached and already warmed by the write paths each session,
so this is typically zero extra HTTP calls.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 6d19a58e-ab56-4fff-9e1e-094c0e5eab45

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

DeeplakeApi adds a knownTablesOrNull() method returning cached table names or null. Context renderer and multiple session-start hooks (Cursor, Hermes, base) use this to derive a conditional tableExists predicate, passing it to renderContextBlock to skip rules/goals SQL queries when tables haven't been created. Skillify pull applies the same pattern to skip skills table queries. All tests mock the new method and verify lazy query behavior.

Changes

Lazy table existence checks via cached known-tables predicate

Layer / File(s) Summary
Deeplake API known-tables discovery
src/deeplake-api.ts
DeeplakeApi.knownTablesOrNull() returns cached table names when available, or null when the fetch is non-cacheable or failed, distinguishing "definitely empty workspace" from "lookup unavailable" without changing existing listTables() behavior.
Context renderer table-existence predicate
src/hooks/shared/context-renderer.ts
RenderOptions gains optional tableExists(name) predicate. renderContextBlock uses it to conditionally skip rules and goals SQL reads per table when present, logging skips and falling back to prior try/catch behavior when the predicate is omitted.
Session-start context rendering
src/hooks/cursor/session-start.ts, src/hooks/hermes/session-start.ts, src/hooks/session-start.ts, src/commands/context.ts
All three session-start hooks and CLI context command fetch known tables via api.knownTablesOrNull(), derive a tableExists predicate, and pass it into renderContextBlock to skip rules/goals queries during initial table creation.
Skillify skills table pull
src/skillify/pull.ts, src/skillify/auto-pull.ts
PullOptions gains optional tableExists predicate. runPull checks it before executing the skills table SELECT; when present and returns false, treats result as empty array, avoiding server-side errors. autoPullSkills computes the predicate from knownTablesOrNull() and wires it into runPull invocation.
Test coverage for known-tables predicate
tests/claude-code/*, tests/cursor/*, tests/hermes/*, tests/shared/context-renderer.test.ts
All DeeplakeApi mocks updated to include knownTablesOrNull() returning null. New tests verify renderContextBlock skips both rules/goals queries when both tables are absent, and skips only goals when only rules exists. skillify-pull tests verify lazy execution: no query when table absent, one query when present.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • activeloopai/hivemind#193: Introduces the renderContextBlock/rules-tasks flow that this PR's tableExists predicate is meant to optimize and guard.
  • activeloopai/hivemind#203: Modifies SessionStart context rendering from rules+tasks to rules+goals; this PR adds the tableExists predicate (via knownTablesOrNull) to skip missing-table SQL reads in that same pipeline.
  • activeloopai/hivemind#112: Introduces auto-pull SessionStart behavior; this PR extends auto-pull.ts to compute and pass the tableExists predicate into runPull.

Suggested reviewers

  • kaghni

Poem

🐰 A rabbit hops through empty tables fast,
Caching known tables so queries don't last—
Predicates guard where data may fail,
Lazy reads skip when tables are pale.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description provides a comprehensive summary of changes with clear rationale, test results, and technical details, though the required 'Version Bump' section from the template is missing. Add the 'Version Bump' section from the template and specify the appropriate version bump (patch/minor/major) or indicate if no release is needed.
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(reads): skip lazy SELECTs for absent tables via listTables pre-check' accurately and specifically summarizes the main change: preventing unnecessary SQL queries for non-existent tables by pre-checking against a known table list.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/skip-missing-table-reads

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot requested a review from kaghni May 26, 2026 23:54
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 26, 2026

Coverage Report

Scope: files changed in this PR. Enforced threshold: 90% per metric (per file via vitest.config.ts).

Status Category Percentage Covered / Total
🟢 Lines 99.04% (🎯 90%) 720 / 727
🟢 Statements 97.86% (🎯 90%) 824 / 842
🟢 Functions 97.25% (🎯 90%) 106 / 109
🔴 Branches 86.56% (🎯 90%) 483 / 558
File Coverage — 8 files changed
File Stmts Branches Functions Lines
src/commands/context.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%
src/deeplake-api.ts 🟢 98.9% 🔴 89.0% 🟢 100.0% 🟢 99.6%
src/hooks/cursor/session-start.ts 🟢 98.6% 🟢 91.7% 🔴 87.5% 🟢 100.0%
src/hooks/hermes/session-start.ts 🟢 98.4% 🟢 90.0% 🔴 83.3% 🟢 100.0%
src/hooks/session-start.ts 🟢 100.0% 🟢 96.0% 🟢 100.0% 🟢 100.0%
src/hooks/shared/context-renderer.ts 🟢 95.1% 🔴 82.0% 🟢 100.0% 🟢 94.9%
src/skillify/auto-pull.ts 🟢 97.6% 🔴 76.0% 🟢 90.9% 🟢 100.0%
src/skillify/pull.ts 🟢 96.1% 🔴 81.7% 🟢 100.0% 🟢 98.8%

Generated for commit 1d2262e.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/skillify/auto-pull.ts`:
- Around line 108-109: The call to api.knownTablesOrNull() happens outside the
withTimeout(...) block causing SessionStart to potentially exceed timeoutMs;
move the table discovery into the same timeout boundary by invoking
knownTablesOrNull via withTimeout(timeoutMs, () => api.knownTablesOrNull()) (or
call withTimeout separately and treat failures/timeouts as null), then set
tableExists = (name: string) => known.includes(name) only if known is returned;
apply the same change to the other discovery usage around lines 116-128 so both
discovery paths degrade to null on timeout and don’t block SessionStart.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 030b2931-4f87-4734-ad76-aea78ac5eb05

📥 Commits

Reviewing files that changed from the base of the PR and between f408a40 and 8b57685.

📒 Files selected for processing (15)
  • src/commands/context.ts
  • src/deeplake-api.ts
  • src/hooks/cursor/session-start.ts
  • src/hooks/hermes/session-start.ts
  • src/hooks/session-start.ts
  • src/hooks/shared/context-renderer.ts
  • src/skillify/auto-pull.ts
  • src/skillify/pull.ts
  • tests/claude-code/cli-context.test.ts
  • tests/claude-code/session-start-hook.test.ts
  • tests/claude-code/skillify-auto-pull.test.ts
  • tests/claude-code/skillify-pull.test.ts
  • tests/cursor/cursor-session-start-hook.test.ts
  • tests/hermes/hermes-session-start-hook.test.ts
  • tests/shared/context-renderer.test.ts

Comment thread src/skillify/auto-pull.ts Outdated
Comment on lines +108 to +109
const known = await api.knownTablesOrNull();
if (known) tableExists = (name: string) => known.includes(name);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Bound table discovery inside the same timeout budget.

Line 108 calls await api.knownTablesOrNull() before withTimeout(...), so SessionStart can still block past timeoutMs when table discovery is slow. Keep table discovery inside the timed block (or timeout it separately and degrade to null).

⏱️ Proposed fix
-  let query: QueryFn;
+  let query: QueryFn;
+  let api: DeeplakeApi | null = null;
   // Existence predicate from a trusted table list — lets runPull skip the
   // SELECT (and the server-side 42P01) when `skills` doesn't exist yet on a
   // fresh workspace. Undefined when the list can't be fetched or a test
   // injects its own query, in which case runPull falls back to its catch.
   let tableExists: ((name: string) => boolean) | undefined;
   if (deps.queryFn) {
     query = deps.queryFn;
   } else {
-    const api = new DeeplakeApi(
+    api = new DeeplakeApi(
       config.token,
       config.apiUrl,
       config.orgId,
       config.workspaceId,
       config.skillsTableName,
     );
     query = (sql: string) => api.query(sql) as Promise<Record<string, unknown>[]>;
-    const known = await api.knownTablesOrNull();
-    if (known) tableExists = (name: string) => known.includes(name);
   }

@@
-  try {
-    const summary = await withTimeout(
-      runPull({
+  try {
+    const summary = await withTimeout((async () => {
+      let resolvedTableExists = tableExists;
+      if (api) {
+        const known = await api.knownTablesOrNull();
+        if (known) {
+          const knownSet = new Set(known);
+          resolvedTableExists = (name: string) => knownSet.has(name);
+        }
+      }
+      return runPull({
         query,
         tableName: config.skillsTableName,
         install,
         cwd: install === "project" ? (deps.cwd ?? process.cwd()) : undefined,
         users: [],
         dryRun: false,
         force: false,
-        tableExists,
-      }),
-      timeoutMs,
-    );
+        tableExists: resolvedTableExists,
+      });
+    })(), timeoutMs);

Also applies to: 116-128

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/skillify/auto-pull.ts` around lines 108 - 109, The call to
api.knownTablesOrNull() happens outside the withTimeout(...) block causing
SessionStart to potentially exceed timeoutMs; move the table discovery into the
same timeout boundary by invoking knownTablesOrNull via withTimeout(timeoutMs,
() => api.knownTablesOrNull()) (or call withTimeout separately and treat
failures/timeouts as null), then set tableExists = (name: string) =>
known.includes(name) only if known is returned; apply the same change to the
other discovery usage around lines 116-128 so both discovery paths degrade to
null on timeout and don’t block SessionStart.

@khustup2 khustup2 requested a review from efenocchi May 27, 2026 00:04
…dget

knownTablesOrNull() (a live GET /tables on a cold cache) was awaited before
withTimeout(), so a slow table lookup could block SessionStart past timeoutMs —
defeating the timeout's purpose. Defer discovery into the timed block so the
table fetch + pull share one budget and degrade together on timeout.
…e thresholds

The fix/skip-missing-table-reads changes added DeeplakeApi.knownTablesOrNull
and a tableExists gating predicate in session-start.ts and context.ts that
the existing tests never exercised (mocks returned null, so the predicate
closure was never constructed), dropping three per-file coverage thresholds:
deeplake-api branches 86.36% < 87%, session-start functions 83.33% < 90%,
context functions 66.66% < 90%.

- deeplake-api.test.ts: add knownTablesOrNull block covering all branch arms
  (cached hit, fresh [] / populated success, non-retryable + network-exhaust
  failures returning null without caching).
- cli-context + session-start-hook tests: make the knownTablesOrNull mock
  configurable and add cases with a trusted table list so the tableExists
  predicate runs (both-present -> SELECTs fire; empty -> SELECTs skipped).
- standalone-embed-client.test.ts: fix a test-isolation bug surfaced by the
  added tests shifting parallel scheduling. The skip-guard probed
  SHARED_DAEMON_PATH (the daemon entry .js), but a live daemon answers on a
  per-user socket whose liveness is independent of that file, so a dev box
  with embeddings installed returned a vector instead of null. Assert the
  result shape (null or numeric vector) per the test's documented intent
  instead of probing the filesystem.
@khustup2 khustup2 merged commit 1ea3f2a into main May 27, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants