Fix workspace chat isolation and conversation streaming#11
Conversation
Increase maxContextTokens from 4096→16384 default, raise chunk allocation from 50%→65%. Profile updates: fast 8K, balanced 16K, precise 32K. This is the single biggest accuracy fix — prevents good retrieval results from being truncated before the LLM sees them. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a chunk is retrieved, also fetch its neighboring chunks (window=1) from the same document. Siblings get a discounted score (0.6x parent). This provides surrounding context that prevents the 'found the right file but incomplete answer' problem. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add explicit RULES section (source-only, cite everything, handle conflicts) and RESPONSE FORMAT (direct answer first, then details). Small models produce significantly better answers with structured constraints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When merging dense + sparse results, RRF now multiplies rank score by original similarity score. A 0.95-similarity result at rank 2 now outscores a 0.50-similarity result at rank 1. Enabled by default in the Retriever's dense+sparse merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace exact word matching with substring matching (handles Korean agglutination and English prefixes like auth→authentication). Add heading hierarchy boost (0.2 weight) — query words in headings are strong relevance signals. Rebalance weights: original 0.5, content 0.3, heading 0.2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Boost retrieval scores by 15% when query keywords appear in heading hierarchy, and by 10% when chunk type aligns with query intent (e.g., code-ast chunks for code queries, table chunks for data queries). Lightweight post-retrieval step, no additional DB queries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace heuristic token estimation (~85% English, ~70% Korean accuracy) with tiktoken-based counting using cl100k_base encoding. Includes sampling-based extrapolation for long texts to maintain performance, and falls back to the heuristic if tiktoken is unavailable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eranker Replace substring matching with word-boundary regex to prevent false positives (e.g. "auth" matching "author", "log" matching "login"). Add n-gram phrase scoring for consecutive bigrams/trigrams. Update scoring weights to original*0.4 + wordMatch*0.25 + ngram*0.15 + heading*0.2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduce per-intent weight profiles for fallback reranking so code queries favor code-ast chunks, concept queries favor semantic chunks, etc. The engine now passes classifyIntent(query) to rerankResults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code queries get 75% chunk space (less history), concept queries get more history (55% chunks), so each intent type gets an optimized context window layout. Falls back to default allocation for unknown intents. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use the intent parameter already passed to retrieveWithFeatures instead of calling classifyIntent() again when invoking rerankResults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dling - Expand Korean-English dictionary to 300+ technical terms with custom dictionary support - Add fallback minScore (0.15) to prevent returning irrelevant results - Fix path traversal vulnerability in static file serving - Hide internal error details from streaming chat responses - Add actionable error messages for CLI index command (ENOENT, EACCES) - Read CLI version dynamically from package.json - Log unexpected plugin loading errors instead of silently swallowing - Add CLAUDE.md project instructions and opendocuments.config.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…arity boundaries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…Ollama integration - Add non-root user (opendocs) with correct /data and /app ownership - Add OCI image labels (title, description, source, license) - Set ENV defaults for NODE_ENV, OPENDOCUMENTS_DATA_DIR, PORT - Add HEALTHCHECK via wget against /health endpoint - docker-compose: add healthcheck probes for both services - docker-compose: add env var documentation comments for model providers - docker-compose: add restart: unless-stopped to both services - docker-compose: add depends_on ollama with required: false Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add GET /api/v1/healthz (liveness) and GET /api/v1/readyz (readiness) endpoints. Readiness checks SQLite, VectorDB, and all model plugin health, returning 200 when ready or 503 with per-check details when not. Also bumps version in /api/v1/health to '0.2.0'. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds two new CLI commands: - `backup` copies opendocuments.db, WAL file, vectors/ dir, and current-workspace file to a timestamped directory under ~/.opendocuments/backups/ (or a custom path via -o/--output). - `restore` copies the same artifacts back from a given backup directory, requiring --force when target files already exist. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds update-checker.ts with checkForUpdates(), getCachedUpdateInfo(), and compareVersions() utilities. Results are cached 24 hours; fetch failures return a safe fallback and never throw. The stats endpoint includes the cached update info when available, and bootstrap fires a non-blocking check after startup, logging a message via log.info() if a newer version exists. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ion enforcement Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduce buildConfigFromEnv() that maps OPENDOCUMENTS_* env vars to OpenDocumentsConfig fields, falling back to well-known provider API key env vars (OPENAI_API_KEY etc.). loadConfig() now calls this as a fallback when no config file exists, before returning hard-coded defaults. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…isting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add GET/POST/DELETE plugin API routes (npm search, install, uninstall via execSync), mount them in app.ts, expose four api.ts helper functions, and add a tabbed Marketplace UI to PluginsPage with a new PluginMarketplace component. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements detectHardware() using node:os and recommendModels() that selects qwen2.5 LLM + embedding model based on effective available memory (GPU VRAM when present, else 60% of system RAM). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ed metrics Add GET /api/v1/admin/benchmark endpoint that measures generation speed (tok/s + latency) and embedding speed (texts/s + latency) for each registered model plugin, with per-model try/catch for resilience. Add getModelBenchmarks() to the web API client and BenchmarkDashboard React component with auto-fetch on mount, Run Benchmark button, health indicator dots, and capability badges. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…chunk fitting Implements compressContext() with three progressive strategies: drop lowest-scoring chunks, deduplicate repeated sentences across chunks, and proportionally truncate at sentence boundaries to fit retrieved context within a token budget. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements @opendocuments/connector-slack that indexes Slack channels by thread. Supports Bearer token auth, channel filtering, paginated history fetching, user name resolution, and sourcePath format slack://channel-name/thread-ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements DiscordConnector that indexes guild text channels by grouping messages into per-day batches (YYYY-MM-DD), with Bot token auth and Discord API v10. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements LinearConnector with paginated GraphQL issue discovery (cursor-based, 50/page), team/status filtering, and markdown formatting of issue details and comments. Auth uses raw API key header per Linear's spec. 10 tests cover metadata, healthCheck, pagination, cursor forwarding, fetch formatting, and error paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements JiraConnector using Jira REST API v3 with Basic auth (email:apiToken base64). Supports paginated JQL search filtered by project and statuses (50/page). Fetches issues with summary, description, status, assignee, and comments. ADF (Atlassian Document Format) is converted to plain text via a recursive node walker handling text/paragraph types. 12 tests cover metadata, auth, healthCheck, pagination, JQL construction, ADF rendering, and error paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ry logic Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… fix The prior test passed even on the pre-fix paragraph chunker because it only required 'any embedded input ends with punctuation' — which a whole paragraph ending in a period satisfied. Tighten the test to require observable signals of semantic chunking: 1. embed() is invoked at least twice (sentence-boundary discovery + chunk embedding) 2. at least one batch contains multiple sentences (the boundary-discovery call over the sentence array) 3. at least one input is a single sentence ending in punctuation, not a whole paragraph Verified the tightened assertions fail on the pre-fix pipeline and pass on the current one.
…heading+first paragraph atomic)
Makes model setup and switching a first-class CLI flow instead of requiring hand-editing opendocuments.config.ts. Also extends init to cover the three new providers added in the previous commit (DeepSeek, Mistral, openai-compatible). New command: opendocuments model - model list [--suggestions] — current config + installed Ollama models with per-model disk footprint, plus a curated catalog of local/cloud options. - model pull <name> — streams /api/pull progress, checks Ollama reachability, estimates disk footprint, and refuses to clobber a low-disk machine silently. Offers the official install command when Ollama is missing. - model rm <name> — deletes a local Ollama model. - model test [-p prompt] — round-trips a short prompt against the configured LLM and embedder, reports latency/chunks/embedding-dim. Surfaces degraded-mode issues immediately. - model switch — interactive provider swap. Rewrites only the `model:` block of opendocuments.config.ts (preserves the rest of the config). Supports all 8 providers including openai-compatible with baseUrl. init improvements - Cloud menu now includes DeepSeek and Mistral. - Third backend option: "OpenAI-compatible endpoint" (vLLM / LM Studio / Groq / Together / Fireworks / OpenRouter) with baseUrl prompt. - API key validation extended to grok, deepseek, mistral (previously openai / anthropic / google only). - "Secondary embedding provider" flow generalised from anthropic-only to any provider that lacks an embeddings API (deepseek joins anthropic). - Ollama auto-install: on macOS/Linux, offers to run the official install script and waits for the daemon to come up. - Pre-pull disk-space check with per-model size estimates (1.5GB headroom). - Progress line updates in place instead of spamming stdio.inherit. - Local model recommendations refreshed for April 2026 (Gemma 3/Qwen 3.5/ Llama 4/DeepSeek R1 distilled). Supporting utilities - packages/cli/src/utils/ollama.ts — shared Ollama client (isRunning, listModels, pullModel w/ progress, deleteModel, disk-space + size estimator, install-command selector). - 4 unit tests for rewriteModelBlock (cloud swap, openai-compatible with baseUrl, round-trip back to ollama, missing-block error). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This reverts commit 946045b.
The 'balanced' profile now fans out to multi-query + contextual retrieval +
parent-doc which pushes the test past the 5s default even with stub LLMs.
This test only verifies routing ('rag' vs 'direct') so fast profile is
sufficient. 120s timeout accommodates real-LLM environments too.
Address code review findings: - Pipeline now reads contextualRetrieval and chunkAugmentation from the active RAG profile, so balanced/precise actually activate them without every caller passing options explicitly. Resolution order: explicit options > config.rag.custom.features > profile. - Profile tests now assert every new feature flag per profile (hyde, multiQuery, multiQueryN, parentDocRetrieval, crossEncoder, contextualRetrieval, chunkAugmentation), gating profile drift. - Eval integration test raises floor from 'valid range' to 'hitAt5 > 0 and MRR > 0', catching a silent regression if retrieval ever stops finding any relevant doc. - Added a doc comment on SearchResult.contextualPrefix clarifying it is for debugging/eval tooling and is never prepended to generator-facing content.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 79c459dfb8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } catch (err) { | ||
| console.error('[conversation] Failed to persist:', err instanceof Error ? err.message : String(err)) | ||
| } |
There was a problem hiding this comment.
Always emit done event after persistence failure
If message persistence throws in this try block (for example, a transient DB failure or a conversation row disappearing between validation and write), the catch only logs and never sends a done SSE. In this path the stream then ends normally, so the web client never receives done/error and can stay stuck in streaming state without finalizing the assistant message. Emit a fallback done (or an explicit error) from the catch to keep the stream protocol consistent.
Useful? React with 👍 / 👎.
Summary
Fixes the workspace isolation issues found during review:
RAGEngineinstances so HTTP chat and streaming chat query the authenticated/requested workspace instead of the default workspaceconversationIdin the streamingdoneevent and sends the stored conversation id from the web client on subsequent streamed messagesRoot Cause
The server created workspace-scoped stores, pipelines, and managers, but chat routes still used the default
ctx.ragEngine. Tag and collection managers also accepted raw ids without verifying that both sides of a relationship belonged to the active workspace. Streaming chat persisted new conversations but did not return the generated id to the client.Validation
npx vitest run tests/document/managers.test.tsinpackages/core-> 13 tests passednpx vitest run tests/http/workspace.test.ts tests/http/chat.test.tsinpackages/server-> 10 tests passednpm run typecheck-> 33 tasks successfulNote:
better-sqlite3was rebuilt against the package test runtime Node 20 before running Vitest.