Fix workspace chat isolation and conversation streaming by joungminsung · Pull Request #11 · joungminsung/OpenDocuments

joungminsung · 2026-05-13T00:27:38Z

Summary

Fixes the workspace isolation issues found during review:

adds workspace-scoped RAGEngine instances so HTTP chat and streaming chat query the authenticated/requested workspace instead of the default workspace
constrains tag and collection document mutations to the manager workspace to prevent cross-workspace metadata changes
returns conversationId in the streaming done event and sends the stored conversation id from the web client on subsequent streamed messages

Root Cause

The server created workspace-scoped stores, pipelines, and managers, but chat routes still used the default ctx.ragEngine. Tag and collection managers also accepted raw ids without verifying that both sides of a relationship belonged to the active workspace. Streaming chat persisted new conversations but did not return the generated id to the client.

Validation

npx vitest run tests/document/managers.test.ts in packages/core -> 13 tests passed
npx vitest run tests/http/workspace.test.ts tests/http/chat.test.ts in packages/server -> 10 tests passed
npm run typecheck -> 33 tasks successful

Note: better-sqlite3 was rebuilt against the package test runtime Node 20 before running Vitest.

Increase maxContextTokens from 4096→16384 default, raise chunk allocation from 50%→65%. Profile updates: fast 8K, balanced 16K, precise 32K. This is the single biggest accuracy fix — prevents good retrieval results from being truncated before the LLM sees them. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When a chunk is retrieved, also fetch its neighboring chunks (window=1) from the same document. Siblings get a discounted score (0.6x parent). This provides surrounding context that prevents the 'found the right file but incomplete answer' problem. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add explicit RULES section (source-only, cite everything, handle conflicts) and RESPONSE FORMAT (direct answer first, then details). Small models produce significantly better answers with structured constraints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When merging dense + sparse results, RRF now multiplies rank score by original similarity score. A 0.95-similarity result at rank 2 now outscores a 0.50-similarity result at rank 1. Enabled by default in the Retriever's dense+sparse merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace exact word matching with substring matching (handles Korean agglutination and English prefixes like auth→authentication). Add heading hierarchy boost (0.2 weight) — query words in headings are strong relevance signals. Rebalance weights: original 0.5, content 0.3, heading 0.2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Boost retrieval scores by 15% when query keywords appear in heading hierarchy, and by 10% when chunk type aligns with query intent (e.g., code-ast chunks for code queries, table chunks for data queries). Lightweight post-retrieval step, no additional DB queries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace heuristic token estimation (~85% English, ~70% Korean accuracy) with tiktoken-based counting using cl100k_base encoding. Includes sampling-based extrapolation for long texts to maintain performance, and falls back to the heuristic if tiktoken is unavailable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…eranker Replace substring matching with word-boundary regex to prevent false positives (e.g. "auth" matching "author", "log" matching "login"). Add n-gram phrase scoring for consecutive bigrams/trigrams. Update scoring weights to original*0.4 + wordMatch*0.25 + ngram*0.15 + heading*0.2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Introduce per-intent weight profiles for fallback reranking so code queries favor code-ast chunks, concept queries favor semantic chunks, etc. The engine now passes classifyIntent(query) to rerankResults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Code queries get 75% chunk space (less history), concept queries get more history (55% chunks), so each intent type gets an optimized context window layout. Falls back to default allocation for unknown intents. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Use the intent parameter already passed to retrieveWithFeatures instead of calling classifyIntent() again when invoking rerankResults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…dling - Expand Korean-English dictionary to 300+ technical terms with custom dictionary support - Add fallback minScore (0.15) to prevent returning irrelevant results - Fix path traversal vulnerability in static file serving - Hide internal error details from streaming chat responses - Add actionable error messages for CLI index command (ENOENT, EACCES) - Read CLI version dynamically from package.json - Log unexpected plugin loading errors instead of silently swallowing - Add CLAUDE.md project instructions and opendocuments.config.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…arity boundaries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…Ollama integration - Add non-root user (opendocs) with correct /data and /app ownership - Add OCI image labels (title, description, source, license) - Set ENV defaults for NODE_ENV, OPENDOCUMENTS_DATA_DIR, PORT - Add HEALTHCHECK via wget against /health endpoint - docker-compose: add healthcheck probes for both services - docker-compose: add env var documentation comments for model providers - docker-compose: add restart: unless-stopped to both services - docker-compose: add depends_on ollama with required: false Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add GET /api/v1/healthz (liveness) and GET /api/v1/readyz (readiness) endpoints. Readiness checks SQLite, VectorDB, and all model plugin health, returning 200 when ready or 503 with per-check details when not. Also bumps version in /api/v1/health to '0.2.0'. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds two new CLI commands: - `backup` copies opendocuments.db, WAL file, vectors/ dir, and current-workspace file to a timestamped directory under ~/.opendocuments/backups/ (or a custom path via -o/--output). - `restore` copies the same artifacts back from a given backup directory, requiring --force when target files already exist. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds update-checker.ts with checkForUpdates(), getCachedUpdateInfo(), and compareVersions() utilities. Results are cached 24 hours; fetch failures return a safe fallback and never throw. The stats endpoint includes the cached update info when available, and bootstrap fires a non-blocking check after startup, logging a message via log.info() if a newer version exists. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ion enforcement Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Introduce buildConfigFromEnv() that maps OPENDOCUMENTS_* env vars to OpenDocumentsConfig fields, falling back to well-known provider API key env vars (OPENAI_API_KEY etc.). loadConfig() now calls this as a fallback when no config file exists, before returning hard-coded defaults. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…isting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add GET/POST/DELETE plugin API routes (npm search, install, uninstall via execSync), mount them in app.ts, expose four api.ts helper functions, and add a tabbed Marketplace UI to PluginsPage with a new PluginMarketplace component. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Implements detectHardware() using node:os and recommendModels() that selects qwen2.5 LLM + embedding model based on effective available memory (GPU VRAM when present, else 60% of system RAM). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ed metrics Add GET /api/v1/admin/benchmark endpoint that measures generation speed (tok/s + latency) and embedding speed (texts/s + latency) for each registered model plugin, with per-model try/catch for resilience. Add getModelBenchmarks() to the web API client and BenchmarkDashboard React component with auto-fetch on mount, Run Benchmark button, health indicator dots, and capability badges. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…chunk fitting Implements compressContext() with three progressive strategies: drop lowest-scoring chunks, deduplicate repeated sentences across chunks, and proportionally truncate at sentence boundaries to fit retrieved context within a token budget. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Implements @opendocuments/connector-slack that indexes Slack channels by thread. Supports Bearer token auth, channel filtering, paginated history fetching, user name resolution, and sourcePath format slack://channel-name/thread-ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Implements DiscordConnector that indexes guild text channels by grouping messages into per-day batches (YYYY-MM-DD), with Bot token auth and Discord API v10. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Implements LinearConnector with paginated GraphQL issue discovery (cursor-based, 50/page), team/status filtering, and markdown formatting of issue details and comments. Auth uses raw API key header per Linear's spec. 10 tests cover metadata, healthCheck, pagination, cursor forwarding, fetch formatting, and error paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Implements JiraConnector using Jira REST API v3 with Basic auth (email:apiToken base64). Supports paginated JQL search filtered by project and statuses (50/page). Fetches issues with summary, description, status, assignee, and comments. ADF (Atlassian Document Format) is converted to plain text via a recursive node walker handling text/paragraph types. 12 tests cover metadata, auth, healthCheck, pagination, JQL construction, ADF rendering, and error paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ry logic Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… fix The prior test passed even on the pre-fix paragraph chunker because it only required 'any embedded input ends with punctuation' — which a whole paragraph ending in a period satisfied. Tighten the test to require observable signals of semantic chunking: 1. embed() is invoked at least twice (sentence-boundary discovery + chunk embedding) 2. at least one batch contains multiple sentences (the boundary-discovery call over the sentence array) 3. at least one input is a single sentence ending in punctuation, not a whole paragraph Verified the tightened assertions fail on the pre-fix pipeline and pass on the current one.

…heading+first paragraph atomic)

…ank-lines case

Makes model setup and switching a first-class CLI flow instead of requiring hand-editing opendocuments.config.ts. Also extends init to cover the three new providers added in the previous commit (DeepSeek, Mistral, openai-compatible). New command: opendocuments model - model list [--suggestions] — current config + installed Ollama models with per-model disk footprint, plus a curated catalog of local/cloud options. - model pull <name> — streams /api/pull progress, checks Ollama reachability, estimates disk footprint, and refuses to clobber a low-disk machine silently. Offers the official install command when Ollama is missing. - model rm <name> — deletes a local Ollama model. - model test [-p prompt] — round-trips a short prompt against the configured LLM and embedder, reports latency/chunks/embedding-dim. Surfaces degraded-mode issues immediately. - model switch — interactive provider swap. Rewrites only the `model:` block of opendocuments.config.ts (preserves the rest of the config). Supports all 8 providers including openai-compatible with baseUrl. init improvements - Cloud menu now includes DeepSeek and Mistral. - Third backend option: "OpenAI-compatible endpoint" (vLLM / LM Studio / Groq / Together / Fireworks / OpenRouter) with baseUrl prompt. - API key validation extended to grok, deepseek, mistral (previously openai / anthropic / google only). - "Secondary embedding provider" flow generalised from anthropic-only to any provider that lacks an embeddings API (deepseek joins anthropic). - Ollama auto-install: on macOS/Linux, offers to run the official install script and waits for the daemon to come up. - Pre-pull disk-space check with per-model size estimates (1.5GB headroom). - Progress line updates in place instead of spamming stdio.inherit. - Local model recommendations refreshed for April 2026 (Gemma 3/Qwen 3.5/ Llama 4/DeepSeek R1 distilled). Supporting utilities - packages/cli/src/utils/ollama.ts — shared Ollama client (isRunning, listModels, pullModel w/ progress, deleteModel, disk-space + size estimator, install-command selector). - 4 unit tests for rewriteModelBlock (cloud swap, openai-compatible with baseUrl, round-trip back to ollama, missing-block error). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This reverts commit 946045b.

… embedding

…n text

…engine

…recall

The 'balanced' profile now fans out to multi-query + contextual retrieval + parent-doc which pushes the test past the 5s default even with stub LLMs. This test only verifies routing ('rag' vs 'direct') so fast profile is sufficient. 120s timeout accommodates real-LLM environments too.

Address code review findings: - Pipeline now reads contextualRetrieval and chunkAugmentation from the active RAG profile, so balanced/precise actually activate them without every caller passing options explicitly. Resolution order: explicit options > config.rag.custom.features > profile. - Profile tests now assert every new feature flag per profile (hyde, multiQuery, multiQueryN, parentDocRetrieval, crossEncoder, contextualRetrieval, chunkAugmentation), gating profile drift. - Eval integration test raises floor from 'valid range' to 'hitAt5 > 0 and MRR > 0', catching a silent regression if retrieval ever stops finding any relevant doc. - Added a doc comment on SearchResult.contextualPrefix clarifying it is for debugging/eval tooling and is never prepended to generator-facing content.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 79c459dfb8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-21T03:54:11Z

        } catch (err) {
          console.error('[conversation] Failed to persist:', err instanceof Error ? err.message : String(err))
        }


Always emit done event after persistence failure

If message persistence throws in this try block (for example, a transient DB failure or a conversation row disappearing between validation and write), the catch only logs and never sends a done SSE. In this path the stream then ends normally, so the web client never receives done/error and can stay stuck in streaming state without finalizing the assistant message. Emit a fallback done (or an explicit error) from the catch to keep the stream protocol consistent.

Useful? React with 👍 / 👎.

joungminsung and others added 30 commits March 31, 2026 15:55

fix: remove duplicate classifyIntent call in retrieveWithFeatures

b4c4255

Use the intent parameter already passed to retrieveWithFeatures instead of calling classifyIntent() again when invoking rerankResults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(core): add semantic sentence-level chunking with embedding simil…

183e87b

…arity boundaries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(core): add embedding-based semantic grounding for hallucination …

f5c74c2

…detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(core): add plugin sandboxing with network and filesystem permiss…

7bbf0f5

…ion enforcement Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(core): add community plugin registry with GitHub-hosted plugin l…

54c73dd

…isting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(core): add Ollama model auto-download with progress streaming

0808cf1

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat: add Discord connector plugin with day-based message batching

1f5e1e0

Implements DiscordConnector that indexes guild text channels by grouping messages into per-day batches (YYYY-MM-DD), with Bot token auth and Discord API v10. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

joungminsung and others added 23 commits April 1, 2026 12:20

feat(core): add webhook event dispatcher with HMAC signatures and ret…

7a0dc68

…ry logic Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(core): add webhooks field to default config

cef5a14

fix(core): wire semantic chunking into ingest pipeline

1c9e14a

feat(core): structure-preserving chunking (fenced code, pipe tables, …

bfabce8

…heading+first paragraph atomic)

test(core): gate structure-preserving chunker with code-block-with-bl…

15f5854

…ank-lines case

feat(core): document-type-aware chunking dispatcher

e145270

feat(core): add LLM-authored chunk contextual prefix generator

774d5aa

Revert "feat(cli): add model management command and improve init UX"

3818b2a

This reverts commit 946045b.

feat(core): contextual retrieval — prepend LLM-authored prefix before…

569d9fc

… embedding

feat(core): add HyDE hypothetical answer generator

c396672

feat(core): add multi-query paraphrase expansion

4bfd297

feat(core): parent-document retrieval — chunks carry enclosing sectio…

fd0403e

…n text

feat(core): orchestrate HyDE + multi-query + parent-doc retrieval in …

36f4692

…engine

feat(core): proposition + hypothetical-question augmentation for FTS …

7d04de0

…recall

feat(core): add cross-encoder LLM reranker behind profile flag

f22afa1

feat(core): RAG evaluation harness (hit@K, MRR, nDCG) with gold dataset

a99da52

chore: changeset + docs + eval runner for RAG accuracy redesign

4100683

chore: release v0.3.0

0dccdff

fix: isolate workspace chat and document metadata

fd94184

joungminsung changed the base branch from feat/rag-accuracy-improvements to main May 13, 2026 00:54

merge main into workspace isolation branch

79c459d

joungminsung marked this pull request as ready for review May 21, 2026 03:50

merge main into workspace isolation branch

7f1bfc6

chatgpt-codex-connector Bot reviewed May 21, 2026

View reviewed changes

joungminsung changed the title ~~[codex] Fix workspace chat isolation and conversation streaming~~ Fix workspace chat isolation and conversation streaming May 21, 2026

joungminsung merged commit 6ea5635 into main May 21, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix workspace chat isolation and conversation streaming#11

Fix workspace chat isolation and conversation streaming#11
joungminsung merged 56 commits into
mainfrom
codex/workspace-rag-isolation-fixes

joungminsung commented May 13, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

joungminsung commented May 13, 2026

Summary

Root Cause

Validation

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant