fix(mcp): dedupe one repo's two path spellings onto one DB connection (#1057)#1082
Open
inth3shadows wants to merge 1 commit into
Open
fix(mcp): dedupe one repo's two path spellings onto one DB connection (#1057)#1082inth3shadows wants to merge 1 commit into
inth3shadows wants to merge 1 commit into
Conversation
…ath (colbymchenry#1057) Two spellings of one repo — a symlinked checkout, or upper/lowercase variants of a path on a case-insensitive mount (Windows NTFS, WSL DrvFs /mnt) — resolved to two different cache keys, so the MCP server opened a second SQLite connection to the same .codegraph/codegraph.db; concurrent writes then corrupted the index. Key projectCache (and the default-instance reuse check) on (dev,ino) filesystem identity, which is identical for every spelling. realpath alone is insufficient: on a case-insensitive, case-preserving filesystem it returns the caller's casing and cannot dedupe case-variants. Mirrors the existing inode-identity pattern in DatabaseConnection.openedInode. Adds __tests__/root-identity.test.ts (symlink case is a deterministic, FS-agnostic proxy for the case-insensitive-mount scenario).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem (#1057)
On WSL, opening a repo under
/mntvia two different path spellings — most concretely an upper- vs lowercase variant of the same path — corrupts.codegraph, even withCODEGRAPH_NO_DAEMONset. The same class of bug hits a symlinked checkout on any platform.Root cause
ToolHandler.getCodeGraphcaches each openCodeGraph(a live SQLite connection) inprojectCache, and reuses the default instance, keyed on the resolved-root path string. Two spellings of one physical directory resolve to two different strings → two cache entries → two SQLite connections to the same.codegraph/codegraph.db. Concurrent writes across the two connections corrupt the index. (This is the same second-connection hazard already documented at the#238comment in that method.)Why not just
realpathSyncthe rootThat was my first attempt, and it is insufficient — verified on a real WSL DrvFs mount:
realpathSyncresolves symlinks and./.., but on a case-insensitive case-preserving filesystem it returns the caller's casing, so it cannot dedupe case-variants. Filesystem identity(dev, ino)is identical for every spelling and is the robust key.Fix
Key
projectCacheand the default-instance reuse check on(dev, ino)via a newcanonicalRootKey()helper. This mirrors the inode-identity pattern this codebase already uses inDatabaseConnection.openedInode(statInode) for replace-on-disk detection, so it's idiomatic rather than novel. Minimal blast radius:findNearestCodeGraphRootitself is unchanged.Reproduced (WSL2 / Ubuntu)
Before fix, against the built resolver:
After fix,
canonicalRootKeyconverges:Tests
__tests__/root-identity.test.ts(4 tests). The symlink case is a deterministic, filesystem-agnostic proxy for the case-insensitive-mount scenario (both produce two path strings for one inode), so it runs on case-sensitive CI. Full suite green locally (the only failures were two pre-existing CPU-contention timing flakes —query-pool,mcp-daemon— that pass in isolation).Scope / follow-up
This fixes the in-process connection cache — the reported
CODEGRAPH_NO_DAEMONpath. The daemon registry (registerDaemon,daemon-registry.ts) keys daemons by path too, so the with-daemon case has the same class of issue; flagging it as a separate follow-up rather than expanding this PR.Validated on WSL2 (Ubuntu). Happy to adjust naming/placement to your preference.