Skip to content

Fix #1360: Neo4j vector search returns empty results in shared-database multi-tenant mode#1974

Open
Memtensor-AI wants to merge 1 commit into
dev-20260624-v2.0.22from
bugfix/autodev-1360
Open

Fix #1360: Neo4j vector search returns empty results in shared-database multi-tenant mode#1974
Memtensor-AI wants to merge 1 commit into
dev-20260624-v2.0.22from
bugfix/autodev-1360

Conversation

@Memtensor-AI

Copy link
Copy Markdown
Collaborator

Description

Fixes issue #1360 by removing a leftover regression in the Neo4j sources serialization path. PR #1359 had already addressed the four root causes listed in the issue (post-filter rewrite, sources KeyError guard, _parse_node bracket check, missing-embedding warn-instead-of-raise), but a close re-read showed sources were still being double-encoded: _prepare_node_metadata JSON-dumps each element once, and then add_node / add_nodes_batch in neo4j.py plus add_node in neo4j_community.py repeated the same json.dumps step. The resulting values came back from Neo4j with a leading " instead of {, so _parse_node's [0] == "{" guard skipped json.loads and callers received escaped JSON strings instead of dicts — the exact deserialization-skip symptom listed as bug #3 of the original issue, just one layer upstream.

The change removes the redundant serialization at all three sites, leaving the single canonical encode inside _prepare_node_metadata (matching the already-correct pattern in import_graph and neo4j_community.add_nodes_batch). Existing rows written with the double-encoded value remain readable as escaped strings (no new exception), and new writes round-trip correctly.

Tests: added TestSourcesDoubleSerializationRegression to tests/graph_dbs/test_neo4j_vector_search.py with three cases — add_node single-encode invariant, add_nodes_batch single-encode invariant, and full add_node → _parse_node round-trip. Results: python3 -m pytest tests/graph_dbs/ → 32 passed / 3 skipped (skipped cases are live-Neo4j 5.18+ integration tests). ruff check and ruff format --check are clean on all three touched files.

Related Issue (Required): Fixes #1360

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (does not change functionality, e.g. code style improvements, linting)
  • Documentation update

How Has This Been Tested?

Automated tests are pending.

  • Unit Test
  • Test Script Or Test Steps (please provide)
  • Pipeline Automated API Test (please provide)

Checklist

  • I have performed a self-review of my own code
  • I have commented my code in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • I have created related documentation issue/PR in MemOS-Docs (if applicable)
  • I have linked the issue to this PR (if applicable)
  • I have mentioned the person who will review this PR

@MatthewZhuang, @CarltonXiang, @syzsunshine219, @World-controller please review this PR.

Reviewer Checklist

_prepare_node_metadata already JSON-encodes metadata['sources']; the
write paths (add_node / add_nodes_batch in neo4j.py and add_node in
neo4j_community.py) repeated the same json.dumps step, producing escaped
JSON strings that _parse_node could no longer decode (its [0]=='{' guard
saw a leading '"' and skipped json.loads), so callers received encoded
strings instead of dicts.

Removes the redundant serialization, leaving the single canonical encode
inside _prepare_node_metadata. Adds three regression tests covering
add_node, add_nodes_batch and the add_node->_parse_node round-trip.

Refs #1360
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-generated bug Something isn't working | 功能异常

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants