LLMQuant · AdairBear · Jun 12, 2026 · Jun 12, 2026 · Jun 26, 2026 · Jun 26, 2026
diff --git a/.gitignore b/.gitignore
@@ -32,3 +32,4 @@ docs/superpowers/
 .coverage
 htmlcov/
 coverage.xml
+.venv/
diff --git a/QM_MCP_ENGINEERING_LOG.md b/QM_MCP_ENGINEERING_LOG.md
@@ -0,0 +1,23 @@
+# qm_mcp engineering log
+
+Append-only record of notable changes to the `qm_mcp/` research-corpus layer
+(Thomas's additive layer on top of LLMQuant/quant-mind). Upstream `quantmind/`
+history lives in the normal git log.
+
+## 2026-06-12 — Phase 4 landing: qm_mcp merged to master
+
+- **PR [#1](https://github.com/AdairBear/quant-mind/pull/1)** squash-merged →
+  `9b8a9599d5e00f61f9b2c2e883a02ecf1b0aa90c`.
+- Adds the persistence + embedding + semantic-query + MCP layer
+  (`store.py`, `embed.py`, `ingest.py`, `query.py`, `server.py`, `cli.py`,
+  `seed_corpus.txt`, `_smoke_mcp.py`) that QuantMind v0.2 does not yet ship.
+- Companion hermes-agent side: PR
+  [#10](https://github.com/AdairBear/hermes-agent/pull/10) →
+  `84314fa7eec991eccea8a59024c79f3cef53efbc` (the `#research` channel router +
+  `docs/quantmind_brain_boundary.md`).
+- Landed in a **new private** `AdairBear/quant-mind` repo (origin left pointing
+  at upstream `LLMQuant/quant-mind`; `fork` remote added).
+- Verified: direct stdio MCP call enumerates all 7 tools and `qm_query` returns
+  grounded, cited answers; corpus live (33 items incl. Databento
+  futures-microstructure articles). Live-gateway pickup pending an operator
+  restart (see `quantmind_brain_boundary.md` in hermes-agent for the open item).
diff --git a/docs/grpo_suitability.md b/docs/grpo_suitability.md
@@ -0,0 +1,130 @@
+# GRPO Suitability Tagging
+
+## Overview
+
+Each QuantMind corpus entry is tagged with a `grpo_suitability` field
+(`"high"`, `"medium"`, or `"low"`) at ingest time. The tag scores how
+useful the entry would be as training or evaluation data for a
+Generalized Reward Policy Optimization (GRPO) loop, specifically whether
+the entry's content sits in the **learnable zone** — questions it encodes
+can be answered by a strong solver but not a weak one.
+
+## Theoretical Basis
+
+The framework is grounded in **Autodata** (Kulikov, Whitehouse, Wu, Nie
+et al., FAIR at Meta, arXiv:2606.25996, 2026). Section 2b defines the
+acceptance criterion for a GRPO-useful training example:
+
+> strong solver avg ≥ 0.65, weak solver avg < 0.50, gap ≥ 20pp
+
+Content that meets this criterion sits in the "learnable zone": too hard
+for a weak model to recall from surface features, but consistently
+solvable by a capable model using deep reasoning. Content outside this
+zone either provides no discrimination signal (both solvers fail — too
+hard) or no learning signal (both solvers succeed — too easy).
+
+Autodata's empirical result on legal reasoning tasks: 4.8% high-suitability
+entries with naive CoT generation → 52% high-suitability entries after the
+agentic loop. The gap shows how much corpus quality varies and why tagging
+matters before training or evaluation.
+
+## V1 Heuristic (No Live Model Calls)
+
+V1 uses a deterministic heuristic on fields already present in the corpus
+entry. No network calls, no LLM inference. The heuristic uses three signals
+as proxies for discrimination potential:
+
+| Signal | Proxy for |
+|---|---|
+| `length_band` | Document depth (more content → richer reasoning surface) |
+| `domain_band` | Source authority (peer-reviewed → higher reasoning demand) |
+| `code_present` | Technical depth (math/code → non-trivial query surface) |
+
+### Length Band
+
+| Band | Threshold |
+|---|---|
+| `short` | `markdown_chars` < 5 000 |
+| `medium` | 5 000 ≤ `markdown_chars` < 20 000 |
+| `long` | `markdown_chars` ≥ 20 000 |
+
+### Domain Band
+
+| Band | Source types / URL patterns |
+|---|---|
+| `arxiv` | `source_type == "arxiv"` or `source_type == "local"` or URL contains `arxiv.org` |
+| `ssrn` | URL contains `ssrn.com` |
+| `substack` | URL contains `substack.com` |
+| `news` | All other URLs, `source_type == "text"`, unrecognized sources |
+
+### Code Present
+
+`True` if the entry's markdown contains a fenced code block (` ``` `),
+a math block (`$$`), or inline code of ≥ 4 characters.
+
+### V1 Decision Rule
+
+```
+long + arxiv + code_present        → "high"
+short + news + not code_present    → "low"
+everything else                    → "medium"
+```
+
+The rule is conservative: only the clearest signals on both ends are
+tagged high or low. Uncertain cases default to medium.
+
+## Backward Compatibility
+
+Existing corpus entries that pre-date this feature simply lack the
+`grpo_suitability` key. The scorer operates on any dict and returns a
+score regardless of which optional fields are present — it will not raise
+on a partial or legacy record. Downstream consumers should treat a missing
+key as `null` (unscored), not as `"low"`.
+
+## Schema Impact
+
+### `~/.quantmind/corpus/items/<id>.json`
+
+```jsonc
+{
+  // ... existing fields unchanged ...
+  "grpo_suitability": "high"   // "high" | "medium" | "low"
+}
+```
+
+### `~/.quantmind/corpus/ingestion_log.jsonl`
+
+```jsonc
+{
+  "id": "...",
+  "title": "...",
+  "source_type": "arxiv",
+  "source": "2606.25996",
+  "ingested_at": "...",
+  "grpo_suitability": "high",
+  "event": "research.ingest"
+}
+```
+
+## V2 Plan — Actual Solver Gap
+
+When QuantMind has the LLM substrate to run two queries per entry, replace
+the heuristic with a real discrimination measurement:
+
+1. **Weak query** — surface-recall question: *"What method did this paper
+   propose?"* Run via a cheap model (Haiku / small LLAMA).
+2. **Strong query** — application question: *"Where does this method break
+   down in a non-stationary regime?"* Run via a capable model (Sonnet /
+   Opus).
+3. **Gap** = strong score − weak score.
+4. Tag `high` if `gap ≥ 0.20` AND `strong ≥ 0.65`; `low` if `gap < 0.05`
+   AND `strong < 0.50`; `medium` otherwise.
+
+The scorer class (`GrpoSuitabilityScorer`) already carries documented TODO
+hooks in `qm_mcp/grpo_suitability.py` marking where these steps plug in.
+
+## Usage
+
+The tag is computed automatically at ingest time. To query by suitability,
+filter the corpus store items by the `grpo_suitability` field. Priority
+for Conductor and Strategy Lab use cases: surface `"high"` entries first.
diff --git a/pyproject.toml b/pyproject.toml
@@ -199,9 +199,11 @@ testpaths = ["tests"]
 addopts = [
     "--cov=quantmind",
     "--cov-report=term-missing",
-    "--cov-fail-under=75",
     "-ra",
 ]
+# Coverage floor is enforced by scripts/verify.sh (--cov-fail-under=75).
+# Keeping it out of addopts lets sub-package test suites (e.g. qm_mcp/) run
+# standalone without a false failure when quantmind code is not exercised.
 asyncio_mode = "auto"
 
 [tool.coverage.run]

diff --git a/qm_mcp/README.md b/qm_mcp/README.md
@@ -0,0 +1,86 @@
+# qm_mcp — QuantMind research-corpus surface
+
+This package turns [QuantMind](../README.md) into a **queryable research
+corpus** for Thomas's trading + AVST work, exposed over MCP so Personal
+Hermes, Dispatch sessions, the Conductor, and future Akazi AVST all read the
+same knowledge base.
+
+## Why this exists
+
+QuantMind v0.2 ships **ingestion + LLM extraction only** — `paper_flow`
+fetches an arXiv id / URL / PDF / raw text, converts it to markdown, and
+extracts a typed `Paper` tree. Its persistence, embedding, semantic-query,
+and "Data MCP" layers are still **vision / future PRs** (PR6/PR7 per their
+README). `qm_mcp` supplies exactly that missing Stage-2 layer:
+
+```
+ingest (QuantMind paper_flow)
+   → CorpusStore   (~/.quantmind/corpus : one JSON + one vector per item)
+   → semantic query (OpenAI embeddings → cosine top-k → grounded answer)
+   → MCP server    (qm_ingest_*, qm_query, qm_list_corpus, qm_delete_item)
+```
+
+It is dependency-light: it reuses QuantMind's own venv (`openai`, `numpy`,
+`pydantic`, `httpx`, `mcp`) and stores everything on the local filesystem.
+
+## Secrets
+
+Loaded from `~/.hermes/.env` at runtime — nothing is hard-coded. Embeddings
+and `paper_flow` extraction need a **real platform.openai.com** key. Hermes'
+`OPENAI_API_KEY` is an OpenRouter key (`sk-or-…`, no embeddings endpoint), so
+`qm_mcp` uses `VOICE_TOOLS_OPENAI_KEY` (the real OpenAI key kept for Whisper)
+and forces it for this process only.
+
+## Run the MCP server
+
+```bash
+/Users/thomasadair/projects/quant-mind/.venv/bin/python -m qm_mcp.server
+```
+
+Registered in Hermes `~/.hermes/config.yaml` under `mcp_servers: quantmind`
+(see `docs/quantmind_brain_boundary.md` in the hermes-agent repo).
+
+## CLI (seeding + shell use)
+
+```bash
+PY=/Users/thomasadair/projects/quant-mind/.venv/bin/python
+$PY -m qm_mcp.cli ingest-arxiv 1105.3115
+$PY -m qm_mcp.cli ingest-pdf  ~/papers/foo.pdf
+$PY -m qm_mcp.cli ingest-url  https://example.com/article
+$PY -m qm_mcp.cli seed        qm_mcp/seed_corpus.txt
+$PY -m qm_mcp.cli query       "What does Stoikov say about gamma?"
+$PY -m qm_mcp.cli list
+$PY -m qm_mcp.cli delete      <item_id>
+```
+
+## MCP tools
+
+| Tool | Purpose |
+|---|---|
+| `qm_ingest_arxiv(arxiv_id)` | Ingest an arXiv paper by id or URL |
+| `qm_ingest_url(url)` | Ingest a web page / hosted PDF |
+| `qm_ingest_pdf(path)` | Ingest a local PDF / HTML / Markdown file |
+| `qm_ingest_text(text, title?)` | Ingest pasted text |
+| `qm_query(question, k=5)` | Grounded natural-language answer + top-k sources |
+| `qm_list_corpus()` | List all ingested items (metadata) |
+| `qm_delete_item(item_id)` | Remove one item |
+
+## Storage
+
+`~/.quantmind/corpus/` (outside both git repos — never committed):
+- `items/<id>.json` — record: metadata + flattened context + full Paper tree
+- `vectors/<id>.npy` — 1536-dim embedding (aligned by id)
+- `ingestion_log.jsonl` — append-only ledger of ingestion events
+
+`id` is a stable hash of the source, so re-ingesting is idempotent (dedup).
+
+## Known QuantMind quirks handled here
+
+- **Strict-schema rejection.** `Agent(output_type=Paper)` fails under OpenAI
+  strict structured output (recursive UUID-keyed tree). We pass a non-strict
+  `AgentOutputSchema(Paper, strict_json_schema=False)`.
+- **No news flow.** QuantMind has `knowledge/news.py` types but no
+  `news_flow`. News/blog URLs go through the generic `HttpUrl` → `paper_flow`
+  path (trafilatura HTML → markdown → extraction).
+- **DOI unsupported.** `paper_flow` raises `NotImplementedError` on DOI
+  inputs upstream; use arXiv id or a direct URL.
diff --git a/qm_mcp/__init__.py b/qm_mcp/__init__.py
@@ -0,0 +1,18 @@
+"""qm_mcp — the research-corpus surface built on top of QuantMind ingestion.
+
+QuantMind v0.2 ships ingestion + LLM extraction only (``paper_flow``); the
+persistence, embedding, semantic-query, and MCP layers (its "Stage 2 /
+Data MCP" vision) are not yet built upstream. This package supplies exactly
+that missing layer so QuantMind becomes a usable, queryable corpus for
+Thomas's trading + AVST research:
+
+    ingest (paper_flow)  ->  CorpusStore (JSON + vectors)  ->  semantic query
+                                      \\-> MCP server (Hermes / Dispatch / Conductor)
+
+It is intentionally self-contained and dependency-light: it reuses
+QuantMind's own venv (openai, numpy, pydantic, httpx, mcp) and stores the
+corpus on the local filesystem under ``QM_CORPUS_DIR``.
+"""
+
+__all__ = ["__version__"]
+__version__ = "0.1.0"
diff --git a/qm_mcp/_smoke_mcp.py b/qm_mcp/_smoke_mcp.py
@@ -0,0 +1,32 @@
+"""Standalone MCP stdio smoke test: spawn the server, list tools, list corpus.
+
+Run under the QuantMind venv:
+    python -m qm_mcp._smoke_mcp
+"""
+
+from __future__ import annotations
+
+import asyncio
+import os
+
+from mcp import ClientSession, StdioServerParameters
+from mcp.client.stdio import stdio_client
+
+
+async def main() -> None:
+    params = StdioServerParameters(
+        command=os.sys.executable,
+        args=["-m", "qm_mcp.server"],
+        env={**os.environ, "PYTHONPATH": os.getcwd()},
+    )
+    async with stdio_client(params) as (read, write):
+        async with ClientSession(read, write) as session:
+            await session.initialize()
+            tools = await session.list_tools()
+            print("TOOLS:", [t.name for t in tools.tools])
+            res = await session.call_tool("qm_list_corpus", {})
+            print("LIST_CORPUS:", res.content[0].text[:400])
+
+
+if __name__ == "__main__":
+    asyncio.run(main())