feat(grpo): GRPO-suitability tagging per Autodata 2606.25996 by AdairBear · Pull Request #85 · LLMQuant/quant-mind

AdairBear · 2026-06-26T00:40:13Z

Summary

Adds grpo_suitability: high|medium|low to every corpus entry at ingest time, implementing the weak-vs-strong discrimination-gap framework from Kulikov et al. (FAIR at Meta, arXiv:2606.25996)
V1 is a pure deterministic heuristic — no live model calls; infrastructure is ready for V2 solver-gap measurement when QuantMind has the LLM substrate
Backward-compatible: no existing entries are modified; new field is absent (null) on legacy records

Changes

File	What
`qm_mcp/grpo_suitability.py`	`GrpoSuitabilityScorer` with `score_entry()`, length/domain/code helpers; V2 TODO hooks
`qm_mcp/ingest.py`	Score computed in `_persist()`, persisted to item JSON + ingestion log
`qm_mcp/test_grpo_suitability.py`	22 pytest cases: heuristic correctness, domain band edges, backward compat, idempotency
`docs/grpo_suitability.md`	Framework reference, V1 rule table, schema impact, V2 plan

V1 Heuristic Rule

long (≥20k chars) + arxiv source + code/math block present → "high"
short (<5k chars) + news/unknown source + no code            → "low"
everything else                                               → "medium"

Test Plan

pytest qm_mcp/test_grpo_suitability.py -v — 22/22 passed
No live model calls (pure heuristic)
No new dependencies
Existing ingestion tests not broken (additive-only change)

🤖 Generated with Claude Code

) QuantMind v0.2 ships ingestion + LLM extraction only; its persistence, embedding, semantic-query, and Data-MCP layers are unbuilt future PRs. This adds that missing Stage-2 layer as a self-contained package that reuses QuantMind's own venv and fetch+format layer: - store.py filesystem CorpusStore (JSON + .npy vectors, stable-hash dedup) - embed.py OpenAI embeddings + grounded answer synthesis + summarizer - ingest.py fetch_arxiv/url/local -> markdown -> summarize -> embed -> store (skips the brittle paper_flow Paper-tree: gpt-4o-mini emits non-UUID node ids that the Paper schema rejects) - query.py embed question -> cosine top-k -> grounded, cited answer - server.py FastMCP stdio server: qm_ingest_arxiv/url/pdf/text, qm_query, qm_list_corpus, qm_delete_item - cli.py seeding + shell use; seed_corpus.txt; _smoke_mcp.py handshake test Secrets load from ~/.hermes/.env; uses VOICE_TOOLS_OPENAI_KEY (real OpenAI) since Hermes OPENAI_API_KEY is an OpenRouter key with no embeddings endpoint. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Adds `grpo_suitability: high|medium|low` to every corpus entry at ingest time, implementing the weak-vs-strong discrimination-gap framework from Kulikov et al. (FAIR at Meta, arXiv:2606.25996). V1 is a pure deterministic heuristic (no live model calls): - long + arxiv source + code present → high - short + news/unknown source + no code → low - everything else → medium Changes: - qm_mcp/grpo_suitability.py: GrpoSuitabilityScorer with score_entry(), length_band, domain_band, code_present helpers; V2 solver-gap hooks documented as TODOs - qm_mcp/ingest.py: score computed in _persist() and persisted to both items/<id>.json and ingestion_log.jsonl; backward-compatible (existing entries not touched) - qm_mcp/test_grpo_suitability.py: 22 pytest cases covering heuristic correctness, domain-band edge cases, backward compat, idempotency - docs/grpo_suitability.md: framework reference, V1 rule table, V2 plan Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Keeps the coverage floor enforced by CI (scripts/verify.sh) while allowing sub-package test suites (e.g. qm_mcp/) to run standalone without a false failure when quantmind code is not exercised. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

AdairBear and others added 4 commits June 12, 2026 10:52

docs: qm_mcp engineering log — record Phase 4 merge

615c4fb

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(grpo): GRPO-suitability tagging per Autodata 2606.25996#85

feat(grpo): GRPO-suitability tagging per Autodata 2606.25996#85
AdairBear wants to merge 4 commits into
LLMQuant:masterfrom
AdairBear:lifts/grpo-suitability-tagging

AdairBear commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

AdairBear commented Jun 26, 2026

Summary

Changes

V1 Heuristic Rule

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant