v1.2.9#48
Merged
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the evaluation pipeline to skip N8 LLM debate when there are no non-guardrail prompt turns, filters guardrail turns out of N8 context, modularizes N4 turn-evaluation prompts, and adds local DB schema search-path configuration.
Changes:
- Adds N8 skip routing/node output with placeholder holistic debate results.
- Filters guardrail
turn_logsandturn_scoresbefore holistic debate. - Splits N4 prompt content into modular YAML parts and adds supporting comparison scripts/config.
Reviewed changes
Copilot reviewed 30 out of 30 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
app/domain/langgraph/graph.py |
Adds holistic_debate_skipped node and routing edge. |
app/domain/langgraph/nodes/eval/__init__.py |
Exports the N8 skipped flow. |
app/domain/langgraph/nodes/eval/routers.py |
Routes no non-guardrail prompt evaluations to N8 skip. |
app/domain/langgraph/nodes/eval/eval_turn_targets.py |
Adds non-guardrail turn-score detection. |
app/domain/langgraph/nodes/eval/n8_code_execution.py |
Filters guardrail material and implements skipped debate flow. |
app/domain/langgraph/nodes/eval/n8_holistic_debate.py |
Applies guardrail filtering to duplicate N8 flow. |
app/domain/langgraph/nodes/eval/holistic_debate_skip.py |
Adds placeholder skipped debate result builder. |
app/domain/langgraph/utils/guardrail_turns.py |
Adds shared guardrail filtering for logs and scores. |
app/domain/langgraph/prompts/eval_intent_analysis.yaml |
Clarifies JAILBREAK vs safe session-setting boundary. |
app/domain/langgraph/prompts/eval_turn_compose.py |
Adds modular N4 prompt composer. |
app/domain/langgraph/prompts/eval_turn/base_context.yaml |
Adds modular N4 base context prompt. |
app/domain/langgraph/prompts/eval_turn/common_scale.yaml |
Adds shared scoring scale prompt. |
app/domain/langgraph/prompts/eval_turn/cot_output.yaml |
Adds CoT and JSON output prompt section. |
app/domain/langgraph/prompts/eval_turn/intent_matrix.yaml |
Adds intent-to-rubric mapping prompt section. |
app/domain/langgraph/prompts/eval_turn/gates/default.yaml |
Adds default N4 gate prompt. |
app/domain/langgraph/prompts/eval_turn/gates/follow_up_ref.yaml |
Adds follow-up reference gate prompt. |
app/domain/langgraph/prompts/eval_turn/gates/mixed_spec_code.yaml |
Adds mixed spec/code gate prompt. |
app/domain/langgraph/prompts/eval_turn/gates/request_only.yaml |
Adds request-only gate prompt. |
app/domain/langgraph/prompts/eval_turn/gates/spec_turn.yaml |
Adds spec-turn gate prompt. |
app/domain/langgraph/prompts/eval_turn/rubrics/r1.yaml |
Adds R1 rubric prompt. |
app/domain/langgraph/prompts/eval_turn/rubrics/r2.yaml |
Adds R2 rubric prompt. |
app/domain/langgraph/prompts/eval_turn/rubrics/r3.yaml |
Adds R3 rubric prompt. |
app/domain/langgraph/prompts/eval_turn/rubrics/r4.yaml |
Adds R4 rubric prompt. |
app/core/config.py |
Adds POSTGRES_SEARCH_PATH setting. |
app/infrastructure/persistence/session.py |
Applies configured PostgreSQL search path. |
env.example |
Documents local search-path override. |
scripts/compare_rubric_1_4_1_5.py |
Adds rubric score comparison helper. |
scripts/compare_n4_participants.py |
Adds N4 participant export comparison helper. |
tests/test_holistic_debate_router.py |
Updates/adds routing and skip-result tests. |
tests/test_guardrail_turns.py |
Adds guardrail log/score filtering test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+126
to
+128
| if not turn_logs: | ||
| logger.info( | ||
| "[N8] 비가드레일 turn_logs 없음 — %s", HOLISTIC_DEBATE_SKIP_MESSAGE |
Comment on lines
+84
to
+88
| def render_eval_turn_prompt(state: Dict[str, Any], **variables) -> str: | ||
| """ | ||
| 분할 YAML을 순서대로 렌더링해 하나의 system 프롬프트로 합칩니다. | ||
| """ | ||
| gate = resolve_turn_archetype_gate(state) |
| sc = det.get("score") or det.get("turn_score") | ||
| if bd: | ||
| r1, r2, r3 = bd.get("R1"), bd.get("R2"), bd.get("R3") | ||
| calc_l, calc_f = score("CREATION", r1 or 3, r2 or 3, r3 or 3) |
ydking0911
approved these changes
May 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
변경 요약
QA CHAT-02: 가드레일 차단·meta (Negative)
eval_intent_analysis.yaml— 세션 규칙(SETTING) vs 내부 지시 노출(JAILBREAK) 경계 보강QA EVAL-04: 제출 시 가드레일 턴 0점·N8 제외 (Negative)
holistic_debate_skip), holistic 0점 +debate_log플레이스홀더turn_logs/turn_scores에서 제외 후 토론·코드 실행 노드에 전달QA 외 사항
eval_turn/YAML +eval_turn_compose.py로 분리(게이트·루브릭·CoT)동작
debate_log로컬 개발용 설정 추가 (배포 기본값과 분리)
Spring과 같은 DB·스키마를 쓰기 위한 설정입니다. 연결/테이블 not found 오류 시 아래부터 확인하세요.
app/core/config.pyPOSTGRES_SEARCH_PATH추가 (기본public, 로컬은.env로ai_vibe_coding_test등)app/infrastructure/persistence/session.pyconnect_args+ 세션마다SET search_path TO …env.example# POSTGRES_SEARCH_PATH=ai_vibe_coding_test예시 주석테스트
tests/test_holistic_debate_router.py— 스킵 라우팅tests/test_guardrail_turns.py— N8용 턴 필터