diff --git a/.sot/03-PROTOCOLS/CTO-CASE-AGENT-PROTOCOL-BLOCKER.md b/.sot/03-PROTOCOLS/CTO-CASE-AGENT-PROTOCOL-BLOCKER.md index b636770..fea02a1 100644 --- a/.sot/03-PROTOCOLS/CTO-CASE-AGENT-PROTOCOL-BLOCKER.md +++ b/.sot/03-PROTOCOLS/CTO-CASE-AGENT-PROTOCOL-BLOCKER.md @@ -258,6 +258,90 @@ Latest evidence: - Run artifact directory: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T023119Z-r1-string-slugify-2759949`. - Report status: `blocked`. + +## Response Probe Budget Correction - 2026-06-01 + +Hermes commit `bbe7c72 Use realistic Case local response probe` corrected the +local response-shape probe budget. + +Observed: + +- Spark vLLM returned reasoning-only output at a very small probe budget. +- The same route returned assistant content at realistic Case-sized budgets. +- The Hermes probe now uses a larger budget before classifying a provider as reasoning-only. + +Evidence: + +- Focused validator passed with `local_provider_delayed_content_allows_case`. +- Real Case Qwen loop artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T023532Z-r1-string-slugify-2776187`. +- Report status: `fail`. +- Failure reason: `case agent result protocol failed`. +- Case process started: `true`. +- Case model provider: `qwen-local`. +- Case model: `qwen3.6-35b-a3b`. +- Result: response shape is no longer the active blocker for this route. + +## OpenAI-Compatible Runtime Bridge Evidence - 2026-06-01 + +Hermes commit `5c5448b Bridge Case Qwen through OpenAI-compatible runtime` +adds a non-vendor compatibility route for Case/Pi local model execution. + +The CTO admission identity remains `qwen-local` / `qwen3.6-35b-a3b`. The Case +runtime provider identity is mapped to Pi's built-in `openai` provider only +inside the harness-owned Case process environment. The Spark endpoint value is +supplied only at runtime and is not recorded in SOT. + +Evidence: + +- Focused validator passed: `python3 harness/runner/validate-case-provider-adapter.py --harness-root harness --json`. +- Focused validator artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024443Z-r1-string-slugify-2817037`. +- Focused validator includes `qwen_local_openai_compat_allows_case`. +- Post-merge aggregate validator passed: `harness/evals/health.sh --json`. +- Post-merge provider-adapter artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024714Z-r1-string-slugify-2832755`. +- Post-merge matrix artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024712Z-run-all-fake-2832397`. +- Real Case Qwen loop artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024456Z-r1-string-slugify-2819659`. +- Real run report status: `fail`. +- Real run failure reason: `case engine failed with exit code 124`. +- Case process started: `true`. +- Case model provider: `qwen-local`. +- Case model: `qwen3.6-35b-a3b`. +- Case runtime model provider: `openai`. +- Case model admission status: `admitted`. +- Source admission status: `not_admitted`. +- Tests command: `python3 -m pytest -q`. +- Tests passed: `true`. +- Patch artifact: `patch.diff`. +- Patch digest: `4706a667d3e66f3a9a00da37d274263c5ab776b0cce0971f7ac4efc5f341da54`. +- The committed fixture diff was captured after the harness learned to diff from the fixture baseline commit. +- Result: Case can reach Spark through the compatibility route and produce a valid artificial-fixture patch, but Stage 2 is not validated because Case timed out before a clean Harness Evidence Interface pass. + +## CTO-WORK-032 - Case Lifecycle Timeout After Valid Patch + +Status: blocked. + +The current active blocker is no longer provider admission, endpoint reachability, +response shape, or absence of a patch. The active blocker is lifecycle completion +after Case has produced a valid patch and passing tests. + +Acceptance: + +- Real Case Stage 2 remains blocked until Case exits cleanly and the harness emits a pass report. +- Evidence must show a non-empty allowed diff from the artificial fixture baseline. +- Evidence must show the fixture tests pass. +- Evidence must show required events pass through the Harness Evidence Interface. +- Evidence must show no Target Repository path was inspected or copied. +- Evidence must preserve admitted provider identity as `qwen-local` / `qwen3.6-35b-a3b`. +- Evidence may use the harness-owned OpenAI-compatible runtime bridge, but must not promote `openai` as CTO admission identity. +- Timeout-after-valid-patch evidence must remain fail evidence, not pass evidence. +- No copied-repo, sandbox-repo, owned-repo, default-candidate, or Core promotion stage may use timeout evidence as pass evidence. + +Required next route: + +- Keep the OpenAI-compatible bridge behind the Hermes CTO Harness seam. +- Add or adjust only harness-side lifecycle control outside vendor Case source. +- Prefer a minimal fix that makes Case stop after the required patch, test, commit, and `AGENT_RESULT` envelope. +- If timeout persists after valid patch/tests, classify it explicitly as lifecycle timeout after valid patch. +- Do not mark Stage 2 validated without a clean pass report. - Failure reason: `provider response shape unavailable`. - Marker: `backend/provider-reasoning-only.txt`. - Case process started: `false`. diff --git a/.sot/03-PROTOCOLS/CTO-CASE-PROVIDER-BUILD-ISSUES.md b/.sot/03-PROTOCOLS/CTO-CASE-PROVIDER-BUILD-ISSUES.md index c4556d6..e592db0 100644 --- a/.sot/03-PROTOCOLS/CTO-CASE-PROVIDER-BUILD-ISSUES.md +++ b/.sot/03-PROTOCOLS/CTO-CASE-PROVIDER-BUILD-ISSUES.md @@ -157,6 +157,27 @@ Validation Evidence: - `CTO-WORK-016` remains blocked because no real Case Stage 2 pass report exists. - Current downstream blocker returns to `CTO-WORK-028`. +## OpenAI-Compatible Runtime Bridge Evidence - 2026-06-01 + +- Hermes commit: `5c5448b Bridge Case Qwen through OpenAI-compatible runtime`. +- The harness preserves CTO admission identity as `qwen-local` / `qwen3.6-35b-a3b`. +- The harness maps the Case runtime provider to Pi's built-in `openai` provider only inside the harness-owned Case process. +- The Spark endpoint value was supplied only through runtime environment and is not recorded in SOT. +- Focused validator passed with `qwen_local_openai_compat_allows_case`. +- Focused validator artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024443Z-r1-string-slugify-2817037`. +- Post-merge Hermes health passed. +- Post-merge provider-adapter artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024714Z-r1-string-slugify-2832755`. +- Real Case Stage 2 retry artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024456Z-r1-string-slugify-2819659`. +- Report status was `fail`. +- Failure reason was `case engine failed with exit code 124`. +- Case process started was `true`. +- Case runtime model provider was `openai`. +- Tests passed. +- Patch artifact was non-empty. +- Patch digest was `4706a667d3e66f3a9a00da37d274263c5ab776b0cce0971f7ac4efc5f341da54`. +- `CTO-WORK-016` remains blocked because no clean real Case Stage 2 pass report exists. +- Current downstream blocker is `CTO-WORK-032`. + ## Isolated Pi Config Runtime Evidence - 2026-06-01 - Hermes commit: `09b5851 Isolate Case Pi provider config`. diff --git a/.sot/03-PROTOCOLS/CTO-CASE-SPARK-ENDPOINT-CONFIG-ISSUES.md b/.sot/03-PROTOCOLS/CTO-CASE-SPARK-ENDPOINT-CONFIG-ISSUES.md index 0521101..7116a8a 100644 --- a/.sot/03-PROTOCOLS/CTO-CASE-SPARK-ENDPOINT-CONFIG-ISSUES.md +++ b/.sot/03-PROTOCOLS/CTO-CASE-SPARK-ENDPOINT-CONFIG-ISSUES.md @@ -104,6 +104,24 @@ Current evidence: - Case model admission status: `admitted`. - Result: Spark endpoint availability is no longer the current unknown; Stage 2 remains blocked by the Case agent-result protocol seam. +## Spark OpenAI-Compatible Runtime Bridge Evidence - 2026-06-01 + +- Hermes commit: `5c5448b Bridge Case Qwen through OpenAI-compatible runtime`. +- The harness reached Spark through an OpenAI-compatible Case runtime provider bridge. +- CTO admission identity stayed `qwen-local` / `qwen3.6-35b-a3b`. +- Runtime endpoint value was supplied only through environment and is not recorded in SOT. +- Focused validator artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024443Z-r1-string-slugify-2817037`. +- Real Case Qwen loop artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024456Z-r1-string-slugify-2819659`. +- Report status: `fail`. +- Failure reason: `case engine failed with exit code 124`. +- Case process started: `true`. +- Case runtime model provider: `openai`. +- Tests passed: `true`. +- Patch artifact was non-empty. +- This proves endpoint config and runtime provider bridging are sufficient for Case to produce a fixture patch. +- This does not validate `CTO-WORK-016`, `CTO-WORK-020`, `CTO-WORK-022`, `CTO-WORK-028`, or `CTO-WORK-032`. +- Current active blocker is Case lifecycle timeout after valid patch evidence. + ## Hermes Case Qwen Loop Evidence - 2026-06-01 - Hermes commit: `6c453ee Add Case Qwen loop entrypoint`. diff --git a/WORKBOARD.yaml b/WORKBOARD.yaml index 62da33d..6e15add 100644 --- a/WORKBOARD.yaml +++ b/WORKBOARD.yaml @@ -155,3 +155,8 @@ items: status: blocked source: .sot/03-PROTOCOLS/CTO-CASE-AGENT-PROTOCOL-BLOCKER.md owner: jp + - id: CTO-WORK-032 + title: Case Lifecycle Timeout After Valid Patch + status: blocked + source: .sot/03-PROTOCOLS/CTO-CASE-AGENT-PROTOCOL-BLOCKER.md + owner: jp