Record Case OpenAI compatibility evidence
This commit is contained in:
parent
b4d2ca2709
commit
9d3d988983
@ -258,6 +258,90 @@ Latest evidence:
|
|||||||
|
|
||||||
- Run artifact directory: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T023119Z-r1-string-slugify-2759949`.
|
- Run artifact directory: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T023119Z-r1-string-slugify-2759949`.
|
||||||
- Report status: `blocked`.
|
- Report status: `blocked`.
|
||||||
|
|
||||||
|
## Response Probe Budget Correction - 2026-06-01
|
||||||
|
|
||||||
|
Hermes commit `bbe7c72 Use realistic Case local response probe` corrected the
|
||||||
|
local response-shape probe budget.
|
||||||
|
|
||||||
|
Observed:
|
||||||
|
|
||||||
|
- Spark vLLM returned reasoning-only output at a very small probe budget.
|
||||||
|
- The same route returned assistant content at realistic Case-sized budgets.
|
||||||
|
- The Hermes probe now uses a larger budget before classifying a provider as reasoning-only.
|
||||||
|
|
||||||
|
Evidence:
|
||||||
|
|
||||||
|
- Focused validator passed with `local_provider_delayed_content_allows_case`.
|
||||||
|
- Real Case Qwen loop artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T023532Z-r1-string-slugify-2776187`.
|
||||||
|
- Report status: `fail`.
|
||||||
|
- Failure reason: `case agent result protocol failed`.
|
||||||
|
- Case process started: `true`.
|
||||||
|
- Case model provider: `qwen-local`.
|
||||||
|
- Case model: `qwen3.6-35b-a3b`.
|
||||||
|
- Result: response shape is no longer the active blocker for this route.
|
||||||
|
|
||||||
|
## OpenAI-Compatible Runtime Bridge Evidence - 2026-06-01
|
||||||
|
|
||||||
|
Hermes commit `5c5448b Bridge Case Qwen through OpenAI-compatible runtime`
|
||||||
|
adds a non-vendor compatibility route for Case/Pi local model execution.
|
||||||
|
|
||||||
|
The CTO admission identity remains `qwen-local` / `qwen3.6-35b-a3b`. The Case
|
||||||
|
runtime provider identity is mapped to Pi's built-in `openai` provider only
|
||||||
|
inside the harness-owned Case process environment. The Spark endpoint value is
|
||||||
|
supplied only at runtime and is not recorded in SOT.
|
||||||
|
|
||||||
|
Evidence:
|
||||||
|
|
||||||
|
- Focused validator passed: `python3 harness/runner/validate-case-provider-adapter.py --harness-root harness --json`.
|
||||||
|
- Focused validator artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024443Z-r1-string-slugify-2817037`.
|
||||||
|
- Focused validator includes `qwen_local_openai_compat_allows_case`.
|
||||||
|
- Post-merge aggregate validator passed: `harness/evals/health.sh --json`.
|
||||||
|
- Post-merge provider-adapter artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024714Z-r1-string-slugify-2832755`.
|
||||||
|
- Post-merge matrix artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024712Z-run-all-fake-2832397`.
|
||||||
|
- Real Case Qwen loop artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024456Z-r1-string-slugify-2819659`.
|
||||||
|
- Real run report status: `fail`.
|
||||||
|
- Real run failure reason: `case engine failed with exit code 124`.
|
||||||
|
- Case process started: `true`.
|
||||||
|
- Case model provider: `qwen-local`.
|
||||||
|
- Case model: `qwen3.6-35b-a3b`.
|
||||||
|
- Case runtime model provider: `openai`.
|
||||||
|
- Case model admission status: `admitted`.
|
||||||
|
- Source admission status: `not_admitted`.
|
||||||
|
- Tests command: `python3 -m pytest -q`.
|
||||||
|
- Tests passed: `true`.
|
||||||
|
- Patch artifact: `patch.diff`.
|
||||||
|
- Patch digest: `4706a667d3e66f3a9a00da37d274263c5ab776b0cce0971f7ac4efc5f341da54`.
|
||||||
|
- The committed fixture diff was captured after the harness learned to diff from the fixture baseline commit.
|
||||||
|
- Result: Case can reach Spark through the compatibility route and produce a valid artificial-fixture patch, but Stage 2 is not validated because Case timed out before a clean Harness Evidence Interface pass.
|
||||||
|
|
||||||
|
## CTO-WORK-032 - Case Lifecycle Timeout After Valid Patch
|
||||||
|
|
||||||
|
Status: blocked.
|
||||||
|
|
||||||
|
The current active blocker is no longer provider admission, endpoint reachability,
|
||||||
|
response shape, or absence of a patch. The active blocker is lifecycle completion
|
||||||
|
after Case has produced a valid patch and passing tests.
|
||||||
|
|
||||||
|
Acceptance:
|
||||||
|
|
||||||
|
- Real Case Stage 2 remains blocked until Case exits cleanly and the harness emits a pass report.
|
||||||
|
- Evidence must show a non-empty allowed diff from the artificial fixture baseline.
|
||||||
|
- Evidence must show the fixture tests pass.
|
||||||
|
- Evidence must show required events pass through the Harness Evidence Interface.
|
||||||
|
- Evidence must show no Target Repository path was inspected or copied.
|
||||||
|
- Evidence must preserve admitted provider identity as `qwen-local` / `qwen3.6-35b-a3b`.
|
||||||
|
- Evidence may use the harness-owned OpenAI-compatible runtime bridge, but must not promote `openai` as CTO admission identity.
|
||||||
|
- Timeout-after-valid-patch evidence must remain fail evidence, not pass evidence.
|
||||||
|
- No copied-repo, sandbox-repo, owned-repo, default-candidate, or Core promotion stage may use timeout evidence as pass evidence.
|
||||||
|
|
||||||
|
Required next route:
|
||||||
|
|
||||||
|
- Keep the OpenAI-compatible bridge behind the Hermes CTO Harness seam.
|
||||||
|
- Add or adjust only harness-side lifecycle control outside vendor Case source.
|
||||||
|
- Prefer a minimal fix that makes Case stop after the required patch, test, commit, and `AGENT_RESULT` envelope.
|
||||||
|
- If timeout persists after valid patch/tests, classify it explicitly as lifecycle timeout after valid patch.
|
||||||
|
- Do not mark Stage 2 validated without a clean pass report.
|
||||||
- Failure reason: `provider response shape unavailable`.
|
- Failure reason: `provider response shape unavailable`.
|
||||||
- Marker: `backend/provider-reasoning-only.txt`.
|
- Marker: `backend/provider-reasoning-only.txt`.
|
||||||
- Case process started: `false`.
|
- Case process started: `false`.
|
||||||
|
|||||||
@ -157,6 +157,27 @@ Validation Evidence:
|
|||||||
- `CTO-WORK-016` remains blocked because no real Case Stage 2 pass report exists.
|
- `CTO-WORK-016` remains blocked because no real Case Stage 2 pass report exists.
|
||||||
- Current downstream blocker returns to `CTO-WORK-028`.
|
- Current downstream blocker returns to `CTO-WORK-028`.
|
||||||
|
|
||||||
|
## OpenAI-Compatible Runtime Bridge Evidence - 2026-06-01
|
||||||
|
|
||||||
|
- Hermes commit: `5c5448b Bridge Case Qwen through OpenAI-compatible runtime`.
|
||||||
|
- The harness preserves CTO admission identity as `qwen-local` / `qwen3.6-35b-a3b`.
|
||||||
|
- The harness maps the Case runtime provider to Pi's built-in `openai` provider only inside the harness-owned Case process.
|
||||||
|
- The Spark endpoint value was supplied only through runtime environment and is not recorded in SOT.
|
||||||
|
- Focused validator passed with `qwen_local_openai_compat_allows_case`.
|
||||||
|
- Focused validator artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024443Z-r1-string-slugify-2817037`.
|
||||||
|
- Post-merge Hermes health passed.
|
||||||
|
- Post-merge provider-adapter artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024714Z-r1-string-slugify-2832755`.
|
||||||
|
- Real Case Stage 2 retry artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024456Z-r1-string-slugify-2819659`.
|
||||||
|
- Report status was `fail`.
|
||||||
|
- Failure reason was `case engine failed with exit code 124`.
|
||||||
|
- Case process started was `true`.
|
||||||
|
- Case runtime model provider was `openai`.
|
||||||
|
- Tests passed.
|
||||||
|
- Patch artifact was non-empty.
|
||||||
|
- Patch digest was `4706a667d3e66f3a9a00da37d274263c5ab776b0cce0971f7ac4efc5f341da54`.
|
||||||
|
- `CTO-WORK-016` remains blocked because no clean real Case Stage 2 pass report exists.
|
||||||
|
- Current downstream blocker is `CTO-WORK-032`.
|
||||||
|
|
||||||
## Isolated Pi Config Runtime Evidence - 2026-06-01
|
## Isolated Pi Config Runtime Evidence - 2026-06-01
|
||||||
|
|
||||||
- Hermes commit: `09b5851 Isolate Case Pi provider config`.
|
- Hermes commit: `09b5851 Isolate Case Pi provider config`.
|
||||||
|
|||||||
@ -104,6 +104,24 @@ Current evidence:
|
|||||||
- Case model admission status: `admitted`.
|
- Case model admission status: `admitted`.
|
||||||
- Result: Spark endpoint availability is no longer the current unknown; Stage 2 remains blocked by the Case agent-result protocol seam.
|
- Result: Spark endpoint availability is no longer the current unknown; Stage 2 remains blocked by the Case agent-result protocol seam.
|
||||||
|
|
||||||
|
## Spark OpenAI-Compatible Runtime Bridge Evidence - 2026-06-01
|
||||||
|
|
||||||
|
- Hermes commit: `5c5448b Bridge Case Qwen through OpenAI-compatible runtime`.
|
||||||
|
- The harness reached Spark through an OpenAI-compatible Case runtime provider bridge.
|
||||||
|
- CTO admission identity stayed `qwen-local` / `qwen3.6-35b-a3b`.
|
||||||
|
- Runtime endpoint value was supplied only through environment and is not recorded in SOT.
|
||||||
|
- Focused validator artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024443Z-r1-string-slugify-2817037`.
|
||||||
|
- Real Case Qwen loop artifact: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024456Z-r1-string-slugify-2819659`.
|
||||||
|
- Report status: `fail`.
|
||||||
|
- Failure reason: `case engine failed with exit code 124`.
|
||||||
|
- Case process started: `true`.
|
||||||
|
- Case runtime model provider: `openai`.
|
||||||
|
- Tests passed: `true`.
|
||||||
|
- Patch artifact was non-empty.
|
||||||
|
- This proves endpoint config and runtime provider bridging are sufficient for Case to produce a fixture patch.
|
||||||
|
- This does not validate `CTO-WORK-016`, `CTO-WORK-020`, `CTO-WORK-022`, `CTO-WORK-028`, or `CTO-WORK-032`.
|
||||||
|
- Current active blocker is Case lifecycle timeout after valid patch evidence.
|
||||||
|
|
||||||
## Hermes Case Qwen Loop Evidence - 2026-06-01
|
## Hermes Case Qwen Loop Evidence - 2026-06-01
|
||||||
|
|
||||||
- Hermes commit: `6c453ee Add Case Qwen loop entrypoint`.
|
- Hermes commit: `6c453ee Add Case Qwen loop entrypoint`.
|
||||||
|
|||||||
@ -155,3 +155,8 @@ items:
|
|||||||
status: blocked
|
status: blocked
|
||||||
source: .sot/03-PROTOCOLS/CTO-CASE-AGENT-PROTOCOL-BLOCKER.md
|
source: .sot/03-PROTOCOLS/CTO-CASE-AGENT-PROTOCOL-BLOCKER.md
|
||||||
owner: jp
|
owner: jp
|
||||||
|
- id: CTO-WORK-032
|
||||||
|
title: Case Lifecycle Timeout After Valid Patch
|
||||||
|
status: blocked
|
||||||
|
source: .sot/03-PROTOCOLS/CTO-CASE-AGENT-PROTOCOL-BLOCKER.md
|
||||||
|
owner: jp
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user