Record Case response probe correction
This commit is contained in:
parent
fca300afa7
commit
b4d2ca2709
@ -297,3 +297,43 @@ Forbidden routes:
|
|||||||
- Do not patch `/tmp/workos-case` as the durable fix.
|
- Do not patch `/tmp/workos-case` as the durable fix.
|
||||||
- Do not make Case default before Stage 2 pass evidence.
|
- Do not make Case default before Stage 2 pass evidence.
|
||||||
- Do not treat reasoning text as a completed `AGENT_RESULT` unless a governed adapter proves the result envelope and file diff.
|
- Do not treat reasoning text as a completed `AGENT_RESULT` unless a governed adapter proves the result envelope and file diff.
|
||||||
|
|
||||||
|
## Response Probe Budget Correction - 2026-06-01
|
||||||
|
|
||||||
|
Hermes commit `bbe7c72 Use realistic Case local response probe` corrects the
|
||||||
|
first response-shape gate.
|
||||||
|
|
||||||
|
Direct Spark evidence showed:
|
||||||
|
|
||||||
|
- `qwen3.6-35b-a3b` can return reasoning only when the probe uses `max_tokens=32`.
|
||||||
|
- The same model route returns assistant content when the probe uses `max_tokens=256` or more.
|
||||||
|
- A 32-token probe can therefore create a false response-shape blocker.
|
||||||
|
|
||||||
|
Harness effect:
|
||||||
|
|
||||||
|
- The Case local response-shape probe now uses `max_tokens=1024`.
|
||||||
|
- Focused validator added `local_provider_delayed_content_allows_case`.
|
||||||
|
- True reasoning-only responses still block before Case process start.
|
||||||
|
- Delayed-content local providers can pass the response-shape probe and reach the Case stub in validation.
|
||||||
|
|
||||||
|
Latest real evidence:
|
||||||
|
|
||||||
|
- Run artifact directory: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T023532Z-r1-string-slugify-2776187`.
|
||||||
|
- Report status: `fail`.
|
||||||
|
- Failure reason: `case agent result protocol failed`.
|
||||||
|
- Protocol marker: `backend/provider-agent-protocol.txt`.
|
||||||
|
- Case process started: `true`.
|
||||||
|
- Case model provider: `qwen-local`.
|
||||||
|
- Case model: `qwen3.6-35b-a3b`.
|
||||||
|
- Case model admission status: `admitted`.
|
||||||
|
- Changed files: none.
|
||||||
|
- Patch artifact: `patch.diff`.
|
||||||
|
- Tests passed: `false`.
|
||||||
|
- Required events passed: `false`.
|
||||||
|
- Result: Stage 2 is still blocked.
|
||||||
|
|
||||||
|
Current interpretation:
|
||||||
|
|
||||||
|
- `CTO-WORK-031` remains useful as a guardrail for true reasoning-only local provider responses.
|
||||||
|
- `CTO-WORK-031` is not the current primary blocker for the Spark Qwen route.
|
||||||
|
- The active blocker returns to `CTO-WORK-028`: Case reaches execution but does not produce the required `AGENT_RESULT` envelope or workspace diff.
|
||||||
|
|||||||
@ -140,6 +140,23 @@ Validation Evidence:
|
|||||||
- `CTO-WORK-016` remains blocked because no real Case Stage 2 pass report exists.
|
- `CTO-WORK-016` remains blocked because no real Case Stage 2 pass report exists.
|
||||||
- Current downstream blocker is `CTO-WORK-031`.
|
- Current downstream blocker is `CTO-WORK-031`.
|
||||||
|
|
||||||
|
## Response Probe Budget Correction Evidence - 2026-06-01
|
||||||
|
|
||||||
|
- Hermes commit: `bbe7c72 Use realistic Case local response probe`.
|
||||||
|
- The local provider response-shape probe now allows delayed assistant content before classifying a provider as reasoning-only.
|
||||||
|
- Focused validator passed with `local_provider_delayed_content_allows_case`.
|
||||||
|
- Aggregate Hermes health passed after merge.
|
||||||
|
- Real Case Stage 2 retry with admitted `qwen-local` / `qwen3.6-35b-a3b` produced report `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T023532Z-r1-string-slugify-2776187/report.json`.
|
||||||
|
- Case process started after admission and response-shape probe passed.
|
||||||
|
- Backend exit code was `1`.
|
||||||
|
- Failure reason was `case agent result protocol failed`.
|
||||||
|
- Protocol marker was recorded at `backend/provider-agent-protocol.txt`.
|
||||||
|
- The harness recorded no changed files.
|
||||||
|
- The patch artifact was empty.
|
||||||
|
- Tests failed because the artificial fixture bug remained unchanged.
|
||||||
|
- `CTO-WORK-016` remains blocked because no real Case Stage 2 pass report exists.
|
||||||
|
- Current downstream blocker returns to `CTO-WORK-028`.
|
||||||
|
|
||||||
## Isolated Pi Config Runtime Evidence - 2026-06-01
|
## Isolated Pi Config Runtime Evidence - 2026-06-01
|
||||||
|
|
||||||
- Hermes commit: `09b5851 Isolate Case Pi provider config`.
|
- Hermes commit: `09b5851 Isolate Case Pi provider config`.
|
||||||
|
|||||||
@ -87,6 +87,23 @@ Current evidence:
|
|||||||
- `CTO-WORK-030` remains blocked until a configured endpoint can support a real Stage 2 pass.
|
- `CTO-WORK-030` remains blocked until a configured endpoint can support a real Stage 2 pass.
|
||||||
- Current downstream blocker is `CTO-WORK-031`.
|
- Current downstream blocker is `CTO-WORK-031`.
|
||||||
|
|
||||||
|
## Spark Response Probe Budget Correction - 2026-06-01
|
||||||
|
|
||||||
|
- Hermes commit: `bbe7c72 Use realistic Case local response probe`.
|
||||||
|
- Direct Spark probe showed `qwen3.6-35b-a3b` returns reasoning only at `max_tokens=32`.
|
||||||
|
- Direct Spark probe showed the same route returns assistant content at `max_tokens=256` and `max_tokens=1024`.
|
||||||
|
- The Hermes response-shape probe now uses `max_tokens=1024`.
|
||||||
|
- Focused validator added `local_provider_delayed_content_allows_case`.
|
||||||
|
- Runtime endpoint value was supplied only through environment and is not recorded in SOT.
|
||||||
|
- Real Case Qwen loop artifact after the correction: `/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T023532Z-r1-string-slugify-2776187`.
|
||||||
|
- Report status: `fail`.
|
||||||
|
- Failure reason: `case agent result protocol failed`.
|
||||||
|
- Case process started: `true`.
|
||||||
|
- Case model provider: `qwen-local`.
|
||||||
|
- Case model: `qwen3.6-35b-a3b`.
|
||||||
|
- Case model admission status: `admitted`.
|
||||||
|
- Result: Spark endpoint availability is no longer the current unknown; Stage 2 remains blocked by the Case agent-result protocol seam.
|
||||||
|
|
||||||
## Hermes Case Qwen Loop Evidence - 2026-06-01
|
## Hermes Case Qwen Loop Evidence - 2026-06-01
|
||||||
|
|
||||||
- Hermes commit: `6c453ee Add Case Qwen loop entrypoint`.
|
- Hermes commit: `6c453ee Add Case Qwen loop entrypoint`.
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user