21 KiB
| title | status | lifecycle_classification | owner | created | last_reviewed | core_promotion_status | route |
|---|---|---|---|---|---|---|---|
| CTO Case Agent Protocol Blocker | draft | sot | jp | 2026-06-01 | 2026-06-01 | not-promoted | cto |
CTO Case Agent Protocol Blocker
Local planning SOT only. Not a Core Protocol. Not active Core authority.
CTO-WORK-028 - Case Agent Result Protocol Blocker
Status: blocked.
Record the first admitted real Case Stage 2 run after OpenAI Codex model admission. The run proves that provider/model admission now reaches Case execution, but does not prove Stage 2. Case failed before producing a workspace diff because its implementer agent result did not satisfy the Case result-envelope contract.
The later admitted Qwen local run reproduced the same result-envelope failure after Case process start. This makes the active blocker the Case agent-result protocol seam, not model admission.
Acceptance:
- Real Case Stage 2 remains blocked until Case produces a Harness Evidence Interface pass report.
- The admitted provider/model pair evidence includes
openai-codex/gpt-5.5andqwen-local/qwen3.6-35b-a3b. - The admission files remain
.sot/03-PROTOCOLS/CTO-CASE-MODEL-PROVIDER-ADMISSION.openai-codex-gpt-5.5.jsonand.sot/03-PROTOCOLS/CTO-CASE-MODEL-PROVIDER-ADMISSION.qwen-local-qwen3.6-35b-a3b.json. - Evidence must show
case_process_started: truebefore this blocker is accepted as the current blocker. - Evidence must show
case_model_admission_status: admitted. - Evidence must show no target repository path was inspected or copied.
- Evidence must show no workspace patch was produced.
- Evidence must show tests did not pass.
- The next implementation route must happen through the Hermes CTO harness seam, a Case-compatible provider adapter seam, or an external compatibility layer.
- The next implementation route must not mutate Cortex Core, vendor Case source, or external developer repositories.
- No real-repo, copied-repo, sandbox-repo, owned-repo, default-candidate, or Core promotion stage may use this failed run as pass evidence.
Evidence - 2026-06-01
- Harness command class: real Case Stage 2 artificial fixture.
- Run artifact directory:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T013918Z-r1-string-slugify-2381028. - Case binary path used by harness:
/tmp/workos-case/dist/ca. - Case source pin for the built binary:
7959ac917cdeb0983b4aaa20bb9f42021747fed8. - Report status:
fail. - Backend:
case. - Backend exit code:
1. - Case process started:
true. - Case model provider:
openai-codex. - Case model:
gpt-5.5. - Case model admission status:
admitted. - Source admission status:
not_admitted. - No target inspection proof:
stage2-no-target-inspection.json. - Changed files: none.
- Patch artifact:
patch.diff. - Patch digest:
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855. - Tests command:
python3 -m pytest -q. - Tests passed:
false. - Required events passed:
false. - Report blocker:
case engine failed with exit code 1. - Case stderr evidence: implementer failed with
AGENT_RESULT start delimiter not found. - Case stderr evidence: retry classified the failure as
agent-protocol-error. - Case stdout evidence: unattended mode auto-selected
Abort. - Result: Stage 2 is still blocked.
Qwen Local Evidence - 2026-06-01
- Harness command class: real Case Stage 2 artificial fixture.
- Run artifact directory:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T015208Z-r1-string-slugify-2478256. - Case binary path used by harness:
/tmp/workos-case/dist/ca. - Case source pin for the built binary:
7959ac917cdeb0983b4aaa20bb9f42021747fed8. - Report status:
fail. - Backend:
case. - Backend exit code:
1. - Case process started:
true. - Case model provider:
qwen-local. - Case model:
qwen3.6-35b-a3b. - Case model admission status:
admitted. - Source admission status:
not_admitted. - No target inspection proof:
stage2-no-target-inspection.json. - Changed files: none.
- Patch artifact:
patch.diff. - Patch digest:
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855. - Tests command:
python3 -m pytest -q. - Tests passed:
false. - Required events passed:
false. - Report blocker:
case engine failed with exit code 1. - Case stderr evidence: implementer failed with
AGENT_RESULT start delimiter not found. - Case stderr evidence: retry classified the failure as
agent-protocol-error. - Case stdout evidence: unattended mode auto-selected
Abort. - Result: Stage 2 is still blocked.
Current Interpretation
This is a protocol compatibility blocker, not a provider approval or model admission blocker.
Two admitted provider/model paths reached Case. Case then failed because the
implementer agent did not return output framed by the Case AGENT_RESULT
delimiter contract. The evidence does not prove whether the defect is Case
provider configuration, provider adapter behavior, model output framing, or
harness invocation shape.
Narrowed Interpretation - 2026-06-01
Hermes commit 5db23c7 Fail closed on Case Codex auth gap narrows the current
OpenAI Codex path blocker.
The current known OpenAI Codex blocker is an auth-bridge gap:
- Case's pipeline SDK path constructs its Pi Agent runtime directly.
- That path does not pass Pi AuthStorage OAuth headers into
streamSimple. - Pi env API-key lookup does not map
openai-codexto an environment API key. - Hermes now blocks
openai-codexbeforecase_process_startedunless an explicit non-vendor auth bridge is proven.
This does not prove that any local provider path passes Stage 2. It only prevents
repeating the misleading malformed-output run for the openai-codex path.
Required Next Route
The next useful route is a small Case agent protocol compatibility investigation. It should answer only this question:
What minimal non-vendor seam makes admitted Case execution return the required
AGENT_RESULT envelope and produce a Stage 2 artificial fixture diff?
Allowed next actions:
- Inspect Case provider adapter behavior read-only.
- Inspect Hermes CTO Case invocation behavior.
- Add fail-closed classification in Hermes CTO harness if needed.
- Add a compatibility shim only outside vendor Case source.
- Admit and test the existing Pi local provider id
qwen-localonly through the Harness Evidence Interface. - Re-run real Case Stage 2 only after a specific protocol compatibility change exists.
Forbidden next actions:
- Do not edit
/tmp/workos-caseas the durable solution. - Do not mark Stage 2 validated from this run.
- Do not promote Case to copied repo, sandbox repo, owned repo, or default candidate.
- Do not write provider secrets to SOT, argv, task files, backend logs, reports, traces, or commits.
Hermes Classifier Evidence - 2026-06-01
- Hermes commit:
48d487a Classify Case agent protocol failures. - Hermes commit:
798fb5a Harden Case protocol failure marker. 48d487aadds a CTO HarnessAGENT_RESULTprotocol appendix to the Case task markdown.48d487aadds a fail-closed protocol marker path:backend/provider-agent-protocol.txt.798fb5ahardens report-time classification so missingAGENT_RESULTevidence is recorded even if backend marker creation is missed.- Focused validator passed:
python3 harness/runner/validate-case-provider-adapter.py --harness-root harness --json. - Aggregate validator passed:
harness/evals/health.sh --json. - Post-merge aggregate validator passed:
harness/evals/health.sh --json.
Qwen Local Classified Runtime Evidence - 2026-06-01
- Run artifact directory:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T020117Z-r1-string-slugify-2566310. - Report status:
fail. - Backend:
case. - Backend exit code:
1. - Case process started:
true. - Case model provider:
qwen-local. - Case model:
qwen3.6-35b-a3b. - Case model admission status:
admitted. - Failure reason:
case agent result protocol failed. - Protocol marker:
backend/provider-agent-protocol.txt. - Changed files: none.
- Patch artifact:
patch.diff. - Patch digest:
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855. - Tests passed:
false. - Required events passed:
false. - Result: Stage 2 is still blocked.
- Current next route remains a Case/Pi runtime protocol compatibility fix, not another admission record.
Isolated Pi Config Evidence - 2026-06-01
- Hermes commit:
09b5851 Isolate Case Pi provider config. - The Hermes CTO Case harness now sets
PI_CODING_AGENT_DIRunder the run artifact directory before invoking Case. - The harness writes isolated Pi auth state under
backend/case-data/pi-agent/auth.json. - Local Case providers now require explicit
CTO_HARNESS_CASE_LOCAL_BASE_URL. - Missing local provider config writes
backend/provider-local-config-unavailable.txt. - Missing local provider config blocks before
case_process_started. - Focused validator passed:
python3 harness/runner/validate-case-provider-adapter.py --harness-root harness --json. - Focused validator artifact:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T020732Z-r1-string-slugify-2609546. - Aggregate validator passed before merge:
harness/evals/health.sh --json. - Aggregate validator artifact before merge:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T020741Z-r1-string-slugify-2611203. - Post-merge aggregate validator passed:
harness/evals/health.sh --json. - Post-merge aggregate validator artifact:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T020801Z-r1-string-slugify-2613843. - Real Qwen local config-gate proof artifact:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T020847Z-r1-string-slugify-2619644. - Real Qwen local config-gate status:
blocked. - Real Qwen local config-gate failure reason:
provider unavailable. - Real Qwen local config-gate Case process started:
false. - This removes the ambient
~/.pi/agentdependency from the harness path. - This does not prove Case can produce the required
AGENT_RESULTenvelope. CTO-WORK-028remains blocked until a configured local provider or another admitted provider returns a valid result envelope and produces a Stage 2 artificial fixture diff.
Spark Endpoint Config Dependency - 2026-06-01
CTO-WORK-030must be resolved before another configured Qwen local run can retest the CaseAGENT_RESULTprotocol path.- Until
CTO_HARNESS_CASE_LOCAL_BASE_URLis supplied, the harness blocks before Case starts. - The agent protocol blocker remains unproven for the isolated Spark endpoint path until Case reaches execution again and returns or fails the required result envelope.
Case Qwen Loop Entrypoint Evidence - 2026-06-01
- Hermes commit:
6c453ee Add Case Qwen loop entrypoint. - The new
harness/evals/case-qwen-loop.sh --jsoncommand is the next standard path for retesting the CaseAGENT_RESULTprotocol after Spark endpoint config is supplied. - This does not resolve the protocol blocker because Case has not reached execution through the configured isolated Spark endpoint path yet.
Spark vLLM Qwen Loop Evidence - 2026-06-01
- Spark1 exposed a reachable OpenAI-compatible vLLM model route for
qwen3.6-35b-a3b. - The endpoint value was supplied only at runtime and is not recorded in SOT.
- Standard command class:
harness/evals/case-qwen-loop.sh --json. - Run artifact directory:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T022535Z-r1-string-slugify-2731603. - Report status:
fail. - Backend:
case. - Backend exit code:
1. - Case process started:
true. - Case model provider:
qwen-local. - Case model:
qwen3.6-35b-a3b. - Case model admission status:
admitted. - Source admission status:
not_admitted. - Failure reason:
case agent result protocol failed. - Protocol marker:
backend/provider-agent-protocol.txt. - Changed files: none.
- Patch artifact:
patch.diff. - Patch digest:
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855. - Tests passed:
false. - Required events passed:
false. - Result: Stage 2 is still blocked.
- Current next route is a Case/Pi result-envelope compatibility fix outside vendor Case source.
Narrowed Response Shape Interpretation - 2026-06-01
Hermes commit 974813b Block Case on reasoning-only local provider narrows the
Qwen local blocker.
Observed:
- Spark vLLM accepts OpenAI-compatible chat-completions requests for
qwen3.6-35b-a3b. - The response can contain a
reasoningfield withcontent: null. - Case/Pi only turns assistant text deltas into the raw text parsed by
AGENT_RESULT. - A reasoning-only response therefore reaches no valid
AGENT_RESULTenvelope.
Harness effect:
- The Case Qwen loop now probes local provider response shape before Case process start.
- Reasoning-only local responses write
backend/provider-reasoning-only.txt. - Reasoning-only local responses report
failure_reason: provider response shape unavailable. - Reasoning-only local responses block with
case_process_started: false.
Latest evidence:
- Run artifact directory:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T023119Z-r1-string-slugify-2759949. - Report status:
blocked.
Response Probe Budget Correction - 2026-06-01
Hermes commit bbe7c72 Use realistic Case local response probe corrected the
local response-shape probe budget.
Observed:
- Spark vLLM returned reasoning-only output at a very small probe budget.
- The same route returned assistant content at realistic Case-sized budgets.
- The Hermes probe now uses a larger budget before classifying a provider as reasoning-only.
Evidence:
- Focused validator passed with
local_provider_delayed_content_allows_case. - Real Case Qwen loop artifact:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T023532Z-r1-string-slugify-2776187. - Report status:
fail. - Failure reason:
case agent result protocol failed. - Case process started:
true. - Case model provider:
qwen-local. - Case model:
qwen3.6-35b-a3b. - Result: response shape is no longer the active blocker for this route.
OpenAI-Compatible Runtime Bridge Evidence - 2026-06-01
Hermes commit 5c5448b Bridge Case Qwen through OpenAI-compatible runtime
adds a non-vendor compatibility route for Case/Pi local model execution.
The CTO admission identity remains qwen-local / qwen3.6-35b-a3b. The Case
runtime provider identity is mapped to Pi's built-in openai provider only
inside the harness-owned Case process environment. The Spark endpoint value is
supplied only at runtime and is not recorded in SOT.
Evidence:
- Focused validator passed:
python3 harness/runner/validate-case-provider-adapter.py --harness-root harness --json. - Focused validator artifact:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024443Z-r1-string-slugify-2817037. - Focused validator includes
qwen_local_openai_compat_allows_case. - Post-merge aggregate validator passed:
harness/evals/health.sh --json. - Post-merge provider-adapter artifact:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024714Z-r1-string-slugify-2832755. - Post-merge matrix artifact:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024712Z-run-all-fake-2832397. - Real Case Qwen loop artifact:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T024456Z-r1-string-slugify-2819659. - Real run report status:
fail. - Real run failure reason:
case engine failed with exit code 124. - Case process started:
true. - Case model provider:
qwen-local. - Case model:
qwen3.6-35b-a3b. - Case runtime model provider:
openai. - Case model admission status:
admitted. - Source admission status:
not_admitted. - Tests command:
python3 -m pytest -q. - Tests passed:
true. - Patch artifact:
patch.diff. - Patch digest:
4706a667d3e66f3a9a00da37d274263c5ab776b0cce0971f7ac4efc5f341da54. - The committed fixture diff was captured after the harness learned to diff from the fixture baseline commit.
- Result: Case can reach Spark through the compatibility route and produce a valid artificial-fixture patch, but Stage 2 is not validated because Case timed out before a clean Harness Evidence Interface pass.
CTO-WORK-032 - Case Lifecycle Timeout After Valid Patch
Status: blocked.
The current active blocker is no longer provider admission, endpoint reachability, response shape, or absence of a patch. The active blocker is lifecycle completion after Case has produced a valid patch and passing tests.
Acceptance:
- Real Case Stage 2 remains blocked until Case exits cleanly and the harness emits a pass report.
- Evidence must show a non-empty allowed diff from the artificial fixture baseline.
- Evidence must show the fixture tests pass.
- Evidence must show required events pass through the Harness Evidence Interface.
- Evidence must show no Target Repository path was inspected or copied.
- Evidence must preserve admitted provider identity as
qwen-local/qwen3.6-35b-a3b. - Evidence may use the harness-owned OpenAI-compatible runtime bridge, but must not promote
openaias CTO admission identity. - Timeout-after-valid-patch evidence must remain fail evidence, not pass evidence.
- No copied-repo, sandbox-repo, owned-repo, default-candidate, or Core promotion stage may use timeout evidence as pass evidence.
Required next route:
- Keep the OpenAI-compatible bridge behind the Hermes CTO Harness seam.
- Add or adjust only harness-side lifecycle control outside vendor Case source.
- Prefer a minimal fix that makes Case stop after the required patch, test, commit, and
AGENT_RESULTenvelope. - If timeout persists after valid patch/tests, classify it explicitly as lifecycle timeout after valid patch.
- Do not mark Stage 2 validated without a clean pass report.
- Failure reason:
provider response shape unavailable. - Marker:
backend/provider-reasoning-only.txt. - Case process started:
false. - Case model provider:
qwen-local. - Case model:
qwen3.6-35b-a3b. - Case model admission status:
admitted. - Result: Stage 2 is still blocked.
CTO-WORK-031 - Case Local Provider Response Shape Shim
Status: blocked.
Create a non-vendor compatibility route that makes the admitted local Qwen path return assistant content usable by Case/Pi result parsing, without weakening Harness evidence gates.
Acceptance:
- No vendor Case source is mutated as the durable solution.
- No endpoint value or credential value is recorded in SOT, argv examples, task files, backend logs, reports, traces, generated config, or commits.
- The shim is outside Cortex Core and outside target repositories.
- Reasoning-only responses remain fail-closed before Case process start.
- A configured local provider can produce assistant content, not only reasoning.
- Case reaches execution only after admission and response-shape checks pass.
- Real Case Stage 2 pass evidence exists through the Harness Evidence Interface.
- Same-run fake baseline comparison remains required for any pass claim.
- No copied-repo, sandbox-repo, owned-repo, default-candidate, or Core promotion stage uses a response-shape blocked run as pass evidence.
Allowed routes:
- vLLM serving configuration that disables reasoning-only output for this model.
- A local OpenAI-compatible proxy that converts or requests usable assistant content.
- A Hermes CTO harness adapter setting that is proven by focused validator and aggregate health.
Forbidden routes:
- Do not patch
/tmp/workos-caseas the durable fix. - Do not make Case default before Stage 2 pass evidence.
- Do not treat reasoning text as a completed
AGENT_RESULTunless a governed adapter proves the result envelope and file diff.
Response Probe Budget Correction - 2026-06-01
Hermes commit bbe7c72 Use realistic Case local response probe corrects the
first response-shape gate.
Direct Spark evidence showed:
qwen3.6-35b-a3bcan return reasoning only when the probe usesmax_tokens=32.- The same model route returns assistant content when the probe uses
max_tokens=256or more. - A 32-token probe can therefore create a false response-shape blocker.
Harness effect:
- The Case local response-shape probe now uses
max_tokens=1024. - Focused validator added
local_provider_delayed_content_allows_case. - True reasoning-only responses still block before Case process start.
- Delayed-content local providers can pass the response-shape probe and reach the Case stub in validation.
Latest real evidence:
- Run artifact directory:
/home/svrnty/.hermes/profiles/cto-planb/harness-runs/20260601T023532Z-r1-string-slugify-2776187. - Report status:
fail. - Failure reason:
case agent result protocol failed. - Protocol marker:
backend/provider-agent-protocol.txt. - Case process started:
true. - Case model provider:
qwen-local. - Case model:
qwen3.6-35b-a3b. - Case model admission status:
admitted. - Changed files: none.
- Patch artifact:
patch.diff. - Tests passed:
false. - Required events passed:
false. - Result: Stage 2 is still blocked.
Current interpretation:
CTO-WORK-031remains useful as a guardrail for true reasoning-only local provider responses.CTO-WORK-031is not the current primary blocker for the Spark Qwen route.- The active blocker returns to
CTO-WORK-028: Case reaches execution but does not produce the requiredAGENT_RESULTenvelope or workspace diff.