--- name: cto-case-stage2-artificial-fixture-prd tier: local status: draft owner: jp source: sot/03-PROTOCOLS/CTO-CASE-STAGED-PROOF-GATES.md created: 2026-05-31 last_reviewed: 2026-05-31 lifecycle_classification: planning core_promotion_status: not-promoted description: Child-local PRD for Stage 2 Case artificial fixture execution proof. --- # CTO Case Stage 2 Artificial Fixture PRD Local planning SOT only. Not a Core Protocol. Not active Core authority. ## Problem Statement Stage 1 proves that the Hermes CTO harness can recognize `case`, deny it by default, and emit evidence without running Case. The next risk is enabling Case too broadly. Stage 2 must prove only one narrow executable behavior: Case may run against a copied artificial fixture, through the existing harness engine seam, while preserving evidence shape, allowed-write control, fake default behavior, and fail-closed fixture outcomes. ## Solution Define a Stage 2 implementation slice for `/home/svrnty/workspaces/hermes/cto/harness`. The slice may execute Case only against copied artificial fixture inputs already owned by the harness. It must not inspect or mutate a Target Repository, copied local repo fixture, disposable repo, owned noncritical repo, vendor source, Case source, or Cortex Core. Stage 2 keeps the `case` backend gated by `CTO_HARNESS_ALLOW_CASE=1` and adds a stricter execution gate named `CTO_HARNESS_CASE_STAGE=2`. Missing Stage 2 gate means blocked, not warning. The fake lane remains the default validation lane and comparison baseline. Allowed roots are explicit: Stage 2 may write only to `runtime_workspace_root` and `run_artifact_dir`. `runtime_workspace_root` is the copied artificial case workspace created under the harness runtime root. `run_artifact_dir` is the single run evidence directory. All other writes are forbidden, including harness source checkout, Target Repository paths, Case source, vendor source, external developer repositories, and Cortex Core. ## Scope - Add one artificial fixture task contract for Case adapter proof. - Allow mutation only inside the copied artificial runtime workspace. - Preserve the Harness Evidence Interface artifact set. - Compare Case report shape, event order, allowed writes, verification output, blockers, artifact digests, and freshness proof against fake fixture expectations. - Use a same-run fake baseline on the same artificial fixture as the comparison artifact. - Add focused Stage 2 validator checks before broader harness health validation. - Add failure fixture expectations for no diff, disallowed file, failed tests, missing test command, missing required event, and provider unavailable. - Keep implementation in the Hermes CTO harness route. ## Non-Goals - Do not run Case on a Target Repository. - Do not run Case on a copied local repository fixture. - Do not create, push, merge, deploy, or close a pull request. - Do not mutate Case source, vendor source, external developer repositories, Cortex Core, or Hermes WebUI product surfaces. - Do not resolve Case source admission for real-repo execution. - Do not promote any artifact into Core. - Do not make Case a default backend. ## Acceptance Criteria - Stage 2 entry requires Stage 1 validated. - Stage 2 allowed mutation scope is `copied artificial case only`. - Stage 2 allowed roots are `runtime_workspace_root` and `run_artifact_dir`. - `case` remains disabled by default. - `CTO_HARNESS_ALLOW_CASE=1` remains required. - `CTO_HARNESS_CASE_STAGE=2` is required before Case adapter execution. - Missing Stage 2 gate produces blocked status, not warning. - Fake remains the default validation lane. - Case runs only through the same engine dispatch path as fake, Codex, and Pi. - Case receives only artificial fixture input, allowed paths, forbidden actions, verification command, and evidence expectations. - No Target Repository path is inspected or copied. - The focused validator proves the task contract contains no Target Repository path and that runtime inputs, command arguments, environment, and config expose only `runtime_workspace_root` and `run_artifact_dir` as writable roots. - No files under harness source checkout, target repo, Case source, vendor source, external developer repositories, or Cortex Core are changed by Stage 2 execution. - Runtime writes are limited to the copied artificial workspace and the run artifact directory. - `report.json` records `backend: case`, `status`, `backend_exit_code`, `allowed_writes_passed`, `changed_files`, `blockers`, `source_admission_status: not_admitted`, `case_process_started`, artifact paths, artifact digests, and freshness proof. - Required artifacts include `report.json`, `report.md`, `events.normalized.jsonl`, `trace.jsonl`, `patch.diff`, `test.log`, and backend raw logs under `backend/`. - Normalized events include `run.started`, `task.contract.created`, `plan.updated`, `patch.applied`, `git.diff.checked`, `verification.completed`, and `run.completed` for pass cases. - Blocked preflight emits `backend.gate.blocked`. - Failure fixtures cover no diff, disallowed file, failed tests, missing test command, missing required event, and provider unavailable. - Failure fixture reports fail closed with blocker reason, nonzero exit where executable, and complete evidence artifacts. - Same-run fake baseline evidence records the fake run artifact path, the Case run artifact path, and whether report shape, event order, allowed writes, verification output, blockers, artifact digests, and freshness proof match expectations. ## Validation - Focused Stage 2 validator command is `python3 harness/runner/validate-case-stage2.py --harness-root harness --json`. - Focused Stage 2 validator output must include `ok`, `checked`, `errors`, pass report paths, failure report paths, same-run fake baseline path, and no-target-inspection proof. - Focused Stage 2 validator must run `python3 harness/runner/validate-case-stage1.py --harness-root harness --json` first and require it to pass. - Focused Stage 2 validator must prove fake remains default. - Focused Stage 2 validator must prove `--engine case` without `CTO_HARNESS_ALLOW_CASE=1` remains blocked. - Focused Stage 2 validator must prove `CTO_HARNESS_ALLOW_CASE=1` without `CTO_HARNESS_CASE_STAGE=2` remains blocked for execution. - Focused Stage 2 validator must run one passing artificial fixture through `case` only when both gates are set. - Focused Stage 2 validator must inspect artifact shape, event order, allowed writes, verification log, digest fields, and freshness proof. - Focused Stage 2 validator must compare the Case run to a same-run fake baseline on the same fixture. - Focused Stage 2 validator must run failure fixture checks for no diff, disallowed file, failed tests, missing test command, missing required event, and provider unavailable. - Focused Stage 2 validator must prove no Target Repository path is present in task contract, command arguments, runtime inputs, environment, or harness config used by the run. - Broader harness health validation must run after the focused Stage 2 validator is green. - Cortex CTO child validator must require this PRD and issue artifact before Stage 2 implementation is considered governed. ## Risks - Stage 2 can be mistaken for real-repo readiness. - A live Case invocation can leak beyond the artificial fixture if workspace and allowed paths are weak. - A passing fixture can hide missing failure closure. - Stage 2 can accidentally make `case` default or bypass fake comparison. - Missing Case credentials or provider availability can block executable proof. ## Dependencies - Stage 1 Gated Case Engine is validated. - Harness Evidence Interface Contract is validated. - Case Adapter Contract is validated. - Case Failure Fixture Matrix is validated. - Stage 1 focused validator exists in the Hermes CTO harness. - Human-provided credentials or local provider configuration may be required for real Case execution; missing credentials must produce a blocked provider-unavailable result, not a fake pass. ## Success Definition Stage 2 is successful when Case can execute one copied artificial fixture through the CTO harness, produce the same evidence interface expected from fake fixtures, and fail closed for required artificial failure classes. Stage 2 does not authorize copied repo, sandbox repo, owned repo, default backend, WebUI product, or Core promotion behavior.