Add CTO Case Stage 2 artificial fixture PRD

This commit is contained in:
Svrnty 2026-05-31 19:14:07 -04:00
parent 4fe2e1092e
commit 64a0d3f2e9
5 changed files with 315 additions and 9 deletions

View File

@ -39,7 +39,9 @@ This workspace is registered as a child-local planning workspace. Registration d
| |-- CTO-CASE-FAILURE-FIXTURE-MATRIX.md
| |-- CTO-CASE-STAGED-PROOF-GATES.md
| |-- CTO-CASE-STAGE1-GATED-ENGINE-PRD.md
| `-- CTO-CASE-STAGE1-GATED-ENGINE-ISSUES.md
| |-- CTO-CASE-STAGE1-GATED-ENGINE-ISSUES.md
| |-- CTO-CASE-STAGE2-ARTIFICIAL-FIXTURE-PRD.md
| `-- CTO-CASE-STAGE2-ARTIFICIAL-FIXTURE-ISSUES.md
`-- tools/
`-- validate_cto_child.py
```

View File

@ -8,44 +8,55 @@ items:
title: CTO Case Candidate Backend PRD
status: validated
source: sot/03-PROTOCOLS/CTO-CASE-CANDIDATE-BACKEND-PRD.md
owner: ""
owner: jp
- id: CTO-WORK-003
title: Planning Validator Coverage
status: validated
source: sot/03-PROTOCOLS/CTO-CASE-CANDIDATE-BACKEND-ISSUES.md
owner: ""
owner: jp
- id: CTO-WORK-004
title: Harness Evidence Interface Contract
status: validated
source: sot/03-PROTOCOLS/CTO-HARNESS-EVIDENCE-INTERFACE-CONTRACT.md
owner: ""
owner: jp
- id: CTO-WORK-005
title: Case Source Admission Record
status: validated
source: sot/03-PROTOCOLS/CTO-CASE-SOURCE-ADMISSION-RECORD.md
owner: ""
owner: jp
- id: CTO-WORK-006
title: Case Adapter Contract And Eligibility Decision
status: validated
source: sot/03-PROTOCOLS/CTO-CASE-ADAPTER-CONTRACT.md
owner: ""
owner: jp
- id: CTO-WORK-007
title: Case Failure Fixture Matrix
status: validated
source: sot/03-PROTOCOLS/CTO-CASE-FAILURE-FIXTURE-MATRIX.md
owner: ""
owner: jp
- id: CTO-WORK-008
title: Staged Proof Gate Records
status: validated
source: sot/03-PROTOCOLS/CTO-CASE-STAGED-PROOF-GATES.md
owner: ""
owner: jp
- id: CTO-WORK-009
title: Stage 1 Gated Case Engine PRD
status: validated
source: sot/03-PROTOCOLS/CTO-CASE-STAGE1-GATED-ENGINE-PRD.md
owner: ""
owner: jp
- id: CTO-WORK-010
title: Stage 1 Harness Implementation Route
status: validated
source: sot/03-PROTOCOLS/CTO-CASE-STAGE1-GATED-ENGINE-ISSUES.md
owner: jp
- id: CTO-WORK-011
title: Stage 2 Artificial Fixture PRD
status: candidate
source: sot/03-PROTOCOLS/CTO-CASE-STAGE2-ARTIFICIAL-FIXTURE-PRD.md
owner: jp
- id: CTO-WORK-012
title: Stage 2 Harness Artificial Fixture Route
status: blocked
source: sot/03-PROTOCOLS/CTO-CASE-STAGE2-ARTIFICIAL-FIXTURE-ISSUES.md
owner: jp

View File

@ -0,0 +1,86 @@
---
name: cto-case-stage2-artificial-fixture-issues
tier: local
status: draft
owner: jp
source: sot/03-PROTOCOLS/CTO-CASE-STAGE2-ARTIFICIAL-FIXTURE-PRD.md
created: 2026-05-31
last_reviewed: 2026-05-31
lifecycle_classification: planning
core_promotion_status: not-promoted
description: Child-local issue sequence for Stage 2 Case artificial fixture proof.
---
# CTO Case Stage 2 Artificial Fixture Issues
Local planning SOT only. Not a Core Protocol. Not active Core authority.
## Issue Sequence
### CTO-WORK-011 - Stage 2 Artificial Fixture PRD
Type: AFK
Blocked by: CTO-WORK-010
User stories covered: CTO Case Candidate Backend PRD stories 4, 5, 7, 8, 9, 10, 11, 13.
What to build: Define the Stage 2 artificial fixture proof before implementation starts.
Acceptance criteria:
- [ ] PRD states Stage 2 allowed mutation scope is `copied artificial case only`.
- [ ] PRD requires Stage 1 validation before Stage 2.
- [ ] PRD requires `CTO_HARNESS_ALLOW_CASE=1` and `CTO_HARNESS_CASE_STAGE=2`.
- [ ] PRD defines allowed roots as `runtime_workspace_root` and `run_artifact_dir`.
- [ ] PRD keeps fake as the default validation lane.
- [ ] PRD forbids Target Repository, Case source, vendor source, external developer repo, Hermes WebUI, and Cortex Core mutation.
- [ ] PRD requires no-target-inspection proof from task contract, command arguments, runtime inputs, environment, and config.
- [ ] PRD requires full Harness Evidence Interface artifacts, digests, freshness proof, and allowed-write proof.
- [ ] PRD requires same-run fake baseline comparison.
- [ ] PRD requires no-diff, disallowed-file, failed-tests, missing-test-command, missing-required-event, and provider-unavailable failure fixtures.
- [ ] Local CTO validator checks Stage 2 PRD and issue artifact.
Allowed files: CTO child workspace planning docs and local validator only.
Validator: `python3 tools/validate_cto_child.py`
Done evidence: PRD, issue artifact, validator JSON, clean worktree, commit.
### CTO-WORK-012 - Stage 2 Harness Artificial Fixture Route
Type: AFK
Blocked by: CTO-WORK-011
User stories covered: CTO Case Candidate Backend PRD stories 4, 5, 7, 8, 9, 10, 11, 13.
What to build: In `/home/svrnty/workspaces/hermes/cto/harness`, implement the Stage 2 Case artificial fixture route behind the existing `case` engine seam.
Acceptance criteria:
- [ ] `case` remains disabled by default.
- [ ] `CTO_HARNESS_ALLOW_CASE=1` remains required.
- [ ] `CTO_HARNESS_CASE_STAGE=2` is required before artificial fixture Case execution.
- [ ] Missing Stage 2 gate emits blocked evidence and does not run Case.
- [ ] Passing Stage 2 run mutates only the copied artificial runtime workspace.
- [ ] No Target Repository path is inspected or copied.
- [ ] Task contract, command arguments, runtime inputs, environment, and config expose no Target Repository path.
- [ ] Writable roots are limited to `runtime_workspace_root` and `run_artifact_dir`.
- [ ] No files under harness source checkout, target repo, Case source, vendor source, external developer repositories, or Cortex Core are changed during execution.
- [ ] Required artifacts include `report.json`, `report.md`, `events.normalized.jsonl`, `trace.jsonl`, `patch.diff`, `test.log`, and backend raw logs under `backend/`.
- [ ] `report.json` records `backend: case`, `source_admission_status: not_admitted`, `case_process_started`, `allowed_writes_passed`, changed files, blockers, artifact digests, and freshness proof.
- [ ] Pass events include `run.started`, `task.contract.created`, `plan.updated`, `patch.applied`, `git.diff.checked`, `verification.completed`, and `run.completed`.
- [ ] Same-run fake baseline comparison records fake and Case artifact paths.
- [ ] Failure fixtures fail closed for no diff, disallowed file, failed tests, missing test command, missing required event, and provider unavailable.
- [ ] Fake remains the default validation lane and broad health remains green after focused Stage 2 validation.
Allowed files: Hermes CTO harness engine, artificial fixtures, focused Stage 2 validator, harness docs, and tests. WebUI, Core, Case source, vendor source, and Target Repositories are forbidden.
Validator: `python3 harness/runner/validate-case-stage2.py --harness-root harness --json`, then `harness/evals/health.sh --json`.
Done evidence: Stage 2 pass report, failure fixture reports, artifact digests, no-mutation proof, clean worktree, commit.
## Granularity Check
This is intentionally two slices: one planning route and one executable harness route. It is not over-granular because Stage 2 is the first Case execution boundary and must be reviewed before any copied-repo work.

View File

@ -0,0 +1,110 @@
---
name: cto-case-stage2-artificial-fixture-prd
tier: local
status: draft
owner: jp
source: sot/03-PROTOCOLS/CTO-CASE-STAGED-PROOF-GATES.md
created: 2026-05-31
last_reviewed: 2026-05-31
lifecycle_classification: planning
core_promotion_status: not-promoted
description: Child-local PRD for Stage 2 Case artificial fixture execution proof.
---
# CTO Case Stage 2 Artificial Fixture PRD
Local planning SOT only. Not a Core Protocol. Not active Core authority.
## Problem Statement
Stage 1 proves that the Hermes CTO harness can recognize `case`, deny it by default, and emit evidence without running Case. The next risk is enabling Case too broadly. Stage 2 must prove only one narrow executable behavior: Case may run against a copied artificial fixture, through the existing harness engine seam, while preserving evidence shape, allowed-write control, fake default behavior, and fail-closed fixture outcomes.
## Solution
Define a Stage 2 implementation slice for `/home/svrnty/workspaces/hermes/cto/harness`. The slice may execute Case only against copied artificial fixture inputs already owned by the harness. It must not inspect or mutate a Target Repository, copied local repo fixture, disposable repo, owned noncritical repo, vendor source, Case source, or Cortex Core.
Stage 2 keeps the `case` backend gated by `CTO_HARNESS_ALLOW_CASE=1` and adds a stricter execution gate named `CTO_HARNESS_CASE_STAGE=2`. Missing Stage 2 gate means blocked, not warning. The fake lane remains the default validation lane and comparison baseline.
Allowed roots are explicit: Stage 2 may write only to `runtime_workspace_root` and `run_artifact_dir`. `runtime_workspace_root` is the copied artificial case workspace created under the harness runtime root. `run_artifact_dir` is the single run evidence directory. All other writes are forbidden, including harness source checkout, Target Repository paths, Case source, vendor source, external developer repositories, and Cortex Core.
## Scope
- Add one artificial fixture task contract for Case adapter proof.
- Allow mutation only inside the copied artificial runtime workspace.
- Preserve the Harness Evidence Interface artifact set.
- Compare Case report shape, event order, allowed writes, verification output, blockers, artifact digests, and freshness proof against fake fixture expectations.
- Use a same-run fake baseline on the same artificial fixture as the comparison artifact.
- Add focused Stage 2 validator checks before broader harness health validation.
- Add failure fixture expectations for no diff, disallowed file, failed tests, missing test command, missing required event, and provider unavailable.
- Keep implementation in the Hermes CTO harness route.
## Non-Goals
- Do not run Case on a Target Repository.
- Do not run Case on a copied local repository fixture.
- Do not create, push, merge, deploy, or close a pull request.
- Do not mutate Case source, vendor source, external developer repositories, Cortex Core, or Hermes WebUI product surfaces.
- Do not resolve Case source admission for real-repo execution.
- Do not promote any artifact into Core.
- Do not make Case a default backend.
## Acceptance Criteria
- Stage 2 entry requires Stage 1 validated.
- Stage 2 allowed mutation scope is `copied artificial case only`.
- Stage 2 allowed roots are `runtime_workspace_root` and `run_artifact_dir`.
- `case` remains disabled by default.
- `CTO_HARNESS_ALLOW_CASE=1` remains required.
- `CTO_HARNESS_CASE_STAGE=2` is required before Case adapter execution.
- Missing Stage 2 gate produces blocked status, not warning.
- Fake remains the default validation lane.
- Case runs only through the same engine dispatch path as fake, Codex, and Pi.
- Case receives only artificial fixture input, allowed paths, forbidden actions, verification command, and evidence expectations.
- No Target Repository path is inspected or copied.
- The focused validator proves the task contract contains no Target Repository path and that runtime inputs, command arguments, environment, and config expose only `runtime_workspace_root` and `run_artifact_dir` as writable roots.
- No files under harness source checkout, target repo, Case source, vendor source, external developer repositories, or Cortex Core are changed by Stage 2 execution.
- Runtime writes are limited to the copied artificial workspace and the run artifact directory.
- `report.json` records `backend: case`, `status`, `backend_exit_code`, `allowed_writes_passed`, `changed_files`, `blockers`, `source_admission_status: not_admitted`, `case_process_started`, artifact paths, artifact digests, and freshness proof.
- Required artifacts include `report.json`, `report.md`, `events.normalized.jsonl`, `trace.jsonl`, `patch.diff`, `test.log`, and backend raw logs under `backend/`.
- Normalized events include `run.started`, `task.contract.created`, `plan.updated`, `patch.applied`, `git.diff.checked`, `verification.completed`, and `run.completed` for pass cases.
- Blocked preflight emits `backend.gate.blocked`.
- Failure fixtures cover no diff, disallowed file, failed tests, missing test command, missing required event, and provider unavailable.
- Failure fixture reports fail closed with blocker reason, nonzero exit where executable, and complete evidence artifacts.
- Same-run fake baseline evidence records the fake run artifact path, the Case run artifact path, and whether report shape, event order, allowed writes, verification output, blockers, artifact digests, and freshness proof match expectations.
## Validation
- Focused Stage 2 validator command is `python3 harness/runner/validate-case-stage2.py --harness-root harness --json`.
- Focused Stage 2 validator output must include `ok`, `checked`, `errors`, pass report paths, failure report paths, same-run fake baseline path, and no-target-inspection proof.
- Focused Stage 2 validator must run `python3 harness/runner/validate-case-stage1.py --harness-root harness --json` first and require it to pass.
- Focused Stage 2 validator must prove fake remains default.
- Focused Stage 2 validator must prove `--engine case` without `CTO_HARNESS_ALLOW_CASE=1` remains blocked.
- Focused Stage 2 validator must prove `CTO_HARNESS_ALLOW_CASE=1` without `CTO_HARNESS_CASE_STAGE=2` remains blocked for execution.
- Focused Stage 2 validator must run one passing artificial fixture through `case` only when both gates are set.
- Focused Stage 2 validator must inspect artifact shape, event order, allowed writes, verification log, digest fields, and freshness proof.
- Focused Stage 2 validator must compare the Case run to a same-run fake baseline on the same fixture.
- Focused Stage 2 validator must run failure fixture checks for no diff, disallowed file, failed tests, missing test command, missing required event, and provider unavailable.
- Focused Stage 2 validator must prove no Target Repository path is present in task contract, command arguments, runtime inputs, environment, or harness config used by the run.
- Broader harness health validation must run after the focused Stage 2 validator is green.
- Cortex CTO child validator must require this PRD and issue artifact before Stage 2 implementation is considered governed.
## Risks
- Stage 2 can be mistaken for real-repo readiness.
- A live Case invocation can leak beyond the artificial fixture if workspace and allowed paths are weak.
- A passing fixture can hide missing failure closure.
- Stage 2 can accidentally make `case` default or bypass fake comparison.
- Missing Case credentials or provider availability can block executable proof.
## Dependencies
- Stage 1 Gated Case Engine is validated.
- Harness Evidence Interface Contract is validated.
- Case Adapter Contract is validated.
- Case Failure Fixture Matrix is validated.
- Stage 1 focused validator exists in the Hermes CTO harness.
- Human-provided credentials or local provider configuration may be required for real Case execution; missing credentials must produce a blocked provider-unavailable result, not a fake pass.
## Success Definition
Stage 2 is successful when Case can execute one copied artificial fixture through the CTO harness, produce the same evidence interface expected from fake fixtures, and fail closed for required artificial failure classes. Stage 2 does not authorize copied repo, sandbox repo, owned repo, default backend, WebUI product, or Core promotion behavior.

View File

@ -4,6 +4,7 @@
from __future__ import annotations
import json
import re
from pathlib import Path
@ -25,6 +26,8 @@ REQUIRED_FILES = [
"sot/03-PROTOCOLS/CTO-CASE-STAGED-PROOF-GATES.md",
"sot/03-PROTOCOLS/CTO-CASE-STAGE1-GATED-ENGINE-PRD.md",
"sot/03-PROTOCOLS/CTO-CASE-STAGE1-GATED-ENGINE-ISSUES.md",
"sot/03-PROTOCOLS/CTO-CASE-STAGE2-ARTIFICIAL-FIXTURE-PRD.md",
"sot/03-PROTOCOLS/CTO-CASE-STAGE2-ARTIFICIAL-FIXTURE-ISSUES.md",
]
REQUIRED_BRIEF_PHRASES = [
@ -209,6 +212,52 @@ REQUIRED_STAGE1_ISSUE_IDS = [
"CTO-WORK-010",
]
REQUIRED_STAGE2_PRD_PHRASES = [
"Local planning SOT only. Not a Core Protocol. Not active Core authority.",
"Stage 2 must prove only one narrow executable behavior",
"copied artificial fixture",
"Stage 2 allowed mutation scope is `copied artificial case only`.",
"CTO_HARNESS_ALLOW_CASE=1",
"CTO_HARNESS_CASE_STAGE=2",
"Missing Stage 2 gate means blocked, not warning.",
"runtime_workspace_root",
"run_artifact_dir",
"Fake remains the default validation lane.",
"No Target Repository path is inspected or copied.",
"task contract contains no Target Repository path",
"source_admission_status: not_admitted",
"allowed_writes_passed",
"report.md",
"backend raw logs under `backend/`",
"backend: case",
"case_process_started",
"changed_files",
"blockers",
"artifact digests",
"freshness proof",
"same-run fake baseline",
"no diff",
"disallowed file",
"failed tests",
"missing test command",
"missing required event",
"provider unavailable",
"python3 harness/runner/validate-case-stage2.py --harness-root harness --json",
"python3 harness/runner/validate-case-stage1.py --harness-root harness --json",
"Stage 2 does not authorize copied repo, sandbox repo, owned repo, default backend, WebUI product, or Core promotion behavior.",
]
REQUIRED_STAGE2_ISSUE_IDS = [
"CTO-WORK-011",
"CTO-WORK-012",
]
def workboard_status(text: str, issue_id: str) -> str | None:
pattern = rf"- id: {re.escape(issue_id)}\n(?: .+\n)*? status: ([^\n]+)"
match = re.search(pattern, text)
return match.group(1).strip() if match else None
def main() -> int:
checked: list[str] = []
@ -328,6 +377,28 @@ def main() -> int:
if issue_id not in text:
errors.append(f"missing_stage1_issue_id:{issue_id}")
stage2_prd = ROOT / "sot/03-PROTOCOLS/CTO-CASE-STAGE2-ARTIFICIAL-FIXTURE-PRD.md"
if stage2_prd.is_file():
text = stage2_prd.read_text(encoding="utf-8")
if "core_promotion_status: not-promoted" not in text:
errors.append("stage2_prd_missing_not_promoted_frontmatter")
for phrase in REQUIRED_STAGE2_PRD_PHRASES:
checked.append(f"stage2_prd_phrase:{phrase}")
if phrase not in text:
errors.append(f"missing_stage2_prd_phrase:{phrase}")
stage2_issues = ROOT / "sot/03-PROTOCOLS/CTO-CASE-STAGE2-ARTIFICIAL-FIXTURE-ISSUES.md"
if stage2_issues.is_file():
text = stage2_issues.read_text(encoding="utf-8")
if "core_promotion_status: not-promoted" not in text:
errors.append("stage2_issues_missing_not_promoted_frontmatter")
if "Local planning SOT only. Not a Core Protocol. Not active Core authority." not in text:
errors.append("stage2_issues_missing_local_planning_notice")
for issue_id in REQUIRED_STAGE2_ISSUE_IDS:
checked.append(f"stage2_issue_id:{issue_id}")
if issue_id not in text:
errors.append(f"missing_stage2_issue_id:{issue_id}")
board = ROOT / "WORKBOARD.yaml"
if board.is_file():
text = board.read_text(encoding="utf-8")
@ -339,6 +410,28 @@ def main() -> int:
checked.append(f"workboard_id:{issue_id}")
if issue_id not in text:
errors.append(f"missing_workboard_id:{issue_id}")
for issue_id in REQUIRED_STAGE2_ISSUE_IDS:
checked.append(f"workboard_id:{issue_id}")
if issue_id not in text:
errors.append(f"missing_workboard_id:{issue_id}")
expected_statuses = {
"CTO-WORK-002": "validated",
"CTO-WORK-003": "validated",
"CTO-WORK-004": "validated",
"CTO-WORK-005": "validated",
"CTO-WORK-006": "validated",
"CTO-WORK-007": "validated",
"CTO-WORK-008": "validated",
"CTO-WORK-009": "validated",
"CTO-WORK-010": "validated",
"CTO-WORK-011": "candidate",
"CTO-WORK-012": "blocked",
}
for issue_id, expected in expected_statuses.items():
checked.append(f"workboard_status:{issue_id}:{expected}")
actual = workboard_status(text, issue_id)
if actual != expected:
errors.append(f"workboard_status_mismatch:{issue_id}:expected_{expected}:actual_{actual}")
if "CTO-HARNESS-EVIDENCE-INTERFACE-CONTRACT.md" not in text:
errors.append("workboard_missing_evidence_interface_contract_source")
if "CTO-WORK-004" in text and "status: validated" not in text:
@ -355,6 +448,10 @@ def main() -> int:
errors.append("workboard_missing_stage1_prd_source")
if "CTO-CASE-STAGE1-GATED-ENGINE-ISSUES.md" not in text:
errors.append("workboard_missing_stage1_issues_source")
if "CTO-CASE-STAGE2-ARTIFICIAL-FIXTURE-PRD.md" not in text:
errors.append("workboard_missing_stage2_prd_source")
if "CTO-CASE-STAGE2-ARTIFICIAL-FIXTURE-ISSUES.md" not in text:
errors.append("workboard_missing_stage2_issues_source")
payload = {
"ok": not errors,