--- name: cto-case-stage6-candidate-default-prd tier: local status: draft owner: jp source: .sot/03-PROTOCOLS/CTO-CASE-STAGED-PROOF-GATES.md created: 2026-06-01 last_reviewed: 2026-06-01 lifecycle_classification: planning core_promotion_status: not-promoted description: Child-local PRD for Stage 6 Case candidate-default comparison proof. --- # CTO Case Stage 6 Candidate Default PRD Local planning SOT only. Not a Core Protocol. Not active Core authority. ## Problem Statement Stage 5 proves Case can change an explicitly owned low-risk Target Repository under Harness control. That is not enough to make Case the default backend. The remaining problem is comparison: Case can be discussed as candidate default only if it matches or beats the existing lanes on evidence completeness, failure closure, and bounded execution behavior. ## Solution Define Stage 6 as a comparison and default-candidacy gate behind the existing CTO Harness seam. Stage 6 allowed mutation scope is `scoped real-repo use only`. Case remains disabled by default until comparison evidence exists against fake, Codex, and Pi where applicable. The CTO Product Surface may record candidate-default eligibility only after the Harness proves report shape, event validity, allowed-path compliance, failure closure, artifact completeness, source freshness, and operator acceptance. ## Scope - Define Stage 6 entry gates, non-goals, acceptance criteria, validation, risks, dependencies, and success definition. - Require Stage 5 validation before Stage 6. - Require comparison fixtures for fake, Codex, and Pi where applicable. - Require current Case source admission before any candidate-default claim. - Require complete failure matrix coverage or explicit blocked rationale for each uncovered row. - Require candidate-default evidence to use the Harness Evidence Interface. - Require operator acceptance after comparison proof. - Keep Core promotion, runtime default switching, merge, deploy, push, and close out of scope. ## Non-Goals - Do not make Case the default backend. - Do not authorize production, critical, customer, vendor, external developer, or unowned repository mutation. - Do not promote any CTO artifact into Core. - Do not add Hermes WebUI Runtime behavior. - Do not remove fake, Codex, or Pi comparison lanes. - Do not treat one passing owned-repo run as candidate-default proof. - Do not let Case choose its own eligibility, authority, target, model, or approval. ## User Stories 1. As JP, I want Case compared against existing lanes before default candidacy, so that stronger coding behavior does not bypass evidence discipline. 2. As Cortex, I want candidate-default claims to remain child-local until Core promotion, so that execution backend choice cannot create authority. 3. As Hermes, I want comparison proof to be replayable through Harness artifacts, so that operator control and visualization remain grounded in evidence. 4. As CTO, I want backend eligibility separated from backend execution, so that Case cannot approve itself. 5. As CTO Harness, I want the same evidence interface across fake, Codex, Pi, and Case, so that comparison is mechanical rather than conversational. 6. As a Target Repository owner, I want allowed-path and forbidden-action proof preserved in Stage 6, so that scoped real-repo use stays bounded. 7. As a reviewer, I want failure matrix closure before default candidacy, so that known bad states fail closed. ## Acceptance Criteria - [ ] Stage 6 requires Stage 5 validation evidence before candidate-default comparison. - [ ] Stage 6 allowed mutation scope is `scoped real-repo use only`. - [ ] Case remains disabled by default until Stage 6 evidence is recorded. - [ ] Candidate-default proof compares Case against fake, Codex, and Pi where applicable. - [ ] Comparison evidence uses the Harness Evidence Interface, not raw backend logs alone. - [ ] Case matches or beats existing lanes on report shape. - [ ] Case matches or beats existing lanes on event validity. - [ ] Case matches or beats existing lanes on allowed-path compliance. - [ ] Case matches or beats existing lanes on failure closure. - [ ] Case matches or beats existing lanes on artifact completeness. - [ ] Case source admission freshness is recorded. - [ ] Failure matrix rows are covered or explicitly blocked with rationale. - [ ] Operator acceptance is recorded after comparison proof. - [ ] No push, merge, deploy, close, PR open, issue close, public publication, Core mutation, vendor-source mutation, or unowned repository mutation is authorized. - [ ] Local CTO validator checks Stage 6 PRD and issue artifacts. ## Validation Planning validator: `python3 tools/validate_cto_child.py`. Implementation validator planned for Hermes: `python3 harness/runner/validate-case-stage6.py --harness-root harness --json`, then `harness/evals/health.sh --json` after focused Stage 6 validation passes. ## Risks - Candidate-default language may be mistaken for default backend activation. - Comparison may be too weak if it checks only pass status and not evidence completeness. - Codex or Pi may be unavailable in a specific environment; that must become explicit `where applicable` rationale, not silent omission. - Failure matrix gaps may be hidden by green happy-path runs. - Source admission may become stale after earlier Stage 5 proof. ## Dependencies - Stage 5 owned noncritical repository proof is validated. - Harness Evidence Interface contract remains active. - Case source admission record remains current. - Case adapter contract remains active. - Failure fixture matrix remains active. - Existing fake lane remains the default validation lane. - Codex and Pi comparison lanes are used where applicable or explicitly blocked with rationale. ## Success Definition Stage 6 is successful when the CTO Product Surface can record Case as a candidate-default backend only after Harness comparison evidence proves Case matches or beats fake, Codex, and Pi where applicable on report shape, event validity, allowed-path compliance, failure closure, and artifact completeness, with current source admission, failure matrix coverage, operator acceptance, and no authority drift.