Allowlist deep-research MCP for CTO

This commit is contained in:
Svrnty 2026-05-25 10:01:53 -04:00
parent 6548e2ffaa
commit 0ca5ffc8ed
3 changed files with 116 additions and 10 deletions

View File

@ -44,7 +44,7 @@ auto_regen_cmd: "yq '.disclosure' manifest.yaml | <renderer-script>"
| Field | Value | Rationale |
|---|---|---|
| `inherit_builtins` | `false` | cto has zero builtins enabled — deny-by-default. Locks in clean posture. |
| `inherit_mcp_toolsets` | `false` | cto has zero MCP — deny-by-default. Closes potential bte-MCP-leak risk that hit ceo/steev. |
| `inherit_mcp_toolsets` | `false` | deny-by-default. CTO has one explicit MCP allowlist (`deep-research`); no inherited/global MCP bleed. |
| `inherit_dirs` | none | no external_dirs — no bundled-skill exposure |
| `sovereign_only` | `false` | INTENTIONAL. cto-agent itself runs sovereign `qwen3.6-35b-a3b`. The `claudeCode('claude-opus-4-7')` literal in sandcastle invocations names the AGENT INSIDE THE SANDBOX — hosted Claude lives behind sandcastle's isolation boundary (CONTRACT.md §5 + AUDIT §6 sovereignty note). Setting `true` would block the valid v1 design. |
@ -60,9 +60,22 @@ Per `disclosure.skills` enum. Pre-push check 6.a enforces declared == live `herm
**Totals.** 3 skills total. Source breakdown: 3 local, 0 hub, 0 builtin, 0 external_dir.
## §4 MCP servers (0)
## §4 MCP servers (1)
No MCP servers exposed — deny-by-default allowlist is empty. cto orchestrates via sandcastle + shell, not MCP. Matches PROFILE-CATALOG §cto-planb. Closes the bte-MCP-leak risk that hit ceo/steev.
Per `disclosure.mcp_servers` allowlist. Deny-by-default; explicit tool enum (no `all`). `deep-research` is exposed for CTO source-grounding and current research per `CTO-WEBUI-CODING-AGENT-PRD.md` §8 and §23.
| Server | Transport | Endpoint | Tools | Hosted API | Data boundary |
|---|---|---|---:|---|---|
| `deep-research` | http | `http://127.0.0.1:3010/mcp` | 4 selected | conditional: hosted only when deep-research `INFERENCE_URL` routes through `llm-gateway` | Tailnet HTTP MCP; search/fetch reaches public web sources; LLM route disclosed by deep-research inference mode |
### §4.1 `deep-research` tool allowlist
| Tool | Mode | Justification |
|---|---|---|
| `mcp_deep_research_deep_research` | read | Full source-grounded research artifact for architecture, standards, vendor behavior, dependency choices, and PRD work. |
| `mcp_deep_research_web_search` | read | Granular current-source search for CTO investigations when a full artifact is too heavy. |
| `mcp_deep_research_fetch_page` | read | Fetch source pages selected during CTO research; browsing/fetch capability disclosed explicitly. |
| `mcp_deep_research_extract_pdf` | read | Extract standards papers, vendor PDFs, and architecture docs during CTO research. |
## §5 Sovereign APIs (1)
@ -122,8 +135,8 @@ No cron jobs. cto runs on-demand or on kanban tick (CONTRACT.md §3 + manifest `
| Surface | Declared | Live | Status |
|---|---|---|---|
| Skills | 3 | 3 | in-sync (live verified by AUDIT-cto-2026-05-24.md §1) |
| MCP servers | 0 | 0 | in-sync (live verified by AUDIT §2) |
| MCP tools (total) | 0 | 0 | in-sync |
| MCP servers | 1 | 1 | in-sync (`deep-research`, 4 selected; verified 2026-05-25) |
| MCP tools (total) | 4 | 4 | in-sync (`deep_research`, `web_search`, `fetch_page`, `extract_pdf`) |
| External orchestrators | 1 (sandcastle) | 1 (sandcastle invoked by `lib/cto-worker.sh:50-62`) | in-sync (Wave-7 D2) |
| Credentials | 0 | 1 vault-absent declared in legacy block | acceptable (Pending JP — see §12) |

View File

@ -88,7 +88,8 @@ echo ""
# F1 resolve $HERMES_WORKSPACE in inherit_dirs → skills.external_dirs
# F2 compute denylist from disclosure.skills → skills.disabled
# F3 propagate inherit_mcp_toolsets → agent.inherit_mcp_toolsets
# F4 install subrepo pre-push disclosure-drift gate
# F4 materialize disclosure.mcp_servers → profile mcp_servers
# F4b install subrepo pre-push disclosure-drift gate
# F5 (D4) write sovereign vllm model block → model.{default,provider,base_url,…}
# Per-profile config lives at ~/.hermes/profiles/$PROFILE_NAME/config.yaml.
# cto inherit_dirs is empty by design (CONTRACT.md §1, §9) — F1 stays for template
@ -189,6 +190,79 @@ else
echo " WARN: F3 yq not on PATH — skipping inherit_mcp_toolsets"
fi
# F4 — materialize explicit MCP allowlist from disclosure.mcp_servers
if [ "$DRY_RUN" -eq 1 ]; then
echo "DRY: F4 materialize disclosure.mcp_servers → $PROFILE_CFG"
else
mkdir -p "$(dirname "$PROFILE_CFG")"
[ -f "$PROFILE_CFG" ] || : > "$PROFILE_CFG"
python3 - "$GLOBAL_CFG" "$PROFILE_CFG" "$REPO/manifest.yaml" <<'PY'
import copy
import os
import sys
import yaml
global_cfg, profile_cfg, manifest_path = sys.argv[1:4]
def load_yaml(path):
if not os.path.exists(path):
return {}
with open(path) as f:
return yaml.safe_load(f) or {}
g = load_yaml(global_cfg)
p = load_yaml(profile_cfg)
m = load_yaml(manifest_path)
global_servers = g.get("mcp_servers") or {}
declared = ((m.get("disclosure") or {}).get("mcp_servers") or [])
new_servers = {}
missing = []
for item in declared:
name = item.get("name")
if not name:
continue
src = global_servers.get(name)
if not src:
missing.append(name)
continue
server = copy.deepcopy(src)
prefix = "mcp_" + name.replace("-", "_") + "_"
native_tools = []
resource_tools = set()
for tool in item.get("tools") or []:
tid = tool.get("id") if isinstance(tool, dict) else str(tool)
if not tid or not tid.startswith(prefix):
continue
native = tid[len(prefix):]
if native in {"list_resources", "read_resource", "list_prompts", "get_prompt"}:
resource_tools.add(native)
else:
native_tools.append(native)
tools_cfg = server.setdefault("tools", {})
if native_tools:
tools_cfg["include"] = native_tools
tools_cfg["resources"] = bool({"list_resources", "read_resource"} & resource_tools)
tools_cfg["prompts"] = bool({"list_prompts", "get_prompt"} & resource_tools)
server["enabled"] = True
new_servers[name] = server
previous = set((p.get("mcp_servers") or {}).keys())
p["mcp_servers"] = new_servers
with open(profile_cfg, "w") as f:
yaml.safe_dump(p, f, sort_keys=False, allow_unicode=True)
for name in sorted(set(new_servers) - previous):
print(f" F4 + {name}")
for name in sorted(previous - set(new_servers)):
print(f" F4 - {name} (not declared)")
for name in missing:
print(f" F4 WARN: {name} declared but missing from global mcp_servers")
print(f" F4 wrote mcp_servers: {len(new_servers)}")
PY
fi
# F5 (D4) — sovereign vllm model block → per-profile config.yaml
# Per CONTRACT.md §5: cto-agent runs sovereign qwen3.6 (this block).
# claudeCode hosted is constrained INSIDE sandcastle isolation only.
@ -216,12 +290,12 @@ else
echo " WARN: F5 yq not on PATH — skipping model block (cto-agent will fall back to global default)"
fi
# F4 — install subrepo pre-push hook (disclosure-drift gate)
# F4b — install subrepo pre-push hook (disclosure-drift gate)
HOOK_DST="$REPO/.git/hooks/pre-push"
if [ "$DRY_RUN" -eq 1 ]; then
echo "DRY: F4 install pre-push hook → $HOOK_DST"
echo "DRY: F4b install pre-push hook → $HOOK_DST"
elif [ ! -d "$REPO/.git" ]; then
echo " WARN: F4 $REPO/.git missing — not a git checkout, skip"
echo " WARN: F4b $REPO/.git missing — not a git checkout, skip"
else
cat > "$HOOK_DST" <<'HOOK_EOF'
#!/usr/bin/env bash

View File

@ -162,7 +162,26 @@ disclosure:
role: toolkit
justification: "Angular stack patterns — closes CONTRACT.md §6 'Angular = skill-only' gap; anchored to adwright/adwright-console"
mcp_servers: [] # cto orchestrates via sandcastle + shell, not MCP
mcp_servers:
- name: deep-research
transport: http
endpoint: "http://127.0.0.1:3010/mcp"
tools:
- id: mcp_deep_research_deep_research
mode: read
justification: "Full source-grounded research artifact for architecture, standards, vendor behavior, dependency choices, and PRD work."
- id: mcp_deep_research_web_search
mode: read
justification: "Granular current-source search for CTO investigations when a full artifact is too heavy."
- id: mcp_deep_research_fetch_page
mode: read
justification: "Fetch source pages selected during CTO research; browsing/fetch capability disclosed explicitly."
- id: mcp_deep_research_extract_pdf
mode: read
justification: "Extract standards papers, vendor PDFs, and architecture docs during CTO research."
hosted_api: "conditional: hosted only when deep-research INFERENCE_URL routes through llm-gateway"
data_boundary: "Tailnet HTTP MCP; search/fetch reaches public web sources; LLM route disclosed by deep-research inference mode."
approval_required: false
sovereign_apis:
- name: bte-rest