Agent MCP tooling (icn-ops)
The ops/mcp server is a portable agent operations layer: one TypeScript stdio MCP that Cursor, Claude Code, Codex-style hosts, and future MCP clients can share. It is not editor-specific glue; it centralizes read-mostly diagnostics, cached cluster/git health, and safe discovery so agents can orient before changing the ICN monorepo.
Launch (all clients)
From the repository root, register icn-ops with:
- Command:
npm - Args:
["--prefix", "./ops/mcp", "run", "start:stdio"]
Repo files:
.mcp.json— Claude-compatible project MCP.cursor/mcp.json— Cursor project MCP
These must stay identical for icn-ops (enforced by python3 scripts/check-mcp-portability.py). Do not point MCP configs at ops/mcp/dist/index.js directly; start:stdio runs tsc then node under one Node.
Install once per checkout (or after changing Node major):
cd ops/mcp && npm ci
Native module note: better-sqlite3 is rebuilt in postinstall. If the MCP host uses a different Node than the one used for npm ci, run npm rebuild better-sqlite3 (or npm ci again) under the host’s Node.
Tools (agent-facing)
| Tool | Purpose |
|---|---|
icn_ops_environment_report |
JSON snapshot: repo root, git branch/commit/dirty, Node ABI, npm/rust/python versions, optional gh/kubectl, MCP config inspection, node_modules/dist presence, better-sqlite3 load probe. Missing optional CLIs are warnings, not hard failures. |
icn_ops_doctor |
Read-only diagnosis: severity (ok / warn / error), per-check results, suggested shell repair commands (not executed). Covers MCP parity script, native module, dirty tree, optional tools, key ops/state files, and lightweight CLI runner probes (git / npm / python3). |
icn_ops_agent_brief |
Compact structured briefing: docs to read first, safe vs forbidden vocabulary, verification commands by area, PR hygiene, completeness warning, MCP troubleshooting bullets. |
icn_ops_command_catalog |
Catalog only — grouped commands with working_directory, safety (read_only / modifies_local / destructive / external_side_effect), runtime hint, when_to_use, and optional caution (e.g. long cargo test, npm ci lockfile behavior, gh network). Never runs commands. |
icn_ops_state_index |
Lists canonical state/architecture paths with present: true/false (filesystem stat); does not invent missing files. Optional arg include_absent (default true). |
icn_ops_next_steps |
Read-only workflow guidance — small JSON: severity (ok / warn / error), summary, recommended_steps[] (title, reason, optional command, working_directory, safety, priority, blocks_agent_work), plus diagnosis_digest counts (not full doctor dumps). Never executes repair commands. |
icn_ops_verification_plan |
Ordered checklist only — input area (mcp | docs | rust | website | vocabulary | pr | full) and optional risk_level (quick | standard | thorough). Returns steps with command, purpose, expected_success_signal, safety, estimated_runtime, notes. Does not run commands. full layers MCP checks with docs, Rust, vocabulary, optional website, and (when thorough) PR API checks. |
icn_ops_repo_map |
Layout map — key repo paths (docs/, ops/mcp, icn, scripts, ADR/RFC dirs, SDKs, web/pilot-ui, deploy, …) with present, description, agent_use, optional caution. Absent paths stay present: false. |
Existing tools (cluster_health, sessions, tasks, decisions, etc.) remain available. Poller and health paths use execFile-style argv (via runCommand in ops/mcp/src/utils/commands.ts): no shell, bounded stdout/stderr, timeouts, and structured { ok, exitCode, stderr, timedOut } results. kubectl get pods -o json is parsed in-process (no jq pipeline). External JSON goes through safeJsonParse so malformed output becomes { error, preview } instead of throwing through the MCP boundary.
Safe command execution policy
- Default:
runCommand/runCommandQuick/runCommandJson— argv only, no/bin/sh -c, predictable quoting, output truncation. - Why avoid shell pipelines: they inject quoting bugs, hide exit codes, and make “optional tool missing” look like opaque script failures.
- Failures: represented as
ok: falseplus stderr/exit/timedOut; pollers write those intohealth_cacheor doctor checks — neverprocess.exitfrom optional probes. - Warnings vs errors:
icn_ops_doctoruseswarnfor optional CLIs and dirty trees;errorfor portability failures, native module load, or missingnode_modules. Suggested repairs are strings only — not executed by MCP.
Diagnostics vs recommendations vs execution
| Layer | Tools | Agent behavior |
|---|---|---|
| Diagnostics | icn_ops_environment_report, icn_ops_doctor, icn_ops_state_index |
Inspect facts; large payloads possible on doctor. |
| Recommendations | icn_ops_next_steps, icn_ops_command_catalog, icn_ops_verification_plan, icn_ops_repo_map, icn_ops_agent_brief |
Choose what to run locally; commands are strings only. |
| Execution | Your shell / CI / editor | Only the human or agent host runs commands. MCP does not auto-fix, remediate, or open arbitrary shells. |
Human review: MCP guidance does not replace human judgment before destructive or external_side_effect actions (e.g. git reset --hard, git clean, gh pr create, production deploys). Treat catalog and verification plans as checklists, not autopilots.
Choosing checks before editing
icn_ops_next_steps— see whether the worktree or MCP layer blocks progress (blocks_agent_work).icn_ops_repo_map— confirm which subtrees exist for this checkout.icn_ops_verification_planwith the rightareaandrisk_level— copy commands into your terminal; adjust scope (cargo test -p …, filters) as needed.icn_ops_command_catalog— deeper command lookup with the samesafetyvocabulary as verification plans.
Common failure modes
| Symptom | Likely cause | What to do |
|---|---|---|
MCP host shows native module / NODE_MODULE_VERSION |
Node ABI mismatch vs npm ci |
cd ops/mcp && npm rebuild better-sqlite3 or reinstall with the host’s Node |
icn_ops_doctor → portability error |
.mcp.json ≠ .cursor/mcp.json |
Align args to canonical npm --prefix ./ops/mcp run start:stdio |
Doctor warns on missing ops/state/... |
Fresh clone or ops state not checked in | Confirm paths; some files are environment-specific |
kubectl / gh warnings |
Optional tooling absent | Expected on dev laptops; not required for MCP core |
Validation before edits
- Call
icn_ops_next_stepsfor a compact gate, oricn_ops_doctor/icn_ops_environment_reportfor full detail after branch or Node changes. - Follow
icn_ops_agent_brief+AGENTS.mdchange routing for the area you touch. - Use
icn_ops_verification_planand/oricn_ops_command_catalogto pick checks; run them in your terminal (MCP does not auto-run them).
Ship checklist (ops MCP)
Launch (Cursor + Claude + any MCP host): repo root, command = npm, args = ["--prefix", "./ops/mcp", "run", "start:stdio"]. Do not point hosts at node ./ops/mcp/dist/index.js (ABI drift). After changing Node major: cd ops/mcp && npm ci, then reload the MCP session / Cursor window.
Merge order to main: land #1716 (stdio unify) → #1717 (diagnostics) → #1718 (execFile runner) → #1719 (workflow guidance). Later PRs may need git fetch && git rebase origin/main after earlier merges.
Local verification (repo root):
npm --prefix ./ops/mcp ci
npm --prefix ./ops/mcp run build
npm --prefix ./ops/mcp test
python3 scripts/check-mcp-portability.py
timeout 5 npm --prefix ./ops/mcp run start:stdio # exit 124 while server runs is OK
Agents: use icn_ops_doctor / icn_ops_next_steps for “what’s wrong?” and “what should I run?” — all suggested commands are strings only; MCP does not run shells or auto-remediation.
Warnings vs blockers: missing kubectl / gh / dirty tree → usually warn. Missing ops/mcp/node_modules, failed better-sqlite3 load, portability script failure → treat as blockers for MCP until fixed.
Related docs
- cursor-mcp-setup.md — Cursor vs Claude wiring and smoke-test commands
AGENTS.md(repo root) — invariants and verification matrixops/CLAUDE.md— orchestration plane layout