feat(agent-platform): custom-tool authoring foundations — types, docs, AST validators#66584
feat(agent-platform): custom-tool authoring foundations — types, docs, AST validators#66584dmarticus wants to merge 1 commit into
Conversation
…, AST validators
Static half of the custom-tool authoring loop. Adds the author-facing
type contract (`CustomToolContext`, `CustomTool`), a docs file explaining
the compute-only sandbox model, and two new AST passes in the compile
pipeline: a banned-construct check that rejects network/process modules
and dynamic-code constructs with friendly errors at PUT time, and a
capability extractor that surfaces declared secret refs for the
upcoming authoring UI. Shape + banned errors merge into one response
so authors see every diagnostic in a single round-trip. Capability
extractor is conservative: only literal `ctx.secrets.ref('NAME')` calls
are collected; aliases / destructures / computed access flip a
`dynamic_secret_refs` flag so the UI knows the static list is advisory.
The runtime sandbox (`--network=none` / `blockNetwork:true`) remains
the real boundary; the AST passes are a friendly-error layer on top.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
👀 Auto-assigned reviewersThese soft owners were skipped because they only have minor changes here. Nothing blocks merge, so self-assign if you'd like a look:
Soft owners come from |
|
Reviews (1): Last reviewed commit: "feat(agent-platform): custom-tool author..." | Re-trigger Greptile |
|
Size Change: 0 B Total Size: 64.8 MB ℹ️ View Unchanged
|
| await writeToolSourceAndSchema(req.params.id, opts.bundles, tool) | ||
| await opts.bundles.write(req.params.id, toolCompiledPath(id), compile.compiled_js!) | ||
| res.json({ ok: true, tool_id: id }) | ||
| res.json({ ok: true, tool_id: id, capabilities: compile.capabilities }) |
There was a problem hiding this comment.
Capabilities not persisted — authoring UI will be unable to retrieve them
capabilities is returned in the PUT response but never written to the bundle alongside compiled.js. The PR description explicitly lists "capability summaries on each tool card" as a coming feature; without a persisted artifact (e.g. tools/<id>/capabilities.json) the GET /revisions/:id/bundle endpoint has no way to surface them, and any UI that isn't holding the original PUT response in memory will silently show nothing. The source is available in source.ts and could be re-parsed, but that would duplicate the compile pipeline in the GET path.
| function isBannedModuleSpecifier(spec: string): boolean { | ||
| const stripped = spec.startsWith('node:') ? spec.slice('node:'.length) : spec | ||
| return BANNED_NODE_MODULES.has(stripped) | ||
| } |
There was a problem hiding this comment.
'node:' literal repeated in the same expression (OnceAndOnlyOnce)
spec.startsWith('node:') and spec.slice('node:'.length) encode the same prefix independently. Extracting the prefix to a constant makes the coupling explicit and eliminates the subtle divergence risk if one is updated without the other.
| function isBannedModuleSpecifier(spec: string): boolean { | |
| const stripped = spec.startsWith('node:') ? spec.slice('node:'.length) : spec | |
| return BANNED_NODE_MODULES.has(stripped) | |
| } | |
| const NODE_PREFIX = 'node:' | |
| function isBannedModuleSpecifier(spec: string): boolean { | |
| const stripped = spec.startsWith(NODE_PREFIX) ? spec.slice(NODE_PREFIX.length) : spec | |
| return BANNED_NODE_MODULES.has(stripped) | |
| } |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Problem
The custom-tool runtime on the agent platform is production-grade — sandboxed dispatch, secret nonces, identity gate, approval queue, audit via session conversation. But there's no authoring loop a customer can actually use: no published type contract for editor intellisense, no docs explaining the compute-only model, no friendly errors at PUT time when source is malformed or imports a forbidden module.
This PR is the static half of that loop: the contract authors import, the docs that explain the contract, and the AST-level validation that runs at
PUT /revisions/:id/tools/:idso authors get structured diagnostics before anything reaches the sandbox.Changes
Types —
agent-shared/src/spec/custom-tool.tsaddsCustomToolContext,CustomTool,CustomToolAction. Mirrors the actual ctx built byagent-sandbox-host/src/dispatch.js(deliberately minimal:secrets.ref(name)+log). Type-only — esbuild strips at compile so nothing reaches the sandbox runtime. Authors usesatisfies CustomToolfor inference.Docs —
docs/custom-tools.mdcovers the compute-only model and why (no network reach is what makes custom tools safe on attacker-influenced input — the lethal trifecta is impossible because the third leg doesn't exist), theexport default { actions: { default } }contract, the minimal ctx, the egress story (use@posthog/http-requestwith host-pinned secrets), and the identity / approval gating.Compile pipeline —
compile-custom-tools.tsgains two AST passes layered after the existing shape check:http,https,net,child_process,worker_threads,vm, plus escape hatches likeinspector,module,repl,wasi) and rejectseval(...)/Function(...)/new Function(...). Coversimport,require, and dynamicimport()forms. Useful stdlib stays allowed (fs,crypto,path,util,stream,url, …).ctx.secrets.ref('NAME')calls and emits{secret_refs, dynamic_secret_refs}for the future authoring UI. Receiver-tight (onlyctx, not<anyIdent>.secrets.ref) and conservative about completeness — aliases, destructures, computed access, and barectx.secretsreads flipdynamic_secret_refs: trueso consumers know the static list is advisory.Shape and banned errors merge into one response so authors fix everything in one round-trip. The bundle is left untouched on any failure. The PUT response gains a
capabilitiesfield.Tests — 63 unit cases in
compile-custom-tools.test.ts(parameterized: banned modules, allowed modules, dynamic-flag aliases, etc.) plus 2 wire-level e2e cases intyped-bundle-authoring.test.ts.The runtime sandbox (
--network=nonein Docker,blockNetwork:truein Modal,--cap-drop=ALL) remains the real security boundary. The AST passes are a friendly-error layer on top of that — they exist to fail at PUT time with an actionable message instead of at session start with a cryptic runtime error.How did you test this code?
Agent-authored — no manual testing claimed.
Automated tests run:
pnpm --filter @posthog/agent-janitor vitest run src/compile-custom-tools.test.ts— 63/63 pass (47 pre-existing, 16 new).pnpm --filter @posthog/agent-tests vitest run src/cases/typed-bundle-authoring.test.ts— 31/31 pass.pnpm --filter @posthog/agent-janitor typescript:check— clean.pnpm --filter @posthog/agent-janitor lint— clean.pnpm --filter @posthog/agent-shared typescript:check— clean.What the new tests catch that nothing existing did:
Function('payload')()(withoutnew) is rejected — earlier code only matchednew Function(...).<non-ctx>.secrets.ref('NAME')is not over-collected into the capability summary.dynamic_secret_refs: trueinstead of silently under-reporting.inspector,inspector/promises,module,repl,wasi) are rejected at PUT time.Automatic notifications
Docs update
The doc in this PR (
products/agent_platform/docs/custom-tools.md) is the internal contract. Customer-facing docs (posthog.com) come later, after Phase 4's authoring UI lands.🤖 Agent context
Autonomy: Human-driven (agent-assisted)
Tool: Claude Code (Opus 4.7). Skills invoked during this PR:
/qa-team(multi-agent QA review at end of Phase 1; report atQAREPORT.mdin the working tree — five reviewers converged on capability extractor scope, banned-list completeness, and error-reporting consistency, all addressed before this commit).Scope choices made along the way that are worth flagging:
products/agent_platform/packages/agent-tool-types/; once it became clear that the actualctxis tiny and esbuild doesn't bundle imports anyway, the right place was a single file insideagent-sharedthat authorsimport typefrom. Extracting to a published@posthog/agent-tool-typeslater is mechanical when GA arrives.kind: "custom_template"spec slot is still there from the pre-cutover registry that got deleted ineb2d9ef855; reviving it stays out of scope until there's cross-agent reuse demand.Next: where does dry-run live?
This PR is the static half of custom-tool authoring. The runtime half — an interactive "test this tool against synthetic args" endpoint authors hit from the UI — is the next thing to build, and it has an architectural question I'd value pushback on before I open the Phase 2 PR.
Stuck: dry-run combines authoring the agent spec (janitor's job) with executing untrusted code (the tool can be arbitrary JS) in a sandbox (only the runner does this today). Three options I considered, with their costs:
agent-janitor— pro: natural URL shape, sits next toPUT /tools/:id. Con: janitor is a pure CRUD service today; adding a sandbox pool doubles its blast radius (Modal SDK init, OOM handling, orphan reaping, lifecycle bugs) for one feature.agent-runnerand let it do CRUD ops — pro: smallest code change, runner already owns the sandbox pool. Con: violates the queue-driven invariant the runner is sized and scaled around; the/healthzexception is the only one we want.IMO none of the existing services is the right home — each ends up either growing a responsibility it shouldn't have or running on a lifecycle that's wrong for the use case.
Pitch: a small new service
agent-execwhose single job is "execute this compiled JS with these args + ctx stub, return result, terminate." Routing is Django → janitor → agent-exec, so janitor stays the authoring authority and does all the CRUD-shaped checks itself (verify JWT, assertDraft, fetch revision, read compiled.js + schema, validate args against the schema, check secret mocks againstspec.secrets[]); agent-exec only sees the bytes it needs to run, with a single-shot lifecycle and warm-pool-capable for sub-second feedback. Cost is one extra intra-cluster hop (5-20ms, small against multi-second sandbox cold starts) plus a new k8s deployment / Helm chart / on-call surface. Implementation is ~400 lines mostly reusingselectSandboxPool+ JWT verify fromagent-shared.Input requested:
agent-execservice for the execution leg, fronted by janitor) feel right, or am I over-rotating from "just put it on the runner / janitor / queue"?agent-execshouldn't reuseselectSandboxPoolwith a single-shot lifecycle and a warm pool?