feat: plumb task-configured MCP servers through ACP session/new#589
Open
bingran-you wants to merge 1 commit into
Open
feat: plumb task-configured MCP servers through ACP session/new#589bingran-you wants to merge 1 commit into
bingran-you wants to merge 1 commit into
Conversation
task.toml [[environment.mcp_servers]] was parsed into SandboxConfig.mcp_servers but never reached the agent: both ACP session entry points hardcoded an empty MCP list and the vendored McpServerSpec was never constructed. This wires the full path so a task can attach MCP servers (e.g. a Playwright MCP) to the agent's ACP session. - McpServerSpec gains the wire-correct fields (name, headers) and a to_new_session_param() projection onto the SDK's discriminated per-transport union (stdio omits type; sse/http carry url+headers). - rollout._task_mcp_specs() owns the MCPServerConfig -> McpServerSpec translation (incl. streamable-http -> http naming) at the composition root, keeping acp/ free of any task-config dependency. - ACPClient.session_new/session_load and connect_acp accept the spec list; None preserves the historical empty-list default. - tests/test_acp_mcp_servers.py covers config->spec mapping, the per-transport wire shape, and the rollout threading. Framework-only change; unblocks Playwright-MCP variants of research tasks.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
task.toml[[environment.mcp_servers]]was parsed intoSandboxConfig.mcp_serversbut never reached the agent: both ACP session entry points hardcoded an empty MCP list (mcp_servers=[]), and the vendoredMcpServerSpecwas never constructed anywhere. So a task could not attach any MCP server (e.g. a Playwright MCP) to the agent.This PR wires the full path task config → ACP
session/newso task-declared MCP servers are attached to the agent's session.Changes
acp/types.py—McpServerSpecbecomes wire-correct: addsname(required — every ACP variant carries it) andheaders, plusto_new_session_param()which projects the flat spec onto the SDK's discriminated per-transport union (stdio omitstypeand carriescommand/args/env; sse/http carryurl/headersand keeptype).rollout.py—_task_mcp_specs()owns theMCPServerConfig → McpServerSpectranslation (including thestreamable-http → httpnaming bridge) at the composition root, soacp/keeps zero dependency on the task-config layer. Threaded into bothconnect_acpcall sites (connect()andconnect_as()).acp/client.py/acp/runtime.py—session_new,session_load, andconnect_acpaccept the spec list.None→ empty list, preserving the historical benchmark default.tests/test_acp_mcp_servers.py— covers the per-transport wire projection, the_task_mcp_specsmapping (all three transports + absent config), thesession_new/session_loadwire output, and the rollout threading intoconnect_acp.Design notes
McpServerSpec(not raw dicts orMCPServerConfigpassed down): it's the ACP-layer DTO that decouples the protocol layer from both the task-config shape and the SDK's splitMcpServerStdio/SseMcpServer/HttpMcpServerunion. Passing typed specs throughconnect_acp/session_newkeeps those signatures honest; doing the translation in the rollout keepsacp/free of anybenchflow.taskimport.session_newusesNewSessionParams.model_validate(...)so the SDK's discriminated union validates each per-transport dict (the constructor's static type rejects alist[dict]).Scope / limitations
env: [](MCPServerConfighas noenvfield) — fine for Playwright; a stdio MCP needing secrets would require adding anenvfield toMCPServerConfig(follow-up).session/newmcpServersvs only its own config still needs a smoke test before relying on it end-to-end.Testing
uv run --extra dev python -m pytest tests/— full suite greenuv run ty check src/— cleanuv run ruff check .— clean