Outbound A2A: support long-running tasks and multi-turn interaction#267
Merged
rockfordlhotka merged 24 commits intomainfrom Apr 12, 2026
Merged
Conversation
…rn follow-up (#265) Enable two RockBot instances to collaborate on behalf of their users (e.g., negotiating a meeting time) by supporting the full A2A task lifecycle instead of treating the first response as final. - Poll GetTask with exponential backoff when HTTP agents return Working/Submitted - Handle InputRequired via trust-gated follow-up loop (both HTTP and queue transports) - Use existing inbound trust levels to decide autonomous vs user-surfaced responses - Add contextId-based conversation continuation on the inbound RockBotTaskHandler - Loop protection: max 20 rounds + repeated Q/A detection (threshold 3) - OTel metrics (polling_attempts, input_required_rounds, input_required_breaks) and spans with cross-container correlation tags (task_id, context_id, session_id) - 17 new tests covering repetition detector, response mapping, and continuation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two isolated RockBot instances (Alice and Bob), each with their own RabbitMQ, communicating via HTTP A2A gateways. Enables integration testing of the multi-turn InputRequired flow from #265. Setup: - rabbitmq-alice/bob: separate message bus instances - agent-alice/bob: RockBot agents with per-instance seed data - gateway-alice/bob: HTTP A2A endpoints (ports 5201/5202) - blazor-alice/bob: Blazor UIs (ports 8081/8082) Each agent's well-known-agents.json points to the other's gateway with pre-configured API key auth. Trust stores pre-seed Act-level trust so agents can collaborate autonomously. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Without LLM credentials agents use EchoChatClient and can't reason about tools — they just echo input back. Updated docker-compose header with prerequisites and --env-file usage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New Act-level inbound skill that exercises the full InputRequired flow: returns InputRequired on first call (proposing meeting times), Completed on follow-up (confirming the selected time). Uses contextId and conversation memory for multi-turn state tracking. - Register skill in agent card (Program.cs), gateway appsettings - Update peer seed data: well-known-agents + trust stores include negotiate-meeting with Act-level approval - Expand .env.example with Azure OpenAI option Test from Alice's Blazor UI (http://localhost:8081): "Negotiate a meeting with Bob for tomorrow" Expected flow: Alice → invoke_agent(Bob, negotiate-meeting, ...) Bob returns InputRequired: "Available at 10am, 2pm, or 4pm" Alice's LLM picks a time → sends follow-up with contextId Bob returns Completed: "Meeting confirmed" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use the existing agent naming mechanism (agent-name.md on data volume, hot-reloaded by AgentProfileLoader) to give each peer instance a distinct identity. The LLM was calling invoke_agent(agent_name=RockBot) (itself) instead of "Bob" because both agents shared the default name. With distinct display names, the Blazor UI also reflects the correct agent identity per instance. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AgentDirectory.StartAsync returned early when known-agents.json didn't exist, skipping the well-known agent seeding loop below. On a fresh data volume (no prior directory file), well-known agents from config were never added to the directory — so list_known_agents returned only the agent's own self-announcement, not the configured peers. Move the early return into a scoped block so well-known seeding always runs regardless of whether the persisted file exists. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pic) The gateway was hardcoding "agent.task.RockBot" for RabbitMQ topic routing. Now uses GatewayOptions.RoutingName (from InternalAgentName config, falls back to AgentName). This separates the external agent card identity (Alice/Bob) from the internal routing name (RockBot) that must match the agent's WithIdentity() subscription topic. In the peer docker-compose: Gateway__AgentName: Alice (external — what callers see) Gateway__InternalAgentName: RockBot (internal — matches agent sub) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each gateway must accept the key that the OTHER agent sends: - gateway-alice accepts bob-calls-alice (sent by Bob) - gateway-bob accepts alice-calls-bob (sent by Alice) The keys were backwards, causing 401 Unauthorized on every cross-agent HTTP A2A call. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The preview bubble published under the target agent's name used IsFinal=false, but no corresponding IsFinal=true from that agent name ever followed — the final synthesis comes under the primary agent's name. The Blazor UI tracks spinners per agent name, so the target agent's spinner never stopped. Mark the preview bubble IsFinal=true since it IS the target agent's final output for this task. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All completed inbound A2A tasks now store a searchable outcome in
working memory under a2a-outcomes/{skill}/{contextId} with an 8-hour
TTL and category "a2a-outcome". This lets the agent recall recent
inter-agent interactions when asked.
- negotiate-meeting: stores full exchange transcript + confirmation
- notify-user: stores notification text + sender
- Observe-level tasks: stores request + LLM summary
Tagged with caller name and skill for SearchWorkingMemory discovery.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New section in docs/a2a.md covering: - Skill registration (agent card, gateway, handler dispatch) - Outcome persistence requirements (working memory, key pattern, category, TTL, tags) - Multi-turn InputRequired pattern (contextId, conversation memory, turn storage, outcome on completion only) - Trust and approval model References HandleNegotiateMeetingAsync as the canonical example. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues found during live testing:
1. A2A outcomes stored under global namespace (a2a-outcomes/) were
invisible to SearchWorkingMemory which defaults to session scope.
Move outcomes under session/{WellKnownSessions.Primary}/a2a-outcomes/
so the user's LLM finds them naturally.
2. The LLM sometimes calls invoke_agent with its own identity name
("RockBot") instead of the target agent ("Bob"), creating a
self-invocation loop. Add a guard that rejects self-invocation
with a helpful error pointing to list_known_agents.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The negotiate-meeting skill now asks the caller (Alice) for purpose, duration, and time preference in a single InputRequired round. The notification to the receiving user (Bob) is purely informational — no questions, just confirmed details. Previously the skill only asked for a time, and Bob's Observe-level notification would generate LLM questions about purpose/duration that should have been directed at Alice during the negotiation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LLMs commonly paraphrase skill names — Alice used "schedule-meeting" instead of "negotiate-meeting", causing the request to fall to Observe level (read-only summary with questions) instead of the Act-level multi-turn handler. - Add "schedule-meeting" as dispatch alias in RockBotTaskHandler - Add to approved skills in both trust stores - Note the alias in well-known-agents descriptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Observe path returned a polite "your request has been received" which the caller's LLM interpreted as success — hallucinating that a meeting was confirmed when it was only queued for human review. Replace with an unambiguous "IMPORTANT: This request was NOT completed" message that explicitly states nothing was scheduled, confirmed, or executed. The caller's LLM should relay that the request is pending the other party's manual review. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The LLM was calling query-availability before negotiate-meeting (unprompted by the user) and paraphrasing skill IDs. Add explicit guidance: - Call the skill that matches the request directly - Don't call query-availability as a prerequisite - Use exact skill IDs from list_known_agents, don't paraphrase - One invoke_agent per user request unless asked otherwise Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Callers may paraphrase skill IDs (e.g. "schedule-meeting" instead of "negotiate-meeting"). Instead of hardcoded aliases, use BM25 ranking against skill IDs, names, descriptions, and known aliases. - InboundSkillMatcher: exact ID → exact alias → BM25 fuzzy match - RockBotTaskHandler: match requested skill before dispatch - Logs matched skill for debugging - Remove hardcoded schedule-meeting alias and trust store entries - 15 new tests covering exact, alias, fuzzy, and no-match cases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All four A2A handlers (result, error, status, InputRequired) used
agent.Name (identity = "RockBot") for AgentReply.AgentName, causing
chat bubbles to show "RockBot" instead of the display name ("Alice").
Add AgentNameHolder to each handler and use DisplayName for all
user-facing fields (AgentReply.AgentName, conversation turns,
progress context). Envelope source field stays as agent.Name for
message routing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The gateway always returned a Message response via EnqueueMessageAsync, losing the task state. When Bob's agent returned InputRequired, the caller saw Completed (because the SDK maps Message as Completed). Now returns a Task response via EnqueueTaskAsync for non-terminal states (InputRequired, Working, Submitted) so the caller's SDK preserves the state and the InputRequired multi-turn loop fires. Terminal states still return Message for backward compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues causing the InputRequired loop to never complete: 1. Gateway didn't pass contextId from SDK request to the RabbitMQ AgentTaskRequest, so Bob's handler never saw a contextId and treated every follow-up as a fresh conversation (completedRounds=0). 2. Gateway used the caller's contextId for the response instead of the agent's contextId. Bob generates a contextId from the taskId on the first call; the gateway now forwards that back so the caller uses it for follow-ups. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The result handler runs the LLM to synthesize a response after an A2A task completes. With invoke_agent in the tool set, the LLM would call it again — creating an infinite loop where each Completed result triggered a new agent call. Filter out A2A caller tools (invoke_agent, register/unregister_agent, list_known_agents, get_agent_details) from the result handler's ChatOptions. The result synthesis should only present the outcome to the user, not initiate new agent interactions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The invoke_agent tool response was too mild — the LLM kept iterating and called negotiate-meeting a second time before the first result arrived. Strengthen the response to explicitly say STOP and present a status to the user while waiting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The LLM ignored the "STOP" instruction and called invoke_agent twice in the same loop iteration, creating duplicate negotiate-meeting round-trips. Now checks if the session already has a pending A2A task in the tracker and returns an error telling the LLM to wait. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two sources of noise in the Blazor UI during A2A interactions: 1. The result handler published a preview bubble AND ran an LLM synthesis — both showed as separate messages, creating a double-confirmation. Remove the preview; the synthesis is the single user-facing message. 2. The InputRequired handler published a question bubble for each round. For autonomous follow-ups (Act-level trust), this is noise — the user only needs the final outcome. Remove the intermediate bubble. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #265
Working/Submitted, pollGetTaskwith exponential backoff (2s → 30s cap) until terminal, forwarding intermediate status updates to the userRockBotTaskHandlerusescontextIdto maintain conversation history across multi-turn exchanges, enabling two RockBot instances to collaborate (e.g., negotiating a meeting time)polling_attempts,input_required_rounds,input_required_breaks), activity spans, and cross-container correlation tags (task_id,context_id,correlation_id,session_id)docs/a2a.mdwith new section covering polling, InputRequired, trust model, loop protection, and observabilityKey new files
InputRequiredHandler— shared service used by both HTTP dispatch and queue result handlerInputRequiredRepetitionDetector— modeled onRepetitiveToolCallDetectorDeferred to future issues
SendStreamingMessageAsyncSubscribeToTaskas alternative to pollingTest plan
🤖 Generated with Claude Code