[DOCS] Document the Agent tool for local subagents

## Summary

The CLI shipped a new first-class `Agent` tool that lets the main agent (chat or headless) spawn one or
more local subagents to run work in parallel and fold their results back into the main context, without
standing up an A2A agent server. A subagent is just an `infer agent` subprocess with its own isolated
session, so it is cheap, isolated, and session-persisted.

This is the lightweight, local complement to the existing A2A trio (`A2A_SubmitTask` / `A2A_QueryTask` /
`A2A_QueryAgent`), which targets external A2A servers.

- Implemented in CLI: inference-gateway/cli#653 (PR inference-gateway/cli#658).
- This ticket tracks the user-facing documentation for it.

## What needs documenting (user-facing)

### 1. The `Agent` tool

A config-gated tool (enabled by default) that fans out local subagents. Document:

- What it is and when to reach for it vs. the A2A tools (local short-lived helpers vs. external A2A servers).
- The tool parameters the model sees:
  - `tasks`: an array of subagent tasks run in parallel, each with:
    - `description` (required) - the task for the subagent
    - `label` (optional) - short label shown in progress output / tmux panes
    - `model` (optional) - per-subagent model override
    - `system_prompt` (optional) - gives that subagent a specialized role/persona
  - `description` (optional) - shorthand for a single-task call (alternative to `tasks`)
  - `system_prompt` (optional) - system prompt for the single-`description` form
- Each subagent runs in its own isolated session id of the form `subagent-<parentSession>-<uuid>`.
- Parallel fan-out is capped by `max_parallel` (default 4) concurrent subagents per call.

### 2. Result modes (async by default, plus blocking wait-all)

- Async (default): the call returns immediately with subagent ids; when each subagent finishes, its final
  result is injected back into the main conversation (mirrors `A2A_SubmitTask` notify behavior). In chat,
  running/completed status is surfaced in the sticky progress area.
- Wait-all (`wait: true`): the call blocks until every spawned subagent reaches a terminal state, then
  returns the aggregated results in one tool result. This is the default in the shipped config.

### 3. Execution surfaces: headless vs. interactive (tmux)

- `headless`: subagents run in the background; results aggregate back into the main context.
- `interactive`: each subagent runs in a live tmux pane/window you can watch while it works; the result
  still aggregates back exactly as in headless mode (interactive is "headless plus a tmux pane attached to
  the live process").
- tmux is an optional runtime dependency, required only for interactive mode (not for headless). Document
  that interactive mode must be run from inside tmux (`$TMUX` set).
- Graceful degradation when not inside tmux (or tmux not installed) is configurable: `fallback: headless`
  (warn and run headless) or `fallback: error`.

### 4. Configuration block: `tools.agent.*`

Document the new config block with the shipped defaults (note: the default `mode` shipped as
`interactive`):

    tools:
      agent:
        enabled: true
        require_approval: true     # spawning work that can edit files is a mutating action
        mode: interactive          # headless | interactive (default when a call omits it)
        wait: true                 # block and return aggregated results by default
        max_parallel: 4            # cap on concurrent subagents per call
        max_depth: 1               # recursion guard; a subagent is itself an `infer agent`
        model: ""                  # default subagent model (inherits parent if blank)
        interactive:
          multiplexer: tmux        # tmux only
          layout: vertical         # vertical | horizontal | window
          fallback: headless       # headless | error (when not inside tmux)

Also document the corresponding env-var overrides (`INFER_TOOLS_AGENT_*`) consistent with the rest of the
config, and that this block is regenerated by `infer init`.

### 5. New flag: `infer agent --result-file <path>`

`infer agent` gained a `--result-file` flag that atomically writes the final assistant message and the
run outcome as JSON to the given path on exit. It is used by the Agent tool to harvest the result of a
detached (tmux pane) subagent. Worth a short mention in the `infer agent` command reference.

### 6. Approval and security behavior

- Subagents run in standard bash mode (the restricted allow-list), exactly like every other headless run.
- The `Agent` tool is in the approval policy and requires approval by default (`require_approval: true`),
  with a per-tool override - consistent with `A2A_SubmitTask`.
- Subagents honor the existing approval-delivery resolution (`prompt` / `ipc` / `block`), so they remain
  secure by default: an off-list or mutating action is blocked in CI/heartbeat (no approver reachable) and
  sent for IPC approval under a channel (e.g. Telegram).
- A depth guard (`max_depth`, default 1) prevents subagent fork-bombs: a subagent cannot itself spawn
  further subagents at the default cap.

### 7. Note on the dangling "Task tool" references

`prompts.yaml` previously referenced a non-existent "Task tool"; those references now point at the real
`Agent` tool. If the docs mirror any of those prompt descriptions, they should be updated to "Agent" too.

## Suggested affected docs pages

- The CLI tools reference (the page that documents the built-in tools alongside the A2A tools).
- The CLI configuration reference (add the `tools.agent.*` block and its env-var overrides).
- The `infer agent` command reference (the `--result-file` flag).
- Any "delegation / multi-agent" or A2A overview page, to contrast local subagents with A2A servers and to
  mention tmux as an optional dependency for interactive mode.

## Out of scope (matches the v1 implementation)

- Nested subagents (depth capped at 1 for v1).
- Routing a subagent's tool-approval prompt back to the main chat TUI.
- Non-tmux multiplexers (screen, zellij) and a `/agent` chat shortcut.

## References

- CLI feature issue: inference-gateway/cli#653
- CLI implementation PR: inference-gateway/cli#658


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DOCS] Document the Agent tool for local subagents #283

Summary

What needs documenting (user-facing)

1. The `Agent` tool

2. Result modes (async by default, plus blocking wait-all)

3. Execution surfaces: headless vs. interactive (tmux)

4. Configuration block: `tools.agent.*`

5. New flag: `infer agent --result-file <path>`

6. Approval and security behavior

7. Note on the dangling "Task tool" references

Suggested affected docs pages

Out of scope (matches the v1 implementation)

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[DOCS] Document the Agent tool for local subagents #283

Description

Summary

What needs documenting (user-facing)

1. The Agent tool

2. Result modes (async by default, plus blocking wait-all)

3. Execution surfaces: headless vs. interactive (tmux)

4. Configuration block: tools.agent.*

5. New flag: infer agent --result-file <path>

6. Approval and security behavior

7. Note on the dangling "Task tool" references

Suggested affected docs pages

Out of scope (matches the v1 implementation)

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. The `Agent` tool

4. Configuration block: `tools.agent.*`

5. New flag: `infer agent --result-file <path>`