Skip to content

Cache terminal risk assessment by command only#315663

Merged
chrmarti merged 1 commit into
mainfrom
chrmarti/risk-badge-cache-key
May 11, 2026
Merged

Cache terminal risk assessment by command only#315663
chrmarti merged 1 commit into
mainfrom
chrmarti/risk-badge-cache-key

Conversation

@chrmarti
Copy link
Copy Markdown
Collaborator

@chrmarti chrmarti commented May 11, 2026

#315616

Summary

Fixes the terminal confirmation badge showing Assessing risk… with a spinner every time, instead of using the cached assessment, when re-running the same command outside the sandbox.

The risk assessment cache key was tool.id + stableStringify(parameters). For run_in_terminal the parameter set includes model-generated explanation, goal, and requestUnsandboxedExecutionReason fields, which differ on every invocation even when the underlying command is identical — so the cache never hit on the second run.

This narrows the cache key for run_in_terminal to just `{ command }`. Other tools are unaffected and still key on their full parameter set.

Repro

  1. Set `chat.tools.riskAssessment.enabled: true` and `chat.agent.sandbox.enabled: true`
  2. In agent mode, run a command that requires unsandboxed access (e.g. `curl https://google.com\`)
  3. Trigger the same command again — badge now renders instantly from cache.
Session Context

Key decisions from the development session:

  • Cache key scope: Only narrowed for `run_in_terminal` (via the `TerminalToolId.RunInTerminal` id). The previous full-parameter key is preserved for all other tools, since their parameters are typically the deterministic inputs of the call.
  • Normalized fields: Only `command` is used. Fields like `explanation`, `goal`, `requestUnsandboxedExecutionReason`, `mode`, `timeout`, and `requestUnsandboxedExecution` are excluded because they're either model-generated descriptive text or execution-mode controls that don't change the risk semantics of the command itself.
  • Shared helper: `getCached` and `assess` both go through a single `_cacheKey` helper so the lookup and insertion keys can't drift.

The cache key for run_in_terminal previously included the model-generated
explanation, goal, and unsandboxed-execution reason fields, which differ
on every invocation even for the same command. As a result, repeating the
same command showed the 'Assessing risk...' spinner instead of using the
cached assessment.

Narrow the cache key for run_in_terminal to just the command string.
Copilot AI review requested due to automatic review settings May 11, 2026 13:01
@chrmarti chrmarti marked this pull request as ready for review May 11, 2026 13:02
@chrmarti chrmarti enabled auto-merge (rebase) May 11, 2026 13:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes risk assessment cache misses for terminal tool invocations by narrowing the cache key used for run_in_terminal from “tool id + full parameter set” to “tool id + normalized parameters (command only)”. This prevents repeated terminal confirmations from showing “Assessing risk…” on every re-run when only model-generated descriptive fields changed.

Changes:

  • Introduced a shared _cacheKey(...) helper so getCached(...) and assess(...) use identical cache key logic.
  • Added parameter normalization for TerminalToolId.RunInTerminal so only { command } contributes to the cache key.
  • Left all other tools’ cache key behavior unchanged (still based on full parameter set via stableStringify).
Show a summary per file
File Description
src/vs/workbench/contrib/chat/browser/tools/chatToolRiskAssessmentService.ts Normalizes run_in_terminal cache keys to be command-based and centralizes cache key computation to restore cache hits on repeated runs.

Copilot's findings

  • Files reviewed: 1/1 changed files
  • Comments generated: 0

@chrmarti chrmarti merged commit 3f0be37 into main May 11, 2026
29 checks passed
@chrmarti chrmarti deleted the chrmarti/risk-badge-cache-key branch May 11, 2026 13:50
@vs-code-engineering vs-code-engineering Bot added this to the 1.121.0 milestone May 11, 2026
@chrmarti chrmarti added the ~release-cherry-pick Trigger: cherry-pick this PR to the latest release branch label May 11, 2026
@vs-code-engineering vs-code-engineering Bot added release-cherry-pick Automated cherry-pick between release and main branches and removed ~release-cherry-pick Trigger: cherry-pick this PR to the latest release branch labels May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-cherry-pick Automated cherry-pick between release and main branches

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants