Skip to content

fix: isolate mirror tests from host env to prevent API key leak (#522)#573

Merged
bingran-you merged 1 commit into
mainfrom
fix/522-test-env-leak
Jun 3, 2026
Merged

fix: isolate mirror tests from host env to prevent API key leak (#522)#573
bingran-you merged 1 commit into
mainfrom
fix/522-test-env-leak

Conversation

@ElegantLin
Copy link
Copy Markdown
Contributor

@ElegantLin ElegantLin commented May 27, 2026

Summary

  • TestAutoInheritEnvBidirectionalMirror called auto_inherit_env() without clearing os.environ, so real API keys (e.g. GEMINI_API_KEY) on the host leaked into the test dict
  • Assertion failures then printed the real key value in pytest output — a secret exposure risk
  • Added an autouse fixture using the existing _clear_keys() helper, matching the isolation pattern already used by TestResolveAgentEnvGemini in the same file

Test plan

  • All 7 tests in test_agent_env_resolution.py pass (including the previously failing test_gemini_only_mirrors_to_google)
  • Verified the fix works with real GEMINI_API_KEY / GOOGLE_API_KEY exported in the environment
  • ruff clean

Closes #522

🤖 Generated with Claude Code


Open in Devin Review

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 1 additional finding.

Open in Devin Review

Copy link
Copy Markdown
Collaborator

@bingran-you bingran-you left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thermo-nuclear review verdict: request changes.

The branch fixes one mirror-test class, but the same host-env leak still exists in sibling tests in tests/test_resolve_env_helpers.py.

I reproduced both leaks locally:

GOOGLE_API_KEY=host-dummy-secret BENCHFLOW_DOTENV_PATH=/tmp/no-such-benchflow-env uv run python -m pytest tests/test_resolve_env_helpers.py::TestAutoInheritEnv::test_gemini_mirrored_to_google -q --tb=short
AZURE_RESOURCE=host-dummy-resource BENCHFLOW_DOTENV_PATH=/tmp/no-such-benchflow-env uv run python -m pytest tests/test_resolve_env_helpers.py::TestAutoInheritEnv::test_azure_endpoint_derives_resource -q --tb=short

Both failures print the host value in the assertion diff. The narrow class-level cleanup means future mirror/derive tests can still accidentally inherit real host credentials.

Recommended fix: pure mirror/derive tests should call auto_inherit_env(env, source_env={}); inheritance-specific tests should pass an explicit source_env={...} instead of depending on or partially clearing the real process env.

Verification notes:

  • Branch-head checks passed: uv sync, full pytest, ty, ruff.
  • Current-merge pytest and ty passed.
  • Current-merge ruff failed only on existing main baseline UP041 in src/benchflow/sandbox/docker.py:408, not this PR.
  • I also attempted real ACP E2E; provider/auth setup prevented a healthy full run for this PR, so I am not treating E2E as passed here.

@ElegantLin ElegantLin requested a review from bingran-you June 1, 2026 00:36
Rebased onto main (picks up #587's CLAUDE_OAUTH/KIMI additions). Pure
mirror/derive tests call auto_inherit_env(env, source_env={}); inheritance
tests pass an explicit source_env={...} instead of monkeypatch-ing the real
process env. #587's new test_claude_oauth_alias test is made hermetic too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ElegantLin ElegantLin force-pushed the fix/522-test-env-leak branch from 775347b to e9f7082 Compare June 1, 2026 01:17
@bingran-you
Copy link
Copy Markdown
Collaborator

Thermo-nuclear code quality review

What this PR is trying to solve

This PR closes #522 by making auto_inherit_env tests hermetic. Several tests called auto_inherit_env(env) directly, which falls back to os.environ; if a developer or CI host has real credentials like GEMINI_API_KEY, GOOGLE_API_KEY, or related provider vars exported, those values can enter the test dict and then appear in pytest assertion output.

The implementation makes the unit tests explicit about their input source: inheritance tests now pass source_env={...}, and pure mirror/derive tests pass source_env={}. That keeps the tests focused on the contract under test while cutting off accidental host-env reads.

Findings

No blocking findings.

The previous host-env leakage pattern appears fixed in the touched test surfaces:

  • tests/test_agent_env_resolution.py:57-75 now isolates Gemini/Google mirror tests with source_env={}.
  • tests/test_resolve_env_helpers.py:50-173 now separates inheritance tests from mirror/derive tests by passing explicit source_env mappings.
  • This is the right abstraction boundary for these unit tests: they exercise auto_inherit_env deterministically without reaching into process-global state.

I do not see a maintainability regression. The PR does not add production branching, new helper layers, type looseness, abstraction leakage, or file-size risk. tests/test_resolve_env_helpers.py is still large, but this PR reduces net lines and does not push it over the 1k threshold.

Residual risk / follow-up

The PR body says it added an autouse fixture using _clear_keys(), but the actual implementation uses explicit source_env arguments instead. I think the code is cleaner than the description, but the PR body should be updated if the description is meant to be exact.

This is test-only and does not touch SkillsBench, trajectories, verifiers, skill loading, Daytona, or Docker behavior. I do not think a Daytona/Docker canary is required for this merge; the relevant risk is unit-test hermeticity.

@bingran-you bingran-you merged commit 183fd73 into main Jun 3, 2026
3 checks passed
@bingran-you bingran-you deleted the fix/522-test-env-leak branch June 3, 2026 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test_google_only_mirrors_to_gemini leaks GEMINI_API_KEY in assertion output and breaks §15 regression flow

2 participants