Skip to content

Wire OBO dispatch arms and reconciler branch#5345

Open
tgrunnagle wants to merge 4 commits into
mainfrom
oss-obo-3_issue_5328
Open

Wire OBO dispatch arms and reconciler branch#5345
tgrunnagle wants to merge 4 commits into
mainfrom
oss-obo-3_issue_5328

Conversation

@tgrunnagle
Copy link
Copy Markdown
Contributor

@tgrunnagle tgrunnagle commented May 20, 2026

Summary

Wires the new obo external auth type into the three operator dispatch sites and the MCPExternalAuthConfig reconciler, then proves the wiring with unit-level tests against the registered default handler.

  • Why: Before this change, the obo arms added in Add default OBO handler hooks and vMCP/proxy converter stubs #5327 existed only to satisfy the exhaustive linter — they returned nil (silent no-op) or a stub error that did not flow through the reconciler's status path. As a result, applying an obo-typed MCPExternalAuthConfig either disappeared silently or surfaced a generic unsupported external auth type / unknown middleware type error in a consumer reconciler downstream. This PR routes obo through the registered handler at every dispatch site so the upstream-only default handler can return obo.ErrEnterpriseRequired and the reconciler can translate that sentinel into a uniform status.conditions[Valid] = False / Reason: EnterpriseRequired outcome — the contract external consumers and downstream tooling pattern-match on.
  • What:
    • AddExternalAuthConfigOptions (cmd/thv-operator/pkg/controllerutil/tokenexchange.go): replaced the placeholder obo arm with a call into the existing OBOApplyRunConfig wrapper.
    • (*VirtualMCPServerReconciler).getExternalAuthConfigSecretEnvVar (cmd/thv-operator/controllers/virtualmcpserver_deployment.go): the obo arm now dispatches through controllerutil.OBOSecretEnvVars via a small extracted helper (firstOBOSecretEnvVar) so the function stays under the cyclomatic-complexity lint threshold.
    • MCPExternalAuthConfigReconciler.Reconcile (cmd/thv-operator/controllers/mcpexternalauthconfig_controller.go): added an ExternalAuthTypeOBO branch that calls controllerutil.OBOValidate, maps obo.ErrEnterpriseRequired to Reason: EnterpriseRequired (and any other validation error to Reason: InvalidConfig), and writes the condition through a new setInvalid helper that uses MutateAndPatchStatus per .claude/rules/k8s-controller.md.
    • Added exported reason constants ConditionReasonEnterpriseRequired and ConditionReasonInvalidConfig in mcpexternalauthconfig_types.go so the user-facing literals live in one place.
    • Added unit-level test coverage for each of the three dispatch sites against the registered default handler, plus reconciler-level coverage for the new setInvalid path (including the wrapped-sentinel case to confirm errors.Is matching).

Closes #5328

Type of change

  • Bug fix
  • New feature
  • Refactoring (no behavior change)
  • Dependency update
  • Documentation
  • Other (describe):

Test plan

  • Unit tests (task test)
  • E2E tests (task test-e2e)
  • Linting (task lint-fix)
  • Manual testing (describe below)

New tests cover each new code path:

  • TestAddExternalAuthConfigOptions_OBO — proves the obo arm of AddExternalAuthConfigOptions returns an error wrapping obo.ErrEnterpriseRequired and never reaches the default: arm's unsupported external auth type formatter. Also asserts neither "unsupported external auth type" nor "unknown middleware type" appears in the error message.
  • TestGetExternalAuthConfigSecretEnvVar_OBO — proves the obo arm of the vMCP deployment's secret-env-var switch dispatches through the registered handler instead of silently returning nil, nil. Asserts both generic-error guard substrings as well.
  • TestMCPExternalAuthConfigReconciler_OBO_DefaultHandler_SetsEnterpriseRequired — drives the reconciler with a fake client against an obo-typed MCPExternalAuthConfig, asserts Valid=False / Reason=EnterpriseRequired lands on the resource, asserts the message equals obo.ErrEnterpriseRequired.Error(), and asserts neither generic substring leaks into the message.
  • TestMCPExternalAuthConfigReconciler_OBO_ClearsStaleIdentitySynthesizedregression guard: when a user switches an existing MCPExternalAuthConfig from embeddedAuthServer (which set IdentitySynthesized=True) to obo, the stale IdentitySynthesized condition must be removed alongside the new Valid=False/EnterpriseRequired write. Defends against the MutateAndPatchStatus snapshot/diff trap where the pre-mutate snapshot already contained the in-memory applyIdentitySynthesizedCondition write — which would silently disappear from the merge patch.
  • TestMCPExternalAuthConfigReconciler_OBO_ReasonSelectionInReconcile — table-driven test that drives the production reason-selection branch through Reconcile by registering a stub OBO handler whose Validate returns each test's error. Covers (a) bare sentinel → EnterpriseRequired, (b) wrapped sentinel through fmt.Errorf("...: %w", ...)EnterpriseRequired (proves the errors.Is matching that lets the vMCP pkg/vmcp/auth/converters/interface.go wrap survive), (c) any other error → InvalidConfig.
  • TestDiscoverAndResolveAuth_OBO_SentinelSurvivesWrap (pkg/vmcp/auth/converters/) — proves errors.Is matches obo.ErrEnterpriseRequired through the vMCP "failed to convert to strategy: %w" wrap site at interface.go:191.

API Compatibility

  • This PR does not break the v1beta1 API, OR the api-break-allowed label is applied and the migration guidance is described above.

Only additive changes to the v1beta1 API: two new exported ConditionReasonXxx string constants in mcpexternalauthconfig_types.go. No CRD schema change in this PR (the CRD enum still rejects "obo" at the apiserver level until #5329 lands).

Does this introduce a user-facing change?

Yes. Applying an obo-typed MCPExternalAuthConfig to a cluster running an upstream-only build now produces a deterministic, machine-readable status:

status.conditions[Valid].status  = False
status.conditions[Valid].reason  = "EnterpriseRequired"
status.conditions[Valid].message = "on-behalf-of (OBO) external auth type requires an enterprise build"

External consumers and downstream tooling can pattern-match on the literal EnterpriseRequired reason string — it is part of the user-facing contract (codified by the new ConditionReasonEnterpriseRequired constant). The message text comes from obo.ErrEnterpriseRequired.Error() at pkg/auth/obo/errors.go. Out-of-tree builds that register a real handler via controllerutil.RegisterOBOHandler bypass this path entirely.

(Note: the CRD enum currently rejects "obo" at the apiserver layer, so this status surface is only reachable today through paths that bypass apiserver enum validation — see "Special notes for reviewers" below. The CRD admission change is the next OSS task, #5329.)

Implementation plan

Approved implementation plan

The implementation followed the recommended approach in issue #5328 verbatim — the issue body already contains the full implementation plan (code pointers, exact switch-arm placements, the setInvalid helper sketch, the reason-string contract, and the integration-test strategy). No separate plan document was generated for this task; the issue is the design doc.

One bounded deviation from the issue's integration-test plan: the issue offered three options for the CRD-enum bypass and recommended picking whichever fit the existing envtest harness best. We chose option 3 — drive the reconciler directly at the Go level with a fake client, because the CRD enum does not admit "obo" until #5329 lands and direct-reconciler invocation kept the test surface contained to this PR. The integration tests under cmd/thv-operator/test-integration/mcp-external-auth/ are unchanged; the per-site coverage lives in standard *_test.go files instead.

Special notes for reviewers

  • Stacked PR — target main, but review against the post-Add obo external auth type and function-pointer override hooks #5325/Add default OBO handler hooks and vMCP/proxy converter stubs #5327 tree. This is task 3 of 4 in the OSS-OBO stack. The branch contains commits from Add obo external auth type and function-pointer override hooks #5325 (operator/proxyrunner Run factor) and Add default OBO handler hooks and vMCP/proxy converter stubs #5327 (default OBO handler hooks + vMCP/proxy converter stubs) because of the local stack workflow. Per .claude/rules/pr-creation.md, this PR should be reviewed and merged after the two upstream PRs land on main; once they do, the diff against main will reduce to just the Wire OBO dispatch arms and reconciler branch; add integration tests #5328-scoped changes (the three commits prefixed Wire OBO dispatch arms, Address code review feedback on OBO dispatch wiring, Consolidate OBO test coverage). A scoped diff at git diff 30953187..HEAD previews exactly that surface (~467 lines, 8 files).
  • CRD-enum sequencing — chose option 3 from the issue. The CRD enum does not admit "obo" until Admit obo in MCPExternalAuthConfig CRD enum and regenerate manifests #5329 lands, so apiserver-routed integration tests would require either an in-test CRD mutation or relying on envtest's looser admission. We picked the issue's option 3 — drive the reconciler directly at the Go level with a fake client — because it kept the test surface contained to this PR and avoided coupling test infrastructure to the next task. All three dispatch sites have direct unit-level coverage as a result; the existing integration suite under cmd/thv-operator/test-integration/mcp-external-auth/ is unchanged.
  • Consumer-CR Status Condition Parity gap — tracked in Mirror MCPExternalAuthConfig.Valid=False/EnterpriseRequired to consumer CRs #5347. Issue Wire OBO dispatch arms and reconciler branch; add integration tests #5328 AC4 calls for the consumer CRDs (MCPServer / MCPRemoteProxy / VirtualMCPServer) to surface a parallel Valid=False / EnterpriseRequired condition. The codebase does not mirror MCPExternalAuthConfig conditions onto consumer CR statuses today — there is no existing condition-mirror machinery to extend. This PR satisfies the AC via the dispatch error path (each consumer reconciler now receives obo.ErrEnterpriseRequired rather than the generic "unsupported" error and surfaces it through its existing failure path), but the literal condition-mirror is deferred to Mirror MCPExternalAuthConfig.Valid=False/EnterpriseRequired to consumer CRs #5347 so it can land before Admit obo in MCPExternalAuthConfig CRD enum and regenerate manifests #5329 makes obo reachable in production. Per .claude/rules/operator.md's Status Condition Parity rule, the mirror is genuine parity work and must merge before Admit obo in MCPExternalAuthConfig CRD enum and regenerate manifests #5329.
  • Pre-existing r.Status().Update call sites are intentionally untouched. The reconciler still calls r.Status().Update(ctx, externalAuthConfig) at the ValidationFailed return and the conditionChanged write. The issue explicitly lists refactoring those to MutateAndPatchStatus as out of scope; the new setInvalid helper uses MutateAndPatchStatus as required for new status writes.
  • setInvalid re-fetches before patching. This is deliberate. MutateAndPatchStatus snapshots the object at the start of the call and diffs the post-mutate state against that snapshot; any mutation already present on cfg.Status.Conditions (notably the in-memory applyIdentitySynthesizedCondition write earlier in Reconcile) would land in both halves of the diff and silently disappear from the merge patch. Re-fetching and re-applying the advisory inside the patch closure folds both the advisory transition and the Valid=False write into the same patch. applyIdentitySynthesizedCondition is idempotent on the same spec, so the double computation is benign. The TestMCPExternalAuthConfigReconciler_OBO_ClearsStaleIdentitySynthesized regression test guards this invariant — temporarily remove the r.applyIdentitySynthesizedCondition(c) call inside setInvalid's closure and the test should fail.

@github-actions github-actions Bot added the size/XL Extra large PR: 1000+ lines changed label May 20, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

Implements changes for issue #5328:
- Replace the placeholder OBO arms in AddExternalAuthConfigOptions and
  getExternalAuthConfigSecretEnvVar with calls into the registered OBO
  handler (OBOApplyRunConfig / OBOSecretEnvVars). In upstream-only
  builds the default handler returns obo.ErrEnterpriseRequired, so the
  dispatch never falls through to the default arm's generic
  "unsupported external auth type" error.
- Add an OBO branch to MCPExternalAuthConfigReconciler.Reconcile that
  calls controllerutil.OBOValidate after the existing Validate()
  succeeds. errors.Is(err, obo.ErrEnterpriseRequired) maps to
  Reason=EnterpriseRequired; any other error maps to
  Reason=InvalidConfig. The branch writes status via
  controllerutil.MutateAndPatchStatus per the operator status-write
  convention; the existing r.Status().Update call sites are untouched
  (refactor is out of scope for this task).
- Cover each dispatch arm with a fake-client unit test that asserts
  errors.Is matches obo.ErrEnterpriseRequired and that no
  "unsupported external auth type" string leaks. A table-driven
  setInvalid test verifies the reason-string selection handles the
  sentinel directly, the sentinel through a fmt.Errorf wrap (the
  vMCP converter path), and any other error.
Fixed issues from code review:
- CRITICAL: setInvalid now re-fetches the object via r.Get and re-applies
  applyIdentitySynthesizedCondition inside the MutateAndPatchStatus
  closure. Previously the helper diffed against an already-mutated
  snapshot, so a user switching a config from embeddedAuthServer to obo
  would leave a stale IdentitySynthesized=True condition lingering on
  the server. Regression test
  TestMCPExternalAuthConfigReconciler_OBO_ClearsStaleIdentitySynthesized
  constructs that scenario and asserts the stale condition is removed
  alongside the new Valid=False/EnterpriseRequired write.
- CRITICAL: Added integration tests under
  cmd/thv-operator/test-integration/mcp-external-auth/ for the three
  consumer-CR paths (MCPServer, MCPRemoteProxy, VirtualMCPServer) plus
  a shared generic-error guard. The VirtualMCPServer test exercises
  the vMCP DiscoverAndResolveAuth path at
  pkg/vmcp/auth/converters/interface.go:191 and asserts errors.Is
  matches the sentinel through the "failed to convert to strategy: %w"
  wrap, proving the recognition is wrap-safe and not string-prefixed.
- MEDIUM: setInvalid's doc comment now describes the re-fetch behavior
  and why it is required (the snapshot-vs-mutation drop trap).
Fixed issues from second-round code review:
- CRITICAL: removed the misnamed fake-client tests from
  test-integration/mcp-external-auth/ (envtest directory by convention).
  The unique vMCP wrap-site assertion moved to pkg/vmcp/auth/converters/
  as TestDiscoverAndResolveAuth_OBO_SentinelSurvivesWrap — it lives next
  to the "failed to convert to strategy: %w" wrap site it guards. The
  dispatch arms remain individually covered by the existing unit tests.
- MEDIUM: setInvalid now treats errors.IsNotFound on its re-fetch as a
  clean no-op exit so a deletion race between the initial Get and the
  re-fetch does not produce spurious "Reconciler error" log noise.
- MEDIUM: added ConditionReasonEnterpriseRequired /
  ConditionReasonInvalidConfig constants in v1beta1, documented as
  user-facing contract. The controller and tests now reference the
  named constants instead of bare string literals.
- MEDIUM: documented the double-call between applyIdentitySynthesized
  Condition at line 95 and setInvalid's patch closure — the in-memory
  mutation is load-bearing only on the validation-failure and Valid=
  True paths; setInvalid re-applies the advisory inside its own patch
  because MutateAndPatchStatus would otherwise drop it.
@tgrunnagle tgrunnagle force-pushed the oss-obo-3_issue_5328 branch from 1e42003 to 0326144 Compare May 20, 2026 15:46
@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/XL Extra large PR: 1000+ lines changed labels May 20, 2026
@github-actions github-actions Bot dismissed their stale review May 20, 2026 15:47

PR size has been reduced below the XL threshold. Thank you for splitting this up!

@github-actions
Copy link
Copy Markdown
Contributor

✅ PR size has been reduced below the XL threshold. The size review has been dismissed and this PR can now proceed with normal review. Thank you for splitting this up!

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 77.41935% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.41%. Comparing base (79c08bd) to head (edea978).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...or/controllers/mcpexternalauthconfig_controller.go 75.00% 4 Missing and 2 partials ⚠️
...perator/controllers/virtualmcpserver_deployment.go 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5345      +/-   ##
==========================================
+ Coverage   68.38%   68.41%   +0.02%     
==========================================
  Files         624      624              
  Lines       63432    63461      +29     
==========================================
+ Hits        43377    43415      +38     
+ Misses      16816    16808       -8     
+ Partials     3239     3238       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor Author

@tgrunnagle tgrunnagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-agent review summary

No HIGH-severity findings; 2 MEDIUM (F1, F2) and 8 LOW. Recommended action: COMMENT — the MEDIUM findings are worth addressing before merge but neither is a hard blocker. (event=COMMENT because the reviewer is also the PR author; nothing else to read into.)

Consensus-scored findings

# Title Severity Score Raised by
F1 Consumer-CR Status Condition Parity gap (issue #5328 AC4 unimplemented) MEDIUM 8 operator, intent
F2 setInvalid_ReasonSelection test duplicates production logic — tautological MEDIUM 9 error-handling, test-coverage, general
F3 Stale line-number references in the new Reconcile comment block LOW 8 operator, style
F4 PR body's enumerated test list is stale vs the diff (5 vs 6) LOW 7 test-coverage, intent
F5 TODO(#5328) marker left behind at mcpexternalauthconfig_types.go:1118 LOW 7 style (verified)
F6 OBO_DefaultHandler...EnterpriseRequired lacks positive message assertion LOW 7 test-coverage, operator
F7 Generic-error guard missing "unknown middleware type" in 2 of 4 tests LOW 7 error-handling
F8 PR body's promised user-facing message ≠ actual obo.ErrEnterpriseRequired text LOW 7 general (verified)
F9 OBOApplyRunConfig doc comment references caller (style rule violation) LOW 6 style
F10 ClearsStaleIdentitySynthesized guards an implicit invariant without naming it LOW 7 error-handling, test-coverage

Findings below threshold (score < 7) are listed under §11 of the local review at .pr-review/pr-5345/findings.md. They're not included here.

Executive summary

This PR wires the OBO external-auth type through the three operator dispatch sites (AddExternalAuthConfigOptions, getExternalAuthConfigSecretEnvVar, the MCPExternalAuthConfig reconcile loop) and adds a setInvalid helper that uses MutateAndPatchStatus to write Valid=False / Reason=EnterpriseRequired when the default OBO handler returns obo.ErrEnterpriseRequired. The approach is structurally sound: it follows #5328's recommended implementation almost verbatim, removes the two pre-existing TODO(#5328) no-op stubs at the dispatch sites, and uses errors.Is so the sentinel survives the vMCP "failed to convert to strategy: %w" wrap. Roughly half the diff is targeted unit tests, including a load-bearing regression guard (OBO_ClearsStaleIdentitySynthesized) for the setInvalid re-Get + re-apply pattern that defends against MutateAndPatchStatus's snapshot/diff trap.

Pros / Cons

Pros

  • Tight implementation that mirrors #5328's recommended approach; dispatch arms placed correctly before default: so the "unsupported external auth type" path can never fire for obo.
  • setInvalid's re-Get + re-apply pattern is genuinely subtle and correct — the regression test makes the invariant testable.
  • Reason strings (EnterpriseRequired / InvalidConfig) extracted to typed constants with doc comments declaring them part of the user-facing contract.
  • Generic-error guard assertions in three of the four new tests directly enforce the no-leak-of-"unsupported"/"unknown-middleware" criterion.
  • stderrors "errors" alias is consistent with the existing convention in sibling controller files.

Cons

  • AC4 not implemented: consumer CRDs don't surface a parallel EnterpriseRequired condition (see F1).
  • The setInvalid_ReasonSelection table-test asserts against its own re-implementation of the production reason-selection logic, not the production code (see F2).
  • Two comments cite absolute line numbers (110, 139) that don't quite match the post-PR file (see F3).
  • A third TODO(#5328) marker is left behind at mcpexternalauthconfig_types.go:1118 after the issue this PR closes (see F5).
  • The PR body promises a user-facing message string that doesn't match the actual obo.ErrEnterpriseRequired text (see F8).

How to test

git fetch origin pull/5345/head:pr-5345 && git checkout pr-5345
task lint-fix
task test

Load-bearing-invariant check: temporarily remove the r.applyIdentitySynthesizedCondition(c) call inside setInvalid's MutateAndPatchStatus closure (controller.go:~225). TestMCPExternalAuthConfigReconciler_OBO_ClearsStaleIdentitySynthesized should fail; if it doesn't, the regression guard is not load-bearing.

Out-of-diff findings (not posted inline)

F4 — PR body test enumeration is stale. The PR description lists five tests, but the diff actually adds six. The two setInvalid_WrappedSentinel/NonSentinelError entries are consolidated into one table-driven setInvalid_ReasonSelection, and the regression-guard OBO_ClearsStaleIdentitySynthesized is not mentioned at all. Update the body's enumeration to match the diff and add a one-line note for ClearsStaleIdentitySynthesized explaining what bug it guards.

F5 — Leftover TODO(#5328) at mcpexternalauthconfig_types.go:1118. The PR removes the TODO(#5328) markers from tokenexchange.go:228 and virtualmcpserver_deployment.go:835, but a third marker at mcpexternalauthconfig_types.go:1118 (in the Validate switch's OBO arm — not touched by this diff) is left behind. The comment body is load-bearing WHY content and should stay; just strip the TODO(#5328): prefix so the marker doesn't outlive the issue it references.

F8 — User-facing message in PR body doesn't match the actual sentinel. The PR body's "Does this introduce a user-facing change?" section quotes condition.Message as "OBO external auth requires an enterprise build with a registered OBO handler". The actual obo.ErrEnterpriseRequired.Error() is "on-behalf-of (OBO) external auth type requires an enterprise build" (pkg/auth/obo/errors.go:19-20) — the phrase "with a registered OBO handler" does not appear. External consumers should pattern-match the reason string (not the message), but downstream tooling docs built against the PR body will be wrong. Update either the PR body or the sentinel text.

Agents consulted

6 specialist pr-reviewers: operator/k8s, error-handling, Go-style, test-coverage, intent-alignment, general. Codex cross-review skipped (CLI not installed locally).


🤖 Generated with Claude Code

Comment thread cmd/thv-operator/api/v1beta1/mcpexternalauthconfig_types.go
Comment thread cmd/thv-operator/controllers/mcpexternalauthconfig_controller.go Outdated
Comment thread cmd/thv-operator/controllers/mcpexternalauthconfig_controller_test.go Outdated
Comment thread cmd/thv-operator/controllers/virtualmcpserver_externalauth_test.go
Comment thread cmd/thv-operator/pkg/controllerutil/tokenexchange.go Outdated
Comment thread cmd/thv-operator/pkg/controllerutil/tokenexchange_test.go
…ction test

Addresses #5345 review comments:
- MEDIUM cmd/thv-operator/controllers/mcpexternalauthconfig_controller_test.go (3275486199 / F2):
  rework setInvalid_ReasonSelection to drive Reconcile via RegisterOBOHandler
  so the production errors.Is decision is exercised, not the test's own copy.
- LOW cmd/thv-operator/controllers/mcpexternalauthconfig_controller.go (3275486175 / F3):
  replace stale line-number prose in the Reconcile comment block with
  behavior-named references.
- LOW cmd/thv-operator/api/v1beta1/mcpexternalauthconfig_types.go (F5):
  strip the leftover TODO(#5328) prefix from the Validate() OBO arm;
  keep the load-bearing WHY content.
- LOW cmd/thv-operator/controllers/mcpexternalauthconfig_controller_test.go (3275486186 / F6):
  add a positive condition.Message assertion so a refactor that empties
  the sentinel message would fail the test.
- LOW cmd/thv-operator/pkg/controllerutil/tokenexchange_test.go (3275486215 / F7a):
  add the "unknown middleware type" generic-error guard.
- LOW cmd/thv-operator/controllers/virtualmcpserver_externalauth_test.go (3275486203 / F7b):
  add both generic-error guards to bring the test into uniformity with
  the AC and the sibling tests.
- LOW cmd/thv-operator/pkg/controllerutil/tokenexchange.go (3275486211 / F9):
  drop the "Called from ..." caller-naming clause from OBOApplyRunConfig's
  doc comment per go-style.md.
- LOW cmd/thv-operator/controllers/mcpexternalauthconfig_controller.go (3275486192 / F10):
  document the load-bearing nature of the re-invoked
  applyIdentitySynthesizedCondition inside setInvalid's closure, naming
  the regression test that guards it.
@tgrunnagle
Copy link
Copy Markdown
Contributor Author

Addressed out-of-diff findings from the review summary:

  • F4 (PR body test enumeration stale): updated the body to list all 6 new tests, with the consolidated table-driven OBO_ReasonSelectionInReconcile and the previously-unmentioned OBO_ClearsStaleIdentitySynthesized regression guard each described.
  • F8 (PR body user-facing message string mismatched the sentinel): the body now quotes the actual obo.ErrEnterpriseRequired.Error() text ("on-behalf-of (OBO) external auth type requires an enterprise build") and points at pkg/auth/obo/errors.go as the source.
  • F1 (consumer-CR condition parity gap): tracked in Mirror MCPExternalAuthConfig.Valid=False/EnterpriseRequired to consumer CRs #5347, called out under "Special notes for reviewers" with the must-merge-before-Admit obo in MCPExternalAuthConfig CRD enum and regenerate manifests #5329 constraint.

@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels May 20, 2026
@tgrunnagle tgrunnagle marked this pull request as ready for review May 20, 2026 16:55
Copy link
Copy Markdown
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Two non-blocking comments inline — both about the future enterprise handler contract / release sequencing, neither needs to land in this PR.

if stderrors.Is(err, obo.ErrEnterpriseRequired) {
reason = mcpv1beta1.ConditionReasonEnterpriseRequired
}
return r.setInvalid(ctx, externalAuthConfig, err, reason)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking thought on the future enterprise handler — setInvalid always returns (ctrl.Result{}, nil), so any non-sentinel error from OBOValidate lands Reason: InvalidConfig permanently until the user touches the spec. For the upstream stub that's fine (it only ever returns ErrEnterpriseRequired), but a future out-of-tree Validate doing I/O (Secret read, JWKS, webhook) would get the same treatment on a transient blip — no self-healing.

One way to nail the contract down before the enterprise handler PR — typed error matching the existing registryapi.Error / SpecValidationError precedent:

// pkg/auth/obo/errors.go
type OBOValidationError struct {
    Message string // verbatim into condition.Message; must be safe to expose
}
func (e *OBOValidationError) Error() string { return e.Message }
// the OBO branch
switch {
case stderrors.Is(err, obo.ErrEnterpriseRequired):
    return r.setInvalid(ctx, externalAuthConfig, err, mcpv1beta1.ConditionReasonEnterpriseRequired)
default:
    var valErr *obo.OBOValidationError
    if stderrors.As(err, &valErr) {
        return r.setInvalid(ctx, externalAuthConfig, valErr, mcpv1beta1.ConditionReasonInvalidConfig)
    }
    return ctrl.Result{}, fmt.Errorf("OBO handler validation failed: %w", err)
}

Three-way contract: ErrEnterpriseRequired → not licensed, *OBOValidationError → permanent (user-fix), anything else → transient (requeue). The Message field also doubles as a 'this is what shows in kubectl describe' hint, so handler authors don't accidentally leak Secret names or internal addressing via fmt.Errorf chains.

Same contract should apply to ApplyRunConfig and SecretEnvVars. No migration concern since no enterprise handler exists yet. Happy to defer to a follow-up issue if you'd rather keep #5328 scoped tight.

// obo.ErrEnterpriseRequired in upstream-only builds), invoked from
// the reconcile loop. The CRD-level Validate() stays a no-op for OBO
// and exists only to keep the `exhaustive` linter happy now that
// ExternalAuthTypeOBO is defined. The CRD enum currently rejects
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double-checking #5329 (the CRD enum update) is on the same release train — wanted to make sure it didn't slip through. As written, this whole dispatch surface is unreachable in production until obo lands in the enum at line 69 (the apiserver rejects every obo-typed config first), so an admin trying to apply one between this PR and #5329 would get spec.type: Unsupported value with no CRD status to explain that an enterprise build would have served them better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Medium PR: 300-599 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wire OBO dispatch arms and reconciler branch; add integration tests

2 participants