Add default OBO handler hooks and vMCP/proxy converter stubs#5338
Conversation
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5338 +/- ##
==========================================
- Coverage 68.41% 68.39% -0.03%
==========================================
Files 621 624 +3
Lines 63278 63432 +154
==========================================
+ Hits 43293 43385 +92
- Misses 16757 16809 +52
- Partials 3228 3238 +10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
PR size has been reduced below the XL threshold. Thank you for splitting this up!
|
✅ PR size has been reduced below the XL threshold. The size review has been dismissed and this PR can now proceed with normal review. Thank you for splitting this up! |
tgrunnagle
left a comment
There was a problem hiding this comment.
Multi-Agent Consensus Review
Agents consulted: operator/CRD, vMCP/auth, Go style, security, tests, general quality (codex skipped — CLI not installed)
Consensus Summary
| # | Finding | Consensus | Severity | Action |
|---|---|---|---|---|
| 1 | Validate() returns nil for OBO — defense-in-depth fail-open | 9/10 | MEDIUM | Fix |
| 2 | Three OBO switch arms behave inconsistently (error vs nil) | 9/10 | MEDIUM | Fix |
| 3 | Missing unit tests for the three new OBO switch arms | 8/10 | MEDIUM | Fix |
| 4 | Package-level OBO state mutated without sync — race risk | 8/10 | MEDIUM | Discuss |
| 5 | stubMessage comment claims exported but is unexported | 8/10 | MEDIUM | Fix |
| 6 | pkg/vmcp -> cmd/thv-operator/pkg/controllerutil cross-layer import | 8/10 | MEDIUM | Discuss |
| 7 | Init-order comment imprecise (function-call time, not runner-construction) | 7/10 | MEDIUM | Fix |
| 8 | validateTypeConfigConsistency missing OBO biconditional | 7/10 | MEDIUM | Discuss |
| 9 | OBOHandler partial registration risks nil-deref at dispatch | 7/10 | MEDIUM | Discuss |
| 10 | PR body "Changes" table lists 4 files not in diff (see note below) | 7/10 | MEDIUM | Fix |
| 11 | OBOValidate / OBOSecretEnvVars / OBOApplyRunConfig wrappers have zero in-tree callers | 7/10 | LOW | Discuss |
| 12 | OBOConverterStub "Stub" suffix inconsistent with sibling naming | 7/10 | LOW | Discuss |
| 13 | TestStubHandler body substring OR is redundant | 7/10 | LOW | Fix |
| 14 | ErrEnterpriseRequired message leaks Go package path into user-visible Status | 7/10 | LOW | Fix |
| 15 | 503 stub body discloses OBO type to unauthenticated callers | 7/10 | LOW | Discuss |
| 16 | TODO(#5328) format inconsistency hurts discoverability | 7/10 | LOW | Fix |
Overall
The PR does exactly what the body advertises: three independent single-slot last-write-wins registration points across operator, vMCP converter, and proxy middleware, with a sentinel ErrEnterpriseRequired everywhere the default fires. Architecture is sound; CRD enum is correctly left in place so nothing user-facing changes today.
The central design choice — every dispatch site fails closed with ErrEnterpriseRequired — is undermined in three places where the new OBO arm returns nil instead of routing through the registered handler: MCPExternalAuthConfig.Validate(), getExternalAuthConfigSecretEnvVar, and the wrappers OBOValidate/OBOSecretEnvVars/OBOApplyRunConfig themselves have zero in-tree callers. The comments justify this with "the CRD enum rejects 'obo'" — but the defense-in-depth Go-level Validate() exists exactly to catch what the enum misses, and once #5329 widens the enum the gate is gone. The contract should be self-enforcing now so #5328 doesn't have to remember to rewire three independent sites correctly.
Two cross-cutting items: a new cross-layer import pkg/vmcp/auth/converters/obo.go -> cmd/thv-operator/pkg/controllerutil (no other converter in the package depends on operator internals — moving ErrEnterpriseRequired to a leaf package keeps direction clean), and a comment on obo.CreateMiddleware that says values are captured "at runner-construction time" when GetSupportedMiddlewareFactories() actually re-reads on every call (capture-once comes from NewRunner).
Test coverage is solid for the wrappers, absent for the actual new switch arms. A regression there would not break a test.
Meta finding (not inline because it's about the PR body)
Finding #10: The PR body's "Changes" table includes rows for cmd/thv-operator/app/app.go, main.go, and the proxyrunner equivalents described as "mechanical previously-landed refactor". gh pr view shows 14 files in the diff; those four are not among them. Remove the stale row — it invites confusion for reviewers.
Generated with Claude Code — multi-agent consensus review.
|
Addressed Finding #10 (and gave the PR body a broader refresh while I was in there): removed the four stale Two commits in this round, addressing the review:
Deferred to #5328 with rationale on each thread: F1, F2 (both arms), F3, F11. Deferred F15 (information disclosure) with a follow-up note. F10 (PR body) handled by the body edit above.
|
Addresses #5338 review comments: - MEDIUM pkg/auth/obo/middleware.go (3270092569): clarify stubMessage doc comment — the constant is unexported and shared with same-package tests. - MEDIUM pkg/vmcp/auth/converters/obo.go (3270092576): move ErrEnterpriseRequired from cmd/thv-operator/pkg/controllerutil to pkg/auth/obo so vmcp converters no longer import operator internals. - MEDIUM cmd/thv-operator/api/v1beta1/mcpexternalauthconfig_types.go (3270092583): existing TODO already covers the biconditional + the unauthenticated check; refine the OBO Validate() arm wording to make the contract (delegated to the controllerutil layer, CRD-level Validate stays a no-op) explicit. - LOW pkg/vmcp/auth/converters/obo.go (3270092595): rename OBOConverterStub -> OBOConverter for parity with sibling converters (TokenExchangeConverter, AwsStsConverter, ...). - LOW pkg/auth/obo/middleware_test.go (3270092599): replace the body-substring OR with an exact-equal check against stubMessage+"\n", locking the cross-file contract. - LOW cmd/thv-operator/pkg/controllerutil/tokenexchange.go (3270092604): drop the developer-instruction prose from ErrEnterpriseRequired's user-visible message; the registration instruction now lives in the Go doc comment, where it belongs. - LOW cmd/thv-operator/pkg/controllerutil/tokenexchange.go (3270092611): normalize TODO(#5328) format across the three sites referencing the dispatch-wiring follow-up so `git grep TODO\(#5328\)` finds them all. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses #5338 review comments: - MEDIUM cmd/thv-operator/pkg/controllerutil/tokenexchange.go (3270092563): guard the package-level OBO handler with a sync.RWMutex so a future hot-reload or admission-webhook re-registration cannot race in-flight dispatch reads. All wrapper reads now snapshot the handler under RLock; RegisterOBOHandler takes the write lock. - MEDIUM pkg/auth/obo/middleware.go (3270092580): turn CreateMiddleware into a stable redirector backed by a mutex-protected currentFactory. The "captured at runner-construction time" caveat no longer applies — every CreateMiddleware call reads the current factory under RLock — and the doc comment is rewritten to match the new semantics. Adds obo.DefaultFactory so external test code can restore the package default in a t.Cleanup. The pkg/runner contract test is updated to exercise the call-after-register path that was previously unreachable. - MEDIUM cmd/thv-operator/pkg/controllerutil/tokenexchange.go (3270092585): RegisterOBOHandler now panics if any of the three function fields is nil. A partial registration would have nil-derefed deep inside dispatch at reconcile time, far from the call site; surfacing the problem at init() is easier to diagnose. Sibling fix in obo.RegisterFactory rejects a nil factory the same way. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jhrozek
left a comment
There was a problem hiding this comment.
Clean scaffolding PR — dispatch surface for OBO looks solid, tests cover the security-critical paths (503 status locked, downstream never invoked, last-write-wins, panic-on-nil), and the CRD enum genuinely stays closed until #5329. Approving. Two non-blocking questions inline.
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Stand up the upstream default stub infrastructure for the new `obo` external auth type at three dispatch layers so an out-of-tree build can register a real OBO implementation by calling three setter functions while an upstream-only build hits the defaults and surfaces a sentinel error at every dispatch point. Implements changes for issue #5327: - Add ExternalAuthTypeOBO ExternalAuthType = "obo" constant to the CRD types file. CRD enum admission, OBOConfig struct and CEL rule remain out of scope and land in the follow-up CRD-admission task. - Add controllerutil.OBOHandler struct, ErrEnterpriseRequired sentinel, RegisterOBOHandler setter and OBOValidate / OBOSecretEnvVars wrappers to cmd/thv-operator/pkg/controllerutil/tokenexchange.go. Default handler returns ErrEnterpriseRequired from every method. - Add pkg/auth/obo package exporting MiddlewareType = "obo", CreateMiddleware factory variable and RegisterFactory setter. Default factory installs a stub Middleware whose Handler responds 503 with a body explaining the OBO middleware factory has not been registered. - Wire obo.MiddlewareType -> obo.CreateMiddleware into the 17-entry middleware-factory map in pkg/runner/middleware.go. - Add OBOConverterStub to pkg/vmcp/auth/converters and register it as the built-in StrategyConverter for ExternalAuthTypeOBO inside NewRegistry. Every method returns an error wrapping ErrEnterpriseRequired so callers can match it via errors.Is. - Add no-op case ExternalAuthTypeOBO arms to existing switches on ExternalAuthType (Validate, AddExternalAuthConfigOptions, getExternalAuthConfigSecretEnvVar) to keep the exhaustive linter green. These arms preserve the existing default behavior; the actual dispatch wiring lives in the follow-up task. - Add unit tests covering every default stub behavior, last-write-wins semantics for both setters and the registered-converter lookup.
Fixed issues from code review: - HIGH: Lead each new case ExternalAuthTypeOBO arm comment with an explicit "Minimal no-op arm added to satisfy the exhaustive linter; dispatch wiring lands in follow-up task #5328" so reviewers see the scope rationale next to the code rather than only in the PR body. - HIGH: Wrap ErrEnterpriseRequired from the OBO arm of AddExternalAuthConfigOptions ("obo dispatch not yet wired: %w") so callers can use errors.Is to distinguish "type known but not wired" from a genuinely unknown auth type. - MEDIUM: Add a TODO(#5329) marker on validateTypeConfigConsistency pointing at the biconditional row and the unauthenticated check that must be updated when the CRD admission task introduces OBOConfig. - MEDIUM: Add OBOApplyRunConfig wrapper to symmetrize the public surface alongside OBOValidate and OBOSecretEnvVars, plus a sibling unit test covering the default sentinel path and dispatch through a registered handler. - MEDIUM: Add TestGetSupportedMiddlewareFactories_OBORegisterBeforeConstruction in pkg/runner/middleware_test.go that locks down the "register before construction" contract for obo.CreateMiddleware — a factory swapped before GetSupportedMiddlewareFactories runs must be the one captured in the returned map.
Adds the new "obo" external auth type entry to docs/operator/crd-api.md so the docs match the constant introduced in cmd/thv-operator/api/v1beta1/mcpexternalauthconfig_types.go. Output of "task crdref-gen". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses #5338 review comments: - MEDIUM pkg/auth/obo/middleware.go (3270092569): clarify stubMessage doc comment — the constant is unexported and shared with same-package tests. - MEDIUM pkg/vmcp/auth/converters/obo.go (3270092576): move ErrEnterpriseRequired from cmd/thv-operator/pkg/controllerutil to pkg/auth/obo so vmcp converters no longer import operator internals. - MEDIUM cmd/thv-operator/api/v1beta1/mcpexternalauthconfig_types.go (3270092583): existing TODO already covers the biconditional + the unauthenticated check; refine the OBO Validate() arm wording to make the contract (delegated to the controllerutil layer, CRD-level Validate stays a no-op) explicit. - LOW pkg/vmcp/auth/converters/obo.go (3270092595): rename OBOConverterStub -> OBOConverter for parity with sibling converters (TokenExchangeConverter, AwsStsConverter, ...). - LOW pkg/auth/obo/middleware_test.go (3270092599): replace the body-substring OR with an exact-equal check against stubMessage+"\n", locking the cross-file contract. - LOW cmd/thv-operator/pkg/controllerutil/tokenexchange.go (3270092604): drop the developer-instruction prose from ErrEnterpriseRequired's user-visible message; the registration instruction now lives in the Go doc comment, where it belongs. - LOW cmd/thv-operator/pkg/controllerutil/tokenexchange.go (3270092611): normalize TODO(#5328) format across the three sites referencing the dispatch-wiring follow-up so `git grep TODO\(#5328\)` finds them all. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses #5338 review comments: - MEDIUM cmd/thv-operator/pkg/controllerutil/tokenexchange.go (3270092563): guard the package-level OBO handler with a sync.RWMutex so a future hot-reload or admission-webhook re-registration cannot race in-flight dispatch reads. All wrapper reads now snapshot the handler under RLock; RegisterOBOHandler takes the write lock. - MEDIUM pkg/auth/obo/middleware.go (3270092580): turn CreateMiddleware into a stable redirector backed by a mutex-protected currentFactory. The "captured at runner-construction time" caveat no longer applies — every CreateMiddleware call reads the current factory under RLock — and the doc comment is rewritten to match the new semantics. Adds obo.DefaultFactory so external test code can restore the package default in a t.Cleanup. The pkg/runner contract test is updated to exercise the call-after-register path that was previously unreachable. - MEDIUM cmd/thv-operator/pkg/controllerutil/tokenexchange.go (3270092585): RegisterOBOHandler now panics if any of the three function fields is nil. A partial registration would have nil-derefed deep inside dispatch at reconcile time, far from the call site; surfacing the problem at init() is easier to diagnose. Sibling fix in obo.RegisterFactory rejects a nil factory the same way. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Picks up the BackendAuthStrategy.Type doc string change from b8d97ce (added "obo" to the strategy list). Output of "task operator-manifests" with no other edits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ce21a06 to
f6972f2
Compare
PR size has been reduced below the XL threshold. Thank you for splitting this up!
|
✅ PR size has been reduced below the XL threshold. The size review has been dismissed and this PR can now proceed with normal review. Thank you for splitting this up! |
Matches the sibling middleware packages (awssts, upstreamswap, oauthproto/tokenexchange) and makes RegisterFactory the single mutation path — direct assignment to CreateMiddleware is no longer possible. The redirector body is unchanged: every call still reads currentFactory under the package RWMutex, so out-of-tree builds that registered a factory continue to see their factory dispatched to. Addresses #5338 (discussion_r3274513656). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Picks up the BackendAuthStrategy.Type doc string change from 8680d3c that added "obo" to the strategy list. Output of "task crdref-gen" with no other edits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
This is the second of four flat technical tasks for the OSS on-behalf-of (OBO) work. It stands up the default stub infrastructure for a new
oboexternal auth type at the three dispatch layers (operator, vMCP converter, proxy runtime), plus the Go constant the other layers reference. After this change, an out-of-tree build can register a real OBO implementation by calling three setter functions; an upstream-only build hits the defaults and surfaces a sentinelobo.ErrEnterpriseRequirederror at every dispatch point.The CRD enum is not widened in this PR —
spec.type: obois still rejected at the apiserver layer. The constant, handler hooks, middleware-factory entry, and converter stub live as dark infrastructure that activates the moment the CRD admission task (#5329) merges.ExternalAuthTypeOBOconstant inmcpexternalauthconfig_types.goand minimal no-opexhaustivelinter arms.pkg/auth/obo/package — leaf home forErrEnterpriseRequired,MiddlewareType, the defaultCreateMiddlewareredirector (503 stub),DefaultFactory, andRegisterFactory. Wires one new entry intopkg/runner/middleware.go.OBOHandlerstruct,RegisterOBOHandlersetter (panics on partial registration), andOBOValidate/OBOSecretEnvVars/OBOApplyRunConfigwrappers incontrollerutil/tokenexchange.go. Reads/writes guarded by async.RWMutex.OBOConverterinpkg/vmcp/auth/converters/obo.goand registers it as a built-in inNewRegistry().StrategyTypeOBOconstant inpkg/vmcp/auth/types/types.go.Closes #5327
Type of change
Test plan
task test)task lint-fix)go test -race ./pkg/auth/obo/... ./cmd/thv-operator/pkg/controllerutil/... ./pkg/runner/...)API Compatibility
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.The new
ExternalAuthTypeOBOGo constant is purely additive and is not yet present in the CRD enum, so the apiserver behavior is unchanged.Changes
cmd/thv-operator/api/v1beta1/mcpexternalauthconfig_types.goExternalAuthTypeOBOconstant; add no-opValidate()switch arm withTODO(#5328)pointer for the reconcile-loop dispatch site andTODO(#5329)pointer for the biconditional/unauthenticated checkspkg/auth/obo/errors.goErrEnterpriseRequired(the cross-layer sentinel imported by both operator and vMCP code paths)pkg/auth/obo/middleware.goMiddlewareTypeconstant,CreateMiddlewareas a stable redirector overcurrentFactory(guarded bysync.RWMutex),DefaultFactory(exported so external test code can restore default behavior), andRegisterFactory(panics on nil factory)pkg/auth/obo/middleware_test.gostubMessage), default factory's runner registration, last-write-wins re-registration, panic-on-nilpkg/auth/obo/errors_test.goerrors.Isworks throughfmt.Errorf("...: %w", ...)cmd/thv-operator/pkg/controllerutil/tokenexchange.goOBOHandlerstruct,RegisterOBOHandler(panics on any nil func field),OBOValidate/OBOSecretEnvVars/OBOApplyRunConfigwrappers;sync.RWMutex-guardedoboHandlerpackage state;ExternalAuthTypeOBOarm inAddExternalAuthConfigOptionsreturnsobo.ErrEnterpriseRequired(dispatch through the handler hook lands in #5328)cmd/thv-operator/pkg/controllerutil/tokenexchange_test.gocmd/thv-operator/controllers/virtualmcpserver_deployment.goExternalAuthTypeOBOarm togetExternalAuthConfigSecretEnvVarswitch withTODO(#5328)pointer for the dispatch rewritepkg/runner/middleware.goobo.MiddlewareType: obo.CreateMiddlewareentry to the factory mappkg/runner/middleware_test.gooboentry; lock the call-after-register contract forobo.CreateMiddlewarepkg/vmcp/auth/converters/obo.goOBOConverterimplementingStrategyConverterwithobo.ErrEnterpriseRequired-wrapped errorspkg/vmcp/auth/converters/obo_test.goerrors.Ispropagationpkg/vmcp/auth/converters/interface.goOBOConverteras built-in inNewRegistry()pkg/vmcp/auth/converters/registry_test.goOBOConverteris registeredpkg/vmcp/auth/types/types.goStrategyTypeOBOconstant; extendBackendAuthStrategy.Typedoc to list"obo"docs/operator/crd-api.mdoboenum valueDoes this introduce a user-facing change?
No. The CRD enum still rejects
spec.type: obo, so no user-visible behavior changes in upstream-only builds. Once the CRD admission task (#5329) lands, an upstream-only build will surfaceobo.ErrEnterpriseRequiredasstatus.conditions[Valid] = FalsewithReason: EnterpriseRequiredforobo-typed resources.Special notes for reviewers
case ExternalAuthTypeOBOarms inAddExternalAuthConfigOptionsandgetExternalAuthConfigSecretEnvVarare minimal placeholders to satisfy theexhaustivelinter. The next task (Wire OBO dispatch arms and reconciler branch; add integration tests #5328) replaces them with real dispatch intoOBOApplyRunConfig/OBOSecretEnvVars, and adds the reconcile-loop branch that mapsobo.ErrEnterpriseRequiredtoReason: EnterpriseRequired. Every site that Wire OBO dispatch arms and reconciler branch; add integration tests #5328 needs to touch is tagged withTODO(#5328):for grep-discoverability.obo.CreateMiddlewareis a stable redirector: calls dispatch throughcurrentFactoryunder anRWMutexon every invocation, soRegisterFactoryafter thepkg/runnermiddleware map is built still takes effect. This removes the captured-at-construction-time fragility that an earlier revision of this PR carried (review feedback).RegisterOBOHandlerpanics if any ofValidate/ApplyRunConfig/SecretEnvVarsis nil;obo.RegisterFactorypanics on a nil factory. Both panic at init() time in real builds, so a misconfigured out-of-tree wiring surfaces at process start, not at reconcile time.RegisterOBOHandler,obo.RegisterFactory, and theRegistry.Registermap for the converter) accept double-registration without panic, matching the existingpkg/config.RegisterProviderFactoryprecedent. Tests cover this.Generated with Claude Code