Mirror MCPExternalAuthConfig Valid=False onto consumer CR conditions#5354
Mirror MCPExternalAuthConfig Valid=False onto consumer CR conditions#5354tgrunnagle wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5354 +/- ##
==========================================
- Coverage 68.68% 68.65% -0.03%
==========================================
Files 625 626 +1
Lines 63298 63365 +67
==========================================
+ Hits 43474 43505 +31
- Misses 16583 16619 +36
Partials 3241 3241 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
PR size has been reduced below the XL threshold. Thank you for splitting this up!
|
✅ PR size has been reduced below the XL threshold. The size review has been dismissed and this PR can now proceed with normal review. Thank you for splitting this up! |
tgrunnagle
left a comment
There was a problem hiding this comment.
Multi-agent review summary
Four specialist agents (k8s-conditions, go-style, tests, general-quality) reviewed the diff. Codex cross-review skipped — codex CLI not installed on the reviewer host.
Consensus summary
| # | Severity | File | Title |
|---|---|---|---|
| F1 | HIGH | mcpserver_controller.go |
MCPServer leaks mirrored False when ExternalAuthConfigRef is removed from spec |
| F2 | MEDIUM | mcpremoteproxy_controller.go |
MCPRemoteProxy mirror lacks defensive clear-on-heal |
| F3 | MEDIUM | virtualmcpserver_externalauth_test.go |
Wrapped-error round-trip through buildOutgoingAuthConfig not exercised |
| F4 | MEDIUM | mcpserver_controller.go |
Asymmetric receiver vs. free-function design between parallel mirror helpers |
| F5 | LOW | mcpserver_controller.go |
Doubled MCPExternalAuthConfig prefix in error message |
| F6 | LOW | external_auth_mirror.go |
Empty source Reason copied verbatim onto consumer condition |
| F7 | LOW | external_auth_mirror.go |
Probe doc comment is verbose; rationale duplicated across file |
| F8 | LOW | mcpremoteproxy_controller_test.go |
MCPRemoteProxy mirror test does not validate message propagation |
| F9 | LOW | mcpserver_externalauth_test.go |
ObservedGeneration is set but not asserted in tests |
| F10 | LOW | mcpserver_externalauth_test.go |
Remove ref and source NotFound heal edges not tested (related to F1) |
Holistic assessment
What the PR does well
- Closes a real Status Condition Parity gap targeted by
.claude/rules/operator.md. - Shared probe + typed error in
external_auth_mirror.gokeeps the per-reconciler call sites small and uniform. - Good unit coverage on the three consumer reconcilers (AC1–AC3), including a regression guard for MCPServer heal-via-source-flip.
errors.Asround-trip throughmirroredReasonFromErroris the right idiom for threading the source reason across the vMCPAuthConfigErroraggregator.ObservedGenerationis set correctly on every new condition write.
Risks / trade-offs to weigh
- MCPServer.handleExternalAuthConfig returns early when the ref is removed without clearing the mirrored False, so the condition outlives its cause in that healing scenario. This is the one issue that should be addressed before merge — it directly contradicts the PR's stated heal-path intent.
- MCPRemoteProxy heal correctness depends on a downstream True-writer being reachable on every code path, vs. MCPServer's defensive clear-on-heal. Pick one model and apply it to both.
- Two near-identical helpers have asymmetric shapes — method on
*MCPServerReconcilerwith unused receiver vs. free function forMCPRemoteProxy. - The
mirroredReasonFromErrortest does not validate theerrors.Aswalk through thefmt.Errorf("... %w", err)wrap that production code actually applies inbuildOutgoingAuthConfig— a future%w→%vregression would not be caught.
Recommendation
Fix F1 (one-line meta.RemoveStatusCondition in the no-ref branch of MCPServerReconciler.handleExternalAuthConfig plus a regression test). The MEDIUM findings (F2–F4) are worth addressing in the same iteration; the LOW findings are quality follow-ups the author can accept or defer.
AC traceability vs. #5347
- AC1 (MCPServer) — addressed, but heal-path-on-ref-removal is incomplete (F1).
- AC2 (MCPRemoteProxy) — addressed; heal contract is fragile (F2).
- AC3 (VirtualMCPServer) — addressed; reason-propagation correctness depends on wrapped-error contract that lacks a test (F3).
- AC4 (envtest integration) — explicitly deferred to #5329 per PR body; out of scope.
- AC5 (merge ordering) — process item; out of code-review scope.
Generated with Claude Code.
Addresses #5354 review comments: - HIGH cmd/thv-operator/controllers/mcpserver_controller.go (F1): the previous mirror added in this branch handled the source-Valid-flips-True heal path, but the no-ref early-return in handleExternalAuthConfig never reached the helper, so removing the spec.externalAuthConfigRef left a stale Valid=False/EnterpriseRequired on the consumer indefinitely. Clear the condition in the no-ref branch and in both NotFound branches so the mirror does not outlive its cause. - MEDIUM cmd/thv-operator/controllers/mcpremoteproxy_controller.go (F2): mirror helper now defensively clears its own condition on the healed path, matching MCPServer. The downstream True-writer at the end of handleExternalAuthConfig still sets the ConditionReasonMCPRemoteProxyExternalAuthConfigValid reason on success; the defensive clear closes the gap if a future control-flow change adds an early return between this helper and that writer. - LOW cmd/thv-operator/controllers/mcpserver_externalauth_test.go (F10): two new regression-guard tests for the heal paths uncovered by F1 — spec.externalAuthConfigRef removed, referenced source NotFound. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses #5354 review comments: - MEDIUM cmd/thv-operator/controllers/virtualmcpserver_externalauth_test.go (F3): add a wrapped-error round-trip assertion so a future %w -> %v regression in buildOutgoingAuthConfig can't silently break per-backend reason propagation. - MEDIUM cmd/thv-operator/controllers/mcpserver_controller.go (F4): move both consumer mirror functions to free functions in external_auth_mirror.go (mirrorInvalidOnMCPServer / mirrorInvalidOnRemoteProxy) instead of one method with an unused receiver and one suffixed free function. Single canonical home. - LOW cmd/thv-operator/controllers/mcpserver_controller.go (F5): drop the "referenced MCPExternalAuthConfig is invalid" prefix from the typed error's Error(); the outer wrap on each consumer carries the namespaced name. Rendered messages now read "MCPExternalAuthConfig ns/name: invalid (Reason): Message". - LOW cmd/thv-operator/controllers/external_auth_mirror.go (F6): substitute a fallback reason when the source surfaces Valid=False with an empty Reason so the apiserver patch cannot fail validation on a corrupt/partial source status. Add a TestMirroredExternalAuthConfigInvalid subtest for the empty-Reason input. - LOW cmd/thv-operator/controllers/external_auth_mirror.go (F7): trim the probe doc; package-level comment carries the #5347 motivation once, the per-function docs are now contract-only. - LOW cmd/thv-operator/controllers/mcpremoteproxy_controller_test.go (F8): add expectedCondMessage to the test table and assert it on the OBO mirror row. - LOW cmd/thv-operator/controllers/mcpserver_externalauth_test.go (F9): set non-zero Generation on the test fixtures and assert ObservedGeneration on the mirrored condition. Same gap covered on the MCPRemoteProxy row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the Status Condition Parity gap noted in #5347. Before this change, an obo-typed MCPExternalAuthConfig surfaced Valid=False/EnterpriseRequired only on the referenced resource. The consumer CR (MCPServer, MCPRemoteProxy, VirtualMCPServer) showed only the generic dispatch failure, leaving users to inspect the referenced config to understand why their workload would not deploy. Each consumer reconciler now inspects the referenced config's Valid condition. When it is False, the reconciler mirrors the source's reason and message onto its own ExternalAuthConfigValidated condition (or per-backend DiscoveredAuthConfig/BackendAuthConfig condition for vMCP). This must merge before #5329 lights up apiserver-level admission of obo-typed configs so production users see the failure on the resource they applied. The envtest integration test for this propagation is deferred to #5329 to avoid adding a setup-envtest dependency here; once #5329 admits "obo" at the CRD layer, the test no longer needs a CRD-enum bypass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two changes from the code review on the previous commit: 1. (correctness) MCPServer.handleExternalAuthConfig now clears any stale ExternalAuthConfigValidated=False the mirror previously wrote when the referenced source has healed (Valid=True or absent). Without this, the False condition outlived its cause: the user fixed their spec but the mirror stayed sticky because MCPServer has no downstream True setter on this condition. Adds a regression-guard test case. 2. (parity) The mirror probe, typed error, and reason-extraction helper now live in a shared external_auth_mirror.go file instead of being duplicated across the three consumer reconcilers. All three reconcilers (MCPServer, MCPRemoteProxy, VirtualMCPServer) call the same primitive and surface the same typed error so downstream consumers like buildOutgoingAuthConfig can react without string-matching. MCPRemoteProxy already sets ExternalAuthConfigValidated=True downstream and so does not need the heal-clearing branch — its existing True writer overwrites any stale mirror. VirtualMCPServer heals naturally because setAuthConfigConditions removes stale per-backend conditions on each reconcile. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two changes from the second round of code review feedback on the consumer-CR mirror: 1. mirroredExternalAuthConfigInvalid now returns the typed *mirroredInvalidExternalAuthConfigError directly (nil when the source is healed) instead of a (reason, error) tuple. Callers no longer need errors.As to recover the message field, and the pointer-or-nil shape removes the awkward "discard the bool" pattern at the MCPServer and MCPRemoteProxy call sites. The typed value still satisfies the error interface so vMCP's convertBackendAuthConfigToVMCP can keep passing it through opaque error-returning APIs and recover the reason later via mirroredReasonFromError. 2. MCPServer and MCPRemoteProxy's outer error wrap no longer repeats the typed error's "referenced MCPExternalAuthConfig is invalid" prefix, so messages render as a single sentence instead of a doubled phrase. The namespaced name is preserved through %w. Updates the mirror probe's unit test to the new return shape and adds a round-trip assertion that the typed pointer flows through mirroredReasonFromError. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses #5354 review comments: - HIGH cmd/thv-operator/controllers/mcpserver_controller.go (F1): the previous mirror added in this branch handled the source-Valid-flips-True heal path, but the no-ref early-return in handleExternalAuthConfig never reached the helper, so removing the spec.externalAuthConfigRef left a stale Valid=False/EnterpriseRequired on the consumer indefinitely. Clear the condition in the no-ref branch and in both NotFound branches so the mirror does not outlive its cause. - MEDIUM cmd/thv-operator/controllers/mcpremoteproxy_controller.go (F2): mirror helper now defensively clears its own condition on the healed path, matching MCPServer. The downstream True-writer at the end of handleExternalAuthConfig still sets the ConditionReasonMCPRemoteProxyExternalAuthConfigValid reason on success; the defensive clear closes the gap if a future control-flow change adds an early return between this helper and that writer. - LOW cmd/thv-operator/controllers/mcpserver_externalauth_test.go (F10): two new regression-guard tests for the heal paths uncovered by F1 — spec.externalAuthConfigRef removed, referenced source NotFound. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses #5354 review comments: - MEDIUM cmd/thv-operator/controllers/virtualmcpserver_externalauth_test.go (F3): add a wrapped-error round-trip assertion so a future %w -> %v regression in buildOutgoingAuthConfig can't silently break per-backend reason propagation. - MEDIUM cmd/thv-operator/controllers/mcpserver_controller.go (F4): move both consumer mirror functions to free functions in external_auth_mirror.go (mirrorInvalidOnMCPServer / mirrorInvalidOnRemoteProxy) instead of one method with an unused receiver and one suffixed free function. Single canonical home. - LOW cmd/thv-operator/controllers/mcpserver_controller.go (F5): drop the "referenced MCPExternalAuthConfig is invalid" prefix from the typed error's Error(); the outer wrap on each consumer carries the namespaced name. Rendered messages now read "MCPExternalAuthConfig ns/name: invalid (Reason): Message". - LOW cmd/thv-operator/controllers/external_auth_mirror.go (F6): substitute a fallback reason when the source surfaces Valid=False with an empty Reason so the apiserver patch cannot fail validation on a corrupt/partial source status. Add a TestMirroredExternalAuthConfigInvalid subtest for the empty-Reason input. - LOW cmd/thv-operator/controllers/external_auth_mirror.go (F7): trim the probe doc; package-level comment carries the #5347 motivation once, the per-function docs are now contract-only. - LOW cmd/thv-operator/controllers/mcpremoteproxy_controller_test.go (F8): add expectedCondMessage to the test table and assert it on the OBO mirror row. - LOW cmd/thv-operator/controllers/mcpserver_externalauth_test.go (F9): set non-zero Generation on the test fixtures and assert ObservedGeneration on the mirrored condition. Same gap covered on the MCPRemoteProxy row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
01aae22 to
c9d0a70
Compare
Summary
Closes #5347.
Today, when an
MCPExternalAuthConfigsurfacesValid=False(for example,an
obo-typed config that the default OBO handler rejects withReason=EnterpriseRequiredin upstream-only builds), the failure is visibleonly on the referenced
MCPExternalAuthConfigresource. The consumer CR(
MCPServer,MCPRemoteProxy,VirtualMCPServer) reports only a genericdispatch error, so users must inspect a different object to learn why their
workload will not deploy. This violates the Status Condition Parity rule in
.claude/rules/operator.mdand was deferred from #5345 as cross-cutting work.This PR closes that parity gap:
external_auth_mirror.goprimitive(
mirroredExternalAuthConfigInvalid,mirroredInvalidExternalAuthConfigError,mirroredReasonFromError) so all three consumer reconcilers probe the sourcethe same way and emit the same typed error.
MCPServermirrors the source's reason and message ontoExternalAuthConfigValidated=False, and clears any stale mirror once thesource heals (it has no downstream
Truesetter, so without the heal paththe
Falsewould persist after the user fixed the spec).MCPRemoteProxymirrors the same condition; it has an existing downstreamExternalAuthConfigValidated=Truewriter that overwrites any staleFalsenaturally, so no separate heal branch is needed.
VirtualMCPServermirrors the source reason onto each per-backendDiscoveredAuthConfig-<name>/BackendAuthConfig-<name>condition via anew
AuthConfigError.Reasonfield.setAuthConfigConditionsfalls back toConversionFailedwhen no source reason is present, preserving existingbehavior for non-mirror failures. vMCP heals naturally because
setAuthConfigConditionsrebuilds per-backend conditions on every reconcile.Type of change
Test plan
task test)task lint-fix)Added or extended unit coverage for every site touched:
mcpserver_externalauth_test.go— table-driven mirror behavior on theMCPServerreconciler coveringValid=False/EnterpriseRequired,Valid=False/InvalidConfig,Valid=True(no mirror), noValidcondition(no mirror), and the heal-clearing regression guard for a stale mirror left
by a prior reconcile.
mcpremoteproxy_controller_test.go— adds theValid=Falsemirror case tothe existing
TestHandleExternalAuthConfigtable on theMCPRemoteProxyreconciler.
virtualmcpserver_externalauth_test.go— exercises the discovered andinline backend paths, verifying the per-backend condition surfaces the
mirrored reason and that the
AuthConfigError.Reasonfield round-tripsthrough
convertBackendAuthConfigToVMCP->buildOutgoingAuthConfig->setAuthConfigConditions.virtualmcpserver_vmcpconfig_test.go— direct coverage ofsetAuthConfigConditionsreason-selection: mirrored reason when present,ConversionFailedfallback otherwise, across the default, discovered, andinline backend code paths.
API Compatibility
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.
This PR adds an unexported
Reasonfield to the in-processAuthConfigErrorstruct and reads existing
Status.Conditionson consumer CRs. No CRD schemachanges.
Changes
cmd/thv-operator/controllers/external_auth_mirror.gocmd/thv-operator/controllers/mcpserver_controller.goValid=FalseontoExternalAuthConfigValidated; clear stale mirror when the source heals.cmd/thv-operator/controllers/mcpremoteproxy_controller.goValid=FalseontoExternalAuthConfigValidated; downstreamTruesetter handles healing.cmd/thv-operator/controllers/virtualmcpserver_controller.goAuthConfigError.Reason; mirror in discover/convert paths; propagate viabuildOutgoingAuthConfig; surface insetAuthConfigConditions.cmd/thv-operator/controllers/mcpserver_externalauth_test.gocmd/thv-operator/controllers/mcpremoteproxy_controller_test.goValid=Falsemirror case.cmd/thv-operator/controllers/virtualmcpserver_externalauth_test.gocmd/thv-operator/controllers/virtualmcpserver_vmcpconfig_test.gosetAuthConfigConditions.Does this introduce a user-facing change?
Yes. When a consumer CR references an
MCPExternalAuthConfigwhoseValidcondition is
False, the consumer's ownStatus.Conditionsnow carries aparallel
Falsecondition with the same reason (for exampleEnterpriseRequired) and message. Users no longer need to inspect adifferent object to diagnose the failure on the workload they applied.
Special notes for reviewers
Stacked PR. This PR is the next in a stacked series. The upstream PRs in
the stack are #5332, #5338, and #5345 — PR #5345 is the immediate predecessor
and shares branch history. Until #5345 merges to
main, the diff againstmainwill include those earlier commits. The in-scope commits for this PRare the three HEAD-most:
Mirror MCPExternalAuthConfig Valid=False onto consumer CR conditionsAddress code review feedback on consumer-CR mirrorTighten mirror probe API; trim duplicated error prefixPlease review after #5345 merges so the diff narrows to just this PR's scope.
Deferred AC4 (envtest integration tests). Issue #5347's AC4 calls for
integration tests under
cmd/thv-operator/test-integration/mcp-external-auth/that verify propagation end-to-end against envtest with a CRD-enum bypass.
Per in-session direction we deferred this to #5329 to avoid adding a
setup-envtestdependency here. Once #5329 admitsoboat the CRD layer,the integration test no longer needs a CRD-enum bypass and is substantially
simpler to write. Unit-level coverage in this PR exercises every consumer
reconciler's mirror behavior, the heal path on
MCPServer, the sharedprobe, the reason-propagation helpers, and the per-backend condition wiring
in
setAuthConfigConditions.Known follow-up:
MCPServerEntry.MCPServerEntry(the per-backendentry inside
VirtualMCPServer) has the same parity gap, but it isout-of-scope for #5347's explicit AC list
(
MCPServer/MCPRemoteProxy/VirtualMCPServer). Suggest filing a separateissue to track the entry-level mirror; not filing from this PR.
Merge ordering. This PR must merge before #5329 so production users see
the
EnterpriseRequiredfailure on the resource they applied (the consumerCR) the moment apiserver-level admission of
obo-typed configs is lit up.