Skip to content

make imageproxyserver handle other registry names#276

Open
atd9876 wants to merge 9 commits intomainfrom
275-make-imageproxyserver-handle-other-registry-names
Open

make imageproxyserver handle other registry names#276
atd9876 wants to merge 9 commits intomainfrom
275-make-imageproxyserver-handle-other-registry-names

Conversation

@atd9876
Copy link

@atd9876 atd9876 commented Feb 23, 2026

Proposed Changes

Replace registry-specific handlers (handleGHCR, handleKeppel) with a generic handleDockerRegistry() that works with any OCI-compliant registry

Add multi-layer registry validation to prevent unauthorised registry access before expensive OCI operations

Layer 1: Controller validation before image inspection (early failure with clear errors)
Layer 2: ImageProxy runtime validation (security boundary at HTTP request time)

Create shared registry validation package

  • ExtractRegistryDomain() - Extracts registry from OCI image references (handles Docker Hub implicit refs)
  • IsRegistryAllowed() - Checks against ALLOWED_REGISTRIES/BLOCKED_REGISTRIES environment variables
  • ValidateImageRegistry() - Validates image refs with descriptive error messages

Add early validation to PXE and HTTP Boot controllers before calling docker.NewResolver().Resolve() to provide clear error messages

Fix missing error state patching in HTTP Boot controller to match PXE patterns

Implement fail-closed security model - deny all registries if neither ALLOWED_REGISTRIES nor BLOCKED_REGISTRIES is configured

Fixes #275

Summary by CodeRabbit

  • New Features

    • Registry allowlist/blocklist enforcement for image access with explicit rejection for disallowed registries.
    • Unified, registry-aware image proxy with authentication detection, token handling, caching, and improved redirect/header handling.
    • Improved image reference parsing and more reliable cross-architecture image resolution with clearer error status on failures.
  • Tests

    • Comprehensive registry validation unit tests, an in-memory mock OCI registry for test isolation, and test helpers for image references.

@atd9876 atd9876 linked an issue Feb 23, 2026 that may be closed by this pull request
@atd9876 atd9876 self-assigned this Feb 23, 2026
@atd9876 atd9876 force-pushed the 275-make-imageproxyserver-handle-other-registry-names branch from 6dd7302 to 789a6c3 Compare February 23, 2026 15:26
@atd9876 atd9876 force-pushed the 275-make-imageproxyserver-handle-other-registry-names branch from b24adab to 18ae968 Compare February 24, 2026 16:35
@atd9876 atd9876 force-pushed the 275-make-imageproxyserver-handle-other-registry-names branch from d8f6bd1 to d83197e Compare February 24, 2026 16:55
@github-actions github-actions bot added size/XXL and removed size/XL labels Feb 24, 2026
@atd9876 atd9876 force-pushed the 275-make-imageproxyserver-handle-other-registry-names branch from f5650e7 to b446367 Compare February 25, 2026 08:45
@atd9876 atd9876 requested a review from a team February 25, 2026 08:54
@atd9876 atd9876 moved this to In Progress in Roadmap Feb 25, 2026
@atd9876 atd9876 marked this pull request as ready for review February 25, 2026 08:59
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 25, 2026

Walkthrough

Adds a registry Validator, multi-registry-aware image proxy with auth detection and caching, controller helpers for parsing/resolving OCI images, an in-memory test registry, and wires the Validator into PXE/HTTP reconcilers and the image proxy. Tests and Dockerfile updated to include the new registry package.

Changes

Cohort / File(s) Summary
Build
Dockerfile
Include internal/registry/ in image via additional COPY.
CLI / bootstrap
cmd/main.go
Instantiate registry.Validator, log config, wire into PXE/HTTP reconcilers, pass to image proxy server.
Controller helpers
internal/controller/serverbootconfig_helpers.go, internal/controller/serverbootconfig_helpers_test.go
Add Parse/Build image ref, FindManifestByArchitecture, ExtractServerNetworkIDs, EnqueueServerBootConfigsReferencingSecret, fetchContent; tests for BuildImageReference.
PXE reconciler
internal/controller/serverbootconfiguration_pxe_controller.go, internal/controller/serverbootconfiguration_pxe_controller_test.go
Add RegistryValidator field; switch to ParseImageReference/FindManifestByArchitecture for image resolution; remove nested-manifest traversal; tests use MockImageRef.
HTTP reconciler
internal/controller/serverbootconfiguration_http_controller.go, internal/controller/serverbootconfiguration_http_controller_test.go
Add RegistryValidator field; use ParseImageReference/FindManifestByArchitecture for UKI resolution; patch status to Error on UKI URL failures; tests use MockImageRef.
Controller suite / tests
internal/controller/suite_test.go
Add mock registry, set ALLOWED_REGISTRIES, add MockImageRef, disable metrics bind, wire RegistryValidator into reconcilers, use runtime.GOARCH.
Registry validation
internal/registry/validation.go, internal/registry/validation_test.go
New Validator (NewValidator, ExtractRegistryDomain, IsRegistryAllowed, ValidateImageRegistry) reading ALLOWED_/BLOCKED_REGISTRIES with comprehensive tests.
Image proxy server
server/imageproxyserver.go
Refactor to unified Docker-registry proxy with auth detection/caching (AuthMethod, RegistryInfo), bearer-token support, request director/response modifier, include RegistryDomain in ImageDetails, enforce validator.IsRegistryAllowed; remove GHCR/Keppel-specific branches.
Test registry mock
test/registry/registry.go
Add in-memory OCI registry MockRegistry with manifests/blobs/indexes, httptest server, and PushPXEImage/PushHTTPImage helpers.
Dependencies
go.mod
Add github.com/distribution/reference and make github.com/opencontainers/go-digest a direct dependency.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Controller as PXE/HTTP<br/>Reconciler
    participant Validator as Registry<br/>Validator
    participant Registry as OCI<br/>Registry

    Client->>Controller: Trigger reconcile with ServerBootConfig (imageRef)
    Controller->>Validator: ValidateImageRegistry(imageRef)
    Validator->>Validator: ExtractRegistryDomain & check ALLOWED/BLOCKED
    alt Registry allowed
        Validator-->>Controller: nil
        Controller->>Registry: Resolve manifests (FindManifestByArchitecture)
        Registry-->>Controller: Manifest / blobs
        Controller-->>Client: Update status / boot details
    else Registry denied
        Validator-->>Controller: error
        Controller-->>Client: Reconcile fails (validation error)
    end
Loading
sequenceDiagram
    participant HTTPClient as Client
    participant ProxyServer as ImageProxy
    participant Cache as RegistryCache
    participant TargetRegistry as OCI<br/>Registry

    HTTPClient->>ProxyServer: Request image resource (imageRef)
    ProxyServer->>ProxyServer: Extract RegistryDomain & Repository
    ProxyServer->>ProxyServer: Validator.IsRegistryAllowed(registry)
    alt Registry denied
        ProxyServer-->>HTTPClient: 403 Forbidden
    else Registry allowed
        ProxyServer->>Cache: getOrDetectRegistry(registry, repo)
        alt cached
            Cache-->>ProxyServer: RegistryInfo
        else not cached
            ProxyServer->>TargetRegistry: Probe /v2/ (HEAD/GET)
            TargetRegistry-->>ProxyServer: 401 + WWW-Authenticate
            ProxyServer->>ProxyServer: parse auth (Bearer/None)
            alt Bearer
                ProxyServer->>TargetRegistry: Request token
                TargetRegistry-->>ProxyServer: Bearer token
            end
            Cache->>Cache: store RegistryInfo
            Cache-->>ProxyServer: RegistryInfo
        end
        ProxyServer->>TargetRegistry: Proxy request (with token if any)
        TargetRegistry-->>ProxyServer: manifest/blob
        ProxyServer->>ProxyServer: modify response (rewrite headers, follow redirects)
        ProxyServer-->>HTTPClient: proxied image data
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

enhancement, size/L

Suggested reviewers

  • hardikdr
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 55.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title concisely describes the main change: making imageproxyserver handle registry names beyond GHCR and Keppel through a generic handler approach.
Description check ✅ Passed The description fully details proposed changes, implementation layers, new components, and clearly links to issue #275, following the template structure with comprehensive coverage.
Linked Issues check ✅ Passed The PR implementation directly addresses issue #275 by replacing registry-specific handlers (handleGHCR, handleKeppel) with generic handleDockerRegistry() supporting any OCI-compliant registry.
Out of Scope Changes check ✅ Passed All changes align with the stated objectives: registry validation package, controller validations, imageproxy refactoring, test infrastructure, and helper utilities—no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 275-make-imageproxyserver-handle-other-registry-names

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (3)
internal/controller/suite_test.go (1)

109-113: Duplicate AddToScheme call — likely a merge artifact.

bootv1alpha1.AddToScheme(scheme.Scheme) is called on line 109 and again on lines 112–113. The second call is redundant (AddToScheme is idempotent) but should be removed to avoid confusion.

Suggested fix
 	Expect(bootv1alpha1.AddToScheme(scheme.Scheme)).NotTo(HaveOccurred())
 	Expect(metalv1alpha1.AddToScheme(scheme.Scheme)).NotTo(HaveOccurred())
-
-	err = bootv1alpha1.AddToScheme(scheme.Scheme)
-	Expect(err).NotTo(HaveOccurred())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/suite_test.go` around lines 109 - 113, Remove the
redundant bootv1alpha1.AddToScheme call and its associated err/Expect block: the
earlier Expect(bootv1alpha1.AddToScheme(scheme.Scheme)).NotTo(HaveOccurred())
already registers the scheme, so delete the later err =
bootv1alpha1.AddToScheme(scheme.Scheme) and Expect(err).NotTo(HaveOccurred())
lines to avoid the duplicate call and unused err variable while keeping the
initial AddToScheme checks (and the metalv1alpha1.AddToScheme) intact.
test/registry/registry.go (2)

134-147: architecture parameter is accepted but unused — no multi-arch index is created.

Both PushPXEImage and PushPXEImageOldFormat accept an architecture string but pushPXEManifest stores only a flat manifest, not an ocispec.Index with platform entries. This means the image index / multi-arch code path in getUKIDigestFromNestedManifest (checking MediaTypeImageIndex and platform.Architecture) is never exercised by tests.

If multi-arch testing is planned, consider creating an ocispec.Index wrapping per-arch manifests. If not needed now, at minimum drop the unused parameter to avoid confusion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/registry/registry.go` around lines 134 - 147, PushPXEImage and
PushPXEImageOldFormat accept an unused architecture parameter which prevents
exercising the multi-arch code path in getUKIDigestFromNestedManifest; either
remove the architecture parameter from these methods and all callers, or modify
MockRegistry.pushPXEManifest (called by PushPXEImage/PushPXEImageOldFormat) to
create an ocispec.Index that wraps per-architecture manifests instead of a flat
manifest: for each architecture build a manifest for
MediaTypeKernel/MediaTypeInitrd (or old types), add them to an ocispec.Index
with platform entries setting platform.Architecture to the provided architecture
string, and store the index so getUKIDigestFromNestedManifest can detect
MediaTypeImageIndex and platform.Architecture.

215-228: Digest is computed for a zero-value manifest on tag-miss before the existence check.

When the tag lookup on line 218 misses, manifest is a zero-value ocispec.Manifest. Lines 220–222 marshal and hash it before the !exists check on line 225. Moving the existence check earlier avoids unnecessary work and makes the logic clearer.

Suggested fix
 	} else {
 		// Look up by tag
 		imageRef := fmt.Sprintf("%s:%s", name, reference)
 		manifest, exists = r.manifests[imageRef]
+		if !exists {
+			http.Error(w, "manifest not found", http.StatusNotFound)
+			return
+		}
 		// Calculate digest for Content-Digest header
 		manifestBytes, _ := json.Marshal(manifest)
 		dgst := digest.FromBytes(manifestBytes)
 		manifestDigest = dgst.String()
 	}
 
-	if !exists {
-		http.Error(w, "manifest not found", http.StatusNotFound)
-		return
-	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/registry/registry.go` around lines 215 - 228, Move the existence check
above the manifest marshaling and digest computation to avoid hashing a
zero-value manifest when the tag lookup fails: after resolving manifest and
exists from r.manifests (the imageRef branch in the registry handler),
immediately return the 404 if !exists, and only then perform
json.Marshal(manifest), digest.FromBytes(manifestBytes) and set manifestDigest;
update references to manifestBytes/manifestDigest accordingly so they are only
computed when manifest actually exists.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/controller/serverbootconfiguration_http_controller.go`:
- Around line 221-227: The current nil-check in getUKIDigestFromNestedManifest
silently skips registry validation; change this to a fail-closed behavior by
returning an error if r.RegistryValidator is nil instead of proceeding, and
otherwise call r.RegistryValidator.ValidateImageRegistry(imageRef) and propagate
any error (use fmt.Errorf to wrap it); reference the
ServerBootConfigurationHTTPReconciler type, its RegistryValidator field, and the
ValidateImageRegistry method for locating the code; if you intentionally want
backward compatibility instead, add a clear comment above the nil check
explaining the rationale and intended rollout plan.
- Around line 202-218: The image ref parsing in constructUKIURL mis-splits
references with registry ports; replace the fragile LastIndex(":") logic with a
proper OCI reference parser (e.g., github.com/distribution/reference's
ParseNormalizedNamed / Reference.Parse) to reliably extract name and tag/digest,
or at minimum validate that the substring after the last ':' does not contain
'/' (treating it as a port) before using it as the tag; update constructUKIURL
to produce correct imageName and imageVersion values used when calling
getUKIDigestFromNestedManifest and when formatting ukiURL with r.ImageServerURL.

In `@internal/controller/serverbootconfiguration_pxe_controller.go`:
- Around line 226-233: The current tag parsing uses only strings.LastIndex on
config.Spec.Image which misparses registry-port forms like
"localhost:5000/repo"; update the logic to require the tag-colon to appear after
the last '/' (i.e. find lastSlash := strings.LastIndex(config.Spec.Image, "/")
and then find a colon after that position), and only split into imageName and
imageVersion when that colon exists and is after lastSlash; otherwise return the
"invalid image format: missing tag" error. Reference the symbols
config.Spec.Image, lastColon, imageName, imageVersion (and add lastSlash) when
implementing the fix.

In `@internal/controller/suite_test.go`:
- Around line 164-169: The HTTP reconciler instantiation is missing the
Architecture field, so set Architecture: runtime.GOARCH on the
ServerBootConfigurationHTTPReconciler (the same way it's set for the PXE
reconciler) before calling SetupWithManager; update the object literal that
constructs ServerBootConfigurationHTTPReconciler (fields: Client, Scheme,
ImageServerURL, RegistryValidator) to include Architecture: runtime.GOARCH so
r.Architecture is populated and multi-arch image index matching works.

In `@internal/registry/validation_test.go`:
- Around line 151-158: The subtests are currently only calling
t.Setenv("ALLOWED_REGISTRIES", tt.allowList) and t.Setenv("BLOCKED_REGISTRIES",
tt.blockList) when those values are non-empty, which lets ambient env values
leak into “empty config” cases; always call t.Setenv for both keys inside each
t.Run regardless of tt.allowList/tt.blockList so empty strings are explicitly
set, and apply the same unconditional t.Setenv changes to the other loop (the
one around lines 250–257) to prevent flakiness.

In `@internal/registry/validation.go`:
- Around line 27-41: The registry extraction functions (notably
ExtractRegistryDomain) currently return different strings for Docker Hub
equivalents (e.g., "registry-1.docker.io" vs "docker.io"), which breaks
exact-match policy checks; update ExtractRegistryDomain and the other
registry-extraction helpers in this file (the blocks around lines 44-55 and
59-66) to normalize aliases to a single canonical value (choose one, e.g.,
"docker.io") and apply normalization after computing potentialRegistry (also
handle hostless references by returning the canonical docker.io), ensure you
lowercase the result and map any known aliases like "registry-1.docker.io" and
"index.docker.io" to the chosen canonical name before returning so policy
evaluation uses a single consistent registry identifier.

In `@server/imageproxyserver.go`:
- Around line 277-281: Replace the current placeholder in the AuthBasic branch
(the "case AuthBasic" block in server/imageproxyserver.go) so the proxy reads
credentials from the configured Kubernetes Secret for the registry, builds an
Authorization header with "Basic " + base64(username:password), injects that
header into the outbound registry request, and proceeds instead of returning
StatusNotImplemented; log the registry domain and only return an error (401/502)
if the secret is missing or malformed. Ensure you reference the same secret
name/config used elsewhere in the service, decode the secret values safely,
create the header before the HTTP request is made, and preserve existing logging
semantics (use log.Info/log.Error) when credentials are missing or invalid.
- Around line 89-93: The registry auth cache is keyed only by registry so
different repository-scoped token URLs (built from realm, service, scope
variables) can collide; update the caching logic to key tokens by the full token
URL or include the repository/service/scope in the cache key instead of just the
registry. Locate the code that constructs the token request URL (uses realm,
service, scope) and the registry auth cache lookup/insert and change the key to
either the formatted URL returned by the token URL builder or a composite key of
registry+service+scope to ensure tokens are cached per scope.
- Around line 315-319: The buildModifyResponse function is leaking upstream
registry bearer tokens by copying bearerToken into response headers; remove the
resp.Header.Set("Authorization", "Bearer "+bearerToken) behavior and ensure any
Authorization header from the upstream response is removed
(resp.Header.Del("Authorization")) so client responses never contain registry
credentials; update buildModifyResponse to stop injecting the bearerToken and
strip Authorization before returning.
- Around line 97-103: The outbound HTTP calls in detectRegistryAuth (the initial
http.Get to targetURL), the subsequent http.Get(tokenURL), and the ad-hoc
&http.Client{} used before client.Do() must use explicit timeouts; replace plain
http.Get usages and the bare http.Client with a shared configured *http.Client
that has a sensible Timeout (or use a request-specific context with deadline)
and use that client for GET and Do calls (reference detectRegistryAuth, tokenURL
usage, and the client variable) so all external HTTP requests cannot hang
indefinitely.

---

Nitpick comments:
In `@internal/controller/suite_test.go`:
- Around line 109-113: Remove the redundant bootv1alpha1.AddToScheme call and
its associated err/Expect block: the earlier
Expect(bootv1alpha1.AddToScheme(scheme.Scheme)).NotTo(HaveOccurred()) already
registers the scheme, so delete the later err =
bootv1alpha1.AddToScheme(scheme.Scheme) and Expect(err).NotTo(HaveOccurred())
lines to avoid the duplicate call and unused err variable while keeping the
initial AddToScheme checks (and the metalv1alpha1.AddToScheme) intact.

In `@test/registry/registry.go`:
- Around line 134-147: PushPXEImage and PushPXEImageOldFormat accept an unused
architecture parameter which prevents exercising the multi-arch code path in
getUKIDigestFromNestedManifest; either remove the architecture parameter from
these methods and all callers, or modify MockRegistry.pushPXEManifest (called by
PushPXEImage/PushPXEImageOldFormat) to create an ocispec.Index that wraps
per-architecture manifests instead of a flat manifest: for each architecture
build a manifest for MediaTypeKernel/MediaTypeInitrd (or old types), add them to
an ocispec.Index with platform entries setting platform.Architecture to the
provided architecture string, and store the index so
getUKIDigestFromNestedManifest can detect MediaTypeImageIndex and
platform.Architecture.
- Around line 215-228: Move the existence check above the manifest marshaling
and digest computation to avoid hashing a zero-value manifest when the tag
lookup fails: after resolving manifest and exists from r.manifests (the imageRef
branch in the registry handler), immediately return the 404 if !exists, and only
then perform json.Marshal(manifest), digest.FromBytes(manifestBytes) and set
manifestDigest; update references to manifestBytes/manifestDigest accordingly so
they are only computed when manifest actually exists.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 63d11f6 and b446367.

📒 Files selected for processing (11)
  • Dockerfile
  • cmd/main.go
  • internal/controller/serverbootconfiguration_http_controller.go
  • internal/controller/serverbootconfiguration_http_controller_test.go
  • internal/controller/serverbootconfiguration_pxe_controller.go
  • internal/controller/serverbootconfiguration_pxe_controller_test.go
  • internal/controller/suite_test.go
  • internal/registry/validation.go
  • internal/registry/validation_test.go
  • server/imageproxyserver.go
  • test/registry/registry.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
server/imageproxyserver.go (2)

361-369: ⚠️ Potential issue | 🟡 Minor

replaceResponse overwrites headers with Set instead of Add, losing multi-valued redirect response headers.

Line 364 uses originalResp.Header.Set(name, value) inside the inner loop over values. Since Set replaces all previous values for that key, only the last value of each multi-valued header survives. This is likely a bug carried over from before, but it's worth noting.

🐛 Proposed fix
 func replaceResponse(originalResp, redirectResp *http.Response) {
+	// Clear original headers before copying redirect headers
+	for name := range originalResp.Header {
+		originalResp.Header.Del(name)
+	}
 	for name, values := range redirectResp.Header {
 		for _, value := range values {
-			originalResp.Header.Set(name, value)
+			originalResp.Header.Add(name, value)
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 361 - 369, In replaceResponse,
current header copying uses originalResp.Header.Set(name, value) inside the loop
which overwrites previous values and drops multi-valued headers; change the
header copy to use originalResp.Header.Add(name, value) when iterating
redirectResp.Header values so all header entries are preserved, while still
assigning originalResp.Body = redirectResp.Body and originalResp.StatusCode =
redirectResp.StatusCode as before.

321-326: ⚠️ Potential issue | 🟠 Major

Bearer token leaked to redirect targets via copyHeaders.

When the registry returns a 307 redirect (typically to S3/GCS/CDN for blob storage), copyHeaders(resp.Request.Header, redirectReq.Header) on line 325 copies all headers from the outbound proxy request—including Authorization: Bearer <token> set by buildDirector. This leaks registry credentials to third-party storage backends.

Standard OCI clients (e.g., Docker, containerd) strip the Authorization header when following cross-domain redirects. This proxy should do the same.

🔒 Proposed fix — strip Authorization on cross-host redirects
 			redirectReq, err := http.NewRequest("GET", location.String(), nil)
 			if err != nil {
 				return err
 			}
 			copyHeaders(resp.Request.Header, redirectReq.Header)
+			// Strip auth header when redirecting to a different host (e.g., CDN/S3)
+			if location.Host != resp.Request.URL.Host {
+				redirectReq.Header.Del("Authorization")
+			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 321 - 326, The redirect handling
currently copies all headers from resp.Request to redirectReq via copyHeaders,
leaking Authorization set by buildDirector to third-party redirect targets;
modify the logic around redirect handling (the block that creates redirectReq
and calls copyHeaders) to detect cross-host redirects by comparing
resp.Request.URL.Host (or resp.Request.Host) with redirectReq.URL.Host (or the
parsed location.Host) and, when they differ, remove or avoid copying the
"Authorization" header (and other sensitive auth headers if present) before
making the redirected request; keep Authorization intact for same-host
redirects.
♻️ Duplicate comments (1)
internal/controller/serverbootconfiguration_http_controller.go (1)

205-209: Same fix needed here as the PXE digest/validator path.

Line 206-Line 208 has the same name:version reconstruction and nil-dereference pattern; apply the same @digest handling and fail-closed nil guard here too.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/serverbootconfiguration_http_controller.go` around lines
205 - 209, In getUKIDigestFromNestedManifest, avoid reconstructing imageRef as
"name:version" that can drop an `@digest` and risk nil deref; instead detect and
preserve an existing digest (image@sha256:...) when building imageRef, and add a
nil-guard / fail-closed check after any registry manifest lookup so you return
an error if the manifest or digest is nil. Update logic around
RegistryValidator.ValidateImageRegistry and subsequent manifest handling to
prefer the `@digest` form and propagate errors when digest/manifest is missing
(mirror the fix applied in the PXE digest/validator path).
🧹 Nitpick comments (8)
test/registry/registry.go (1)

141-145: architecture parameter is unused in mock push helpers.

Line 141 and Line 149 accept architecture but ignore it, which can hide architecture-path test gaps. Either use it for index/platform fixtures or explicitly mark it unused.

♻️ Minimal clarity fix
-func (r *MockRegistry) PushPXEImage(name, tag, architecture string) error {
+func (r *MockRegistry) PushPXEImage(name, tag, _ string) error {
-func (r *MockRegistry) PushPXEImageOldFormat(name, tag, architecture string) error {
+func (r *MockRegistry) PushPXEImageOldFormat(name, tag, _ string) error {

Also applies to: 149-153

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/registry/registry.go` around lines 141 - 145, The MockRegistry push
helpers (e.g., MockRegistry.PushPXEImage) accept an architecture parameter but
never use it; either thread the architecture into the mock manifest/index
creation or explicitly mark it unused. Fix by updating MockRegistry.PushPXEImage
(and the other mock push helpers around the same area) to pass architecture into
pushPXEManifest (or alter pushPXEManifest to accept an architecture and set the
manifest/index Platform.Architecture accordingly) so test fixtures reflect the
platform, or if you prefer no behavioral change, explicitly consume the param
(e.g., _ = architecture) in each method to signal intentional omission.
internal/registry/validation_test.go (1)

68-149: Add a case-variant registry test to prevent policy regressions.

Please add a DOCKER.IO/... case (block-list mode) so case-sensitivity behavior is explicitly locked by tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/registry/validation_test.go` around lines 68 - 149, Add a test in
TestIsRegistryAllowed to verify case-variant matching for block-list behavior:
insert a test case into the tests slice (e.g. name "blocked with block list -
DOCKER.IO", registry "DOCKER.IO", blockList "docker.io,registry-1.docker.io",
allowList "", want false) so the test asserts that registry matching is
case-insensitive for Registry validation; modify only the tests slice in
TestIsRegistryAllowed and keep the rest of the table-driven test intact.
internal/controller/serverbootconfig_helpers.go (3)

63-77: Unnecessary double-unmarshal of manifest data.

The data is first unmarshaled into ocispec.Manifest (line 70), and if the descriptor is an index, it's immediately re-unmarshaled into ocispec.Index (line 81). Since ocispec.Index and ocispec.Manifest are different types, the first unmarshal is wasted work for the index path.

Consider deferring the unmarshal until after the media type check:

♻️ Suggested refactor
 func FindManifestByArchitecture(ctx context.Context, resolver remotes.Resolver, name string, desc ocispec.Descriptor, architecture string, enableCNAMECompat bool) (ocispec.Manifest, error) {
 	manifestData, err := fetchContent(ctx, resolver, name, desc)
 	if err != nil {
 		return ocispec.Manifest{}, fmt.Errorf("failed to fetch manifest data: %w", err)
 	}
 
-	var manifest ocispec.Manifest
-	if err := json.Unmarshal(manifestData, &manifest); err != nil {
-		return ocispec.Manifest{}, fmt.Errorf("failed to unmarshal manifest: %w", err)
-	}
-
 	// If not an index, return the manifest directly
 	if desc.MediaType != ocispec.MediaTypeImageIndex {
+		var manifest ocispec.Manifest
+		if err := json.Unmarshal(manifestData, &manifest); err != nil {
+			return ocispec.Manifest{}, fmt.Errorf("failed to unmarshal manifest: %w", err)
+		}
 		return manifest, nil
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/serverbootconfig_helpers.go` around lines 63 - 77,
FindManifestByArchitecture currently unmarshals manifestData into
ocispec.Manifest before checking desc.MediaType, causing a wasted unmarshal when
the descriptor is an index; change the logic to first check desc.MediaType
against ocispec.MediaTypeImageIndex (using the existing manifestData fetched by
fetchContent), and only unmarshal into ocispec.Index when it is an index and
into ocispec.Manifest when it is not; update subsequent uses of the unmarshaled
value to refer to the correctly typed variable (ocispec.Index or
ocispec.Manifest) instead of doing a double-unmarshal.

194-198: Use structured logging instead of fmt.Printf for the close error.

This is the only place in the file that uses fmt.Printf for error reporting. It bypasses the structured logger used elsewhere in the codebase. Consider accepting a logger or using slog, or at minimum switching to fmt.Fprintf(os.Stderr, ...) so it doesn't get lost in stdout.

♻️ Suggested refactor — propagate via context logger or just ignore

Since close errors on read-only streams are almost always benign, the simplest fix is to just drop the printf:

 	defer func() {
-		if cerr := reader.Close(); cerr != nil {
-			fmt.Printf("failed to close reader: %v\n", cerr)
-		}
+		_ = reader.Close()
 	}()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/serverbootconfig_helpers.go` around lines 194 - 198, The
defer that closes reader uses fmt.Printf which breaks structured logging;
replace that printf with the project's structured logger (e.g., use slog or the
function's context/logger parameter) or at minimum write to stderr—capture cerr
from reader.Close() and log it via the structured logger (e.g.,
slog.Error("failed to close reader", "err", cerr)) inside the existing defer
anonymous function that checks cerr; alternatively, if close errors are benign
here, simply drop the log call.

155-159: Passing nil as the error argument to log.Error.

log.Error(nil, "Failed to decode object into Secret", ...) violates the logr convention where the first argument should be a non-nil error or the call should use log.Info instead. Using nil makes structured log consumers (e.g., JSON loggers) emit a null error field.

♻️ Suggested fix
-	log.Error(nil, "Failed to decode object into Secret", "object", secret)
+	log.Info("Failed to decode object into Secret", "object", secret)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/serverbootconfig_helpers.go` around lines 155 - 159, The
call log.Error(nil, "Failed to decode object into Secret", "object", secret) is
passing nil as the error; change it so the logging follows logr conventions:
either construct a real error (e.g., fmt.Errorf with the unexpected type) and
pass it as the first argument to log.Error, or switch to log.Info with the same
message and context. Update the call around secret/secretObj handling (the block
that does secretObj, ok := secret.(*corev1.Secret)) to use one of these two
approaches and include the offending value (secret) in the structured context.
server/imageproxyserver.go (3)

96-101: New http.Client allocated per call to detectRegistryAuth and getBearerToken.

Lines 99 and 163 each create a fresh http.Client. Since http.Client manages an internal connection pool via its Transport, creating one per call defeats connection reuse. Extract a shared package-level client with the desired timeout.

♻️ Suggested refactor
+var httpClient = &http.Client{Timeout: 30 * time.Second}
+
 func detectRegistryAuth(registryDomain, repository string) (*RegistryInfo, error) {
 	targetURL := fmt.Sprintf("https://%s/v2/", registryDomain)
-	client := &http.Client{
-		Timeout: 30 * time.Second,
-	}
-	resp, err := client.Get(targetURL)
+	resp, err := httpClient.Get(targetURL)

Apply the same change in getBearerToken and buildModifyResponse.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 96 - 101, detectRegistryAuth,
getBearerToken and buildModifyResponse each allocate a new http.Client which
prevents connection reuse; instead declare a package-level shared *http.Client
with the desired Timeout and reuse it in those functions. Replace local client
:= &http.Client{Timeout: ...} in detectRegistryAuth, getBearerToken and
buildModifyResponse with references to the shared client (e.g. reuseHttpClient
or defaultHTTPClient), initialize that variable once (with Timeout set) at
package scope (or in init()) and ensure all HTTP calls use it so the
Transport/connection pool is reused.

64-76: extractParam may match substrings of other parameter names.

If the header contains a parameter whose name ends with the search term (e.g., a hypothetical "somerealm"), the strings.Index approach would produce a false match. In practice, standard WWW-Authenticate headers use well-defined parameter names (realm, service, scope), so this is low risk—but a more robust approach would prefix-match with a delimiter (comma/space).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 64 - 76, The extractParam function
can false-match parameter names that end with the search term; update
extractParam to only match param when it's a full name by checking the character
before the match is either start-of-string or a delimiter (comma, space) and
that the match is followed by =" (i.e., look for patterns like 'param="' with a
proper delimiter prefix), then extract the quoted value as before; reference the
extractParam(header, param string) function and ensure you still handle missing
end-quote gracefully.

55-57: Registry cache grows unboundedly — no TTL or size limit.

registryCache is a plain map with no eviction. In long-running deployments serving many distinct registry/repository combinations, this will grow indefinitely. Consider adding a TTL-based expiry (e.g., using a simple timestamp check) or switching to an LRU cache. A TTL also ensures that auth method changes at the registry are eventually picked up.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 55 - 57, The registryCache map
currently grows unbounded; modify the caching logic around
registryCache/registryCacheMutex and RegistryInfo to enforce eviction—either add
a lastChecked/expiry timestamp field to RegistryInfo and on Get/Set check and
remove expired entries (TTL-based), or replace registryCache with an LRU
structure (e.g., container/list + map) to cap size and evict least-recently-used
entries; ensure all accesses still use registryCacheMutex for concurrency and
that on cache miss or expired entry you re-probe and update the cache with a
refreshed timestamp.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/controller/serverbootconfiguration_pxe_controller.go`:
- Around line 226-230: The code in getLayerDigestsFromNestedManifest currently
builds imageRef with fmt.Sprintf("%s:%s", imageName, imageVersion) which breaks
when imageVersion is a digest (must use '@' for canonical digests) and also
calls r.RegistryValidator.ValidateImageRegistry without checking
r.RegistryValidator for nil; update getLayerDigestsFromNestedManifest to detect
whether imageVersion is a digest (e.g., strings.HasPrefix(imageVersion,
"sha256:") or using a ParseImageReference helper) and build imageRef using
"%s@%s" for digests and "%s:%s" for tags, and guard the validator call by first
checking if r.RegistryValidator != nil before invoking ValidateImageRegistry
(return a clear error if nil); apply the same pattern to the analogous function
in serverbootconfiguration_http_controller.go that performs registry validation.

In `@internal/registry/validation.go`:
- Around line 49-55: Normalize the input to lowercase before doing the switch
comparison in normalizeDockerHubDomain: call strings.ToLower on the incoming
domain at the start of the function, then switch on that lowercase value and
return the normalized "docker.io" as before; also apply the same lowercase
normalization to the other domain/registry normalization helper in this file
(the other function that normalizes registry domains) so all policy checks use
case-insensitive matching.

In `@server/imageproxyserver.go`:
- Around line 283-284: The code ignores url.Parse errors when creating proxyURL
from targetURL, which can yield a nil proxyURL and cause a panic in
buildDirector(proxyURL, ...) and proxy.ServeHTTP; update the call that sets
proxyURL (currently "proxyURL, _ := url.Parse(targetURL)") to handle the error:
call url.Parse and if it returns a non-nil error, log or return an HTTP error
(or propagate) and avoid calling buildDirector/proxy.ServeHTTP with a nil URL;
ensure you reference and check proxyURL and the error before using buildDirector
and proxy.ServeHTTP so malformed registryDomain/targetURL cannot trigger a
nil-pointer panic.
- Line 310: The bearerToken parameter on buildModifyResponse is now unused;
remove it from the function signature and all call sites to eliminate dead code.
Update the function declaration buildModifyResponse(...) to take no parameters
and adjust where it's invoked (e.g., the call at the former line 288) to stop
passing bearerToken; ensure any closures or referenced variables inside
buildModifyResponse (if any) still compile after the removal. Run a quick
build/test to confirm no remaining references to bearerToken in
buildModifyResponse or its callers.

---

Outside diff comments:
In `@server/imageproxyserver.go`:
- Around line 361-369: In replaceResponse, current header copying uses
originalResp.Header.Set(name, value) inside the loop which overwrites previous
values and drops multi-valued headers; change the header copy to use
originalResp.Header.Add(name, value) when iterating redirectResp.Header values
so all header entries are preserved, while still assigning originalResp.Body =
redirectResp.Body and originalResp.StatusCode = redirectResp.StatusCode as
before.
- Around line 321-326: The redirect handling currently copies all headers from
resp.Request to redirectReq via copyHeaders, leaking Authorization set by
buildDirector to third-party redirect targets; modify the logic around redirect
handling (the block that creates redirectReq and calls copyHeaders) to detect
cross-host redirects by comparing resp.Request.URL.Host (or resp.Request.Host)
with redirectReq.URL.Host (or the parsed location.Host) and, when they differ,
remove or avoid copying the "Authorization" header (and other sensitive auth
headers if present) before making the redirected request; keep Authorization
intact for same-host redirects.

---

Duplicate comments:
In `@internal/controller/serverbootconfiguration_http_controller.go`:
- Around line 205-209: In getUKIDigestFromNestedManifest, avoid reconstructing
imageRef as "name:version" that can drop an `@digest` and risk nil deref; instead
detect and preserve an existing digest (image@sha256:...) when building
imageRef, and add a nil-guard / fail-closed check after any registry manifest
lookup so you return an error if the manifest or digest is nil. Update logic
around RegistryValidator.ValidateImageRegistry and subsequent manifest handling
to prefer the `@digest` form and propagate errors when digest/manifest is missing
(mirror the fix applied in the PXE digest/validator path).

---

Nitpick comments:
In `@internal/controller/serverbootconfig_helpers.go`:
- Around line 63-77: FindManifestByArchitecture currently unmarshals
manifestData into ocispec.Manifest before checking desc.MediaType, causing a
wasted unmarshal when the descriptor is an index; change the logic to first
check desc.MediaType against ocispec.MediaTypeImageIndex (using the existing
manifestData fetched by fetchContent), and only unmarshal into ocispec.Index
when it is an index and into ocispec.Manifest when it is not; update subsequent
uses of the unmarshaled value to refer to the correctly typed variable
(ocispec.Index or ocispec.Manifest) instead of doing a double-unmarshal.
- Around line 194-198: The defer that closes reader uses fmt.Printf which breaks
structured logging; replace that printf with the project's structured logger
(e.g., use slog or the function's context/logger parameter) or at minimum write
to stderr—capture cerr from reader.Close() and log it via the structured logger
(e.g., slog.Error("failed to close reader", "err", cerr)) inside the existing
defer anonymous function that checks cerr; alternatively, if close errors are
benign here, simply drop the log call.
- Around line 155-159: The call log.Error(nil, "Failed to decode object into
Secret", "object", secret) is passing nil as the error; change it so the logging
follows logr conventions: either construct a real error (e.g., fmt.Errorf with
the unexpected type) and pass it as the first argument to log.Error, or switch
to log.Info with the same message and context. Update the call around
secret/secretObj handling (the block that does secretObj, ok :=
secret.(*corev1.Secret)) to use one of these two approaches and include the
offending value (secret) in the structured context.

In `@internal/registry/validation_test.go`:
- Around line 68-149: Add a test in TestIsRegistryAllowed to verify case-variant
matching for block-list behavior: insert a test case into the tests slice (e.g.
name "blocked with block list - DOCKER.IO", registry "DOCKER.IO", blockList
"docker.io,registry-1.docker.io", allowList "", want false) so the test asserts
that registry matching is case-insensitive for Registry validation; modify only
the tests slice in TestIsRegistryAllowed and keep the rest of the table-driven
test intact.

In `@server/imageproxyserver.go`:
- Around line 96-101: detectRegistryAuth, getBearerToken and buildModifyResponse
each allocate a new http.Client which prevents connection reuse; instead declare
a package-level shared *http.Client with the desired Timeout and reuse it in
those functions. Replace local client := &http.Client{Timeout: ...} in
detectRegistryAuth, getBearerToken and buildModifyResponse with references to
the shared client (e.g. reuseHttpClient or defaultHTTPClient), initialize that
variable once (with Timeout set) at package scope (or in init()) and ensure all
HTTP calls use it so the Transport/connection pool is reused.
- Around line 64-76: The extractParam function can false-match parameter names
that end with the search term; update extractParam to only match param when it's
a full name by checking the character before the match is either start-of-string
or a delimiter (comma, space) and that the match is followed by =" (i.e., look
for patterns like 'param="' with a proper delimiter prefix), then extract the
quoted value as before; reference the extractParam(header, param string)
function and ensure you still handle missing end-quote gracefully.
- Around line 55-57: The registryCache map currently grows unbounded; modify the
caching logic around registryCache/registryCacheMutex and RegistryInfo to
enforce eviction—either add a lastChecked/expiry timestamp field to RegistryInfo
and on Get/Set check and remove expired entries (TTL-based), or replace
registryCache with an LRU structure (e.g., container/list + map) to cap size and
evict least-recently-used entries; ensure all accesses still use
registryCacheMutex for concurrency and that on cache miss or expired entry you
re-probe and update the cache with a refreshed timestamp.

In `@test/registry/registry.go`:
- Around line 141-145: The MockRegistry push helpers (e.g.,
MockRegistry.PushPXEImage) accept an architecture parameter but never use it;
either thread the architecture into the mock manifest/index creation or
explicitly mark it unused. Fix by updating MockRegistry.PushPXEImage (and the
other mock push helpers around the same area) to pass architecture into
pushPXEManifest (or alter pushPXEManifest to accept an architecture and set the
manifest/index Platform.Architecture accordingly) so test fixtures reflect the
platform, or if you prefer no behavioral change, explicitly consume the param
(e.g., _ = architecture) in each method to signal intentional omission.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b446367 and bc5bd40.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (10)
  • cmd/main.go
  • go.mod
  • internal/controller/serverbootconfig_helpers.go
  • internal/controller/serverbootconfiguration_http_controller.go
  • internal/controller/serverbootconfiguration_pxe_controller.go
  • internal/controller/suite_test.go
  • internal/registry/validation.go
  • internal/registry/validation_test.go
  • server/imageproxyserver.go
  • test/registry/registry.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/imageproxyserver.go (1)

311-333: ⚠️ Potential issue | 🟠 Major

Authorization header leaked to redirect targets.

When a registry returns a 307 redirect (commonly to cloud storage like S3/GCS/Azure Blob), copyHeaders on line 325 forwards all original request headers — including Authorization: Bearer <token> — to the storage backend. Standard HTTP clients strip the Authorization header on cross-origin redirects for this reason.

This leaks registry credentials to third-party storage providers.

🔒 Proposed fix: strip Authorization before following redirect
 			redirectReq, err := http.NewRequest("GET", location.String(), nil)
 			if err != nil {
 				return err
 			}
 			copyHeaders(resp.Request.Header, redirectReq.Header)
+			// Strip auth header on cross-origin redirect to prevent credential leakage
+			if redirectReq.URL.Host != resp.Request.URL.Host {
+				redirectReq.Header.Del("Authorization")
+			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 311 - 333, The redirect handler
forwards all original headers via copyHeaders, leaking credentials; after
creating redirectReq and calling copyHeaders(resp.Request.Header,
redirectReq.Header) in the anonymous redirect function, explicitly remove
sensitive auth headers (e.g., redirectReq.Header.Del("Authorization") and
redirectReq.Header.Del("Proxy-Authorization")) before client.Do, so that
replaceResponse and copyHeaders remain unchanged but no credentials are sent to
the redirect target.
♻️ Duplicate comments (1)
server/imageproxyserver.go (1)

283-284: Ignored error from url.Parse can cause a nil-pointer panic.

proxyURL, _ := url.Parse(targetURL) silently discards the error. If parsing fails, proxyURL is nil and the subsequent buildDirector / proxy.ServeHTTP calls will panic.

🐛 Proposed fix
-	proxyURL, _ := url.Parse(targetURL)
+	proxyURL, err := url.Parse(targetURL)
+	if err != nil {
+		http.Error(w, "Internal Server Error", http.StatusInternalServerError)
+		log.Error(err, "Failed to parse target URL", "targetURL", targetURL)
+		return
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 283 - 284, The code currently
ignores the error returned by url.Parse when creating proxyURL from targetURL,
which can leave proxyURL nil and cause a panic in buildDirector /
proxy.ServeHTTP; update the block around targetURL and url.Parse to capture the
error, check if err != nil, log or write an appropriate HTTP error response (or
return) and avoid calling buildDirector or proxy.ServeHTTP with a nil proxyURL;
reference the variables and functions targetURL, url.Parse, proxyURL,
buildDirector, and proxy.ServeHTTP when making the change.
🧹 Nitpick comments (6)
server/imageproxyserver.go (4)

185-215: RunImageProxyServer uses http.DefaultServeMux, which is a global.

Using http.HandleFunc registers on the default mux. If any other code in the process also registers handlers (e.g., health checks, metrics), they'll share the same mux, which can cause routing conflicts. Consider using a dedicated http.ServeMux:

♻️ Proposed refactor
 func RunImageProxyServer(imageProxyServerAddr string, k8sClient client.Client, validator *registry.Validator, log logr.Logger) {
+	mux := http.NewServeMux()
-	http.HandleFunc("/image", func(w http.ResponseWriter, r *http.Request) {
+	mux.HandleFunc("/image", func(w http.ResponseWriter, r *http.Request) {
 		// ...
 	})
-	http.HandleFunc("/httpboot/", func(w http.ResponseWriter, r *http.Request) {
+	mux.HandleFunc("/httpboot/", func(w http.ResponseWriter, r *http.Request) {
 		// ...
 	})
 
 	log.Info("Starting image proxy server", "address", imageProxyServerAddr)
-	if err := http.ListenAndServe(imageProxyServerAddr, nil); err != nil {
+	if err := http.ListenAndServe(imageProxyServerAddr, mux); err != nil {
 		// ...
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 185 - 215, RunImageProxyServer
currently registers routes with the global http.DefaultServeMux via
http.HandleFunc which can lead to conflicts; instead create a dedicated
http.ServeMux inside RunImageProxyServer, register the "/image" and "/httpboot/"
handlers on that mux (replacing http.HandleFunc calls), and pass the mux to
http.ListenAndServe(imageProxyServerAddr, mux) so the server uses the local
ServeMux and avoids interfering with other global handlers.

96-106: Consider sharing a single http.Client instance across functions.

Three separate http.Client instances are created (lines 99, 163, 318) with the same 30s timeout. Allocating per-call prevents connection reuse and increases GC pressure. A package-level shared client would be more efficient.

♻️ Proposed refactor
+var httpClient = &http.Client{Timeout: 30 * time.Second}
+
 func detectRegistryAuth(registryDomain, repository string) (*RegistryInfo, error) {
 	targetURL := fmt.Sprintf("https://%s/v2/", registryDomain)
-	client := &http.Client{
-		Timeout: 30 * time.Second,
-	}
-	resp, err := client.Get(targetURL)
+	resp, err := httpClient.Get(targetURL)

Apply the same change to getBearerToken and buildModifyResponse.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 96 - 106, The code creates separate
http.Client instances in detectRegistryAuth (client := &http.Client{Timeout: 30
* time.Second}) and similarly in getBearerToken and buildModifyResponse; replace
these per-call clients with a single package-level reusable http.Client (e.g.,
httpClient) configured with the 30s Timeout (and optionally a shared Transport
to enable keep-alives) and update detectRegistryAuth, getBearerToken, and
buildModifyResponse to use that shared httpClient to allow connection reuse and
reduce allocations/GC pressure.

55-57: Unbounded registry cache with no TTL or eviction.

registryCache grows indefinitely. For a long-running server handling many distinct image references, this leaks memory. Additionally, cached TokenURL values become permanently stale if a registry's auth configuration changes.

Consider adding a TTL-based eviction (e.g., using a simple timestamp check or a library like patrickmn/go-cache), or at minimum cap the cache size.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 55 - 57, The registryCache map (and
registryCacheMutex) is unbounded and leaks memory and can serve stale TokenURL
values; update the caching to include TTL-based eviction or a max size: add a
timestamp field (e.g., LastSeen or ExpiresAt) to RegistryInfo, check that
timestamp in every lookup path (where registryCache is read) and treat expired
entries as missing so they are refreshed, and implement either a simple cleanup
goroutine that periodically removes expired entries from registryCache or
replace the map with a TTL cache (e.g., patrickmn/go-cache) and use that in
place of registryCache while still protecting concurrent access with
registryCacheMutex if required. Ensure TokenURL is refreshed on cache
miss/expiry and that additions set the timestamp/TTL when storing in
registryCache.

65-76: extractParam does not handle edge cases in WWW-Authenticate parsing.

If the header contains multiple parameters with overlapping names (e.g., realm appearing in a value string), or uses single quotes, or has URL-encoded quotes inside values, this parser could extract incorrect values. For the common case of well-behaved OCI registries, this works fine, but be aware of the fragility.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 65 - 76, The extractParam function
is fragile: it naively searches for param+"=\"" and can be fooled by the param
name appearing inside a quoted value, single quotes, escaped/URL-encoded quotes,
or overlapping names. Replace the simple index-based logic in extractParam with
a robust token parser that scans the header left-to-right, splits parameters on
unquoted commas, recognizes both double- and single-quoted values, handles
escaped quotes inside quoted values (and URL-decoded contents if necessary), and
returns the matching parameter's unescaped value (first match). Ensure the
parser ignores occurrences of the param name inside quoted values and supports
names that are substrings of other names. Use the function name extractParam to
locate and update the implementation.
internal/controller/serverbootconfig_helpers.go (2)

203-207: Use structured logging instead of fmt.Printf for the close error.

The rest of the codebase uses logr-based structured logging. fmt.Printf here is inconsistent and the output won't be captured by the logging framework.

Since fetchContent doesn't receive a logger, consider either passing a context-derived logger, or simply ignoring the close error (common for read-only streams):

♻️ Simpler alternative: silently discard close error
 	defer func() {
-		if cerr := reader.Close(); cerr != nil {
-			fmt.Printf("failed to close reader: %v\n", cerr)
-		}
+		_ = reader.Close()
 	}()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/serverbootconfig_helpers.go` around lines 203 - 207, The
defer currently prints close errors with fmt.Printf in fetchContent (defer
calling reader.Close()), which is inconsistent with the logr-based logging;
replace that fmt.Printf by either (a) removing/logically ignoring the close
error for read-only streams (i.e., drop the cerr check and just call
reader.Close()), or (b) if you prefer to surface the error, accept a
context-derived logger or a logr.Logger into fetchContent and call
logger.Error(cerr, "failed to close reader") instead; update the defer around
reader.Close() and any function signature changes accordingly.

72-134: FindManifestByArchitecture cleanly consolidates manifest resolution logic.

Good extraction of previously duplicated manifest traversal code from both PXE and HTTP controllers. The CNAME compatibility path is gated behind enableCNAMECompat, keeping HTTP and PXE behavior distinct.

One observation: the first json.Unmarshal into ocispec.Manifest (line 79) for index descriptors will succeed but produce a partially meaningless struct (since index JSON doesn't match manifest schema). This is harmless because the result is discarded when desc.MediaType == ocispec.MediaTypeImageIndex, but a brief comment would clarify intent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/serverbootconfig_helpers.go` around lines 72 - 134, Add a
short clarifying comment in FindManifestByArchitecture explaining why we first
unmarshal manifestData into an ocispec.Manifest even when desc.MediaType may be
ocispec.MediaTypeImageIndex: note that the initial json.Unmarshal into the
variable manifest is safe and may yield a partial/unused struct for index
payloads, and that the real index parsing happens later into indexManifest when
desc.MediaType == ocispec.MediaTypeImageIndex; place the comment immediately
above the first json.Unmarshal call (the manifest variable) so future readers
understand the intentional, harmless extra unmarshal.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/controller/serverbootconfiguration_pxe_controller.go`:
- Around line 219-221: The URL query values for kernelURL, initrdURL, and
squashFSURL are built by interpolating imageName and imageVersion (and the
digests) without escaping, which can produce malformed URLs; update the
construction to escape query parameter values (e.g., use url.QueryEscape or
build the query with url.Values / url.URL.Query()) so r.IPXEServiceURL remains
the base and imageName, imageVersion, kernelDigest, initrdDigest, squashFSDigest
are added as properly encoded query parameters when creating kernelURL,
initrdURL and squashFSURL.

---

Outside diff comments:
In `@server/imageproxyserver.go`:
- Around line 311-333: The redirect handler forwards all original headers via
copyHeaders, leaking credentials; after creating redirectReq and calling
copyHeaders(resp.Request.Header, redirectReq.Header) in the anonymous redirect
function, explicitly remove sensitive auth headers (e.g.,
redirectReq.Header.Del("Authorization") and
redirectReq.Header.Del("Proxy-Authorization")) before client.Do, so that
replaceResponse and copyHeaders remain unchanged but no credentials are sent to
the redirect target.

---

Duplicate comments:
In `@server/imageproxyserver.go`:
- Around line 283-284: The code currently ignores the error returned by
url.Parse when creating proxyURL from targetURL, which can leave proxyURL nil
and cause a panic in buildDirector / proxy.ServeHTTP; update the block around
targetURL and url.Parse to capture the error, check if err != nil, log or write
an appropriate HTTP error response (or return) and avoid calling buildDirector
or proxy.ServeHTTP with a nil proxyURL; reference the variables and functions
targetURL, url.Parse, proxyURL, buildDirector, and proxy.ServeHTTP when making
the change.

---

Nitpick comments:
In `@internal/controller/serverbootconfig_helpers.go`:
- Around line 203-207: The defer currently prints close errors with fmt.Printf
in fetchContent (defer calling reader.Close()), which is inconsistent with the
logr-based logging; replace that fmt.Printf by either (a) removing/logically
ignoring the close error for read-only streams (i.e., drop the cerr check and
just call reader.Close()), or (b) if you prefer to surface the error, accept a
context-derived logger or a logr.Logger into fetchContent and call
logger.Error(cerr, "failed to close reader") instead; update the defer around
reader.Close() and any function signature changes accordingly.
- Around line 72-134: Add a short clarifying comment in
FindManifestByArchitecture explaining why we first unmarshal manifestData into
an ocispec.Manifest even when desc.MediaType may be ocispec.MediaTypeImageIndex:
note that the initial json.Unmarshal into the variable manifest is safe and may
yield a partial/unused struct for index payloads, and that the real index
parsing happens later into indexManifest when desc.MediaType ==
ocispec.MediaTypeImageIndex; place the comment immediately above the first
json.Unmarshal call (the manifest variable) so future readers understand the
intentional, harmless extra unmarshal.

In `@server/imageproxyserver.go`:
- Around line 185-215: RunImageProxyServer currently registers routes with the
global http.DefaultServeMux via http.HandleFunc which can lead to conflicts;
instead create a dedicated http.ServeMux inside RunImageProxyServer, register
the "/image" and "/httpboot/" handlers on that mux (replacing http.HandleFunc
calls), and pass the mux to http.ListenAndServe(imageProxyServerAddr, mux) so
the server uses the local ServeMux and avoids interfering with other global
handlers.
- Around line 96-106: The code creates separate http.Client instances in
detectRegistryAuth (client := &http.Client{Timeout: 30 * time.Second}) and
similarly in getBearerToken and buildModifyResponse; replace these per-call
clients with a single package-level reusable http.Client (e.g., httpClient)
configured with the 30s Timeout (and optionally a shared Transport to enable
keep-alives) and update detectRegistryAuth, getBearerToken, and
buildModifyResponse to use that shared httpClient to allow connection reuse and
reduce allocations/GC pressure.
- Around line 55-57: The registryCache map (and registryCacheMutex) is unbounded
and leaks memory and can serve stale TokenURL values; update the caching to
include TTL-based eviction or a max size: add a timestamp field (e.g., LastSeen
or ExpiresAt) to RegistryInfo, check that timestamp in every lookup path (where
registryCache is read) and treat expired entries as missing so they are
refreshed, and implement either a simple cleanup goroutine that periodically
removes expired entries from registryCache or replace the map with a TTL cache
(e.g., patrickmn/go-cache) and use that in place of registryCache while still
protecting concurrent access with registryCacheMutex if required. Ensure
TokenURL is refreshed on cache miss/expiry and that additions set the
timestamp/TTL when storing in registryCache.
- Around line 65-76: The extractParam function is fragile: it naively searches
for param+"=\"" and can be fooled by the param name appearing inside a quoted
value, single quotes, escaped/URL-encoded quotes, or overlapping names. Replace
the simple index-based logic in extractParam with a robust token parser that
scans the header left-to-right, splits parameters on unquoted commas, recognizes
both double- and single-quoted values, handles escaped quotes inside quoted
values (and URL-decoded contents if necessary), and returns the matching
parameter's unescaped value (first match). Ensure the parser ignores occurrences
of the param name inside quoted values and supports names that are substrings of
other names. Use the function name extractParam to locate and update the
implementation.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc5bd40 and 7cf6f7a.

📒 Files selected for processing (7)
  • internal/controller/serverbootconfig_helpers.go
  • internal/controller/serverbootconfig_helpers_test.go
  • internal/controller/serverbootconfiguration_http_controller.go
  • internal/controller/serverbootconfiguration_pxe_controller.go
  • internal/registry/validation.go
  • internal/registry/validation_test.go
  • server/imageproxyserver.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/registry/validation.go
  • internal/registry/validation_test.go

Comment on lines +219 to +221
kernelURL := fmt.Sprintf("%s/image?imageName=%s&version=%s&layerDigest=%s", r.IPXEServiceURL, imageName, imageVersion, kernelDigest)
initrdURL := fmt.Sprintf("%s/image?imageName=%s&version=%s&layerDigest=%s", r.IPXEServiceURL, imageName, imageVersion, initrdDigest)
squashFSURL := fmt.Sprintf("%s/image?imageName=%s&version=%s&layerDigest=%s", r.IPXEServiceURL, imageName, imageVersion, squashFSDigest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

URL query parameter values are not escaped.

imageName and imageVersion are interpolated directly into the URL string. If imageName contains characters like @ (for digest refs passed through) or other special characters, the resulting URL may be malformed or misinterpreted by the image proxy server.

Use url.QueryEscape or build the URL with url.Values:

🛠️ Proposed fix using url.Values
+	params := url.Values{}
+	params.Set("imageName", imageName)
+	params.Set("version", imageVersion)
+
-	kernelURL := fmt.Sprintf("%s/image?imageName=%s&version=%s&layerDigest=%s", r.IPXEServiceURL, imageName, imageVersion, kernelDigest)
-	initrdURL := fmt.Sprintf("%s/image?imageName=%s&version=%s&layerDigest=%s", r.IPXEServiceURL, imageName, imageVersion, initrdDigest)
-	squashFSURL := fmt.Sprintf("%s/image?imageName=%s&version=%s&layerDigest=%s", r.IPXEServiceURL, imageName, imageVersion, squashFSDigest)
+	params.Set("layerDigest", kernelDigest)
+	kernelURL := fmt.Sprintf("%s/image?%s", r.IPXEServiceURL, params.Encode())
+	params.Set("layerDigest", initrdDigest)
+	initrdURL := fmt.Sprintf("%s/image?%s", r.IPXEServiceURL, params.Encode())
+	params.Set("layerDigest", squashFSDigest)
+	squashFSURL := fmt.Sprintf("%s/image?%s", r.IPXEServiceURL, params.Encode())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/serverbootconfiguration_pxe_controller.go` around lines
219 - 221, The URL query values for kernelURL, initrdURL, and squashFSURL are
built by interpolating imageName and imageVersion (and the digests) without
escaping, which can produce malformed URLs; update the construction to escape
query parameter values (e.g., use url.QueryEscape or build the query with
url.Values / url.URL.Query()) so r.IPXEServiceURL remains the base and
imageName, imageVersion, kernelDigest, initrdDigest, squashFSDigest are added as
properly encoded query parameters when creating kernelURL, initrdURL and
squashFSURL.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/imageproxyserver.go (1)

408-417: ⚠️ Potential issue | 🟠 Major

Close original response body before replacing it on redirect.

At Line 416, originalResp.Body is overwritten without closing the prior body. Under frequent redirects, this can leak connections and degrade proxy throughput.

🛠️ Proposed fix
 func replaceResponse(originalResp, redirectResp *http.Response) {
+	if originalResp.Body != nil {
+		_ = originalResp.Body.Close()
+	}
+
 	// Preserve all values for multi-valued headers (e.g., Set-Cookie)
 	for name, values := range redirectResp.Header {
 		originalResp.Header.Del(name) // Clear existing values first
 		for _, value := range values {
 			originalResp.Header.Add(name, value) // Add all values
 		}
 	}
 	originalResp.Body = redirectResp.Body
 	originalResp.StatusCode = redirectResp.StatusCode
+	originalResp.Status = redirectResp.Status
+	originalResp.ContentLength = redirectResp.ContentLength
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 408 - 417, In replaceResponse, close
the previous response body to avoid connection leaks by checking if
originalResp.Body != nil and calling Close() on it (ignore or handle the error
as appropriate) before assigning originalResp.Body = redirectResp.Body; ensure
you only close originalResp.Body and do not close redirectResp.Body so the new
body remains usable, and then proceed to copy headers and set
originalResp.StatusCode as currently done.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/imageproxyserver.go`:
- Around line 43-45: Update TokenResponse to include both Token and AccessToken
fields (e.g., TokenResponse { Token string `json:"token"`; AccessToken string
`json:"access_token"` }) and change token handling to prefer AccessToken when
set, falling back to Token. After fetching the token HTTP response (the code
path that currently unmarshals into TokenResponse), first check the response
status (only unmarshal on 2xx success); if it's not a success, read the response
body and return or log a clear error including the status code and body instead
of attempting to unmarshal. Ensure the token consumer (where the token string is
used) reads the chosen field (AccessToken || Token).
- Around line 71-73: extractRepository currently only trims the registry prefix
and therefore preserves tag (':tag') and digest ('@sha...') suffixes and fails
to map single-component Docker Hub refs to library/<name>. Fix extractRepository
to: 1) remove any digest by splitting on '@' and taking the left part; 2) remove
a tag by splitting on the last ':' only if that ':' appears after the last '/'
(so you don't strip registry ports); and 3) if the registryDomain corresponds to
Docker Hub (e.g., "docker.io" or the implicit hub case) and the resulting
repository has no '/', prefix it with "library/". Update the extractRepository
function and any other places that replicate this logic (the other extraction
sites mentioned in your review) so they use the corrected behavior.
- Around line 131-133: The code currently checks the WWW-Authenticate header
with a case-sensitive prefix (strings.HasPrefix(authHeader, "Bearer ")), causing
lowercase or mixed-case schemes to be missed; update the check to be
case-insensitive (e.g., compare strings.HasPrefix(strings.ToLower(authHeader),
"bearer ")), then set info.AuthMethod = AuthBearer and call
extractTokenURL(authHeader, repository) as before so token parsing still uses
the original header value.

---

Outside diff comments:
In `@server/imageproxyserver.go`:
- Around line 408-417: In replaceResponse, close the previous response body to
avoid connection leaks by checking if originalResp.Body != nil and calling
Close() on it (ignore or handle the error as appropriate) before assigning
originalResp.Body = redirectResp.Body; ensure you only close originalResp.Body
and do not close redirectResp.Body so the new body remains usable, and then
proceed to copy headers and set originalResp.StatusCode as currently done.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7cf6f7a and 618e3ba.

📒 Files selected for processing (1)
  • server/imageproxyserver.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
server/imageproxyserver.go (3)

251-254: Consider defensive nil check on required validator parameter.

If validator is nil, the server will panic when validator.IsRegistryAllowed() is called at Line 325. While the caller should always provide a valid validator, a defensive check with a clear error message would aid debugging.

🛡️ Optional: Defensive nil check
 func RunImageProxyServer(imageProxyServerAddr string, k8sClient client.Client, validator *registry.Validator, log logr.Logger) {
+	if validator == nil {
+		log.Error(nil, "Registry validator is required but was nil")
+		panic("registry validator is required")
+	}
+
 	// Start background cleanup of expired cache entries
 	go cleanupExpiredCacheEntries(log)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 251 - 254, RunImageProxyServer
currently assumes validator is non-nil and will panic when calling
validator.IsRegistryAllowed; add a defensive nil check at the start of
RunImageProxyServer that detects if validator == nil and logs a clear error
(using the provided log) and exits/returns early (or panic with a clear message)
so subsequent calls like validator.IsRegistryAllowed do not dereference nil.
Target the RunImageProxyServer function and the validator parameter for this
change.

86-98: Parameter name matching is case-sensitive; RFC 7235 specifies case-insensitivity.

The extractParam function matches parameter names exactly (e.g., "realm"). Per RFC 7235, auth-param names are case-insensitive, so Realm or REALM would be missed. In practice, OCI registries use lowercase, so this is low risk.

♻️ Optional: Case-insensitive parameter extraction
 func extractParam(header, param string) string {
-	start := strings.Index(header, param+"=\"")
+	lowerHeader := strings.ToLower(header)
+	lowerParam := strings.ToLower(param)
+	start := strings.Index(lowerHeader, lowerParam+"=\"")
 	if start == -1 {
 		return ""
 	}
-	start += len(param) + 2
-	end := strings.Index(header[start:], "\"")
+	start += len(lowerParam) + 2
+	end := strings.Index(lowerHeader[start:], "\"")
 	if end == -1 {
 		return ""
 	}
+	// Return value from original header to preserve case
 	return header[start : start+end]
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 86 - 98, The extractParam function
currently matches parameter names case-sensitively; update extractParam(header,
param string) to perform a case-insensitive match (per RFC 7235) when locating
param+"=\"" in the WWW-Authenticate header — e.g., operate on a lowercase copy
of the header or use a case-insensitive search/regex (or strings.EqualFold when
comparing tokens) so names like "Realm" or "REALM" are recognized, then extract
the value from the original header slice accordingly and return the parameter
value as before.

222-249: Consider adding a cache size limit for production hardening.

The cache cleanup only removes expired entries but doesn't limit total cache size. With many unique registry/repository combinations, the cache could grow unbounded. For most deployments this is fine, but at scale an LRU eviction policy or max-size limit could be beneficial.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 222 - 249,
cleanupExpiredCacheEntries only prunes by expiration and doesn't prevent
unbounded growth; add a max-size policy (e.g., const maxRegistryCacheEntries)
and enforce it when mutating registryCache: when len(registryCache) exceeds the
limit, evict entries until within limit using either the oldest-by-expiresAt
heuristic or an LRU structure (maintain a doubly-linked list or timestamp on
each registryCache entry). Update places that insert into registryCache (the
code that sets registryCache[key] = entry) to acquire registryCacheMutex, check
len(registryCache) > maxRegistryCacheEntries and perform eviction logic (remove
keys from registryCache and update any LRU metadata), and keep
cleanupExpiredCacheEntries as a periodic expiration pass that also logs
evictions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/imageproxyserver.go`:
- Around line 398-401: The redirect handling creates redirectReq with
http.NewRequest which uses background context and won't be cancelled when the
client disconnects; change the creation to use http.NewRequestWithContext and
pass the original incoming request's context (e.g., r.Context()) so the
redirectReq (the variable named redirectReq created with location.String())
inherits cancellation/timeout from the original request context.
- Around line 195-199: The error handling uses io.ReadAll(resp.Body) without
bounds, which can OOM on large responses; change the read to use a limited
reader (e.g., io.ReadAll(io.LimitReader(resp.Body, maxErrorBodySize))) or an
equivalent copy with io.CopyN and choose a sensible constant (e.g.,
maxErrorBodySize = 64*1024) near the top of the file; update the error message
to indicate the body may be truncated and reference the existing resp and
io.ReadAll usage in the token request handling (the block that checks
resp.StatusCode != http.StatusOK) so you only replace that read with a bounded
read.

---

Nitpick comments:
In `@server/imageproxyserver.go`:
- Around line 251-254: RunImageProxyServer currently assumes validator is
non-nil and will panic when calling validator.IsRegistryAllowed; add a defensive
nil check at the start of RunImageProxyServer that detects if validator == nil
and logs a clear error (using the provided log) and exits/returns early (or
panic with a clear message) so subsequent calls like validator.IsRegistryAllowed
do not dereference nil. Target the RunImageProxyServer function and the
validator parameter for this change.
- Around line 86-98: The extractParam function currently matches parameter names
case-sensitively; update extractParam(header, param string) to perform a
case-insensitive match (per RFC 7235) when locating param+"=\"" in the
WWW-Authenticate header — e.g., operate on a lowercase copy of the header or use
a case-insensitive search/regex (or strings.EqualFold when comparing tokens) so
names like "Realm" or "REALM" are recognized, then extract the value from the
original header slice accordingly and return the parameter value as before.
- Around line 222-249: cleanupExpiredCacheEntries only prunes by expiration and
doesn't prevent unbounded growth; add a max-size policy (e.g., const
maxRegistryCacheEntries) and enforce it when mutating registryCache: when
len(registryCache) exceeds the limit, evict entries until within limit using
either the oldest-by-expiresAt heuristic or an LRU structure (maintain a
doubly-linked list or timestamp on each registryCache entry). Update places that
insert into registryCache (the code that sets registryCache[key] = entry) to
acquire registryCacheMutex, check len(registryCache) > maxRegistryCacheEntries
and perform eviction logic (remove keys from registryCache and update any LRU
metadata), and keep cleanupExpiredCacheEntries as a periodic expiration pass
that also logs evictions.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 618e3ba and 2775aae.

📒 Files selected for processing (1)
  • server/imageproxyserver.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
server/imageproxyserver.go (1)

232-259: Consider adding a shutdown mechanism for graceful cleanup.

The cleanup goroutine runs indefinitely with no way to stop it. While this works fine for a long-running server (process exit terminates it), adding a context-based cancellation would enable clean shutdown in testing and containerized environments.

♻️ Optional: Add context-based cancellation
-func cleanupExpiredCacheEntries(log logr.Logger) {
+func cleanupExpiredCacheEntries(ctx context.Context, log logr.Logger) {
 	ticker := time.NewTicker(5 * time.Minute)
 	defer ticker.Stop()
 
-	for range ticker.C {
+	for {
+		select {
+		case <-ctx.Done():
+			log.V(1).Info("Cache cleanup stopped")
+			return
+		case <-ticker.C:
+		}
 		now := time.Now()
 		// ... rest of cleanup logic
 	}
 }

Then in RunImageProxyServer, pass a cancellable context.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/imageproxyserver.go` around lines 232 - 259, The cleanup goroutine in
cleanupExpiredCacheEntries has no stop mechanism; change its signature to accept
a context (ctx context.Context) and make the for loop select on ctx.Done() and
ticker.C so it returns and stops the ticker when cancelled; ensure
registryCacheMutex is still used as before and that you stop the ticker via
defer ticker.Stop(). Then update RunImageProxyServer to create/accept a
cancellable context (or propagate the server context) and pass that context into
cleanupExpiredCacheEntries so the cleanup goroutine can be cancelled for
graceful shutdown.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/imageproxyserver.go`:
- Around line 232-259: The cleanup goroutine in cleanupExpiredCacheEntries has
no stop mechanism; change its signature to accept a context (ctx
context.Context) and make the for loop select on ctx.Done() and ticker.C so it
returns and stops the ticker when cancelled; ensure registryCacheMutex is still
used as before and that you stop the ticker via defer ticker.Stop(). Then update
RunImageProxyServer to create/accept a cancellable context (or propagate the
server context) and pass that context into cleanupExpiredCacheEntries so the
cleanup goroutine can be cancelled for graceful shutdown.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2775aae and 7d5eb28.

📒 Files selected for processing (1)
  • server/imageproxyserver.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

Make imageproxyserver handle other registry names

2 participants