Skip to content

perf: stream OCI layers directly to rootfs for local registry images#101

Draft
hiroTamada wants to merge 7 commits intomainfrom
perf/streaming-image-unpack
Draft

perf: stream OCI layers directly to rootfs for local registry images#101
hiroTamada wants to merge 7 commits intomainfrom
perf/streaming-image-unpack

Conversation

@hiroTamada
Copy link
Contributor

Summary

Optimizes image conversion for local registry images by streaming OCI layers directly from the registry to the target directory, bypassing the intermediate OCI cache write/read cycle.

Performance improvement: 1.6-2.8x faster for local image imports (benchmarked with synthetic images of varying sizes).

Key changes

  • streamingUnpack() - Pipes layers from go-containerregistry directly to tar extraction
  • processWhiteouts() - Handles OCI whiteout files (.wh.* deletions, .wh..wh..opq opaque dirs)
  • extractMetadataFromImage() - Extracts container config without OCI cache
  • isLocalRegistry() - Detects local registry patterns (localhost, 127.0.0.1, 10.102.0.1)

When streaming is used

Scenario Path Used
Local registry, not cached Streaming (new)
Local registry, already cached Cached (existing)
Remote registry Cached (existing)

This ensures local build images (pushed from builder VMs) benefit from streaming, while remote images still use caching for retry scenarios.

Test plan

  • Unit tests for processWhiteouts() (regular files, opaque dirs, nonexistent targets)
  • Unit tests for extractTarStream()
  • Unit tests for extractMetadataFromImage()
  • Unit tests for isLocalRegistry()
  • Micro-benchmarks comparing streaming vs umoci extraction
  • E2E test: Build image from local registry verified streaming path (no OCI cache entry created)

Made with Cursor

hiroTamada and others added 7 commits February 13, 2026 13:38
Optimize image conversion by streaming layers directly from the registry
to the target directory using tar extraction, bypassing the intermediate
OCI cache write/read cycle. This reduces disk I/O and speeds up local
image imports by 1.6-2.8x depending on image size.

Key changes:
- Add streamingUnpack() that pipes layers from go-containerregistry to tar
- Add processWhiteouts() for proper OCI whiteout file handling
- Add extractMetadataFromImage() to get config without OCI cache
- Add isLocalRegistry() to detect local registry images (localhost, 127.0.0.1, 10.102.0.1)
- Use streaming path for local registry images that aren't already cached

The cached path (pullAndExport + umoci) is still used for:
- Remote registry images (benefits from caching on retry)
- Already-cached images (avoids redundant downloads)

Co-authored-by: Cursor <cursoragent@cursor.com>
Logs which image unpack strategy is used:
- "using streaming unpack for local registry image" - streaming path
- "using cached unpack" - traditional OCI cache + umoci path

This helps verify the streaming optimization is active in production.

Co-authored-by: Cursor <cursoragent@cursor.com>
The isLocalRegistry() function was hardcoded to only recognize
10.102.0.1 (dev gateway), but production uses 172.30.0.1.

Now detects all RFC 1918 private IP ranges:
- 10.0.0.0/8
- 172.16.0.0/12 (172.16.x.x - 172.31.x.x)
- 192.168.0.0/16

This ensures the streaming optimization works in all environments
regardless of the configured gateway IP.

Co-authored-by: Cursor <cursoragent@cursor.com>
…mages

The registry pre-caches images to the OCI layout upon push, making the
'alreadyCached' check always true for local builds. This prevented the
streaming optimization from ever being used.

Now, for local registries (RFC 1918 IPs, localhost), we always use
streaming because:
1. Image is already local (no network benefit from caching)
2. Direct tar extraction is 1.6-2.8x faster than umoci
3. Registry caching means existsInLayout() is always true anyway

This should reduce the ImportLocalImage time from ~5s (umoci) to ~2s (streaming).

Co-authored-by: Cursor <cursoragent@cursor.com>
The streamingUnpack function requires network auth to pull from the
local registry. Since the registry pre-caches images to the OCI layout
on push, we can read directly from disk without auth.

Streaming is now only used for remote registries when the image isn't
cached yet. For local registries, we use the cached path which:
1. Reads from the pre-cached OCI layout (no network/auth needed)
2. Works reliably with the existing registry setup

TODO: Implement streamingUnpackFromLayout to get streaming speed
benefits for local registry images by reading directly from the
OCI cache instead of through umoci.

Co-authored-by: Cursor <cursoragent@cursor.com>
…version

For local registry images, the image is already cached in the OCI layout
(registry pre-caches on push). This new function streams directly from
the local OCI cache to tar extraction, bypassing slow umoci.

Strategy by registry type:
- Local registry (cached): streamingUnpackFromLayout - fastest, no network
- Local registry (uncached): streamingUnpack - rare case, network pull
- Remote registry: pullAndExport - enables layer deduplication

The streamingUnpackFromLayout function:
1. Opens OCI layout from local cache
2. Gets image by annotation tag
3. Streams each layer through tar command
4. Processes whiteouts between layers
5. Extracts metadata from image config

This should reduce ImportLocalImage time from ~5s (umoci) to ~2s (streaming)
for local builds.

Co-authored-by: Cursor <cursoragent@cursor.com>
…egistry

Verifies the fast path (streamingUnpackFromLayout) is used for local
registry images (172.30.x.x). The test confirms:
1. Local registry reference triggers streaming from layout
2. Image metadata is correctly extracted
3. Disk file is created successfully

Log output shows: "using streaming unpack from layout for local registry image"

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant