Build: Docker layer granularity optimization plan - 1 #425

yuhuan130 · 2025-12-14T11:26:39Z

What This Does

Optimizes Docker build cache by separating heavy ML dependencies (torch, tensorflow) into their own layers and moving Flyte Python wheels to the bottom, so adding lightweight packages doesn't trigger full rebuilds.

Performance

Heavy Benchmark (torch, tensorflow, transformers)

Metric	Without Optimization	With Optimization	Result
Build Time	310. 3s (5.2 min)	8.7s (0.14 min)	35.9× faster

Standard Benchmark (torch, numpy, pandas)

Metric	Without Optimization	With Optimization	Result
Build Time	249.7s (4.2 min)	7.7s	32.3× faster

How It Works

BEFORE:  Any change rebuilds everything
┌─────────────────────────────────────┐
│ ALL packages together               │
└─────────────────────────────────────┘

AFTER:  Optimized layer ordering  
┌─────────────────────────────────────┐
│ Heavy:   torch, tensorflow (cached) │ ← Rarely changes
├─────────────────────────────────────┤
│ Other                               │ ← Changes more often
├─────────────────────────────────────┤
│ Flyte PythonWheels (bottom layer)   │ ← Compiled once, fully cached
└─────────────────────────────────────┘

Key Changes

heavy_deps.py: New centralized configuration file for heavy dependencies (tensorflow, torch, scikit-learn, etc.)
image_builder.py: Enhanced _optimize_image_layers() method with:
- Heavy dependency extraction to top layer
- Flyte wheel layer placement at bottom for maximum caching
- UV script dependency parsing support
New flag: optimize_layers=True (enabled by default, can disable)
3 benchmark examples showing 32-36× speedup
Smarter layer ordering: Heavy deps cached separately, Flyte wheels at bottom

Files Changed

src/flyte/_internal/imagebuild/heavy_deps.py (new)
src/flyte/_internal/imagebuild/image_builder.py
src/flyte/_image. py
tests/flyte/test_image. py
examples/image_layer_optimize/ (3 benchmark files)

Total: 7 files (1 new, 6 modified)

Screenshots

Replaces #422 with cleaner git history

Signed-off-by: “Alex <[email protected]>

pingsutw · 2025-12-24T18:42:38Z

src/flyte/_image.py

+            categorized[category].append(pkg)
+
+        # Helper function to create a layer
+        def create_pip_layer(pkgs):


Image.from_debian_base(name="test").with_pip_packages("tensorflow").with_pip_packages("pytorch")

In this case, TensorFlow and Pytorch will be installed in different layers, right? Let's install them in the same layer.

Also, I think it might be better to add _optimize method in ImageBuildEngine, and we can do

image = ImageBuildEngine._optimize(image) result = await img_builder.build_image(image, dry_run=dry_run)

flyte-sdk/src/flyte/_internal/imagebuild/image_builder.py

Lines 209 to 210 in 55d1b63

result = await img_builder.build_image(image, dry_run=dry_run)

It will be cleaner, wdyt?

Great idea. I've updated your suggestion on my latest commit!!

pingsutw · 2025-12-24T18:44:19Z

src/flyte/_image.py

        pre: bool = False,
        extra_args: Optional[str] = None,
        secret_mounts: Optional[SecretRequest] = None,
+        optimize_layers: bool = True,


Could we move it to clone()

Image.from_debian_base().clone(optimize_layers=True)

pingsutw · 2025-12-24T19:02:07Z

src/flyte/_image.py

+        "heavy": ("tensorflow", "torch", "torchaudio", "torchvision", "scikit-learn"),
+        # -----------------[ MIDDLE ]----------------- #
+        # Layer 1: ~200MB | Rebuild cost: Med  | Freq: Low
+        "core": ("numpy", "pandas", "pydantic", "requests", "httpx", "boto3", "fastapi", "uvicorn"),


I just tried to build these two images; the build times are almost the same. Do we really need this core layer?

image = ( Image.from_debian_base(install_flyte=False) .with_apt_packages("vim", "wget") .with_pip_packages("pandas", "numpy") )

image = ( Image.from_debian_base(install_flyte=False) .with_apt_packages("vim", "wget") .with_pip_packages("pandas", "numpy", "ty") )

I think the similar build times make sense here, without the additional optimize_layers flag marked false. Both images reuse the cached apt + pandas/numpy layers, so the only real installing is ty, which is small. Technically it'll save around 45 seconds - 1 minute.

Hmm interesting, I tried running it locally once with and without optimize, and I found out python:3.12-slim-bookworm base actually pre-installs the wheels for pandas and numpy etc... so it'll be fast for these two packages either way.

pingsutw · 2025-12-24T19:03:13Z

examples/benchmark.py

+
+
+@env.task
+async def main():


Signed-off-by: “Alex <[email protected]>

src/flyte/_internal/imagebuild/image_builder.py

src/flyte/_image.py

Signed-off-by: “Alex <[email protected]>

pingsutw · 2026-01-07T01:53:43Z

src/flyte/_internal/imagebuild/image_builder.py

+        # Step 1: Collect heavy packages and build new layer list
+        all_heavy_packages: list[str] = []
+        template_layer: PipPackages | None = None
+        optimized_layers: list[Layer] = []


nit: could we call it original_layers or other_layers?

pingsutw · 2026-01-07T02:05:44Z

src/flyte/_internal/imagebuild/image_builder.py

+                            template_layer = PipPackages(
+                                packages=(),
+                                index_url=layer.index_url,


I think we should have heavy_layers: List. each layers has different settings

pingsutw · 2026-01-07T02:06:16Z

src/flyte/_internal/imagebuild/image_builder.py

    ImageBuilderType = typing.Literal["local", "remote"]

+    @staticmethod
+    def _optimize_image_layers(image: Image) -> Image:


Could we also add some unit tests for it, thanks

yuhuan130 added 3 commits December 14, 2025 19:07

layers example

d7bbd00

Signed-off-by: “Alex <[email protected]>

fixed layers

bc4d3de

Signed-off-by: “Alex <[email protected]>

Added optimize_layers

128afeb

Signed-off-by: “Alex <[email protected]>

yuhuan130 marked this pull request as ready for review December 14, 2025 11:31

format

fc487d8

Signed-off-by: “Alex <[email protected]>

yuhuan130 marked this pull request as draft December 14, 2025 17:33

Optimize image layers and add benchmark

0c04a56

Signed-off-by: “Alex <[email protected]>

yuhuan130 marked this pull request as ready for review December 24, 2025 06:25

pingsutw reviewed Dec 24, 2025

View reviewed changes

examples/benchmark.py Outdated

@env.task

async def main():

Copy link

Member

pingsutw Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

yuhuan130 reacted with thumbs up emoji

yuhuan130 added 2 commits December 30, 2025 18:20

optimize inside builder

d0ab64f

Signed-off-by: “Alex <[email protected]>

fix wording

f419e11

Signed-off-by: “Alex <[email protected]>

pingsutw reviewed Dec 31, 2025

View reviewed changes

src/flyte/_internal/imagebuild/image_builder.py Outdated Show resolved Hide resolved

src/flyte/_image.py Outdated Show resolved Hide resolved

created a new file, extract only heavy deps, flyte python wheel

d842c06

Signed-off-by: “Alex <[email protected]>

pingsutw reviewed Jan 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Build: Docker layer granularity optimization plan - 1 #425

Build: Docker layer granularity optimization plan - 1 #425

yuhuan130 commented Dec 14, 2025 •

edited

Loading

Uh oh!

pingsutw Dec 24, 2025

Uh oh!

yuhuan130 Dec 30, 2025 •

edited

Loading

Uh oh!

pingsutw Dec 24, 2025

Uh oh!

pingsutw Dec 24, 2025

Uh oh!

yuhuan130 Dec 25, 2025

Uh oh!

yuhuan130 Dec 29, 2025

Uh oh!

pingsutw Dec 24, 2025

Uh oh!

Uh oh!

Uh oh!

pingsutw Jan 7, 2026

Uh oh!

pingsutw Jan 7, 2026

Uh oh!

pingsutw Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


	result = await img_builder.build_image(image, dry_run=dry_run)

Build: Docker layer granularity optimization plan - 1 #425

Are you sure you want to change the base?

Build: Docker layer granularity optimization plan - 1 #425

Conversation

yuhuan130 commented Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What This Does

Performance

Heavy Benchmark (torch, tensorflow, transformers)

Standard Benchmark (torch, numpy, pandas)

How It Works

Key Changes

Files Changed

Screenshots

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuhuan130 Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yuhuan130 commented Dec 14, 2025 •

edited

Loading

yuhuan130 Dec 30, 2025 •

edited

Loading