feat: remote build for Python sdk#76
Open
christianalfoni wants to merge 6 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Build snapshots without a local Docker daemon
❌ Current behavior
sdk.snapshots.create(CreateContextSnapshotParams(...))always shelled out to the localdockerdaemon. The Python SDK refused to build at all if Docker wasn't installed, which blocked any environment without a daemon (CI containers, restricted devboxes, web playgrounds, etc.) from creating context-based snapshots.flowchart LR A[snapshots.create<br/>CreateContextSnapshotParams] --> B{docker available?} B -- no --> X[RuntimeError] B -- yes --> C[docker buildx build<br/>locally] C --> D[docker push<br/>to bartender registry] D --> E[create_snapshot API]✅ New behavior
snapshots.createnow defaults to the remote image-builder service. The localdocker buildxpath is still available as an opt-in viaTOGETHER_LOCAL_BUILD=1. Both paths return the same{image, architecture}shape so the registration logic that follows is identical.flowchart LR A[snapshots.create<br/>CreateContextSnapshotParams] --> B{TOGETHER_LOCAL_BUILD=1?} B -- yes --> C[_build_and_register<br/>docker buildx + push] B -- no, default --> D[_build_image_via_builder] D --> E[POST context.tar.gz<br/>to builder.* host] E --> F[SSE log stream<br/>via httpx-sse] F --> G[GET /builds/:id<br/>resolve image_ref] C --> H[create_snapshot API] G --> HThe remote path is implemented by a new
RemoteImageBuilderClientthat:tar.gzof the build context, honoring.dockerignoreviapathspec.{builder_host}/buildswithnydus_convert=true.httpx-sse.aconnect_sse) and retries transient connect failures while the build pod is scheduling.asyncio.CancelledErrorso client-side aborts don't leak running pods.GET /builds/:idonce adonecontrol event is received.Both
_submitand_get_statusare wrapped in_with_retryfrom_utilsfor consistency with the rest of the SDK's retry behavior.🤔 Assumptions
api.bartender.(orapi.) withbuilder..TOGETHER_API_KEYis valid for both the management API and the builder service.ubuntu-latestGitHub runners ship Docker preinstalled, so the local-build e2e test still works without an explicitdocker/setup-buildx-actionstep.TOGETHER_REMOTE_ARCHITECTURE, if set, must be a value accepted byCreateSnapshotBodyArchitecture— otherwise we raise a clearRuntimeErrorrather than silently falling back.🧠 Decisions
_build_and_registerto return{image, architecture}instead of completing the snapshot registration itself. This keepscreate()as the single place that callscreate_snapshot_api/alias_snapshot_api, so the two build paths share registration + alias logic.api_keyexplicitly toSnapshotsNamespace. The builder client needs a bearer token, and reaching into the underlying generatedApiClientwould couple us to generated code. Cleaner to passapi_keyas a kwarg fromTogetherSandbox.__init__.httpx-sse>=0.4.0andpathspec>=0.11.0topyproject.toml. Both are small, well-maintained, and avoid us having to hand-roll SSE parsing or.dockerignorematching.monkeypatch.setenv/delenv(..., raising=False)in the e2e tests. Function-scoped, automatic teardown — the two tests can sit next to each other without leaking env state, and the remote test is robust to a workflow-levelTOGETHER_LOCAL_BUILD=1if someone ever sets one.🔄 Discussions
test_create_from_context_with_aliastest that just lets the SDK pick its path silently. Rejected because the two paths now exercise totally different infrastructure (local Docker vs. builder service over HTTPS+SSE) and a single test would lose the ability to attribute failures._build_and_registeritself. Rejected because the branch should be visible at the call site so future readers see both paths in one place.🧪 Testing
tests/e2e/test_snapshots.py::TestSnapshots::test_create_from_context_local—monkeypatch.setenv("TOGETHER_LOCAL_BUILD", "1"), asserts the localdocker buildxpath still produces a registered snapshot with the expected alias prefix.tests/e2e/test_snapshots.py::TestSnapshots::test_create_from_context_remote—monkeypatch.delenv("TOGETHER_LOCAL_BUILD", raising=False), asserts the remote builder path produces a registered snapshot.origin/mainconfirms_build_and_registerand_build_image_via_buildernow share the same{image, architecture}return shape consumed bycreate().e2e.ymlworkflow run on theCSB-1450branch — the workflow trigger was updated to include this branch so the e2e job will fire on push.📁 References