feat(engine,cli): drawElementImage fast-capture behind --experimental-fast-capture#1295
feat(engine,cli): drawElementImage fast-capture behind --experimental-fast-capture#1295vanceingalls wants to merge 1 commit into
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
…-fast-capture Add an experimental frame-capture mode that reads DOM paint records directly via Chrome's canvas.drawElementImage API instead of Page.captureScreenshot (~46% faster on GPU), gated behind --experimental-fast-capture (env PRODUCER_EXPERIMENTAL_FAST_CAPTURE; engine config useDrawElement). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
e8cec72 to
d2ef1f2
Compare
miguel-heygen
left a comment
There was a problem hiding this comment.
Solid experimental feature. The fallback routing logic (SwiftShader detection, transparent + GPU vs Docker paths, supersampling guard, page-side compositing auto-disable) is well thought through, and the format-aware JPEG/PNG encode choice is the right call — I can see from the PR description that the PNG-vs-JPEG bug was caught by the e2e run, which is exactly why it matters. A few small gaps worth addressing before this graduates out of --experimental.
P2 — captureDrawElementFrame: split(",") is fragile
packages/engine/src/services/drawElementService.ts, the base64 extraction:
const base64 = dataUrl.split(",")[1];Base64 can't contain commas, so this is practically safe for a toDataURL response — but the intent is to grab everything after the first comma, and split(",")[1] silently truncates if the payload ever has one. Prefer:
const commaIdx = dataUrl.indexOf(",");
if (commaIdx === -1) throw new Error("drawElement: toDataURL returned malformed data URL");
const base64 = dataUrl.slice(commaIdx + 1);Worth fixing before graduation: it's one line and removes any ambiguity.
P3 — BeginFrame response discarded in drawelement branch
packages/engine/src/services/frameCapture.ts, captureFrameCore:
await client.send("HeadlessExperimental.beginFrame", { ... });
// response ignoredDiscarding the response is intentional (we don't rely on BeginFrame for the screenshot here), but beginFrameHasDamageCount/beginFrameNoDamageCount won't increment for drawElement renders. Diagnostics and any tooling that inspects those counters post-render will silently under-report. At minimum log hasDamage so render telemetry stays accurate; ideally update the session counters so the existing diagnostic surface works.
P3 — injectDrawElementCanvas idempotency path uncovered
drawElementService.test.ts mocks page.evaluate but never exercises the early-return branch (document.getElementById("__hf_de_canvas") already exists). Add a mock test: call the function twice and assert page.evaluate was called only once.
P3 — Supersampling fallback untested
The initDrawElementOrTransparentBackground path for useDrawElement=true + deviceScaleFactor>1 (logs the warning and falls back to screenshot capture) has no unit coverage. Since the function is private, this would need a thin integration shim or an indirect test via initializeSession. Minor, but worth noting before promotion.
Cloud/Lambda follow-up is fine — the flag silently no-ops there and the description is clear about it.
→ Approve with comments. Fallback logic and encoding are correct; the gaps above are small and appropriate to address before the flag loses the --experimental prefix.
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Reviewed at d2ef1f2d. Big experimental capture-mode add (449/23, 10 files). Net: architecture is sound, opt-in plumbing is correct, the SwiftShader and page-side-compositing incompatibility cases are handled with two layers of defense (resolveConfig force-off + per-frame skip in prepareFrameForCapture), and the PSNR=∞ result vs captureScreenshot is strong evidence the visual output is byte-identical on the harnessed compositions. No blockers from my pass.
Where I have non-blocker concerns: coverage shape (CI's regression-shards run with the flag OFF, so drawelement is not exercised across the composition matrix), the Docker env-only path silently no-opping, and the <canvas layoutsubtree> HTML-in-canvas pattern (worth flagging for Miguel given his prior investigation, though this use case is render-output capture rather than interactive UI).
Concerns
-
Regression suite runs with fast-capture OFF. The PSNR=∞ result + e2e
drawElement canvas injectedlog come from a css-spinner → mp4 run plus the SwiftShader T1/T2 harness. The CI'sregression-shardsmatrix is great for verifying nothing breaks when the flag is off but doesn't exercise the newdrawelementmode across the diverse compositions (sub-compositions, iframes, HDR, font-variation, raf-ball, etc.). The composition-root reparenting ininjectDrawElementCanvas(parent.insertBefore(canvas, root); canvas.appendChild(root);) changes the root's layout-parent — compositions withposition: fixeddescendants, viewport-relative units referencing the prior ancestor, or that contain iframes (theiframe-render-compatstyle is the canonical case) could subtly diverge in a way the single-composition harness wouldn't catch. Is the plan to add a fast-capture variant of the regression matrix in a follow-up before promoting beyondEXPERIMENTAL, or is the PSNR=∞ on css-spinner considered representative? Either is defensible — just want to set expectations on coverage. -
Docker render silently drops env-only opt-in.
buildDockerRunArgsbuilds positional argv only — it does not forward host env vars to the container. The CLI plumbing atrender.ts:610usesargs["experimental-fast-capture"] === true, which isfalsefor an absent flag even whenPRODUCER_EXPERIMENTAL_FAST_CAPTURE=trueis set in the host shell. So an operator who sets only the env var and runshyperframes renderagainst the Docker path gets a normal screenshot render. The flag description ("Env: PRODUCER_EXPERIMENTAL_FAST_CAPTURE.") doesn't qualify that the env fallback is in-process only. This is consistent with the--low-memory-modeidiom the PR cites, so it may be intentional — but the symmetric--low-memory-modedescription has the same gap. Two reasonable resolutions: (a) document that the env fallback is in-process only and Docker needs the CLI flag, or (b) read env in the Docker option-prep path and synthesize the flag when present. Either is fine; (a) is cheaper if the existing pattern is the intent. -
<canvas layoutsubtree>is HTML-in-canvas. Surfacing for Miguel given his prior investigation on avoiding the pattern. This use case is render-output capture (no DOM events traverse the canvas, no a11y surface, no interactive UI nested inside), so the concerns from that thread probably don't apply directly — but worth a 👀 from him given the team bias against the pattern shape.
Nits
-
captureDrawElementFramehard-codes the JPEG quality default at 80 (drawElementService.ts:106). The downstream caller iscaptureFrameCorewhich passesoptions.quality ?? 80. Ifoptions.qualityis ever undefined from an alternate entry point, the default could diverge frompageScreenshotCapture's. Probably fine — the integration harness exercises the real call chain — but a one-line "tracks pageScreenshotCapture default" comment would pin the parity. -
logInitPhaseinterpolatessession.captureModeat the time of log (frameCapture.ts:997). Pre-helper phases log[initSession:beginframe]or[initSession:screenshot]; the new helper flips todrawelement, and post-helper phases log[initSession:drawelement]. ThedrawElement canvas injectedlog fires after the flip, which is the clear signal — but a future log-reader chasing a render issue may misread the prefix change mid-stream as a session restart. A one-line comment onlogInitPhasecalling out the mid-init flip would save 10 minutes of confused log-grepping later. -
drawElementService.integration.test.tsis adescribe.skipdocumenting what was validated locally. Discoverable but adds zero CI coverage and can drift silently. Two paths: (a) leave as a validation record (current shape, fine as a doc), (b) gate behind aRUN_BROWSER_INTEGRATION_TESTS=1env and add a nightly workflow that flips it on. Your call — current shape is honest about what's validated, so this is preference territory. -
render.ts:610uses=== truefor the inline option pass;render.ts:408-412uses!= nullfor the env override. Asymmetry is correct (env preserves on silence, option forwardsfalseon silence), and it tracks the--low-memory-modeidiom — but a one-liner on the option-pass line explaining why the two checks differ would help the next person threading a new flag.
Questions
-
Composition coverage of the harness. Was the validation harness exercised against sub-composition (
sub-composition-video), iframe (iframe-render-compat), HDR (hdr-regression), or font-variation compositions, or only css-spinner? The composition-root reparenting is the place I'd want fidelity evidence on those styles before flipping the flag on for any production-ish surface. -
Cloud / Lambda env state. The PR body says
cloud/render.tsandlambda/render.ts"silently no-op" — butresolveConfigreadsPRODUCER_EXPERIMENTAL_FAST_CAPTUREglobally, so if those infra surfaces happen to have the env var set (via deploy config), fast capture does activate. Is the "silently no-op" claim about the CLI flag wiring specifically (env-via-deploy-config is a supported operator escape hatch until the follow-up PR lands), or should those surfaces actively gate the env off until then? -
macOS path. On Mac,
headlessShellis null →preModeresolves to"screenshot"regardless. The new helper then flipscaptureModeto"drawelement"if useDrawElement is on and no SwiftShader. Per-frame branch gates BeginFrame onbeginFrameTimeTicks > 0, which stays 0 in the screenshot-launched path — so the compositor advances naturally and the comment atframeCapture.ts:1402-1404matches the behavior. Has the validation harness exercised macOS GPU, or only the Docker SwiftShader cases + the GPU-host T3?
What I didn't verify
-
Windows render compat.
Render on windows-latest+Tests on windows-latestare still in-flight at review time.drawelementdoesn't gate on platform, so a Windows session with the flag set would land in the same screenshot-launched + drawelement-mode path as macOS. If Windows CI surfaces something, this'll need a look. -
useDrawElement: truewithforceScreenshot: truesimultaneously.resolveConfigdoesn't appear to reconcile these —forceScreenshotis honored atpreModeresolution,useDrawElementis honored at init-time. A session could land in drawelement mode on a screenshot-launched browser. That's actually the design for macOS GPU, so it's not necessarily wrong — flagging as "haven't dug into the operator-confusion surface." -
--enable-features=CanvasDrawElementinteraction with multiple feature-flag groups. Verified the flag is added globally inbuildChromeArgs:554and is the only--enable-features=on the command line (the other flag is--disable-features=…which is a separate category). Chrome's behavior for multiple--enable-features=flags is version-dependent, but since there's only one, this is unambiguous. -
The integration record's PSNR=∞ claims — taken at the integration test file's word.
Clean execution; leaving as a comment.
— Review by Rames D Jusso

Summary
Adds an experimental frame-capture mode that reads DOM paint records directly via Chrome's
canvas.drawElementImageAPI instead ofPage.captureScreenshot. On a GPU it is ~46% faster than screenshot capture and is pixel-identical (PSNR = ∞). It is fully opt-in behind a new CLI flag — default behaviour is byte-unchanged.How it works
"drawelement"(alongside"beginframe"/"screenshot"). A<canvas layoutsubtree>is injected around the composition root after page-ready; each frame clears the canvas, callsdrawElementImage(root, 0, 0), and reads it back.drawElementImageon a transparent canvas is broken on SwiftShader (Chromium bug; promoted sub-layers are dropped). On detecting SwiftShader (Docker), transparent renders automatically fall back toPage.captureScreenshot. Opaque renders use drawElement on SwiftShader fine.resolveConfigauto-disablesenablePageSideCompositingwhen fast-capture is on (shader transitions fall back to the Node-side layered blend rather than dropping).Wiring
CLI flag → env → engine config:
--experimental-fast-capture→PRODUCER_EXPERIMENTAL_FAST_CAPTURE→EngineConfig.useDrawElement→resolveConfig→ capture session. Threaded through the Docker render path (buildDockerRunArgs) as well.Testing
drawElementService.test.ts,config.test.ts): SwiftShader detection, capture-mode routing, env resolution, page-side auto-disable. Full engine suite green (697 pass).--experimental-fast-capture→ mp4) logsdrawElement canvas injectedand produces a valid mp4 — exercises env → resolveConfig → captureCfg → createCaptureSession → drawelement → ffmpeg encode. This e2e caught (and this PR fixes) the PNG-vs-JPEG encode bug above.Scope / follow-up
spikes/(untracked, per repo convention) — not committed.cloud/render.ts) and Lambda (lambda/render.ts) render surfaces do not carry the flag yet — wired in a stacked follow-up PR. The flag silently no-ops there until then.🤖 Generated with Claude Code