-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Chrome is currently working on making its models work on the CPU, instead of being restricted to devices with GPUs. This will expand the number of devices which can access these APIs. However, it will also open up the possibility of these APIs being slower for those new users, which the web developer might not anticipate or prefer.
@clarkduvall wanted to start a discussion to see if some sort of constraint API would be valuable here. Here is a sketch to get discussion started, although per the reasons below I don't think it's quite right:
// Test whether a GPU model is available:
const availability = await Summarizer.availability({ deviceConstraint: "gpu" }); // or maybe "fast" / "low-latency"?
// Only create the summarizer if it can be done on the GPU:
const summarizer = await Summarizer.create({ deviceConstraint: "gpu" });Some considerations:
-
We should be sensitive to various implementation architectures, and not bake in assumptions like "there are only two modes, CPU and GPU", or "CPUs are always slow, and GPUs are always fast". I'm not sure what this looks like, exactly. One API surface that is relatively architecture-agnostic would be
{ requiredTokensPerSecond: 10 }, but I'm not sure that kind of information is easily available to implementations... -
This might be best combined with availability() and create() patterns should also work for APIs with cloud options #38, which envisions a way to exclude or require cloud-based implementations.
-
I don't think this is a significant privacy issue. Developers can gather very similar information, with more work, via benchmarking. And basic information like "is a GPU available" is already possible to gather using APIs like WebGPU. Nevertheless, all the APIs are async, so it's possible the browser could pop open a prompt if it feels necessary.
Our goal in opening this issue is to start the discussion, and see if it's appealing to developers. We'll likely learn more after Chrome rolls out its wider device coverage. If nobody complains, then probably it's not necessary :)