`cake`

Join the project community on our server!

Cake is a Rust framework for distributed inference of large language models and image generation models based on Candle. The goal is to run big (70B+) models by repurposing consumer hardware into a heterogeneous cluster of iOS, Android, macOS, Linux and Windows devices, effectively leveraging planned obsolescence as a tool to make AI more accessible and democratic.

This is experimental code that's being actively developed and changed very quickly.

Key Features

Distributed Inference — Shard transformer blocks across multiple devices to run models that don't fit on a single GPU. Learn more.
Multi Model — Support for LLaMA 3.x, Qwen2/2.5, Qwen3.5 and Stable Diffusion.
Multi Platform — CUDA, Metal, and CPU backends across Linux, macOS, Windows, iOS, and Android.
Zero-Config Clustering — mDNS discovery, automatic layer assignment, and model data push with a single --cluster-key flag. Learn more.
OpenAI-Compatible API — REST API with streaming support, plus a built-in web UI and TUI chat client.
Docker — Container builds for Linux/NVIDIA with docker-compose cluster support.

Platform Support

OS	Architectures	Acceleration	Status
GNU/Linux	arm, arm64, x86_64	-	✅
GNU/Linux	arm, arm64, x86_64	CUDA	✅
GNU/Linux	arm, arm64, x86_64	BLAS	✅
Windows	x86_64	BLAS	⚠️
Windows	x86_64	CUDA	✅
macOS	x86_64	-	✅
macOS	aarch64	-	✅
macOS	aarch64	Metal	✅
Android	arm, arm64, x86_64	-	✅
Android	arm, arm64, x86_64	CUDA	⚠️
iOS / iPadOS	aarch64	-	✅
iOS / iPadOS	aarch64	Metal	✅ (A13+ / M-series)

Models

Model	Type	Feature Flag	Status
LLaMA 3.x	Text	`llama` (default)	✅
Qwen2 / Qwen2.5	Text	`qwen2` (default)	✅
Qwen3.5	Text	`qwen3_5` (default)	✅
Stable Diffusion (1.5, 2.1, XL, XL Turbo)	Image	-	✅

Quick Start

cargo build --release --features cuda  # or: --features metal
cake download Qwen/Qwen2.5-Coder-1.5B-Instruct
cake master --model Qwen/Qwen2.5-Coder-1.5B-Instruct --prompt "Hello!"

To start the API server and web UI:

cake master --model Qwen/Qwen2.5-Coder-1.5B-Instruct --api 0.0.0.0:8080

For the full usage guide and API reference, check the project documentation.

Contributors

Star History

License

Released under the GPL 3 license. To see the licenses of the project dependencies, install cargo license with cargo install cargo-license and then run cargo license.

Name		Name	Last commit message	Last commit date
Latest commit History 273 Commits
.cargo		.cargo
.github		.github
cake-cli		cake-cli
cake-core		cake-core
cake-mobile-app		cake-mobile-app
cake-mobile		cake-mobile
cuda-compat		cuda-compat
docs		docs
.dockerignore		.dockerignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

`cake`

Key Features

Platform Support

Models

Quick Start

Contributors

Star History

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

cake

Key Features

Platform Support

Models

Quick Start

Contributors

Star History

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`cake`

Packages