Support GPU Passthrough to VMs

# **Summary**
This work item enables **FeOS** to attach one or more physical GPUs directly to a FeOS-managed Virtual Machine (VM) using PCIe passthrough.

This functionality is critical for supporting GPU-accelerated workloads such as Artificial Intelligence (AI), Machine Learning (ML), scientific computing, and high-performance graphics within VMs. The implementation will extend the VM API to allow specifying GPUs by their host PCIe address.

---

## **Scope**

### ✅ In Scope
- Extend the FeOS VM API to allow specifying one or more GPUs via their host PCIe address for attachment to a VM.
- Implement the backend logic for PCIe passthrough of a complete physical GPU (e.g., using IOMMU / `vfio-pci`).
- Ensure the guest VM can recognize the attached GPU and that appropriate vendor drivers (e.g., NVIDIA, AMD) can be installed and utilized.
- Support for passing through multiple GPUs to a single VM.

### ❌ Out of Scope
- GPU virtualization technologies like NVIDIA vGPU or AMD MxGPU (SR-IOV). This issue focuses exclusively on **full device passthrough**.
- Live migration of VMs with attached GPUs.
- Dynamic hot-plugging of GPUs. GPUs must be attached when the VM is created or started.
- Host-side GPU driver installation and configuration. This issue assumes the host is correctly prepared for passthrough.

---

## **Responsible Areas**
- FeOS VM Management
- FeOS API

---

## **Contributors**
- @guvenc 
- @MalteJ 

---

## **Acceptance Criteria**

- ### API
  - [ ] The VM API is extended to accept a list of PCIe addresses for GPUs in the VM specification.
  - [ ] The API performs validation to ensure the specified PCIe devices exist and are available for passthrough.

- ### VM Runtime & Guest OS
  - [ ] A VM can be successfully launched with one or more GPUs passed through to it.
  - [ ] The guest operating system correctly identifies the hardware of the passed-through GPU(s) (e.g., visible in `lspci`).
  - [ ] Vendor-specific drivers (e.g., NVIDIA driver) can be installed successfully inside the guest OS.
  - [ ] A GPU-accelerated application or utility (e.g., `nvidia-smi`, a CUDA/OpenCL sample) runs successfully within the VM and can access the GPU's capabilities.
  - [ ] The FeOS host correctly isolates the device, preventing host-level drivers from claiming it while it is assigned to a VM.

---

## **Action Items**
- [ ] Design the API extension in the VM model for specifying GPU devices.
- [ ] Implement the backend logic to configure the hypervisor for GPU passthrough (e.g., managing IOMMU groups, binding to `vfio-pci`).
- [ ] Ensure that all functions of a GPU (e.g., graphics and audio components on the same PCIe card) are passed through together.
- [ ] Add robust validation and error handling for cases where a GPU is unavailable or passthrough fails.
- [ ] Create integration tests that:
    - [ ] Launch a VM with a single GPU and verify its functionality in the guest.
    - [ ] Launch a VM with multiple GPUs and verify their functionality in the guest.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support GPU Passthrough to VMs #106

Summary

Scope

✅ In Scope

❌ Out of Scope

Responsible Areas

Contributors

Acceptance Criteria

API

VM Runtime & Guest OS

Action Items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support GPU Passthrough to VMs #106

Description

Summary

Scope

✅ In Scope

❌ Out of Scope

Responsible Areas

Contributors

Acceptance Criteria

API

VM Runtime & Guest OS

Action Items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions