Skip to content

feat: AMD XDNA / Ryzen AI NPU backend #9

Description

@HaleTom

Request

Add an AMD XDNA NPU backend for edge inference, targeting AMD Ryzen AI processors (Ryzen AI 300 series with XDNA 2 NPU, Ryzen 8040 series with XDNA NPU).

  • Architecture reference: AMD XDNA™ Architecture
  • Software stack: AMD Ryzen AI Software (model export via ONNX → MLIR, execution on the AI Engine tile array via the NPU driver)

Why this backend

eddy's mission is "C++ inference library for multi-vendor edge NPUs." Today only Intel OpenVINO NPU is supported. AMD Ryzen AI is the other major PC-class NPU and would meaningfully expand hardware coverage:

  • XDNA is a spatial dataflow NPU — tiled array of AI Engine processors (VLIW SIMD vector + scalar cores, on-chip memory, up to 1.3GHz). Deterministic, library-based compilation. Relevant for the fixed-shape transformer workloads eddy runs.
  • XDNA 2 targets Generative AI on PC — AMD explicitly positions it for on-device GenAI, which overlaps with ASR model inference.
  • Same model export pipeline: eddy already exports models to ONNX for OpenVINO; the AMD Ryzen AI stack also consumes ONNX, so model artifacts could be shared with a backend-specific compile step.

Suggested scope

  1. Integrate the AMD Ryzen AI Software SDK (model optimizer + NPU runtime) as an optional CMake backend (EDDY_ENABLE_AMD_XDNA), parallel to the existing OpenVINO path.
  2. Reuse the existing ONNX export pipeline for Parakeet V2/V3 and Whisper; the AMD stack performs its own MLIR-based compilation to the NPU.
  3. Add a --device NPU_AMD (or backend-agnostic abstraction) device selector alongside the existing NPU (Intel) / CPU / AUTO.
  4. Benchmark on a Ryzen AI 300 series platform and add results to BENCHMARK_RESULTS.md.
  5. Handle the NPU-driver/SDK detection and graceful fallback when AMD hardware is absent.

Notes

  • This is already on the roadmap ("AMD Ryzen AI Software backend"). Filing to capture the specific ask and link the architecture reference for scoping.
  • Worth deciding whether the backend abstraction should be formalized now (a Backend interface with OpenVINO/XDNA/QNN implementations) or kept ad-hoc until a third backend lands. QNN (Qualcomm) is also on the roadmap.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions