Skip to content

Syraxius/TensorflowDockerBuilder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build TensorFlow 2.21 Wheels for NVIDIA GeForce RTX 5090

Build an unofficial TensorFlow v2.21.0 GPU wheel for the NVIDIA GeForce RTX 5090, RTX 50 Series, and other Blackwell workstation/consumer cards that use CUDA compute capability 12.0. The default output is a Python 3.12, Linux x86_64 TensorFlow wheel built in Docker with CUDA 12.8, cuDNN 9, native sm_120 cubins, and compute_120 PTX fallback code.

This repository is for developers who need a TensorFlow RTX 5090 wheel before or outside the official TensorFlow release matrix, especially for Blackwell CUDA workloads where stock wheels may not include sm_120 support.

The final .whl is exported to dist/ and repaired with relative CUDA RUNPATHs so downstream projects can use TensorFlow with the nvidia-* CUDA pip packages without manually setting LD_LIBRARY_PATH.

What This Builds

Default TensorFlow GPU wheel configuration:

Setting Default
TensorFlow version v2.21.0
TensorFlow commit pin a481b10260dfdf833a1b16007eead49c1d7febf3
Python ABI Ubuntu CPython 3.12 / cp312
Platform Linux x86_64
CUDA build image nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04
Hermetic CUDA/cuDNN 12.8.1 / 9.8.0
CUDA architectures sm_120,compute_120
Bazel CUDA config cuda_nvcc
Wheel suffix +selfbuild
Wheel output dist/

sm_120 gives native RTX 5090 and RTX 50 Series Blackwell cubins. compute_120 embeds PTX so the NVIDIA driver has a JIT fallback path for compatible 12.x devices.

Supported GPUs and Architectures

This builder defaults to NVIDIA Blackwell workstation/consumer GPUs with CUDA compute capability 12.0. NVIDIA's CUDA GPU Compute Capability table lists the following 12.0 cards, which are the intended default target for this wheel:

  • GeForce RTX 50 Series: RTX 5090, RTX 5080, RTX 5070 Ti, RTX 5070, RTX 5060 Ti, RTX 5060, and RTX 5050
  • NVIDIA RTX PRO Blackwell: RTX PRO 6000 Blackwell Server Edition, RTX PRO 6000 Blackwell Workstation Edition, RTX PRO 6000 Blackwell Max-Q Workstation Edition, RTX PRO 5000 Blackwell, RTX PRO 4500 Blackwell, RTX PRO 4000 Blackwell, RTX PRO 4000 Blackwell SFF Edition, and RTX PRO 2000 Blackwell

For those cards, sm_120 is the native cubin target and compute_120 is the embedded PTX target.

The default wheel is not a universal NVIDIA GPU wheel. It does not include native cubins for older or different compute capabilities such as:

GPU generation or card family Common CUDA target Default support
NVIDIA GB200/B200 Blackwell data center sm_100 Rebuild with an explicit sm_100/compute_100 target
NVIDIA GB300/B300 Blackwell data center sm_103 Rebuild with an explicit sm_103/compute_103 target
NVIDIA GH200, H200, H100 Hopper sm_90 Rebuild with sm_90/compute_90
RTX 6000 Ada, RTX 4090, RTX 4080, RTX 4070, RTX 4060 sm_89 Rebuild with sm_89/compute_89
RTX A6000/A5000/A4000 and GeForce RTX 30 Series Ampere sm_86 Rebuild with sm_86/compute_86
NVIDIA A100/A30 Ampere data center sm_80 Rebuild with sm_80/compute_80

When adding 10.x, 12.x, or future CUDA targets, use a CUDA image and nvcc version that can compile those architectures.

To build one TensorFlow wheel for several GPU generations, change the compute_capabilities setting through the interactive menu and regenerate the Dockerfile. For example:

sm_89,compute_89,sm_120,compute_120

More architecture targets increase TensorFlow build time and wheel size. For the RTX 5090 specifically, keep sm_120,compute_120 so the wheel contains native Blackwell kernels plus a PTX fallback.

Platform Support

The default artifact is a CPython cp312 wheel for Linux x86_64. This repository does not build Windows, macOS, Linux aarch64, or Jetson wheels by default. Non-default Python versions are possible, but each wheel should be installed and smoke-tested in a matching Python environment.

Requirements

  • Linux x86_64 host
  • Docker with Buildx and BuildKit
  • Python 3 to run main.py
  • Recent NVIDIA driver for RTX 5090 or RTX 50 Series runtime testing
  • Enough disk space, memory, and time for a TensorFlow source build

TensorFlow source is cloned and compiled inside Docker. The host Python environment is only used to run the build driver.

Build TensorFlow 2.21 for RTX 5090

Review the resolved defaults:

python3 main.py --show-config

Generate the Dockerfile and TensorFlow Bazel config:

python3 main.py --generate

Build the TensorFlow v2.21.0 CUDA wheel:

python3 main.py --build

Successful default builds write the wheel to:

dist/tensorflow-2.21.0+selfbuild-cp312-cp312-linux_x86_64.whl

Build logs are written to logs/.

You can also run the interactive menu:

python3 main.py

Python Versions

The default build uses Ubuntu's packaged Python 3.12. For other Python ABIs, the interactive menu can switch the Python distribution to one of:

  • ubuntu: Ubuntu package CPython, recommended for the default Python 3.12 build
  • deadsnakes: Deadsnakes PPA CPython, useful for quick experiments
  • source: CPython built from python.org sources inside Docker

The source-built path currently supports Python 3.10, 3.11, and 3.13. It avoids relying on Launchpad during unattended build queues, at the cost of building CPython before TensorFlow. Wheel filenames and virtualenv commands will use the matching Python ABI tag, such as cp311 or cp313.

Install the TensorFlow RTX 5090 Wheel

Install with TensorFlow's CUDA extras so the required NVIDIA CUDA libraries are pulled in as pip packages. For the default Python 3.12 wheel:

python3.12 -m venv /tmp/tf-5090
. /tmp/tf-5090/bin/activate
python -m pip install --upgrade pip
python -m pip install 'dist/tensorflow-2.21.0+selfbuild-cp312-cp312-linux_x86_64.whl[and-cuda]'

If reinstalling into an environment that already has the same wheel version, force pip to replace the installed files:

python -m pip install --force-reinstall \
  'dist/tensorflow-2.21.0+selfbuild-cp312-cp312-linux_x86_64.whl[and-cuda]'

Verify GPU Support

Run a small TensorFlow import and GPU check:

python - <<'PY'
import json
import tensorflow as tf

print(tf.__version__)
print(json.dumps(tf.sysconfig.get_build_info(), indent=2, sort_keys=True))
print(tf.config.list_physical_devices("GPU"))
PY

Expected highlights:

  • Version: 2.21.0+selfbuild
  • cuda_compute_capabilities: ["sm_120", "compute_120"]
  • One or more RTX 5090, RTX 50 Series, or compatible Blackwell GPUs listed

Leave LD_LIBRARY_PATH unset for the smoke test. The wheel repair step adds relative ELF RUNPATHs into TensorFlow shared objects so TensorFlow can find CUDA libraries installed by the nvidia-* pip packages under site-packages/nvidia/*/lib.

Use manual LD_LIBRARY_PATH only as a diagnostic if you are testing an unrepaired or custom wheel.

Why This Uses cuda_nvcc

TensorFlow v2.21.0 has a cuda_clang build path, but its hermetic LLVM 18 toolchain rejects sm_120. This builder uses TensorFlow's cuda_nvcc Bazel config so CUDA 12.8 nvcc compiles native RTX 5090 and RTX 50 Series kernels.

Project Layout

  • main.py: interactive and non-interactive Docker build driver
  • Dockerfile: generated BuildKit recipe
  • .tf_configure.bazelrc: generated TensorFlow Bazel config
  • dist/: wheel output
  • logs/: Docker build logs
  • .tf5090-build.json: optional local saved build settings

dist/, logs/, and .tf5090-build.json are ignored by git.

Publishing Notes

This is an unofficial downstream TensorFlow build, not an official TensorFlow release. For GitHub releases, keep the wheel filename and release notes clear about the target:

  • TensorFlow v2.21.0
  • Python ABI, for example cp312 for the default build
  • Linux x86_64
  • CUDA 12.8 / cuDNN 9
  • NVIDIA Blackwell compute capability 12.0
  • RTX 5090 / RTX 50 Series sm_120

The local version suffix (+selfbuild) is intentional for a downstream wheel. If you publish multiple variants, use distinct local suffixes such as +rtx5090.cuda128.sm120.

Notes

  • TensorFlow source builds are large. Docker image and BuildKit caches can use tens of gigabytes.
  • The generated wheel depends on NVIDIA CUDA pip packages when installed with [and-cuda]; it does not vendor those libraries directly.
  • Non-default Python builds should be installed and smoke-tested in a matching Python environment.
  • Arbitrary TensorFlow tags may require matching changes to Python, CUDA, cuDNN, Bazel, or TensorFlow build flags.

About

Portable and clean way to build and package TensorFlow with RTX 5090 support as distributable Python wheels

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors