Skip to content

feat: expand gateway glibc compatibility #1937

@pimlock

Description

@pimlock

Problem Statement

OpenShell gateway release binaries currently do not have a confirmed Rocky/RHEL 8-compatible GNU/Linux ABI floor.

Rocky Linux 8 and RHEL 8 ship glibc 2.28 with enterprise backports. Users running gateway on those hosts need openshell-gateway to start without requiring newer glibc symbols. The v0.0.63 generic GNU x86_64 gateway artifact was dynamically linked and required symbols above that floor:

  • GLIBC_2.29: log, log2, exp, pow, exp2
  • GLIBC_2.29: posix_spawn_file_actions_addchdir_np
  • GLIBC_2.30: pthread_cond_clockwait
  • GLIBC_2.30: gettid

Follow-up local builds showed that newer Rust/std paths can avoid some of these with weak optional lookups or fallbacks, but the CI path is still not settled. In particular, the current cargo-zigbuild gateway build with bundled Z3 has hit C/C++ wrapper/toolchain issues in CI before reaching the final GLIBC symbol verification stage.

Proposed Design

Expand gateway glibc compatibility so the released GNU/Linux gateway can explicitly target Rocky/RHEL/Alma/Oracle Linux 8 class systems with glibc 2.28.

The final build path should:

  • produce openshell-gateway release artifacts for GNU/Linux with no required GLIBC_* symbols above GLIBC_2.28;
  • keep a direct verifier in CI, for example:
strings openshell-gateway | rg -o 'GLIBC_[0-9]+\.[0-9]+' | sort -Vu
objdump -T openshell-gateway | rg 'GLIBC_2\.(29|30|3[1-9])'
  • smoke-test the resulting binary on a Rocky/RHEL 8-compatible runtime, ideally with LD_BIND_NOW=1 so missing eager bindings fail at startup;
  • document which release artifacts are expected to support glibc 2.28.

Alternatives Considered

1. Keep cargo-zigbuild with an explicit glibc 2.28 target and unbundle Z3

Build the gateway with an explicit target such as:

cargo zigbuild --release --target x86_64-unknown-linux-gnu.2.28 -p openshell-server --bin openshell-gateway

Then stop bundling Z3 into the gateway binary and provide Z3 as a runtime package dependency instead.

Pros:

  • Keeps the desired glibc floor explicit in the build target.
  • Avoids the current z3-sys bundled C/C++ build path that is interacting poorly with cargo-zigbuild in CI.
  • May align with system package managers for Debian/RPM-style gateway packages.

Cons / open questions:

  • We need to confirm whether Z3 is still required at build time for Rust crate metadata/link discovery, even if it is not bundled.
  • Release package manifests would need to declare the correct Z3 runtime dependency for every package format we produce.
  • Runtime dependency availability and ABI compatibility would vary by distro.
  • Tarball users would need clear instructions or a bundled sidecar strategy, since tarballs do not install system packages.

2. Use an older glibc sysroot on the existing CI image

A sysroot is a directory containing the target platform's headers and libraries. During linking, the compiler/linker can be told to use that directory instead of the build host's /usr/include, /lib, and /usr/lib.

This lets us use a newer build container/toolchain while still producing a binary with an older glibc ABI floor. For example, run modern Rust/clang/cargo in an Ubuntu 24.04 or Fedora image, but link against a Rocky/RHEL 8 sysroot containing glibc 2.28:

RUSTFLAGS="
  -C linker=clang
  -C link-arg=--sysroot=/opt/sysroots/rocky8
  -C link-arg=-L/opt/sysroots/rocky8/usr/lib64
  -C link-arg=-L/opt/sysroots/rocky8/lib64
"
cargo build --release -p openshell-server --bin openshell-gateway --target x86_64-unknown-linux-gnu

Deno uses a related approach in denoland/deno_sysroot_build: the sysroot provides older-versioned libraries to the linker while the build itself does not run inside a chroot/sysroot. See https://github.com/denoland/deno_sysroot_build.

Pros:

  • Makes the ABI floor explicit through the sysroot contents rather than through the ambient CI host.
  • Allows modern build tooling while linking against Rocky/RHEL 8-era glibc.
  • Should fix accidental bindings of old APIs to newer symbol versions, such as log@GLIBC_2.29 or pthread_create@GLIBC_2.34.
  • May avoid a full old-image build while keeping CI maintainability.

Cons / open questions:

  • The sysroot must be built, versioned, checksummed, and refreshed intentionally.
  • C/C++ dependencies such as Z3 still need careful compiler/linker configuration against the sysroot.
  • A sysroot will not invent APIs missing from glibc 2.28. Code still needs fallbacks for newer APIs such as:
posix_spawn_file_actions_addchdir_np  # glibc 2.29
pthread_cond_clockwait                # glibc 2.30
gettid                                # glibc 2.30
  • We may need linker shims for specific libm symbols if they otherwise bind to newer symbol versions.

3. Introduce musl as an alternative artifact

Provide an additional musl/static gateway artifact for environments that prefer not to depend on host glibc at all.

Pros:

  • Avoids the host glibc ABI floor problem for users who can run a musl-linked gateway.
  • We already use static musl successfully for the supervisor image path.
  • Could be useful as an escape hatch while GNU/glibc compatibility work continues.

Cons / open questions:

  • Fully switching the primary gateway artifact to musl is not a good default because musl builds are harder to FIPS validate than glibc builds.
  • Native TLS, DNS, libc behavior, and dependency behavior need separate validation.
  • This should likely be additive, not a replacement for GNU/glibc artifacts.

Agent Investigation

Context from the glibc compatibility investigation:

  • Rocky/RHEL 8 glibc is based on 2.28, with enterprise backports.
  • The v0.0.63 openshell-gateway-x86_64-unknown-linux-gnu.tar.gz artifact required GLIBC_2.29 and GLIBC_2.30 symbols.
  • Local experiments showed that a 2.28-floor build can start on Rocky 8 with LD_BIND_NOW=1, but the current CI path still needs a robust build strategy.
  • Rust/std weak optional symbol behavior appears acceptable for some symbols:
    • posix_spawn_file_actions_addchdir_np has a fallback path when unavailable.
    • gettid can fall back to the raw syscall when the glibc wrapper is unavailable.
    • pthread_cond_clockwait disappeared from the rebuilt binary; current Rust std Linux condvars use a futex path / older pthread symbols.
  • The bundled Z3 path is the main unresolved build-system complication for the gateway GNU/Linux release build.

Definition of Done

  • Decide on the primary build strategy for GNU/Linux gateway glibc 2.28 compatibility.
  • Resolve the bundled/unbundled Z3 build and runtime packaging story.
  • Add CI verification that fails if openshell-gateway requires symbols above GLIBC_2.28.
  • Add a Rocky/RHEL 8 startup smoke test for the built artifact.
  • Update release/package documentation to describe the supported glibc floor and any Z3 runtime dependency.
  • Decide whether to add a separate musl gateway artifact as an alternative.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions