Skip to content

Bevy 0.19 Panic: index out of bounds in AtomicSparseBufferVec dirty-page tracking via MeshCullingDataBuffer::grow #24730

Description

@sax

Full disclosure: I used Claude to help me track this down, but have gone through a lot of passes to try to reduce the noise and remove hallucinations. I'm going to keep using 0.18 in the near-term, as I'm also blocked on some navmesh crates updating, but I'll save my work in progress on a branch in case you'd like me to do any more investigation. Hopefully this is helpful.

Bevy version and features

  • 0.19.0 (release, crates.io)
  • Non-default features:
    bevy = { version = "0.19", default-features = false, features = [
      "3d", "bevy_dev_tools", "bevy_remote", "exr", "jpeg", "multi_threaded", "ui",
    ] }

Relevant system information

  • Rust: 1.96.0 (stable)
  • OS: macOS (Darwin 25.5.0)
SystemInfo { os: "macOS 26.5.1", kernel: "25.5.0", cpu: "Apple M4 Pro", core_count: "14", memory: "48.0 GiB" }
AdapterInfo { name: "Apple M4 Pro", vendor: 0, device: 0, device_type: IntegratedGpu, device_pci_bus_id: "", driver: "", driver_info: "", backend: Metal, subgroup_min_size: 4, subgroup_max_size: 64, transient_saves_memory: true }

What you did

A 3D app that renders most content (units, foliage, weapons) through custom GPU
instancing
, so the number of bevy-managed mesh instances (terrain chunks, weapon
parts, projectiles) stays under 256, with meshes moving every frame. The 3D
camera carries NoIndirectDrawing, so play runs in CpuCulling mode with no issue.

The panic fires deterministically on opening the pause menu, which spawns a UI
Camera2d that lacks NoIndirectDrawing. extract_meshes_for_gpu_building derives
any_gpu_culling = !gpu_culling_query.is_empty(), where
gpu_culling_query: Extract<Query<(), (With<Camera>, Without<NoIndirectDrawing>)>>
(mesh.rs:1958, 1967):

  1. During play the only camera has NoIndirectDrawingany_gpu_culling = false
    → the mesh instance queue is CpuCullingmesh_culling_data_buffer is unused and
    empty, even though current_input_buffer already holds N (< 256) instances.
  2. The menu Camera2d has no NoIndirectDrawingany_gpu_culling = true
    RenderMeshInstanceGpuQueue::init(true) switches to GpuCulling (mesh.rs:1294).
  3. That frame, collect_meshes_for_gpu_building calls
    mesh_culling_data_buffer.grow(current_input_buffer.len() = N) for the first time
    at N < 256. The off-by-one (below) zeroes its dirty_pages, and the following
    fast-path set for a moving mesh panics.

Both buffers are monotonic (never cleared/truncated) and grow early-returns unless
new_len > old_len, so this is the first GpuCulling frame, not steady state.

Minimal repro: one camera with GPU culling (no NoIndirectDrawing), a 3D scene
with fewer than 256 mesh instances, at least one moving (so the collection fast
path runs).

What went wrong

Panic from the parallel mesh-collection task pool:

thread 'Compute Task Pool (3)' panicked at
  bevy_render-0.19.0/src/render_resource/sparse_buffer_vec.rs:589:25:
index out of bounds: the len is 0 but the index is 0

  2: core::panicking::panic_bounds_check
  3: AtomicSparseBufferVec<T>::set                       // -> note_changed_index
  4: bevy_pbr::render::mesh::collect_meshes_for_gpu_building::{{closure}}::{{closure}}
        at bevy_pbr-0.19.0/src/render/mesh.rs:2603:38    // mesh_culling_data_buffer.set(...)
 ...
 45: bevy_pbr::render::mesh::collect_meshes_for_gpu_building
        at bevy_pbr-0.19.0/src/render/mesh.rs:2472:32    // ComputeTaskPool scope

Root cause

MeshCullingDataBuffer is an AtomicSparseBufferVec<MeshCullingData> with
page_size_log2 = 8page size = 256 elements (mesh.rs:650, 1639-1647). Each
frame collect_meshes_for_gpu_building grows it to the instance count (mesh.rs:2442):

mesh_culling_data_buffer.grow(current_input_buffer.len() as u32);

AtomicSparseBufferVec::grow sizes dirty_pages from the floored page index of
new_len instead of the page count needed for new_len elements
(sparse_buffer_vec.rs, grow ~612-650):

self.values.resize_with(new_len as usize, T::Blob::default);   // values -> new_len
let new_page_count = self.index_to_page(new_len);              // = new_len / 256  (floored!)
self.dirty_pages.resize_with(
    (new_page_count as usize).div_ceil(PAGES_PER_DIRTY_WORD as usize),
    || AtomicU64::new(u64::MAX),
);
fn index_to_page(&self, index: u32) -> u32 { index / self.page_size() }   // floor

For 0 < new_len < 256: new_page_count = 0dirty_pages is truncated to length
0
, while values keeps new_len elements (all in page 0, which needs one dirty
word). push does this correctly (index_to_page(index) / PAGES_PER_DIRTY_WORD + 1);
grow does not. set then writes values[index] fine but panics in
note_changed_index (sparse_buffer_vec.rs:564-590):

pub fn set(&self, index: u32, value: T) {
    value.write_to_blob(&self.values[index as usize]);   // ok: values has the index
    self.note_changed_index(index);                       // panics below
}
fn note_changed_index(&self, index: u32) {
    let page = self.index_to_page(index);
    let page_word = page / PAGES_PER_DIRTY_WORD;
    self.dirty_pages[page_word as usize].fetch_or(...);   // :589  dirty_pages empty -> panic
}

The panic at line 589 (not the values[index] write on 565) proves values contains
the index — the inconsistency is purely values (populated) vs dirty_pages (empty).

Additional information

Possible fix

In AtomicSparseBufferVec::grow, size dirty_pages from the page count needed to hold
new_len elements:

let new_page_count = new_len.div_ceil(self.page_size());   // not index_to_page(new_len)

(truncate has the same floor-based count but is only called with len = 0, where it
is harmless. The real defect is grow truncating dirty_pages below what values
requires.)

Workarounds

  • Put NoIndirectDrawing on every camera, including UI Camera2ds. Keeps
    any_gpu_culling = false (CpuCulling), so MeshCullingDataBuffer is never used.
    Clean and consistent. (The inverse — no NoIndirectDrawing anywhere, i.e. always
    GpuCulling — does not help, since the bug is in the GpuCulling path itself.)
  • PbrPlugin { use_gpu_instance_buffer_builder: false } also avoids this path, but
    the per-view batch-set type is still chosen from the hardware-detected
    GpuPreprocessingMode (Culling), not from that flag — producing an inconsistent
    state that logs "Dynamic uniform batch sets should be used when GPU preprocessing is off" and breaks real material draws. Only appropriate when the device lacks GPU
    preprocessing.

Likely-related code / history

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-RenderingDrawing game state to the screenC-BugAn unexpected or incorrect behaviorD-ModestA "normal" level of difficulty; suitable for simple features or challenging fixesP-CrashA sudden unexpected crashP-RegressionFunctionality that used to work but no longer does. Add a test for this!S-Ready-For-ImplementationThis issue is ready for an implementation PR. Go for it!

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Needs SME Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions