Bevy 0.19 Panic: `index out of bounds` in `AtomicSparseBufferVec` dirty-page tracking via `MeshCullingDataBuffer::grow`

Full disclosure: I used Claude to help me track this down, but have gone through a lot of passes to try to reduce the noise and remove hallucinations. I'm going to keep using 0.18 in the near-term, as I'm also blocked on some navmesh crates updating, but I'll save my work in progress on a branch in case you'd like me to do any more investigation. Hopefully this is helpful.

## Bevy version and features

- **0.19.0** (release, crates.io)
- Non-default features:
  ```toml
  bevy = { version = "0.19", default-features = false, features = [
    "3d", "bevy_dev_tools", "bevy_remote", "exr", "jpeg", "multi_threaded", "ui",
  ] }
  ```

## Relevant system information

- Rust: `1.96.0` (stable)
- OS: macOS (Darwin 25.5.0)
```
SystemInfo { os: "macOS 26.5.1", kernel: "25.5.0", cpu: "Apple M4 Pro", core_count: "14", memory: "48.0 GiB" }
```
 ```
AdapterInfo { name: "Apple M4 Pro", vendor: 0, device: 0, device_type: IntegratedGpu, device_pci_bus_id: "", driver: "", driver_info: "", backend: Metal, subgroup_min_size: 4, subgroup_max_size: 64, transient_saves_memory: true }
```

## What you did

A 3D app that renders most content (units, foliage, weapons) through **custom GPU
instancing**, so the number of *bevy-managed* mesh instances (terrain chunks, weapon
parts, projectiles) stays **under 256**, with meshes moving every frame. The 3D
camera carries `NoIndirectDrawing`, so play runs in CpuCulling mode with no issue.

The panic fires **deterministically on opening the pause menu**, which spawns a UI
`Camera2d` that lacks `NoIndirectDrawing`. `extract_meshes_for_gpu_building` derives
`any_gpu_culling = !gpu_culling_query.is_empty()`, where
`gpu_culling_query: Extract<Query<(), (With<Camera>, Without<NoIndirectDrawing>)>>`
(mesh.rs:1958, 1967):

1. **During play** the only camera has `NoIndirectDrawing` → `any_gpu_culling = false`
   → the mesh instance queue is `CpuCulling` → `mesh_culling_data_buffer` is unused and
   empty, even though `current_input_buffer` already holds N (< 256) instances.
2. **The menu `Camera2d` has no `NoIndirectDrawing`** → `any_gpu_culling = true` →
   `RenderMeshInstanceGpuQueue::init(true)` switches to `GpuCulling` (mesh.rs:1294).
3. That frame, `collect_meshes_for_gpu_building` calls
   `mesh_culling_data_buffer.grow(current_input_buffer.len() = N)` for the first time
   at N < 256. The off-by-one (below) zeroes its `dirty_pages`, and the following
   fast-path `set` for a moving mesh panics.

Both buffers are monotonic (never cleared/truncated) and `grow` early-returns unless
`new_len > old_len`, so this is the *first* GpuCulling frame, not steady state.

Minimal repro: one camera **with** GPU culling (no `NoIndirectDrawing`), a 3D scene
with **fewer than 256** mesh instances, at least one moving (so the collection fast
path runs).

## What went wrong

Panic from the parallel mesh-collection task pool:

```
thread 'Compute Task Pool (3)' panicked at
  bevy_render-0.19.0/src/render_resource/sparse_buffer_vec.rs:589:25:
index out of bounds: the len is 0 but the index is 0

  2: core::panicking::panic_bounds_check
  3: AtomicSparseBufferVec<T>::set                       // -> note_changed_index
  4: bevy_pbr::render::mesh::collect_meshes_for_gpu_building::{{closure}}::{{closure}}
        at bevy_pbr-0.19.0/src/render/mesh.rs:2603:38    // mesh_culling_data_buffer.set(...)
 ...
 45: bevy_pbr::render::mesh::collect_meshes_for_gpu_building
        at bevy_pbr-0.19.0/src/render/mesh.rs:2472:32    // ComputeTaskPool scope
```

### Root cause

`MeshCullingDataBuffer` is an `AtomicSparseBufferVec<MeshCullingData>` with
`page_size_log2 = 8` → **page size = 256 elements** (mesh.rs:650, 1639-1647). Each
frame `collect_meshes_for_gpu_building` grows it to the instance count (mesh.rs:2442):

```rust
mesh_culling_data_buffer.grow(current_input_buffer.len() as u32);
```

`AtomicSparseBufferVec::grow` sizes `dirty_pages` from the *floored* page index of
`new_len` instead of the page *count* needed for `new_len` elements
(sparse_buffer_vec.rs, grow ~612-650):

```rust
self.values.resize_with(new_len as usize, T::Blob::default);   // values -> new_len
let new_page_count = self.index_to_page(new_len);              // = new_len / 256  (floored!)
self.dirty_pages.resize_with(
    (new_page_count as usize).div_ceil(PAGES_PER_DIRTY_WORD as usize),
    || AtomicU64::new(u64::MAX),
);
fn index_to_page(&self, index: u32) -> u32 { index / self.page_size() }   // floor
```

For `0 < new_len < 256`: `new_page_count = 0` → `dirty_pages` is truncated to **length
0**, while `values` keeps `new_len` elements (all in page 0, which needs one dirty
word). `push` does this correctly (`index_to_page(index) / PAGES_PER_DIRTY_WORD + 1`);
`grow` does not. `set` then writes `values[index]` fine but panics in
`note_changed_index` (sparse_buffer_vec.rs:564-590):

```rust
pub fn set(&self, index: u32, value: T) {
    value.write_to_blob(&self.values[index as usize]);   // ok: values has the index
    self.note_changed_index(index);                       // panics below
}
fn note_changed_index(&self, index: u32) {
    let page = self.index_to_page(index);
    let page_word = page / PAGES_PER_DIRTY_WORD;
    self.dirty_pages[page_word as usize].fetch_or(...);   // :589  dirty_pages empty -> panic
}
```

The panic at line 589 (not the `values[index]` write on 565) proves `values` contains
the index — the inconsistency is purely `values` (populated) vs `dirty_pages` (empty).

## Additional information

### Possible fix

In `AtomicSparseBufferVec::grow`, size `dirty_pages` from the page count needed to hold
`new_len` elements:

```rust
let new_page_count = new_len.div_ceil(self.page_size());   // not index_to_page(new_len)
```

(`truncate` has the same floor-based count but is only called with `len = 0`, where it
is harmless. The real defect is `grow` truncating `dirty_pages` below what `values`
requires.)

### Workarounds

- **Put `NoIndirectDrawing` on *every* camera, including UI `Camera2d`s.** Keeps
  `any_gpu_culling = false` (CpuCulling), so `MeshCullingDataBuffer` is never used.
  Clean and consistent. (The inverse — no `NoIndirectDrawing` anywhere, i.e. always
  GpuCulling — does *not* help, since the bug is in the GpuCulling path itself.)
- **`PbrPlugin { use_gpu_instance_buffer_builder: false }`** also avoids this path, but
  the per-view batch-set type is still chosen from the hardware-detected
  `GpuPreprocessingMode` (`Culling`), not from that flag — producing an inconsistent
  state that logs `"Dynamic uniform batch sets should be used when GPU preprocessing is
  off"` and breaks real material draws. Only appropriate when the *device* lacks GPU
  preprocessing.

### Likely-related code / history

- #23242 (sparse buffer uploads for mesh input uniforms — introduced `AtomicSparseBufferVec`)
- #22988 (mesh collection workers update GPU data via shared memory — the fast path)
- #22297 (parallel mesh collection)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Bevy 0.19 Panic: `index out of bounds` in `AtomicSparseBufferVec` dirty-page tracking via `MeshCullingDataBuffer::grow` #24730

Bevy version and features

Relevant system information

What you did

What went wrong

Root cause

Additional information

Possible fix

Workarounds

Likely-related code / history

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

Bevy 0.19 Panic: index out of bounds in AtomicSparseBufferVec dirty-page tracking via MeshCullingDataBuffer::grow #24730

Description

Bevy version and features

Relevant system information

What you did

What went wrong

Root cause

Additional information

Possible fix

Workarounds

Likely-related code / history

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Bevy 0.19 Panic: `index out of bounds` in `AtomicSparseBufferVec` dirty-page tracking via `MeshCullingDataBuffer::grow` #24730