Skip to content

Add GPU-emulator (mirage/rocjitsu) environment axis for AORTA workloads#227

Open
vivekkhandelwal1 wants to merge 5 commits into
mainfrom
users/vivekkhandelwal1/mirage-emulation
Open

Add GPU-emulator (mirage/rocjitsu) environment axis for AORTA workloads#227
vivekkhandelwal1 wants to merge 5 commits into
mainfrom
users/vivekkhandelwal1/mirage-emulation

Conversation

@vivekkhandelwal1

Copy link
Copy Markdown
Collaborator

Summary

  • Adds an emulator environment axis (emulator / mirage_profile) so AORTA workloads and triage cells can run on a software-emulated GPU (the mirage control plane + rocjitsu emulator) with no physical GPU — for hardware-free dev / CI / functional-correctness. Peers of docker/venv/buck_target; threaded into _aorta_environment via the existing dispatcher path (no dispatcher change). Registry + JSON sidecar loader accept the new keys; built-in emulated-rocjitsu environment added.
  • New aorta.emulation.mirage_launch: turns an emulated environment into mirage run --profile <p> -- <argv>; non-emulated launches are returned byte-for-byte unchanged. $MIRAGE_BIN resolution; fails loudly rather than silently running on real hardware.
  • aorta probe (SubprocessWorkload) opt-in: wraps its argv through mirage when the cell's environment is emulated.
  • New single-process gpu_smoke workload (trivial CUDA kernel + verify; min_world_size=1) + recipes/gpu-smoke-emulated.yaml — a hardware-free emulator/CI smoke test.

Test plan

  • tests/emulation/ (18 tests, no GPU required): Environment round-trips the new keys, built-in emulated-rocjitsu resolves, sidecars accept the keys, emulation detection, argv wrapping + passthrough, $MIRAGE_BIN resolution + error paths, SubprocessWorkload opt-in wrap.
  • Validated end-to-end on an emulated MI350X (no physical GPU): mirage run --profile rocjitsu-MI350X -- aorta triage run --recipe recipes/gpu-smoke-emulated.yamlmatrix.md, 0% failure.

Notes / limitations

  • rocjitsu is a functional emulator: single-GPU + real GPU kernels work; multi-GPU enumerates. Multi-rank RCCL collectives (torchrun ≥2 ranks) are pending upstream (rocjitsu's daemon is single-client today). No NIC/timing model — this is a functional-correctness + harness/CI substrate, not a substitute for at-scale or timing-sensitive hardware testing.

Made with Cursor

Run AORTA workloads/triage cells on a software-emulated GPU (mirage control
plane + rocjitsu) with no physical GPU, selected via the environment axis -
for hardware-free dev / CI / functional-correctness.

- Environment gains optional `emulator` / `mirage_profile` fields (peers of
  docker/venv/buck_target), threaded into `_aorta_environment` via the existing
  dispatcher asdict path. Allow-lists updated in the entry-point registry and
  JSON sidecar loader; built-in `emulated-rocjitsu` environment added.
- New `aorta.emulation.mirage_launch`: turns an emulated environment into
  `mirage run --profile <p> -- <argv>`; non-emulated argv is returned
  byte-for-byte unchanged. $MIRAGE_BIN resolution; loud errors.
- `SubprocessWorkload` (aorta probe) opt-in: wraps its argv through mirage when
  the cell's environment is emulated.
- New single-process `gpu_smoke` workload (trivial CUDA kernel + verify;
  min_world_size=1) + `recipes/gpu-smoke-emulated.yaml` - a hardware-free
  emulator/CI smoke test.
- docs + tests (no GPU required).

Validated end-to-end on an emulated MI350X (no physical GPU):
`mirage run --profile rocjitsu-MI350X -- aorta triage run
--recipe recipes/gpu-smoke-emulated.yaml` -> matrix.md, 0% failure.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copilot AI review requested due to automatic review settings June 16, 2026 14:51

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new GPU-emulator environment axis to AORTA so workloads (and probe-mode subprocess launches) can be run under the mirage + rocjitsu software GPU emulator, enabling hardware-free development/CI runs. It also introduces a minimal gpu_smoke workload and an emulation-focused recipe to validate the end-to-end triage path without a physical GPU.

Changes:

  • Extend Environment (and environment registries/sidecars) with emulator and mirage_profile, plus a built-in emulated-rocjitsu environment.
  • Add aorta.emulation.mirage_launch to detect emulated environments and wrap subprocess argv as mirage run --profile … -- <argv>, and integrate this into SubprocessWorkload setup.
  • Add gpu_smoke workload + recipes/gpu-smoke-emulated.yaml and supporting documentation/tests for the new emulation path.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/emulation/test_mirage_launch.py New tests covering Environment round-trip, sidecar keys, emulation detection, mirage bin resolution, argv wrapping, and SubprocessWorkload opt-in behavior.
tests/emulation/init.py Adds emulation test package marker.
src/aorta/workloads/gpu_smoke.py New single-process GPU smoke workload for emulator/CI validation.
src/aorta/workloads/_subprocess.py Wrap subprocess argv via mirage when the resolved environment is emulated; (also includes an unintended Tier-3 knob regression noted in comments).
src/aorta/registry/types.py Adds emulator and mirage_profile fields to Environment and documents the new axis.
src/aorta/registry/sidecar.py Allows emulator / mirage_profile keys in JSON sidecar environments.
src/aorta/registry/environments.py Allows new keys and adds built-in emulated-rocjitsu environment.
src/aorta/emulation/mirage_launch.py New emulation launch helper module: environment detection, mirage binary resolution, argv wrapping.
src/aorta/emulation/init.py Exposes emulation helpers at the package level.
recipes/gpu-smoke-emulated.yaml New recipe demonstrating triage execution under emulation.
pyproject.toml Registers gpu_smoke workload entry point.
docs/plans/mirage-aorta-integration.md Design/usage documentation for emulated GPU execution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/aorta/workloads/_subprocess.py
Comment thread src/aorta/workloads/gpu_smoke.py
- _subprocess.py: the prior commit copied an older revision of this file,
  inadvertently reverting newer main features (e.g. `tier3_vram_growth`
  TrialContext wiring, `_terminate_process_tree`). Rebuild on main's current
  file and apply ONLY the emulation opt-in, now guarded so the non-emulated
  path is a true zero-cost no-op (no import / no extra work) and existing
  probes are byte-for-byte unchanged.
- gpu_smoke.py: use a tolerance-based comparison (math.isclose) instead of
  exact float equality, which is brittle for float16/bfloat16 / larger sizes.

Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 2 comments.

Comment thread src/aorta/workloads/gpu_smoke.py
Comment thread src/aorta/workloads/gpu_smoke.py
- gpu_smoke: default `steps` via explicit `is None` so an intentional
  `steps: 0` is honored instead of being treated as missing (falsy-0).
- Add dependency-free `tests/workloads/test_gpu_smoke.py` (stubs a minimal
  fake torch): cuda-availability gate, steps defaulting incl. explicit 0,
  pass/fail tolerance, and corruption (out-of-tolerance) detection.

Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 1 comment.

Comment thread src/aorta/registry/types.py
…ields

Adding `emulator`/`mirage_profile` to `Environment` adds two keys to
`asdict(Environment(...))`. Update the existing assertions that pin the exact
asdict shape so they include `emulator: None` / `mirage_profile: None`:
- tests/registry/test_environments.py (pure-buck asdict)
- tests/run/test_dispatcher.py (docker/buck round-trip + buck/image override
  preservation: the `_aorta_environment` payload).

Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 15 changed files in this pull request and generated 3 comments.

Comment thread src/aorta/workloads/gpu_smoke.py Outdated
Comment thread src/aorta/registry/types.py Outdated
Comment thread src/aorta/workloads/_subprocess.py
- gpu_smoke: validate `dtype` and raise on an unknown/typo value (listing
  allowed values) instead of silently defaulting to float32, which could mask
  a misconfigured run as green. Add a unit test for the raise.
- types.py: mention `emulator` (optional hint paired with `mirage_profile`) in
  the Environment docstring's first paragraph so it matches the schema.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants