GPU by mailhexu · Pull Request #85 · mailhexu/TB2J

mailhexu · 2026-04-02T13:19:10Z

Option to use GPU accerlaration with JAX.

- Add --spin-conf CLI option and spin_conf TOML parameter for specifying magnetic moments - Refactor: extract prepare_magnon_from_params to reduce code duplication - Fix create_plot_script to write to correct output directory - Add magnon_theory.md documentation - Add examples directory with scripts and config files - Add comprehensive tests for magnon functionality

- Set default=True for --no-Jiso, --no-Jani, --no-DMI, --no-SIA CLI args to ensure all interactions are enabled by default - Remove path prepending for spin_conf_file and uz_file; paths are now relative to current working directory, not TB2J results path - Add combined J tensor output in exchange.out showing J = Jiso*I + DMI + Jani - Document combined tensor formula and provide verified example in docs - Fix type hints for Optional[str] in MagnonParameters

…on_amp

- Create MAEGreenGPU class using JAX for GPU acceleration - JAX is an optional dependency (lazy loading) - Implements GPU-accelerated: * Green's function computation with vmap * Spinor matrix rotation * Parallel angle computation - Add use_gpu option to siesta interface - Add --use_gpu CLI argument to siesta2J The API is fully compatible with MAEGreen.

- Create ExchangeNCLGPU class using JAX for GPU acceleration - JAX is an optional dependency (lazy loading) - Implements GPU-accelerated: * Pauli block decomposition * A tensor computation with einsum * Vectorized operations over R vectors and atom pairs * Orbital-resolved A tensor computation - Same API as ExchangeNCL with use_gpu parameter

Features: - Add --use_gpu flag to enable GPU acceleration (opt-in, no auto-detection) - GPU-accelerated eigenvalue/eigenvector computation using Cholesky decomposition - GPU-accelerated Green's function and A-tensor computation - JIT-compiled kernels for Pauli decomposition and tensor contractions - Support for non-orthogonal basis (overlap matrix S) - Separate GreenGPU.py module that inherits from TBGreen Performance improvements (9x9x9 k-mesh, 50 energy points): - Eigenvalue preparation: 22s → 5.6s (4x speedup) - Total: 34s → 22s (1.5x speedup) - Results match CPU version (J_iso ≈ -26.22 meV) Key changes: - TB2J/GreenGPU.py: New GPU-accelerated TBGreen class - TB2J/gpu/: New module with JAX-based GPU implementations - jax_utils.py: Eigenvalue computation, array utilities - exchange_ncl_gpu.py: GPU ExchangeNCL implementation - exchange_pert2_gpu.py: GPU perturbation implementation - mae_green_gpu.py: GPU MAE calculation - TB2J/green.py: Add use_gpu parameter, delegate to TBGreenGPU - TB2J/exchange.py: Integrate GPU exchange class - TB2J/exchange_params.py: Add use_gpu parameter handling - TB2J/interfaces/siesta_interface.py: Pass use_gpu to exchange calculation - TB2J/scripts/*.py: Add --use_gpu CLI flag Deprecated: - sisl_wrapper.py moved to deprecated/ (use HamiltonIO instead) - exchangeGPU.py removed (replaced by gpu/ module)

The y-component of Pauli decomposition had the wrong sign: - Wrong: (M01 - M10) * (-0.5j) - Correct: (M01 - M10) * 0.5j This matches the CPU version in TB2J/pauli.py and fixes the DMI y-component sign difference.

- Combined H(k), S(k) and eigenvalue decomposition into single JIT-compiled pipeline - Uses jax.vmap for batched eigenvalue computation - Optimized Cholesky decomposition for generalized eigenvalue problem - Caching of JIT-compiled functions to avoid recompilation overhead Performance improvements: - Eigenvalue preparation: 3.8s15s + 0.63s 0.62s (6.2x with JIT) -> 3.46s - Total: 8.33s -> 6.05s

- Created ExchangeCL2GPU class for GPU-accelerated collinear calculations - Uses JAX for tensor operations with JIT compilation - Added GPU support to Manager class for Wannier90 interface - Updated siesta_interface.py to use GPU for collinear calculations - Both collinear and non-collinear calculations now support GPU acceleration

- Removed /2 division in parse_ham for match TB2J_spinphon behavior - Removed H + H.conj().T symmetrization in gen_ham - Fixed merge_tbmodels_spin to use interleaved basis order The Wannier90 _hr.dat file stores H(R) for all R vectors. The previous code incorrectly divided by 2 and then symmetrized, giving incorrect results. Now the behavior matches TB2J_spinphon which gives correct exchange parameters.

…ivide by 2 in parser

mailhexu added 20 commits February 19, 2026 18:40

refactor magnon code cli interface

ee7e0e7

Merge branch 'main' into fixmagnon_amp

b7980b1

custom magnon qpoints

37814cf

Merge branch 'fixmagnon_amp' of github.com:mailhexu/TB2J into fixmagn…

a5fa8a3

…on_amp

update magnon cli for the use of custom qpoints

059513f

fix magnon sign of B

f502557

merge from main

3bec596

Fix Pauli decomposition sign error in GPU version

bccc4f6

The y-component of Pauli decomposition had the wrong sign: - Wrong: (M01 - M10) * (-0.5j) - Correct: (M01 - M10) * 0.5j This matches the CPU version in TB2J/pauli.py and fixes the DMI y-component sign difference.

Fix Hamiltonian hermiticity: add conjugate transpose in k-space and d…

b5fccce

…ivide by 2 in parser

debug mae gpu

4475a4b

merge main

bacdd0e

Merge branch 'fixmagnon_amp' into gpu

d8b88eb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU#85

GPU#85
mailhexu wants to merge 20 commits into
mainfrom
gpu

mailhexu commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mailhexu commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant