This repo provides an implementation of Bayesian Hierarchical Invariant Prediction (BHIP) and experimental results from the paper Bayesian Hierarchical Invariant Prediction (arXiv:2505.11211 [stat.ML]).
The BHIP algorithm is implemented in src/, using NumPyro for probabilistic modeling.
src/— core BHIP algorithm: hierarchical models and invariance testsexperiments/— all experiments organized by paper section:bus_dwelling/— Section 4.1: bus dwelling time case study (synthetic data, generated in notebook)educational_attainment/— Section 4.2: school achievement case study, including Appendix E sparse priorscomputational_study/— Section 4.3: computational complexity comparison (ICP vs BHIP)synthetic_benchmark/— Section 4.4.1: low-dimensional synthetic benchmark and sensitivity analysis (Appendix H, I)gene_perturbation/— Section 4.4.2: Kemmeren yeast gene perturbation benchmark
external/— git submodule for the BIP baseline
Clone with submodules to include the BIP baseline:
git clone --recurse-submodules https://github.com/fmfsa/bhip
Install dependencies:
pip install -r requirements_cpu.txt
- Bus dwelling (Section 4.1) — synthetic, generated within the notebook.
- Educational attainment (Section 4.2) — CollegeDistance dataset from the R package AER (Rouse, 1995), downloaded automatically by the notebook.
- Synthetic benchmarks (Sections 4.3, 4.4.1) — generated by the experiment scripts using sempler.
- Kemmeren gene perturbation (Section 4.4.2) — yeast gene deletion data from Kemmeren et al. (2014). Download
Kemmeren.hdf5from the deleteome database and runexperiments/gene_perturbation/extract_data.pyto generate the required CSV and text files.
If you find this code helpful, please cite:
@InProceedings{madaleno2026,
title = {Bayesian Hierarchical Invariant Prediction},
author = {Francisco Madaleno and Pernille Julie Viuff Sand and Francisco C. Pereira and Sergio Hernan Garrido Mejia},
booktitle = {Proceedings of the Fifth Conference on Causal Learning and Reasoning},
year = {2026},
editor = {Bijan Mazaheri and Niels Richard Hansen},
series = {Proceedings of Machine Learning Research},
month = {Apr},
publisher = {PMLR},
}