GenerativeMIL is a Julia project for generative modeling of multi-instance and set-structured data. It complements GroupAD.jl with reusable building blocks, model implementations, and training utilities for set-based generative and discriminative experiments.
The repository is still under active development, so APIs and model coverage may change.
- Set-based generative models and attention blocks implemented in Julia.
- CPU and GPU training paths for the main research models.
- DrWatson-based project layout for reproducible experiments.
- Support for variable-cardinality set data through masking where available.
The codebase is usable, but not all models are finished yet. Some components are research prototypes rather than polished production APIs.
Important note for the current setup:
- Do not import
cuDNNin this project for now. In the current environment it breakssoftmax.
- Clone or download the repository.
- Start Julia in the project directory.
- Activate and instantiate the environment:
using Pkg
Pkg.activate(".")
Pkg.instantiate()If you use DrWatson workflows, you can also rely on quickactivate from within the project.
src/contains the package code: building blocks, models, losses, utilities, and evaluation helpers.scripts/contains runnable experiment and training entry points.experiments/contains experiment-specific runs and outputs.test/contains smoke tests for CPU and GPU paths.
| Implemented models | CPU training | GPU training | variable cardinality1 (in/out)2 | note |
|---|---|---|---|---|
| SetVAE | yes | yes | yes/yes | Implementation is close to the original Python code. |
| FoldingNet VAE | yes | yes 3 | yes/no | Batched training on CPU via broadcasting. |
| PoolModel (ours) | yes | yes 4 | yes/yes | Masked forward pass for variable cardinality on GPU is still TODO. |
| SetTransformer | yes | yes | yes/no | Classifier version only. |
| Masked Autoencoder for Distribution Estimation (MADE) | yes | yes | possible5/no | Multiple-mask support is still TODO. |
| Masked Autoregressive Flow (MAF) | ? | ? | Not finished. | |
| Inverse Autoregresive Flow (IAF) | ? | ? | Not finished. | |
| SoftPointFlow | ? | ? | yes/yes | Not finished. |
| SetVAEformer (ours) | yes | yes | yes/yes | Work in progress. |
This project uses DrWatson to keep experiments organized and reproducible. Most scripts assume the repository root as the active project directory.
Footnotes
-
Cardinality means the number of elements in a single bag/set. In real data this can differ per sample, which complicates batching. ↩
-
"in" variable cardinality means varying set sizes in the input batch; "out" variable cardinality means the model can generate outputs with a different number of elements than the input. ↩
-
FoldingNet VAE is trainable on GPU via
fit_gpu_ready!. It is a special case with fixed cardinality and without KLD of reconstructed encoding. ↩ -
PoolModel currently works only for constant cardinality. ↩
-
This model has no cardinality reduction or expansion. ↩