[New Model] iLTM#305
Conversation
|
Heyho @davidbonet and @salcc, Would be great to hear your thoughts on how we are using your foundation model and whether we should make any changes. Also, any help adding our workaround in the official codebase would be appreciated! If you are okay with how we are using it, and once we have the results, we will add it to the leaderboard. With your blessing, we would add it as a verified implementation. Let me know if anything is missing in the code to use it as intended. |
# Conflicts: # tabarena/pyproject.toml # tabarena/tabarena/benchmark/models/model_registry.py # tabarena/tabarena/models/utils.py # tabflow_slurm/run_setup_slurm_jobs.py
|
Hey @LennartPurucker , I'll take a look and let you know, thanks! |
|
Very cool that you ran it on TabArena as well! Here are the results for default on Lite, which I believe match your findings (i.e., roughly above xRFM, plus/minus having more models in the mix). Lite is, in general, a bit noisier than using all splits.
Gotcha, I think TabSTAR uses the same definition! In both cases, I think this is fine and am doing HPO right now, but the compute costs of iLTM are quite high compared to a model like RealMLP. Ideally, if we get more compute grants, this will resolve itself. Or it will take a bit longer to fully integrate the results.
Great, this is what is happening in the wrapper, actually! The wrapper only changes the categorical codes, not the dtype itself. The preprocess call is otherwise a pass-through. The main goal is not to pass strings or objects that consume more RAM/VRAM for features that downstream models treat as categoricals. We usually only disable model-agnostic preprocessing for models that take semantics into account, e.g., TabSTAR.
Do you have any numbers or details on this? Assuming iLTM has accounts for an overhead of a few seconds, this should not occur. And as far as I can tell, it does have such an overhead. |
|
Heyho @davidbonet, I am running into an issue, and was wondering if your code handles this:
I am running on 40GB VRAM GPUs. |
|
Hey @LennartPurucker, we have some basic safeguards, but probably not enough 🙃 My guess right now is that it could happen at inference time even with small datasets if it's fast, so can build big ensembles in that time budget, but some layers can grow big depending on the hyperparameters and might not fit in a 40GB GPU. Could you tell me what hyperparameter configuration is giving OOM so I can look into it? Thank you! |
|
The config is: The full log is: DetailsI will look into it more once other jobs have finished, if needed. In the worst case, we can increase the VRAM for a few jobs or add some batching. |
|
Hey! It seems this OOM is about the CPU memory limit, as there is no We now implemented a fix so that the RF weights of the predictors are stored in lower precision, which reduces the amount of RAM needed without significantly impacting performance, and in this particular configuration allows it to run using less than 32 GB of RAM. To avoid crashes on other configurations that could use even more predictors, we also implemented a check that stops creating predictors when the used RAM would exceed the limits defined in However, we also noticed a separate AutoGluon/TabArena wrapper-side memory issue. In TabArena, this is triggered after fit/evaluation during result metadata collection: We think it could be fixed by defining the following method in class This matches AutoGluon’s serialized-size semantics, avoids the extra in-memory pickle copy, and keeps the change scoped to TabArena’s |
|
Very cool, thank you for the update on the code! Related to Nevertheless, great to raise this point! We likely want to fix this in the future for all models and actually plot and investigate disk space usage. CC @Innixma |
…ewest iltm version




This PR adds the iLTM model (https://github.com/AI-sandbox/iLTM, https://arxiv.org/abs/2511.15941).
The benchmark is currently running, so more changes might occur if I encounter any bugs.
So far, I have fixed problems in the upstream code to make the model run. Check out the code (iltm_model.py) for details. Two major issues:
_isolate_iltm_global_state)_ensure_iltm_logger_patched)By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.