In recent experiments, I observed that the AutoRecLab workspace (“sandbox”) can grow extremely large. In one current experiment, it even exceeded 400 GB. As a result, AutoRecLab terminates when the disk space is exhausted, which is not desirable. This is particularly common when working with large datasets, such as the Amazon2018MusicalInstruments dataset.
This growth occurs because AutoRecLab typically trains a model for each created node (e.g., via OmniRec) and stores it in a file for later use. With a large number of nodes, this leads to a correspondingly large number of stored models, causing storage usage to increase very quickly.
To address this, I will introduce a flag in the config.toml. Without this flag, AutoRecLab will behave as usual. However, when the flag is enabled, the full workspace of a node will no longer be archived/logged after execution. Instead, only the relevant results will be kept. In particular, the large model files will be deleted, allowing users to run large experiments even with limited storage capacity.

In recent experiments, I observed that the AutoRecLab workspace (“sandbox”) can grow extremely large. In one current experiment, it even exceeded 400 GB. As a result, AutoRecLab terminates when the disk space is exhausted, which is not desirable. This is particularly common when working with large datasets, such as the Amazon2018MusicalInstruments dataset.
This growth occurs because AutoRecLab typically trains a model for each created node (e.g., via OmniRec) and stores it in a file for later use. With a large number of nodes, this leads to a correspondingly large number of stored models, causing storage usage to increase very quickly.
To address this, I will introduce a flag in the config.toml. Without this flag, AutoRecLab will behave as usual. However, when the flag is enabled, the full workspace of a node will no longer be archived/logged after execution. Instead, only the relevant results will be kept. In particular, the large model files will be deleted, allowing users to run large experiments even with limited storage capacity.