Describe the bug
Simulating live model deployment of the standard multivariate model DefaultDetector (i.e. DetectorEnsemble of VAE and RRCF) by means of the TSADEvaluator leads to periodic re-training. Initially, TSADEvaluator's default_retrain_kwargs() method ensures that train_config for training the DetectorEnsemble is an instance of DetectorEnsembleTrainConfig. However, after passing down re-training from DetectorEnsemble to the individual models, no care is taken to ensure that the train_config for the DefaultDetector will be an instance of the same class. Instead, train_config for training the DetectorEnsemble that is the DefaultDetector is of type dict which leads to the bug reported.
Most likely this is related to TSADEvaluator mismatch between full_train_kwargs and full_retrain_kwargs, as the lines
|
full_train_kwargs = self.default_train_kwargs() |
|
full_train_kwargs.update(train_kwargs) |
do not ensure that
|
train_result = self._train_model(train_vals, **full_train_kwargs) |
utilizes the correct
train_config.
To Reproduce
Bug has been identified by going over the tutorial on "Multivariate Time Series Anomaly Detection" for Merlion v2.0.2, section "Model Inference and Quantitative Evaluation" (see https://opensource.salesforce.com/Merlion/v2.0.2/tutorials/anomaly/2_AnomalyMultivariate.html#Model-Inference-and-Quantitative-Evaluation). When performing "Sliding Window Evaluation" with TSADEvaluator, the ensemble fails at re-training the DefaultDetector model due to the bug reported.
Expected behavior
Successful re-training of the DefaultDetector model as part of DetectorEnsemble models when using TSADEvaluator.
Screenshots
A screenshot of the resulting error stack trace is attached.

Desktop
- OS: Ubuntu 24.04 LTS
- Merlion Version: 2.0.2
- Python Version: 3.9.18
- openjdk-11-jdk installed as per docs.
Additional context
At the re-train trigger
|
if t >= t_next and not cur_train.is_empty() and not cur_test.is_empty(): |
the following call sequence occurs:
merlion.evaluate.anomaly.TSADEvaluator's get_predict() invokes merlion.evaluate.base.EvaluatorBase's get_predict(). The latter contains the re-training logic.
- When re-training is initiated,
self.model is an instance of merlion.models.ensemble.anomaly.DetectorEnsemble. Consequently, EvaluatorBase's _train_model() invokes DetectorEnsemble's train().
merlion.models.ensemble.anomaly.DetectorEnsemble inherits from merlion.models.ensemble.base.EnsembleBase and merlion.models.anomaly.base.DetectorBase. Only the latter has a train() method. Therefore, DetectorEnsemble's train() actually calls DetectorBase's train().
- Using
call_with_accepted_kwargs, DetectorBase's train() invokes DetectorEnsemble's _train().
- After executing
|
train_cfgs = train_config.per_model_train_configs |
train_cfgs becomes List[dict].
TSADEvaluator's get_predict() is invoked at the first iteration of
|
for i, (model, cfg, pr_cfg) in enumerate(zip(self.models, train_cfgs, pr_cfgs)): |
|
try: |
|
train_kwargs = dict(train_config=cfg, anomaly_labels=anomaly_labels, post_rule_train_config=pr_cfg) |
|
train_scores, valid_scores = TSADEvaluator(model=model, config=eval_cfg).get_predict( |
|
train_vals=train, test_vals=valid, train_kwargs=train_kwargs, post_process=True |
|
) |
which is responsible for re-training the first ensemble model which is an instance of merlion.models.defaults.DefaultDetector. At this moment, train_kwargs['train_config'] is of type dict. Effectively, TSADEvaluator's get_predict() invokes EvaluatorBase's get_predict().
EvaluatorBases get_predict() invokes EvaluatorBase's _train_model(). The latter invokes merlion.models.defaults.DefaultDetector's train(). At this moment, train_config is of type dict.
self.model is set to be a DetectorEnsemble of VAE and RRCF, and DefaultDetector's train() invokes LayeredDetector's train(). merlion.models.layers.LayeredDetector inherits from merlion.models.layers.LayeredModel and merlion.models.anomaly.base.DetectorBase. Only the latter has a train() method. Therefore, LayeredDetector's train() invokes DetectorBase's train(). At this moment, train_config is of type dict.
- Using
call_with_accepted_kwargs, DetectorBase's train() invokes DetectorEnsemble's _train().
train_config is required to be an instance of DetectorEnsembleTrainConfig. We see that this is not the case. The error occurs.
Describe the bug
Simulating live model deployment of the standard multivariate model
DefaultDetector(i.e.DetectorEnsembleof VAE and RRCF) by means of theTSADEvaluatorleads to periodic re-training. Initially,TSADEvaluator'sdefault_retrain_kwargs()method ensures thattrain_configfor training theDetectorEnsembleis an instance ofDetectorEnsembleTrainConfig. However, after passing down re-training fromDetectorEnsembleto the individual models, no care is taken to ensure that thetrain_configfor theDefaultDetectorwill be an instance of the same class. Instead,train_configfor training theDetectorEnsemblethat is theDefaultDetectoris of typedictwhich leads to the bug reported.Most likely this is related to
TSADEvaluatormismatch betweenfull_train_kwargsandfull_retrain_kwargs, as the linesMerlion/merlion/evaluate/base.py
Lines 191 to 192 in 085ef8a
do not ensure that
Merlion/merlion/evaluate/base.py
Line 202 in 085ef8a
train_config.To Reproduce
Bug has been identified by going over the tutorial on "Multivariate Time Series Anomaly Detection" for Merlion v2.0.2, section "Model Inference and Quantitative Evaluation" (see https://opensource.salesforce.com/Merlion/v2.0.2/tutorials/anomaly/2_AnomalyMultivariate.html#Model-Inference-and-Quantitative-Evaluation). When performing "Sliding Window Evaluation" with
TSADEvaluator, the ensemble fails at re-training theDefaultDetectormodel due to the bug reported.Expected behavior
Successful re-training of the
DefaultDetectormodel as part ofDetectorEnsemblemodels when usingTSADEvaluator.Screenshots

A screenshot of the resulting error stack trace is attached.
Desktop
Additional context
At the re-train trigger
Merlion/merlion/evaluate/base.py
Line 230 in 085ef8a
merlion.evaluate.anomaly.TSADEvaluator'sget_predict()invokesmerlion.evaluate.base.EvaluatorBase'sget_predict(). The latter contains the re-training logic.self.modelis an instance ofmerlion.models.ensemble.anomaly.DetectorEnsemble. Consequently,EvaluatorBase's_train_model()invokesDetectorEnsemble'strain().merlion.models.ensemble.anomaly.DetectorEnsembleinherits frommerlion.models.ensemble.base.EnsembleBaseandmerlion.models.anomaly.base.DetectorBase. Only the latter has atrain()method. Therefore,DetectorEnsemble'strain()actually callsDetectorBase'strain().call_with_accepted_kwargs,DetectorBase'strain()invokesDetectorEnsemble's_train().Merlion/merlion/models/ensemble/anomaly.py
Line 139 in 085ef8a
train_cfgsbecomesList[dict].TSADEvaluator'sget_predict()is invoked at the first iteration ofMerlion/merlion/models/ensemble/anomaly.py
Lines 159 to 164 in 085ef8a
merlion.models.defaults.DefaultDetector. At this moment,train_kwargs['train_config']is of typedict. Effectively,TSADEvaluator'sget_predict()invokesEvaluatorBase'sget_predict().EvaluatorBasesget_predict()invokesEvaluatorBase's_train_model(). The latter invokesmerlion.models.defaults.DefaultDetector'strain(). At this moment,train_configis of typedict.self.modelis set to be aDetectorEnsembleof VAE and RRCF, andDefaultDetector'strain()invokesLayeredDetector'strain().merlion.models.layers.LayeredDetectorinherits frommerlion.models.layers.LayeredModelandmerlion.models.anomaly.base.DetectorBase. Only the latter has atrain()method. Therefore,LayeredDetector'strain()invokesDetectorBase'strain(). At this moment,train_configis of typedict.call_with_accepted_kwargs,DetectorBase'strain()invokesDetectorEnsemble's_train().train_configis required to be an instance ofDetectorEnsembleTrainConfig. We see that this is not the case. The error occurs.