-
Notifications
You must be signed in to change notification settings - Fork 39
OpenSTEF Meta V0.1 #771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: research/v4.1.0
Are you sure you want to change the base?
OpenSTEF Meta V0.1 #771
Conversation
commit 37089b8 Author: Egor Dmitriev <[email protected]> Date: Mon Nov 17 15:29:59 2025 +0100 fix(#728): Fixed parallelism stability issues, and gblinear feature pipeline. (#752) * fix(STEF-2475): Added loky as default option for parallelism since fork causes instabilities for xgboost results. Signed-off-by: Egor Dmitriev <[email protected]> * fix(STEF-2475): Added better support for flatliners and predicting when data is sparse. Signed-off-by: Egor Dmitriev <[email protected]> * fix(STEF-2475): Feature handing improvements for gblinear. Like imputation, nan dropping, and checking if features are available. Signed-off-by: Egor Dmitriev <[email protected]> * fix(#728): Added checks on metrics to gracefully handle empty data. Added flatline filtering during evalution. Signed-off-by: Egor Dmitriev <[email protected]> * fix(#728): Updated xgboost to skip scaling on empty prediction. Signed-off-by: Egor Dmitriev <[email protected]> * fix(STEF-2475): Added parallelism parameters. Signed-off-by: Egor Dmitriev <[email protected]> --------- Signed-off-by: Egor Dmitriev <[email protected]> commit a85a3f7 Author: Egor Dmitriev <[email protected]> Date: Fri Nov 14 14:31:34 2025 +0100 fix(STEF-2475): Fixed rolling aggregate adder by adding forward filling and stating support for only one horizon. (#750) Signed-off-by: Egor Dmitriev <[email protected]> commit 4f0c664 Author: Egor Dmitriev <[email protected]> Date: Thu Nov 13 16:54:15 2025 +0100 feature: Disabled data cutoff by default to be consistent with openstef 3. And other minor improvements. (#748) commit 493126e Author: Egor Dmitriev <[email protected]> Date: Thu Nov 13 16:12:35 2025 +0100 fix(STEF-2475) fix and refactor backtesting iction in context of backtestforecasting config for clarity. Added more colors. Fixed data split function to handle 0.0 splits. (#747) * fix: Fixed data collation during backtesting. Renamed horizon to prediction in context of backtestforecasting config for clarity. Added more colors. Fixed data split function to handle 0.0 splits. * fix: Formatting. Signed-off-by: Egor Dmitriev <[email protected]> * fix: Formatting. Signed-off-by: Egor Dmitriev <[email protected]> --------- Signed-off-by: Egor Dmitriev <[email protected]> commit 6b1da44 Author: Egor Dmitriev <[email protected]> Date: Thu Nov 13 16:05:32 2025 +0100 feature: forecaster hyperparams and eval metrics (#746) * feature(#729) Removed to_state and from_state methods in favor of builtin python state saving functions. Signed-off-by: Egor Dmitriev <[email protected]> * feature(#729): Fixed issue where generic transform pipeline could not be serialized. Signed-off-by: Egor Dmitriev <[email protected]> * feature(#729): Added more state saving tests Signed-off-by: Egor Dmitriev <[email protected]> * feature(#729): Added more state saving tests Signed-off-by: Egor Dmitriev <[email protected]> * feature(#729): Added more state saving tests Signed-off-by: Egor Dmitriev <[email protected]> * feature: standardized objective function. Added custom evaluation functions for forecasters. * fix: Formatting. Signed-off-by: Egor Dmitriev <[email protected]> --------- Signed-off-by: Egor Dmitriev <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
…ybridForecaster2.0
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Residual Forecaster and Stacking Forecaster can now predict model contributions. Regular forecasters (EXCEPT LGBM Linear) can predict feature contributions
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
commit 6f88d72 Author: Lars van Someren <[email protected]> Date: Mon Dec 8 09:46:57 2025 +0100 Bugfixes Signed-off-by: Lars van Someren <[email protected]> commit b44fd92 Author: Lars van Someren <[email protected]> Date: Thu Dec 4 14:39:31 2025 +0100 bug fixes Signed-off-by: Lars van Someren <[email protected]> commit e212448 Author: Lars van Someren <[email protected]> Date: Thu Dec 4 12:38:24 2025 +0100 fixes Signed-off-by: Lars van Someren <[email protected]> commit eb775e4 Author: Lars van Someren <[email protected]> Date: Thu Dec 4 11:40:44 2025 +0100 BugFix Signed-off-by: Lars van Someren <[email protected]> commit c33ce93 Author: Lars van Someren <[email protected]> Date: Wed Dec 3 14:15:06 2025 +0100 Made PR Compliant Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
…utionsWFunctions' into research/HybridForecaster2.0 Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
…itting and Model Fit Result. Validation and test data can now be fully used Signed-off-by: Lars van Someren <[email protected]>
egordm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome improvements! It looks mostly release-ready.
I have a few small comments / nitpicks, but overall great work!
Quick note, we did make some small changes/fixes in the current release branch. So you might need to rebase.
I think after this is merged, we can open a PR to merge the research branch into release if it's free of research artifacts / temporary scripts.
| if isinstance(context.workflow.model, EnsembleForecastingModel): | ||
| raise NotImplementedError( | ||
| "MLFlowStorageCallback does not yet support EnsembleForecastingWorkflow model storage." | ||
| ) | ||
|
|
||
| # Create a new run | ||
| run = self.storage.create_run( | ||
| model_id=context.workflow.model_id, | ||
| tags=context.workflow.model.tags, | ||
| hyperparams=context.workflow.model.forecaster.hyperparams, | ||
| hyperparams=context.workflow.model.forecaster.hyperparams, # type: ignore TODO Make MLFlow compatible with OpenSTEF Meta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably address this before merging.
It's mostly in hyperparams if I understand it correctly? Since the rest is pickling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is required for use in production as I understand it. I left this integration un resolved as I do not fully understand this part of the package.
The primary issue here is EnsembleForecastingModel does not have a Forecaster attribute. So forecaster.hyperparams is unavailable. Depending on what they are used for downstream, we can either:
- Pass combiner.hyperparams
- Pass a dictionary of hyperparams, something like
{
'forecaster_model_1' : Hyperparams()
'forecaster_model_2' : Hyperparams()
'combiner' : Hyperparams()
}
This way everything is neatly saved, but it does require additional changes
packages/openstef-models/src/openstef_models/models/forecasting/flatliner_forecaster.py
Outdated
Show resolved
Hide resolved
packages/openstef-models/src/openstef_models/models/forecasting/gblinear_forecaster.py
Show resolved
Hide resolved
| if isinstance(result, EnsembleModelFitResult): | ||
| self._logger.info("Discarding EnsembleModelFitResult for compatibility.") | ||
| result = result.combiner_fit_result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick. But I think using log level info might be too verbose. debug would be more appropriate, or no logging at all, since it's not really actionable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change the log level. I do think it would be nice to keep the full EnsembleModelFit result in the future, to really evaluate the setup.
current setup:
EnsembleModelFitResults:
forecaster_results: ModelFitResult[] // Performance of base forecasters
combiner_results: ModelFitResult // Performance of forecasters + combiner
Right now the combiner fit result gives the total performance after applying the combiner models. Ideally we would separate this. We would get something like this:
EnsembleModelFitResults:
forecaster_results: ModelFitResult[] // Performance of base forecasters
combiner_results: ModelFitResult // Added performance of combiner
full_results: ModelFitResult // Performance of forecasters + combiner
This off course implies extra work for the callbacks etc to make everyting compatible, so for now we discard larger structure and keep only the total result.
packages/openstef-models/src/openstef_models/presets/forecasting_workflow.py
Outdated
Show resolved
Hide resolved
packages/openstef-models/src/openstef_models/transforms/general/selector.py
Show resolved
Hide resolved
Signed-off-by: Lars van Someren <[email protected]>
...stef-beam/src/openstef_beam/backtesting/backtest_forecaster/openstef4_backtest_forecaster.py
Outdated
Show resolved
Hide resolved
...stef-beam/src/openstef_beam/backtesting/backtest_forecaster/openstef4_backtest_forecaster.py
Outdated
Show resolved
Hide resolved
| # Extract quantiles from the workflow's model | ||
|
|
||
| if isinstance(self._workflow.model, EnsembleForecastingModel): | ||
| # Assuming all ensemble members have the same quantiles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also enforce this?
Then we do not have to assume I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not enforce it explicitly, but if the Forecasting Workflow (Config) is used, it is always the case. I have removed the inline comment. If statement still applies, as ensemble forecasting workflow does not have a .forecaster property. Alternatively we can define a .forecaster property on Ensemble Forecasting Model that returns self.forecasters[0] (To ensure compatibility without If statements with beam etc.
packages/openstef-meta/src/openstef_meta/models/ensemble_forecasting_model.py
Outdated
Show resolved
Hide resolved
packages/openstef-meta/src/openstef_meta/presets/forecasting_workflow.py
Outdated
Show resolved
Hide resolved
packages/openstef-meta/src/openstef_meta/presets/forecasting_workflow.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Lars van Someren <[email protected]>
|
Really nice additions to OpenSTEF, I have left some comments. Mostly small nitpicks. |
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
…ybridForecaster2.0 Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Signed-off-by: Lars van Someren <[email protected]>
Introducting OpenSTEF-Meta, a one-stop-shop for all things meta learning.
This sub-package introduces four common meta learning algorithms:
Residual Forecaster
Stacking Forecaster
Learned Weights Forecaster
Rules Forecaster
This is an initial implementation