Skip to content

[MNT] Fix #1039: Eliminate global seeds and replace sklearn.utils.check_random_state#1041

Open
krsatyamthakur-droid wants to merge 2 commits into
sktime:mainfrom
krsatyamthakur-droid:fix/issue-1039-no-sklearn
Open

[MNT] Fix #1039: Eliminate global seeds and replace sklearn.utils.check_random_state#1041
krsatyamthakur-droid wants to merge 2 commits into
sktime:mainfrom
krsatyamthakur-droid:fix/issue-1039-no-sklearn

Conversation

@krsatyamthakur-droid
Copy link
Copy Markdown
Contributor

Reference Issues/PRs

Fixes #1039.

What does this implement/fix? Explain your changes.

Cleaned up global np.random.seed() calls in several regressors to stop them from mutating the global state.

Following the maintainer's suggestion, I also removed the scikit-learn dependency for random state management by adding a pure NumPy check_random_state helper in skpro/utils/random_state.py.

Changes:

  • Added an internal check_random_state utility in random_state.py so we don't have to pull from sklearn.utils.
  • Removed np.random.seed() from bootstrap.py, enbpi.py, _bagging.py, and mdn.py.
  • Updated those estimators to use a local self._random_state for all sampling and noise logic.
  • Switched imports to use our new internal utility.

Does your contribution introduce a new dependency? If yes, which one?

No. This actually helps remove the sklearn dependency for these modules.

What should a reviewer concentrate their feedback on?

Check the check_random_state implementation in skpro/utils/random_state.py to make sure it covers all the edge cases for NumPy seeds. Also, double-check that I didn't miss any sampling steps in the regressors that still use the global np.random.

Did you add any tests for the change?

I ran the existing tests to make sure I didn't break reproducibility:

  • pytest skpro/regression/tests/test_mdn.py (Passed)
  • pytest skpro/regression/tests/test_all_regressors.py (63 tests for the affected regressors all passed).

Any other comments?

This should make these regressors thread-safe and stop them from messing with the user's local environment.

PR checklist

  • I've added myself to the list of contributors.
  • The PR title starts with [MNT].

@krsatyamthakur-droid
Copy link
Copy Markdown
Contributor Author

Hii @fkiraly "I've refactored this to avoid the scikit-learn dependency as you suggested. I added a local NumPy version of check_random_state so we can handle the seeds internally. Should be good now!"

@krsatyamthakur-droid krsatyamthakur-droid force-pushed the fix/issue-1039-no-sklearn branch from 229867a to 89ad843 Compare April 23, 2026 20:20
…or random state

This PR addresses sktime#1039 by:
1. Implementing a pure NumPy check_random_state in skpro/utils/random_state.py to avoid sklearn dependency as requested by the maintainer.
2. Removing np.random.seed() calls in bootstrap.py, enbpi.py, _bagging.py, and mdn.py.
3. Ensuring local random state management using the new utility.
4. Updating all internal references to use the internal check_random_state.

Signed-off-by: satyam kumar <krsatyamthakur@gamil.com>
@krsatyamthakur-droid krsatyamthakur-droid force-pushed the fix/issue-1039-no-sklearn branch from 89ad843 to 5faf681 Compare April 23, 2026 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[MNT] Eliminate global np.random.seed calls to prevent global state mutation.

1 participant