[MNT] Fix #1039: Eliminate global seeds and replace sklearn.utils.check_random_state#1041
Open
krsatyamthakur-droid wants to merge 2 commits into
Open
Conversation
Contributor
Author
|
Hii @fkiraly "I've refactored this to avoid the scikit-learn dependency as you suggested. I added a local NumPy version of check_random_state so we can handle the seeds internally. Should be good now!" |
229867a to
89ad843
Compare
…or random state This PR addresses sktime#1039 by: 1. Implementing a pure NumPy check_random_state in skpro/utils/random_state.py to avoid sklearn dependency as requested by the maintainer. 2. Removing np.random.seed() calls in bootstrap.py, enbpi.py, _bagging.py, and mdn.py. 3. Ensuring local random state management using the new utility. 4. Updating all internal references to use the internal check_random_state. Signed-off-by: satyam kumar <krsatyamthakur@gamil.com>
89ad843 to
5faf681
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reference Issues/PRs
Fixes #1039.
What does this implement/fix? Explain your changes.
Cleaned up global
np.random.seed()calls in several regressors to stop them from mutating the global state.Following the maintainer's suggestion, I also removed the
scikit-learndependency for random state management by adding a pure NumPycheck_random_statehelper inskpro/utils/random_state.py.Changes:
check_random_stateutility inrandom_state.pyso we don't have to pull fromsklearn.utils.np.random.seed()frombootstrap.py,enbpi.py,_bagging.py, andmdn.py.self._random_statefor all sampling and noise logic.Does your contribution introduce a new dependency? If yes, which one?
No. This actually helps remove the
sklearndependency for these modules.What should a reviewer concentrate their feedback on?
Check the
check_random_stateimplementation inskpro/utils/random_state.pyto make sure it covers all the edge cases for NumPy seeds. Also, double-check that I didn't miss any sampling steps in the regressors that still use the globalnp.random.Did you add any tests for the change?
I ran the existing tests to make sure I didn't break reproducibility:
pytest skpro/regression/tests/test_mdn.py(Passed)pytest skpro/regression/tests/test_all_regressors.py(63 tests for the affected regressors all passed).Any other comments?
This should make these regressors thread-safe and stop them from messing with the user's local environment.
PR checklist