-
-
Notifications
You must be signed in to change notification settings - Fork 370
Adding copy_store convenience method for Group
#3612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3612 +/- ##
==========================================
+ Coverage 60.90% 60.97% +0.07%
==========================================
Files 86 86
Lines 10174 10199 +25
==========================================
+ Hits 6196 6219 +23
- Misses 3978 3980 +2
🚀 New features to boost your workflow:
|
|
@d-v-b I know right now this won't work when zarr v2 is used. I can add that functionality as well or we say the convenience function is not supported for zarr v2. |
|
added zarr v2 support. If in general the code is approved, I will add documentation. |
src/zarr/core/group.py
Outdated
| store: StoreLike, | ||
| *, | ||
| overwrite: bool = False, | ||
| consolidate_metadata: bool | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this here? IMO it's simpler if copying is just copying. If people want to add consolidated metadata, they should do that after copying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adjusted it. It now consolidates if the source store was consolidated, but this does not require an argument. One more thing, I only take into account when consolidation happened at the root level. It could be (although the user should not do this) that consolidation happened at the subgroup level. This could be accounted for by checking if member. metadata.consolidated_metadata when looping through members. WDYT should I add this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would copy the original group and its children exactly, including their consolidated metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the purpose of copying over stale consolidated metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it keeps the scope of this function simple -- it just copies the original group exactly. If this tool produces a different group, then it's not really copying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regarding the from_store method, I would open a separate PR for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consolidated metadata is already in the group metadata document for zarr v3 if "kind" is set to "inline".
👍 but doing something other (more or less) than just copying would require interpreting that which is too far IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI @joshmoore the method under discussion here is a method on the Group class, not on stores. maybe this is a point of confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the only consolidated metadata-related parameter we need is whether iterating over the child groups uses their consolidated metadata (if present). See the relevant kwarg in the signature of members.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added both, only thing remaining is probably adding consolidated_metadata in the async group call.
|
@d-v-b @joshmoore I think I have now addressed all comments and I also added documentation, so I think this would be ready for a final round of review. @d-v-b thanks for your patience. |
|
@d-v-b @joshmoore , just bumping in case you have some time to see whether you approve the current state:) |
| self, | ||
| store: StoreLike, | ||
| *, | ||
| overwrite: bool = False, | ||
| use_consolidated_for_children: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should add a test that checks the signature of this function against the signature of the async version. otherwise, they have a tendency to drift. Here's an example of such a test:
zarr-python/tests/test_api/test_synchronous.py
Lines 26 to 41 in ee0e69a
| def test_docstrings_match(callable_name: str) -> None: | |
| """ | |
| Tests that the docstrings for the sync and async define identical parameters. | |
| """ | |
| callable_a = getattr(synchronous, callable_name) | |
| callable_b = getattr(asynchronous, callable_name) | |
| if callable_a.__doc__ is None: | |
| assert callable_b.__doc__ is None | |
| else: | |
| params_a = NumpyDocString(callable_a.__doc__)["Parameters"] | |
| params_b = NumpyDocString(callable_b.__doc__)["Parameters"] | |
| mismatch = [] | |
| for idx, (a, b) in enumerate(zip(params_a, params_b, strict=False)): | |
| if a != b: | |
| mismatch.append((idx, (a, b))) | |
| assert mismatch == [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I did not do the actual docstrings itself, but I did write a test for all methods for classes and their async counterparts. I think this is generalizable to other classes as well, though currently only Group and AsyncGroup are tested. With this, already some mismatches are detected which for now are skipped but could be addressed in a follow up PR. Doing so would require deprecating some parameters. If you agree I can create an issue for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see test_class_method_parameters_match
| self, | ||
| store: StoreLike, | ||
| *, | ||
| path: str | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does it mean when path is None? is it meaningfully different from defaulting to the empty string?
| ) | ||
|
|
||
|
|
||
| def get_method_names(cls): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def get_method_names(cls): | |
| def get_method_names(cls) -> str: |
| from zarr.core.common import JSON, ZarrFormat | ||
|
|
||
|
|
||
| def is_deprecated(method): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def is_deprecated(method): | |
| def is_deprecated(method) -> bool: |
| ] | ||
|
|
||
|
|
||
| def get_method_signature(cls, method_name: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def get_method_signature(cls, method_name: str): | |
| def get_method_signature(cls, method_name: str) -> dict[str, whatever]: |
| return src, arr_data, subgroup_arr_data, zarr_format, consolidate_metadata | ||
|
|
||
|
|
||
| def consolidate(src_store, src, consolidate_metadata, zarr_format): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add type annotations
| ], | ||
| indirect=True, | ||
| ) | ||
| def test_copy_to(copy_to_test_data) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def test_copy_to(copy_to_test_data) -> None: | |
| def test_copy_to(copy_to_test_data: tuple[int, None | tuple[int, ...], bool]) -> None: |
| ], | ||
| indirect=True, | ||
| ) | ||
| def test_copy_to_with_path(copy_to_test_data) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def test_copy_to_with_path(copy_to_test_data) -> None: | |
| def test_copy_to_with_path(copy_to_test_data: tuple[int, None | tuple[int, ...], bool]) -> None: |
[Description of PR]
This PR adds a convenience method for copying the store of a
Groupto a destination store.TODO:
docs/user-guide/*.mdchanges/@d-v-b