Skip to content

0.6.25#416

Merged
joocer merged 4 commits into
mainfrom
0.6.25
Sep 4, 2025
Merged

0.6.25#416
joocer merged 4 commits into
mainfrom
0.6.25

Conversation

@joocer

@joocer joocer commented Sep 4, 2025

Copy link
Copy Markdown
Member

No description provided.

@joocer joocer merged commit 33ea9be into main Sep 4, 2025
3 of 12 checks passed
@joocer joocer deleted the 0.6.25 branch September 4, 2025 19:53
@joocer joocer requested a review from Copilot September 4, 2025 19:53

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request updates the version to 0.6.25 and makes several changes to improve consistency and functionality across the codebase.

  • Updates version from 0.6.24 to 0.6.25
  • Replaces hash functions with xxhash for better performance and consistency
  • Refactors test assertions to use set comparisons for order-independent testing

Reviewed Changes

Copilot reviewed 12 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
mabel/version.py Updates version number to 0.6.25
mabel/data/internals/group_by.py Replaces siphash with xxhash for group key generation
mabel/data/readers/internals/cursor.py Replaces CityHash64 with xxhash for partition identification
tests/test_data_group_by.py Updates test assertions to use set comparisons and fixes import paths
tests/test_data_dictset.py Updates expected hash value and fixes import path
tests/test_reader_cursor.py Updates partition values, removes unused code, and fixes import path
mabel/data/writers/internals/blob_writer.py Removes zstd compression parameter from parquet writer
mabel/utils/dates.py Adds trailing comma for function parameter
mabel/data/validator/init.py Adds trailing comma for function parameter
tests/test_utils_common.py Simplifies docstring format
tests/performance/indexing.py Simplifies docstring format
mabel/data/readers/internals/threaded_wrapper.py Simplifies docstring format

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +80 to +82
group_key: int = xxh3_64_intdigest(
"".join([f"{record.get(column, '')}" for column in self._columns]),
HASH_SEED,

Copilot AI Sep 4, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The xxh3_64_intdigest function call has an extra comma after HASH_SEED. This should be removed to match the function signature and the usage pattern on lines 75-77.

Suggested change
group_key: int = xxh3_64_intdigest(
"".join([f"{record.get(column, '')}" for column in self._columns]),
HASH_SEED,
HASH_SEED

Copilot uses AI. Check for mistakes.
Comment on lines +41 to +42
expected = [{'COUNT(*)': 6, 'user': 'alice'}, {'COUNT(*)': 5, 'user': 'bob'}, {'COUNT(*)': 2, 'user': 'eve'}]
assert set(tuple(sorted(d.items())) for d in ls) == set(tuple(sorted(d.items())) for d in expected)

Copilot AI Sep 4, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The set comparison pattern using tuple(sorted(d.items())) is repeated throughout this file. Consider extracting this into a helper function like assert_dict_sets_equal(actual, expected) to reduce duplication and improve readability.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants