Skip to content

Conversation

@Jacobluke-
Copy link
Collaborator

No description provided.

JAnns98 and others added 4 commits October 20, 2025 16:01
v2025.10.20 Bingka

Various changes including:
- whorlmaps
- updated slopegraph aesthetics with added group summaries
- updated mini meta delta calculation
- extra custom_palette functionality
…-format data

Remove overly aggressive NaN filtering in _check_errors() that was causing
data truncation when using wide-format paired data with different group sizes.

Problem:
When loading wide-format paired data created by concatenating DataFrames of
different lengths (e.g., 20, 10, and 40 samples), the package was removing
ALL rows with ANY NaN value across ALL columns. This truncated all groups
to the size of the smallest group.

Root Cause:
In _check_errors() method, the code had:
elif x is None and y is None:
    self.__output_data.dropna(inplace=True)

This removed entire rows if they had NaN in ANY column, affecting all groups
even though NaN values were structural (from DataFrame concatenation) and not
actual missing data points.

Solution:
Removed the problematic elif block from _check_errors(). The downstream code
in _get_plot_data() already handles NaN values correctly by:
1. Using pd.melt() which preserves all non-NaN values
2. Calling dropna(subset=[self.__yvar]) which only removes rows with NaN in
    the measurement column, not across all columns

Testing:
- Added test_33_multi_paired_different_sizes() to verify groups with 20, 10,
and 40 samples are preserved correctly
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants