Mitocheck analysis by axiomcura · Pull Request #79 · WayScience/buscar

axiomcura · 2026-03-06T20:12:59Z

This PR adds the buscar results after applying the method to Mitocheck.

This will be merged once #75 is merged.

Note to reviewer

Most of the changes come from the analysis notebooks. The main parts that need review are:

metric.py
the 3.mitocheck-analysis folder

Please ignore the cfret analysis for now. I’m not sure why those results appeared after merging #75. I will submit another PR rerunning the cfret analysis to confirm whether any bugs or errors were introduced after the merge.

Also, please don’t focus too much on the 4.cpjump1-analysis folder, as that analysis is still in progress.

…alysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…_base_on_moa.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…alysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…otypic-activity.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…ores.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

review-notebook-app · 2026-03-06T20:13:05Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Copilot

Pull request overview

This PR adds Mitocheck Buscar analysis outputs (notebooks/scripts + generated plots/tables) and updates core utility/metric code used by the analysis workflows.

Changes:

Extend shuffle_feature_profiles and add an ENSG→gene-symbol helper for Mitocheck preprocessing.
Refactor/extend buscar.metrics scoring + EMD threading controls used by downstream analyses.
Add Mitocheck analysis runner scripts plus converted notebooks and plot-generation R scripts/artifacts.

Reviewed changes

Copilot reviewed 30 out of 54 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`utils/data_utils.py`	Adds label shuffling option and ENSG→gene symbol API helper used in Mitocheck workflows.
`r_buscar_env.yaml`	Expands environment with Python runtime + core Python deps used in Buscar workflows.
`notebooks/4.cpjump1-analysis/nbconverted/5.cpjump_u2os_MoA_analysis.py`	Removes converted notebook script (analysis reshuffling/cleanup).
`notebooks/4.cpjump1-analysis/nbconverted/3.calculate-on-off-scores.py`	Removes converted notebook script (analysis reshuffling/cleanup).
`notebooks/4.cpjump1-analysis/nbconverted/2.run_buscar_rankings_base_on_moa.py`	Updates MoA ranking workflow to new shuffling + metrics API.
`notebooks/4.cpjump1-analysis/nbconverted/2.assess-heterogeneity.py`	Removes converted notebook script (analysis reshuffling/cleanup).
`notebooks/4.cpjump1-analysis/nbconverted/1.generate-on-off-signatures.py`	Minor notebook export tweaks and output changes.
`notebooks/4.cpjump1-analysis/3.calculate-on-off-scores.ipynb`	Removes notebook.
`notebooks/4.cpjump1-analysis/2.run_buscar_rankings_base_on_moa.ipynb`	Adds updated notebook version of MoA ranking run.
`notebooks/4.cpjump1-analysis/2.assess-heterogeneity.ipynb`	Removes notebook.
`notebooks/4.cpjump1-analysis/1.generate-on-off-signatures.ipynb`	Updates to use `buscar.signatures.get_signatures` and refreshes outputs.
`notebooks/3.mitocheck-analysis/run-mitocheck-buscar.sh`	Adds a shell entrypoint to nbconvert + run Mitocheck Python analysis scripts.
`notebooks/3.mitocheck-analysis/plots/run-mitocheck-plots.sh`	Adds a shell entrypoint to nbconvert + run Mitocheck R plotting scripts.
`notebooks/3.mitocheck-analysis/plots/nbconverted/1.heat-map-on-off-scores.r`	Adds R script to render Mitocheck heatmaps.
`notebooks/3.mitocheck-analysis/plots/nbconverted/2.gene-ranking-relationship.r`	Adds R script to quantify/correlate rank consistency across phenotypic states.
`notebooks/3.mitocheck-analysis/plots/nbconverted/3.linear-modeling-ranking-and-proportion.r`	Adds R script for rank vs. phenotype-proportion modeling/plots.
`notebooks/3.mitocheck-analysis/plots/all-plots/gene-ranking-relationships/profile_specificity_real.csv`	Adds generated summary table artifact.
`notebooks/3.mitocheck-analysis/plots/all-plots/gene-ranking-relationships/profile_specificity_real_vs_shuffled.csv`	Adds generated comparison summary artifact.
`notebooks/3.mitocheck-analysis/plots/all-plots/gene-ranking-relationships/kendall_w_real_permutation_summary.csv`	Adds generated Kendall’s W permutation summary artifact.
`notebooks/3.mitocheck-analysis/plots/all-plots/gene-ranking-relationships/kendall_w_real_shuffled_permutation_summary.csv`	Adds generated real vs shuffled Kendall’s W summary artifact.
`notebooks/3.mitocheck-analysis/nbconverted/0.explore-mitocheck-data.py`	Adds converted Mitocheck EDA script.
`notebooks/3.mitocheck-analysis/nbconverted/1.buscar-analysis.py`	Adds converted Mitocheck Buscar analysis (including LOGO evaluation).
`notebooks/3.mitocheck-analysis/1.buscar-analysis.ipynb`	Adds the Mitocheck analysis notebook source.
`notebooks/2.cfret-analysis/4.assess-signature-significance.ipynb`	Notebook markdown/content adjustments.
`notebooks/2.cfret-analysis/3.generate-pca-umap-components.ipynb`	Notebook output changes (cell counts).
`notebooks/2.cfret-analysis/1.cfret-pilot-buscar-analysis.ipynb`	Notebook metadata version change.
`notebooks/0.download-data/2.preprocessing.ipynb`	Uses ENSG→symbol conversion + stores mapping for Mitocheck.
`buscar/metrics.py`	Refactors score computation and adds EMD threading + optional normalization behavior.
`.pre-commit-config.yaml`	Updates Ruff pre-commit revision.

Comments suppressed due to low confidence (1)

notebooks/4.cpjump1-analysis/nbconverted/2.run_buscar_rankings_base_on_moa.py:377

measure_phenotypic_activity is invoked with emd_n_threads, but the function signature uses n_threads. As written, this will raise TypeError: unexpected keyword argument 'emd_n_threads'. Update the call site to use n_threads (or align the API if emd_n_threads is intended).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-06T21:47:07Z

buscar/metrics.py

+        else:
+            raise ValueError(
+                f"Invalid off_calculation '{off_calculation}'. Must be 'ratio_affected'"
+                " or 'emd'."
+            )


In the signature_type == "off" branch, the else: raises for any off_calculation other than "ratio_affected", which makes off_calculation="emd" (allowed by the signature) fail at runtime. Add the EMD off-score branch or remove "emd" from the accepted values.

Copilot · 2026-03-06T21:47:08Z

buscar/metrics.py

        # extract morphological features for reference condition (excluding metadata)
-        ref_profile = profiles.filter(pl.col(treatment_col) == ref_state).drop(
-            meta_cols
-        )
+        ref_profile = profiles.filter(pl.col(state_col) == ref_state).drop(meta_cols)



state_col is allowed to be None, but it is used unconditionally in pl.col(state_col) when building ref_profile. This will fail at runtime for callers that don't pass state_col. Set state_col = treatment_col when state_col is None, and validate state_col exists / has no nulls similar to treatment_col.

Copilot · 2026-03-06T21:47:08Z

buscar/metrics.py

+    signature: list[str],
+    signature_type: Literal["on", "off"],
+    on_calculation: Literal["emd"] = "emd",
+    off_calculation: Literal["ratio_affected", "emd"] = "ratio_affected",


calculate_score (and measure_phenotypic_activity) accept off_calculation/off_method="emd" via the type hints and docs, but there is no implementation path for off-score EMD. Either implement the EMD off-score (mirroring the on-score branch) or remove "emd" from the accepted values to avoid a runtime error.

Suggested change

off_calculation: Literal["ratio_affected", "emd"] = "ratio_affected",

off_calculation: Literal["ratio_affected"] = "ratio_affected",

Copilot · 2026-03-06T21:47:08Z

utils/data_utils.py

+    r = requests.post(
+        f"{server}/lookup/id",
+        headers=headers,
+        data=json.dumps({"ids": ensg_ids}),  # note: "ids" not "id"
+    )


requests.post(...) is made without a timeout. In analysis/utility code this can hang indefinitely on network issues. Consider adding a reasonable timeout (and optionally retries/backoff) so notebook runs fail fast and predictably.

Copilot · 2026-03-06T21:47:08Z

utils/data_utils.py

+        gene_info = decoded.get(eng_gene)
+        if gene_info is not None:
+            decoded_genes[eng_gene] = gene_info.get("display_name", eng_gene)
+        else:
+            decoded_genes[eng_gene] = eng_gene


Docstring claims unresolved ENSG IDs are mapped to "N/A", but the implementation maps missing IDs back to the original ENSG ID (decoded_genes[eng_gene] = eng_gene). Either return "N/A" here or update the docstring to match the behavior (and consider returning {} early when ensg_ids is an empty list).

Copilot · 2026-03-06T21:47:08Z

utils/data_utils.py

 import numpy as np
 import pandas as pd
 import polars as pl
+import requests
 from pycytominer.cyto_utils import infer_cp_features


This module now depends on requests (imported here) for transform_ensg_to_gene_symbol. Ensure requests is declared as a direct project dependency (not just transitively via another package), otherwise fresh installs may fail when the transitive dependency set changes.

Copilot · 2026-03-06T21:47:09Z

utils/data_utils.py

+    if not r.ok:
+        r.raise_for_status()
+        sys.exit(1)
+


Calling sys.exit(1) inside a utility function is a surprising side effect and prevents callers from handling API failures. Prefer raising an exception (after raise_for_status()), or returning an error value that the caller can act on.

axiomcura and others added 27 commits February 12, 2026 11:27

removed cluster refinement module

0094350

removed heterogeneity

7b940d2

updated metrics

a51b553

created buscar module (separated notebook utils and buscar software)

2661eec

updated test module

38d4d33

updated imports to reflect module changs

36abe31

ignore vscode cached files

f6f96ac

updated metrics

5590b4d

Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-an…

039cd78

…alysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-an…

a927cb4

…alysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update buscar/metrics.py

580ec7e

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update notebooks/4.cpjump1-analysis/nbconverted/4.run_buscar_rankings…

3a8c5fe

…_base_on_moa.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-an…

71411f1

…alysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update buscar/metrics.py

3a699c4

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update notebooks/1.compound-prioritization/nbconverted/4.measure-phen…

77cade9

…otypic-activity.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update notebooks/4.cpjump1-analysis/nbconverted/3.calculate-on-off-sc…

87a34a9

…ores.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

applied copilot's changes

76e3b77

update processing module

fe507bf

reran buscar on cfret data

87a7475

reran nbconvert

286467c

removed pdf

fba3e78

update cpjump1 analysis module

e8e6dc9

updates

965ef28

Merge remote-tracking branch 'origin/main' into mitocheck-analysis

027e7bb

added mitocheck analysis module

406a9da

added label shuffling method

4b7f3c6

mitocheck buscar ranking analysis

1bfa635

Merge remote-tracking branch 'origin/main' into mitocheck-analysis

12d3c42

axiomcura requested a review from Copilot March 6, 2026 21:39

Copilot started reviewing on behalf of axiomcura March 6, 2026 21:40 View session

Copilot AI reviewed Mar 6, 2026

View reviewed changes

removed cpjump analysis folder, this should be in another PR

5addc42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mitocheck analysis#79

Mitocheck analysis#79
axiomcura wants to merge 29 commits intoWayScience:mainfrom
axiomcura:mitocheck-analysis

axiomcura commented Mar 6, 2026 •

edited

Loading

Uh oh!

review-notebook-app bot commented Mar 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	off_calculation: Literal["ratio_affected", "emd"] = "ratio_affected",
	off_calculation: Literal["ratio_affected"] = "ratio_affected",

Conversation

axiomcura commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Mar 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

axiomcura commented Mar 6, 2026 •

edited

Loading