Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a development Dockerfile, relaxes the sdrf-pipelines dependency, updates an import path for UnimodDatabase, refactors SDRF validation/CLI to use unified validator with a minimal mode, and extends tests for DIA‑NN unified design parsing and error cases. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CLI as "quantmsutils CLI"
participant FS as "Filesystem (SDRF / design files)"
participant Validator as "sdrf_pipelines (read_sdrf / validate_sdrf)"
participant Parser as "quantmsutils.get_exp_design_dfs"
CLI->>FS: open --sdrf / --exp_design file
CLI->>Validator: read_sdrf(path) / validate_sdrf(template or minimal)
Validator-->>CLI: validation result (ok / errors)
CLI->>Parser: get_exp_design_dfs(validated_path)
Parser-->>CLI: DataFrames or raise ValueError
CLI->>FS: report errors or continue processing
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
tests/test_commands.py (1)
110-118: This only tests the bypass branch.
--skip_sdrf_validationexits beforeread_sdrf()/validate_sdrf()run, so the refactored validation path in this PR is still untested. Please add at least one non-skip case.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_commands.py` around lines 110 - 118, The test only covers the branch that bypasses SDRF validation; add a complementary test that exercises the real validation path by calling run_cli_command("checksamplesheet", args) without the "--skip_sdrf_validation" flag and using a valid or intentionally invalid SDRF file from TEST_DATA_DIR (e.g., "PXD000001.sdrf.tsv" or another fixture) so that read_sdrf() and validate_sdrf() are invoked; create a new test function (e.g., test_check_samplesheet_sdrf_validate) or parametrize the existing test to include both skip and non-skip cases and assert the appropriate exit_code and/or error messages from validate_sdrf().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Dockerfile.dev`:
- Around line 1-5: The Dockerfile leaves the container running as root; create
and switch to a non-root user: add user/group creation (e.g., addgroup/adduser
or groupadd/useradd), create the project directory if needed, chown /src to that
user, and add a USER <username> line before installing/running app commands so
subsequent steps run as the non-root user; update relevant RUN/COPY/WORKDIR
sequences in the Dockerfile to ensure ownership is set (references: WORKDIR
/src, COPY . ., RUN pip install --no-cache-dir .).
In `@pyproject.toml`:
- Line 34: The dependency specification for sdrf-pipelines is too open
(sdrf-pipelines = ">=0.1.1") and could allow breaking pre-1.0 API changes that
this PR relies on (UnimodDatabase, read_sdrf); update pyproject.toml to pin a
stable upper bound such as ">=0.1.1,<0.2.0" (or use a compatible spec like
"~=0.1.1") so installs remain reproducible and the expected APIs
(UnimodDatabase, read_sdrf) won't change unexpectedly.
In `@quantmsutils/sdrf/check_samplesheet.py`:
- Around line 1-5: The file references os and errno inside the make_dir() helper
but those modules are not imported, causing F821; either restore the missing
imports (add import os and import errno at the top) or remove/replace make_dir()
usage; specifically update check_samplesheet.py to import os and errno so
make_dir() can call os.path.exists, os.makedirs and errno.EEXIST, or
refactor/inline the functionality where make_dir() is used (look for the
make_dir function and call sites to ensure consistency).
- Around line 46-50: Change the CLI option that supplies the SDRF path to use
click.Path(exists=True, dir_okay=False, readable=True) so the shell/CLI
validates file existence for the parameter (the variable passed into read_sdrf
is input_sdrf / the --exp_design option). Wrap calls to read_sdrf(input_sdrf)
and df.validate_sdrf(...) in a try/except that catches exceptions from
parsing/validation and re-raises them as click.ClickException with a clear
message (include the original exception text) so failures surface as
CLI-friendly errors; apply the same pattern to the second occurrence covering
lines ~63–80.
---
Nitpick comments:
In `@tests/test_commands.py`:
- Around line 110-118: The test only covers the branch that bypasses SDRF
validation; add a complementary test that exercises the real validation path by
calling run_cli_command("checksamplesheet", args) without the
"--skip_sdrf_validation" flag and using a valid or intentionally invalid SDRF
file from TEST_DATA_DIR (e.g., "PXD000001.sdrf.tsv" or another fixture) so that
read_sdrf() and validate_sdrf() are invoked; create a new test function (e.g.,
test_check_samplesheet_sdrf_validate) or parametrize the existing test to
include both skip and non-skip cases and assert the appropriate exit_code and/or
error messages from validate_sdrf().
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9390029a-3072-4eff-b38f-0d00d472e928
⛔ Files ignored due to path filters (1)
tests/test_data/diann2msstats/PXD026600_diann_design.tsvis excluded by!**/*.tsv
📒 Files selected for processing (5)
Dockerfile.devpyproject.tomlquantmsutils/diann/dianncfg.pyquantmsutils/sdrf/check_samplesheet.pytests/test_commands.py
| FROM python:3.11-slim | ||
| RUN apt-get update && apt-get install -y --no-install-recommends git procps libglib2.0-0t64 && rm -rf /var/lib/apt/lists/* | ||
| WORKDIR /src | ||
| COPY . . | ||
| RUN pip install --no-cache-dir . |
There was a problem hiding this comment.
Run the dev container as a non-root user.
The image never switches away from root. That keeps the whole dev session over-privileged and will also leave bind-mounted files owned by root on the host.
Suggested change
FROM python:3.11-slim
-RUN apt-get update && apt-get install -y --no-install-recommends git procps libglib2.0-0t64 && rm -rf /var/lib/apt/lists/*
+RUN apt-get update \
+ && apt-get install -y --no-install-recommends git procps libglib2.0-0t64 \
+ && useradd --create-home --shell /bin/bash appuser \
+ && rm -rf /var/lib/apt/lists/*
WORKDIR /src
COPY . .
RUN pip install --no-cache-dir .
+RUN chown -R appuser:appuser /src
+USER appuser📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| FROM python:3.11-slim | |
| RUN apt-get update && apt-get install -y --no-install-recommends git procps libglib2.0-0t64 && rm -rf /var/lib/apt/lists/* | |
| WORKDIR /src | |
| COPY . . | |
| RUN pip install --no-cache-dir . | |
| FROM python:3.11-slim | |
| RUN apt-get update \ | |
| && apt-get install -y --no-install-recommends git procps libglib2.0-0t64 \ | |
| && useradd --create-home --shell /bin/bash appuser \ | |
| && rm -rf /var/lib/apt/lists/* | |
| WORKDIR /src | |
| COPY . . | |
| RUN pip install --no-cache-dir . | |
| RUN chown -R appuser:appuser /src | |
| USER appuser |
🧰 Tools
🪛 Trivy (0.69.3)
[error] 1-1: Image user should not be 'root'
Specify at least 1 USER command in Dockerfile with non-root user as argument
Rule: DS-0002
(IaC/Dockerfile)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Dockerfile.dev` around lines 1 - 5, The Dockerfile leaves the container
running as root; create and switch to a non-root user: add user/group creation
(e.g., addgroup/adduser or groupadd/useradd), create the project directory if
needed, chown /src to that user, and add a USER <username> line before
installing/running app commands so subsequent steps run as the non-root user;
update relevant RUN/COPY/WORKDIR sequences in the Dockerfile to ensure ownership
is set (references: WORKDIR /src, COPY . ., RUN pip install --no-cache-dir .).
| python = "*" | ||
| click = "*" | ||
| sdrf-pipelines = "==0.0.33" | ||
| sdrf-pipelines = ">=0.1.1" |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Which released sdrf-pipelinesversions expose bothsdrf_pipelines.converters.openms.unimod.UnimodDatabaseandsdrf_pipelines.sdrf.sdrf.read_sdrf/validate_sdrf, and what version range should downstream projects pin to in order to preserve those APIs?
💡 Result:
No released versions of sdrf-pipelines explicitly documented as exposing both sdrf_pipelines.converters.openms.unimod.UnimodDatabase and sdrf_pipelines.sdrf.sdrf.read_sdrf / validate_sdrf APIs could be identified from available sources. The package has releases from 0.0.2 to 0.1.0 (latest as of 2026-02-04), primarily CLI-focused (parse_sdrf validate-sdrf, convert-openms). Repository structure confirms sdrf_pipelines/ directory with sdrf/ and converters/, suggesting Python APIs exist, but no public API docs, changelogs, or version-specific mentions found. Downstream projects should test specific versions (e.g., >=0.0.33 <0.1.0) or pin to latest stable 0.1.0 after verifying API presence via code inspection, as no breaking change evidence exists.
Citations:
- 1: https://github.com/bigbio/sdrf-pipelines/releases
- 2: https://pypi.org/project/sdrf-pipelines/
- 3: https://github.com/bigbio/sdrf-pipelines
- 4: https://pypi.org/project/sdrf-pipelines/0.0.33/
- 5: https://pypi.org/project/sdrf-pipelines/0.1.1/
- 6: [ENH] Need for more comprehensive integration testing sdrf-pipelines#112
- 7: https://bioconda.github.io/recipes/sdrf-pipelines/README.html
🏁 Script executed:
#!/bin/bash
# Check sdrf-pipelines versions available on PyPI and GitHub releases
# Try to fetch PyPI JSON API
echo "=== PyPI sdrf-pipelines releases ==="
curl -s "https://pypi.org/pypi/sdrf-pipelines/json" | jq -r '.releases | keys[]' | sort -V
echo -e "\n=== Latest version info ==="
curl -s "https://pypi.org/pypi/sdrf-pipelines/json" | jq '.info | {version, release_url}'Repository: bigbio/quantms-utils
Length of output: 441
🏁 Script executed:
#!/bin/bash
# Verify the code uses the mentioned sdrf-pipelines APIs
echo "=== Searching for sdrf_pipelines imports in the codebase ==="
rg "from sdrf_pipelines|import sdrf_pipelines" -n
echo -e "\n=== Files that import sdrf_pipelines modules ==="
rg "sdrf_pipelines\." -l | head -20Repository: bigbio/quantms-utils
Length of output: 433
Bound sdrf-pipelines to a stable version range.
This PR now depends on specific sdrf-pipelines APIs (UnimodDatabase, read_sdrf), but >=0.1.1 allows future pre-1.0 releases where these may change. Add an upper bound to keep installs reproducible.
Suggested change
-sdrf-pipelines = ">=0.1.1"
+sdrf-pipelines = ">=0.1.1,<0.2.0"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| sdrf-pipelines = ">=0.1.1" | |
| sdrf-pipelines = ">=0.1.1,<0.2.0" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pyproject.toml` at line 34, The dependency specification for sdrf-pipelines
is too open (sdrf-pipelines = ">=0.1.1") and could allow breaking pre-1.0 API
changes that this PR relies on (UnimodDatabase, read_sdrf); update
pyproject.toml to pin a stable upper bound such as ">=0.1.1,<0.2.0" (or use a
compatible spec like "~=0.1.1") so installs remain reproducible and the expected
APIs (UnimodDatabase, read_sdrf) won't change unexpectedly.
| df = read_sdrf(input_sdrf) | ||
| errors = df.validate_sdrf( | ||
| template="ms-proteomics", | ||
| use_ols_cache_only=use_ols_cache_only, | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Click, what is the recommended way to validate an input file option so missing paths fail as CLI usage errors rather than uncaught exceptions? Is click.Path(exists=True, dir_okay=False, readable=True) the right fit for this case?
💡 Result:
Yes, click.Path(exists=True, dir_okay=False, readable=True) is the recommended way to validate an input file option in Click. This ensures that if a missing or invalid path is provided (or if the option is required and omitted), Click raises a click.BadParameter exception during parameter validation, which is handled as a proper CLI usage error with formatted help text like "Error: Invalid value for '--input-file': Path 'missing.txt' does not exist." rather than propagating an uncaught exception later in the code. Key features from official docs: - exists=True: Requires the path to exist; missing paths fail validation immediately. - dir_okay=False: Rejects directories, accepting only files. - readable=True: Ensures the file is readable (default, but explicit for clarity). For an input file option, use it like: @click.option('--input-file', type=click.Path(exists=True, dir_okay=False, readable=True), required=True) If required=True is omitted, provide a path or it uses the default (if any); validation only applies to provided values. This catches issues early as CLI errors, preventing unhandled exceptions when accessing the path later.
Citations:
- 1: https://click.palletsprojects.com/en/stable/handling-files/
- 2: https://click.palletsprojects.com/en/stable/parameter-types/
- 3: https://click.palletsprojects.com/en/8.1.x/arguments/
- 4: https://click.palletsprojects.com/en/stable/parameters/
🏁 Script executed:
# First, locate the file
find . -name "check_samplesheet.py" -type fRepository: bigbio/quantms-utils
Length of output: 104
🏁 Script executed:
# Once found, examine the file around the target lines (40-90)
cat -n quantmsutils/sdrf/check_samplesheet.py | sed -n '35,85p'Repository: bigbio/quantms-utils
Length of output: 1771
🏁 Script executed:
# Also check if there are any other similar option decorators in the file
rg -n "click\.option" quantmsutils/sdrf/check_samplesheet.py -A 2Repository: bigbio/quantms-utils
Length of output: 369
🏁 Script executed:
# Check if there are other commands with click.option decorators for file paths
rg -B 5 "@click.command" quantmsutils/sdrf/check_samplesheet.py | head -50Repository: bigbio/quantms-utils
Length of output: 110
🏁 Script executed:
# Get the complete file to understand all commands
wc -l quantmsutils/sdrf/check_samplesheet.py
cat -n quantmsutils/sdrf/check_samplesheet.pyRepository: bigbio/quantms-utils
Length of output: 2945
Use click.Path() to validate file existence at the CLI boundary, and wrap parser errors as ClickExceptions.
Currently, --exp_design accepts any string without validation, so missing files are only discovered inside read_sdrf() and bubble up as raw exceptions. Validate the path at the CLI boundary using click.Path(exists=True, dir_okay=False, readable=True) to catch missing files early. Additionally, wrap read_sdrf() and validate_sdrf() calls in a try-except block to translate parsing failures to ClickException, ensuring all errors surface as proper CLI usage messages rather than raw tracebacks.
Suggested changes
-@click.option("--exp_design", help="SDRF file to be validated", required=True)
+@click.option(
+ "--exp_design",
+ type=click.Path(exists=True, dir_okay=False, readable=True, path_type=str),
+ help="SDRF file to be validated",
+ required=True,
+)
`@click.option`("--skip_sdrf_validation", help="Skip all SDRF validation", is_flag=True)
@@
- df = read_sdrf(input_sdrf)
- errors = df.validate_sdrf(
- template="ms-proteomics",
- use_ols_cache_only=use_ols_cache_only,
- )
+ try:
+ df = read_sdrf(input_sdrf)
+ errors = df.validate_sdrf(
+ template="ms-proteomics",
+ use_ols_cache_only=use_ols_cache_only,
+ )
+ except Exception as exc:
+ raise click.ClickException(str(exc)) from excAlso applies to: 63–80
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@quantmsutils/sdrf/check_samplesheet.py` around lines 46 - 50, Change the CLI
option that supplies the SDRF path to use click.Path(exists=True,
dir_okay=False, readable=True) so the shell/CLI validates file existence for the
parameter (the variable passed into read_sdrf is input_sdrf / the --exp_design
option). Wrap calls to read_sdrf(input_sdrf) and df.validate_sdrf(...) in a
try/except that catches exceptions from parsing/validation and re-raises them as
click.ClickException with a clear message (include the original exception text)
so failures surface as CLI-friendly errors; apply the same pattern to the second
occurrence covering lines ~63–80.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@quantmsutils/sdrf/check_samplesheet.py`:
- Around line 5-6: Resolve the sdrf-pipelines version mismatch by choosing the
supported upstream version (either pin to 0.0.33 in pyproject.toml or update
requirements.txt to >=0.1.1), then update the code to match that version's API:
verify and, if necessary, change the import "from sdrf_pipelines.sdrf.sdrf
import read_sdrf" to the correct module path used by the selected version and
update calls to df.validate_sdrf(template=..., use_ols_cache_only=...) to match
the actual parameter names and signature in that version (rename or remove
use_ols_cache_only or template args as required); ensure both dependency files
reference the same version constraint so installs are consistent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8630374e-df9f-43fa-b364-685ef327004f
📒 Files selected for processing (1)
quantmsutils/sdrf/check_samplesheet.py
|
|
||
| from sdrf_pipelines.sdrf.sdrf import read_sdrf |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
sdrf-pipelines 0.0.33 read_sdrf validate_sdrf API
💡 Result:
sdrf-pipelines 0.0.33 is a Python package (https://pypi.org/project/sdrf-pipelines/0.0.33/, released Nov 23, 2025) from the bigbio/sdrf-pipelines GitHub repo (https://github.com/bigbio/sdrf-pipelines). It is the official SDRF (Sample and Data Relationship Format) file validator and converter for proteomics/multiomics, translating SDRF files to pipeline configurations (e.g., OpenMS, MaxQuant, MSstats). Primary usage is via the CLI tool parse_sdrf: - validate-sdrf: Validates SDRF structure, formatting, uniqueness, and optionally ontology terms (EFO, CL, MS) using YAML schema templates (default: mass spectrometry). Example: parse_sdrf validate-sdrf --sdrf_file path/to/sdrf.tsv --template human --skip-ontology Simplified: parse_sdrf validate-sdrf-simple path/to/sdrf.tsv --template default No explicit Python API documentation for read_sdrf or validate_sdrf functions found in official sources (PyPI, GitHub README, COMMANDS.md referenced but not detailed). These may be internal functions called by the CLI (e.g., in sdrf_pipelines.sdrf.validators or parsers), accessible after pip install sdrf-pipelines by importing the module, but primarily designed for CLI use. Check GitHub source code (sdrf_pipelines/ directory) for programmatic access. Install with pip install sdrf-pipelines==0.0.33 or sdrf-pipelines[ontology] for full validation.
Citations:
- 1: https://pypi.org/project/sdrf-pipelines/0.0.33/
- 2: https://github.com/bigbio/sdrf-pipelines/blob/dev/README.md
- 3: https://github.com/bigbio/sdrf-pipelines/releases
- 4: https://github.com/bigbio/sdrf-pipelines
- 5: https://pypi.org/project/sdrf-pipelines/
🏁 Script executed:
#!/bin/bash
# Verify sdrf-pipelines 0.0.33 API structure by checking GitHub source
# First, check if we can access the repo and look at the module structure at the tag for 0.0.33
# Try to fetch the raw content from GitHub at the specific version
# Check the structure of sdrf_pipelines.sdrf.sdrf module
curl -s "https://raw.githubusercontent.com/bigbio/sdrf-pipelines/0.0.33/sdrf_pipelines/sdrf/sdrf.py" | head -100
echo "---"
# Also check for SDRF class and validate_sdrf method
curl -s "https://raw.githubusercontent.com/bigbio/sdrf-pipelines/0.0.33/sdrf_pipelines/sdrf/sdrf.py" | grep -A 10 "def read_sdrf\|def validate_sdrf\|class.*SDRF" | head -50Repository: bigbio/quantms-utils
Length of output: 81
🏁 Script executed:
#!/bin/bash
# Check for dependency specifications in the repository
find . -type f \( -name "setup.py" -o -name "requirements*.txt" -o -name "pyproject.toml" -o -name "setup.cfg" \) | head -20
echo "---"
# Check the actual usage in the check_samplesheet.py file
cat -n quantmsutils/sdrf/check_samplesheet.py | head -50Repository: bigbio/quantms-utils
Length of output: 1897
🏁 Script executed:
#!/bin/bash
# Check the requirements and pyproject.toml for sdrf-pipelines version
echo "=== requirements.txt ==="
cat requirements.txt | grep -i sdrf
echo ""
echo "=== pyproject.toml ==="
cat pyproject.toml | grep -A 5 -B 5 sdrfRepository: bigbio/quantms-utils
Length of output: 285
🏁 Script executed:
#!/bin/bash
# Check the actual check_samplesheet.py to see how read_sdrf and validate_sdrf are being used
cat -n quantmsutils/sdrf/check_samplesheet.py
echo ""
echo "=== Checking if there are other files using these imports ==="
rg "read_sdrf|validate_sdrf" --type pyRepository: bigbio/quantms-utils
Length of output: 2914
Fix version conflict: requirements.txt (0.0.33) vs pyproject.toml (>=0.1.1).
The requirements.txt pins sdrf-pipelines==0.0.33 while pyproject.toml specifies sdrf-pipelines>=0.1.1. This creates an installation inconsistency. Verify whether the code targets 0.0.33 specifically or can work with 0.1.1+, then align both files. If the import path from sdrf_pipelines.sdrf.sdrf import read_sdrf and the method signature df.validate_sdrf(template=..., use_ols_cache_only=...) changed between versions, adjust the code accordingly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@quantmsutils/sdrf/check_samplesheet.py` around lines 5 - 6, Resolve the
sdrf-pipelines version mismatch by choosing the supported upstream version
(either pin to 0.0.33 in pyproject.toml or update requirements.txt to >=0.1.1),
then update the code to match that version's API: verify and, if necessary,
change the import "from sdrf_pipelines.sdrf.sdrf import read_sdrf" to the
correct module path used by the selected version and update calls to
df.validate_sdrf(template=..., use_ols_cache_only=...) to match the actual
parameter names and signature in that version (rename or remove
use_ols_cache_only or template args as required); ensure both dependency files
reference the same version constraint so installs are consistent.
|
We are tackling this issue here bigbio/quantms#646; we are adding a new option called minimal that allows running quantms and quantmsdiann with minimal metadata, not organism, not organism part, etc., only the structure of the SDRF, and the parameters needed for the workflow. |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
quantmsutils/sdrf/check_samplesheet.py (1)
54-61:⚠️ Potential issue | 🟠 MajorTurn bad
--exp_designinputs into CLI errors.
--exp_designis still a plain string, and failures fromread_sdrf()/pd.read_csv()will currently bubble out as raw exceptions. Missing files and malformed SDRFs should fail as Click errors, not Python tracebacks.Proposed fix
-@click.option("--exp_design", help="SDRF file to be validated", required=True) +@click.option( + "--exp_design", + type=click.Path(exists=True, dir_okay=False, readable=True, path_type=str), + help="SDRF file to be validated", + required=True, +) @@ - if minimal: - errors = _validate_minimal(input_sdrf) - else: - df = read_sdrf(input_sdrf) - errors = df.validate_sdrf( - template=template, - use_ols_cache_only=use_ols_cache_only, - ) + try: + if minimal: + errors = _validate_minimal(input_sdrf) + else: + df = read_sdrf(input_sdrf) + errors = df.validate_sdrf( + template=template, + use_ols_cache_only=use_ols_cache_only, + ) + except Exception as exc: + raise click.ClickException(str(exc)) from excAlso applies to: 105-105, 121-133
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@quantmsutils/sdrf/check_samplesheet.py` around lines 54 - 61, Wrap SDRF parsing/validation so filesystem and CSV parse errors become Click errors: import click and catch exceptions raised by read_sdrf() / pandas (e.g., FileNotFoundError, pd.errors.ParserError, ValueError) and any exceptions from df.validate_sdrf(...) and re-raise them as click.ClickException with a clear message including input_sdrf and the original error; do the same for the minimal-path call to _validate_minimal(input_sdrf) and for the other similar blocks referenced (the code around the calls to read_sdrf, _validate_minimal, and df.validate_sdrf at the other locations) so user-facing errors are emitted instead of raw tracebacks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_commands.py`:
- Around line 141-148: The no-op expression f.name is unused after the
NamedTemporaryFile context; capture the temp file path into a variable and use
that in args. After writing in the with tempfile.NamedTemporaryFile(...) as f
block, assign the filename to a variable (e.g., temp_sdrf_path = f.name) and
then set args = ["--minimal", "--exp_design", temp_sdrf_path]; reference the
tempfile.NamedTemporaryFile usage and the args construction in
tests/test_commands.py to locate where to apply this change.
---
Duplicate comments:
In `@quantmsutils/sdrf/check_samplesheet.py`:
- Around line 54-61: Wrap SDRF parsing/validation so filesystem and CSV parse
errors become Click errors: import click and catch exceptions raised by
read_sdrf() / pandas (e.g., FileNotFoundError, pd.errors.ParserError,
ValueError) and any exceptions from df.validate_sdrf(...) and re-raise them as
click.ClickException with a clear message including input_sdrf and the original
error; do the same for the minimal-path call to _validate_minimal(input_sdrf)
and for the other similar blocks referenced (the code around the calls to
read_sdrf, _validate_minimal, and df.validate_sdrf at the other locations) so
user-facing errors are emitted instead of raw tracebacks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c7f062f0-4f30-4dcd-b69a-503e8fa74c3a
📒 Files selected for processing (3)
quantmsutils/sdrf/check_samplesheet.pyrecipe/meta.yamltests/test_commands.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/test_commands.py (1)
126-149: Clean up temp SDRF files after assertions.Both tests create files with
delete=Falseand never remove them. This can leave residue across repeated runs.Proposed cleanup pattern
with tempfile.NamedTemporaryFile(mode="w", suffix=".sdrf.tsv", delete=False) as f: ... tmp_path = f.name - args = ["--minimal", "--exp_design", tmp_path] - result = run_cli_command("checksamplesheet", args) - assert result.exit_code == 0 + try: + args = ["--minimal", "--exp_design", tmp_path] + result = run_cli_command("checksamplesheet", args) + assert result.exit_code == 0 + finally: + if os.path.exists(tmp_path): + os.unlink(tmp_path)with tempfile.NamedTemporaryFile(mode="w", suffix=".sdrf.tsv", delete=False) as f: ... tmp_path = f.name - args = ["--minimal", "--exp_design", tmp_path] - result = run_cli_command("checksamplesheet", args) - assert result.exit_code != 0 - assert "cleavage agent" in result.output.lower() + try: + args = ["--minimal", "--exp_design", tmp_path] + result = run_cli_command("checksamplesheet", args) + assert result.exit_code != 0 + assert "cleavage agent" in result.output.lower() + finally: + if os.path.exists(tmp_path): + os.unlink(tmp_path)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_commands.py` around lines 126 - 149, The tests create temporary SDRF files with delete=False (tmp_path) and never remove them; update both test_check_samplesheet_minimal and test_check_samplesheet_minimal_missing_column to ensure the temp file is unlinked after the test by wrapping the run/assertions in a try/finally (or use pytest tmp_path fixture) and call os.unlink(tmp_path) in the finally block so the tmp_path is always removed even if the assertion fails; reference the tmp_path variable in the cleanup to locate the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/test_commands.py`:
- Around line 126-149: The tests create temporary SDRF files with delete=False
(tmp_path) and never remove them; update both test_check_samplesheet_minimal and
test_check_samplesheet_minimal_missing_column to ensure the temp file is
unlinked after the test by wrapping the run/assertions in a try/finally (or use
pytest tmp_path fixture) and call os.unlink(tmp_path) in the finally block so
the tmp_path is always removed even if the assertion fails; reference the
tmp_path variable in the cleanup to locate the file.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
quantmsutils/sdrf/check_samplesheet.py (1)
75-78: Usedf.emptyinstead oflen(df) == 0for pandas idiom.Minor style suggestion:
df_rows.emptyis the idiomatic pandas way to check for an empty DataFrame.Suggested change
df_rows = pd.read_csv(input_sdrf, sep="\t", nrows=1) - if len(df_rows) == 0: + if df_rows.empty: errors.append("ERROR: SDRF file contains a header but no data rows.") return errors🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@quantmsutils/sdrf/check_samplesheet.py` around lines 75 - 78, Replace the len(df_rows) == 0 check with the pandas-idiomatic df_rows.empty in the function where df_rows is created (the block reading the SDRF with pd.read_csv in check_samplesheet.py); specifically, update the conditional that appends "ERROR: SDRF file contains a header but no data rows." to use df_rows.empty to detect an empty DataFrame.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@quantmsutils/sdrf/check_samplesheet.py`:
- Around line 75-78: Replace the len(df_rows) == 0 check with the
pandas-idiomatic df_rows.empty in the function where df_rows is created (the
block reading the SDRF with pd.read_csv in check_samplesheet.py); specifically,
update the conditional that appends "ERROR: SDRF file contains a header but no
data rows." to use df_rows.empty to detect an empty DataFrame.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6cf28c03-b9f1-49e8-b87d-0a43edfdc43a
📒 Files selected for processing (1)
quantmsutils/sdrf/check_samplesheet.py
Summary by CodeRabbit
New Features
Improvements
Tests
Chores