Skip to content

fix: prevent inventory from overwriting Deleting status for NVLink#2674

Open
obondarev wants to merge 1 commit into
NVIDIA:mainfrom
ya-isakov:nvlink-infinite-deleting-fix
Open

fix: prevent inventory from overwriting Deleting status for NVLink#2674
obondarev wants to merge 1 commit into
NVIDIA:mainfrom
ya-isakov:nvlink-infinite-deleting-fix

Conversation

@obondarev

@obondarev obondarev commented Jun 17, 2026

Copy link
Copy Markdown

Add a guard in the NVLink inventory update loop to skip status updates when the partition is already in Deleting state, matching the existing guard in the InfiniBand Partition processor. Without this, a race between inventory and the delete workflow could overwrite Deleting→Ready, causing the partition to never be cleaned up from the DB.

Related issues

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

@obondarev obondarev requested a review from a team as a code owner June 17, 2026 12:46
@copy-pr-bot

copy-pr-bot Bot commented Jun 17, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9cac6206-3cd1-4f54-ae47-2187d091be56

📥 Commits

Reviewing files that changed from the base of the PR and between 21dee59 and 891d51c.

📒 Files selected for processing (2)
  • rest-api/workflow/pkg/activity/nvlinklogicalpartition/nvlinklogicalpartition.go
  • rest-api/workflow/pkg/activity/nvlinklogicalpartition/nvlinklogicalpartition_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • rest-api/workflow/pkg/activity/nvlinklogicalpartition/nvlinklogicalpartition_test.go
  • rest-api/workflow/pkg/activity/nvlinklogicalpartition/nvlinklogicalpartition.go

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced inventory processing to correctly handle NVLink Logical Partitions marked for deletion by skipping further updates and preserving their deleting status, preventing unintended status modifications.
  • Tests

    • Updated test cases to verify proper handling and status preservation of partitions in deleting state.

Walkthrough

UpdateNVLinkLogicalPartitionsInDB now short-circuits status-mapping and status-detail creation for any existing NVLink Logical Partition record whose current DB status is Deleting. The accompanying test suite adds a deletingNVLinkLogicalPartitions expectation slice and verifies that such partitions are not reclassified by inventory processing.

Changes

Deleting Status Guard and Test Coverage

Layer / File(s) Summary
Deleting status guard in UpdateNVLinkLogicalPartitionsInDB
rest-api/workflow/pkg/activity/nvlinklogicalpartition/nvlinklogicalpartition.go
Adds a four-line guard (132–135) that checks whether the existing DB record's status is NVLinkLogicalPartitionStatusDeleting and immediately continues, preventing any downstream status mapping, updates, or status-detail creation for that record.
Test struct extension and Deleting assertions
rest-api/workflow/pkg/activity/nvlinklogicalpartition/nvlinklogicalpartition_test.go
Extends the fields struct with a deletingNVLinkLogicalPartitions slice, reclassifies nvllp3 from updatedNVLinkLogicalPartitions to deletingNVLinkLogicalPartitions in the success case, and adds an assertion loop confirming those partitions retain NVLinkLogicalPartitionStatusDeleting post-processing.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly describes the core change: preventing inventory from overwriting the Deleting status for NVLink partitions, which is the primary objective of this PR.
Description check ✅ Passed The description is directly related to the changeset, explaining the race condition fix, the guard mechanism, and referencing the matching InfiniBand implementation pattern.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Add a guard in the NVLink inventory update loop to skip status updates
when the partition is already in Deleting state, matching the existing
guard in the InfiniBand Partition processor. Without this, a race
between inventory and the delete workflow could overwrite Deleting→Ready,
causing the partition to never be cleaned up from the DB.

Signed-off-by: Oleg Bondarev <obondarev@mirantis.com>
@obondarev obondarev force-pushed the nvlink-infinite-deleting-fix branch from 21dee59 to 891d51c Compare June 17, 2026 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant