Skip to content

DAOS-18579 vos: reserve_space() include NVMe info in ERR log#17764

Draft
kccain wants to merge 2 commits intomasterfrom
kccain/daos_18579_debug_nvme_space
Draft

DAOS-18579 vos: reserve_space() include NVMe info in ERR log#17764
kccain wants to merge 2 commits intomasterfrom
kccain/daos_18579_debug_nvme_space

Conversation

@kccain
Copy link
Copy Markdown
Contributor

@kccain kccain commented Mar 23, 2026

When vos_reserve_blocks() fails, in addition to the allocation request size, include additional information about NVMe space: free, total, and system-reserved bytes.

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 23, 2026

Ticket title is 'Soak Harasser: Rebuild failed after one pool drain on rank 8: status: -1007 (DER_NOSPACE)'
Status is 'In Progress'
Labels: 'Rebuild,soak,test_2.8'
https://daosio.atlassian.net/browse/DAOS-18579

When vos_reserve_blocks() fails, in addition to the allocation
request size, include additional information about NVMe space:
free, total, and system-reserved bytes.

Signed-off-by: Kenneth Cain <kenneth.cain@hpe.com>
@kccain kccain force-pushed the kccain/daos_18579_debug_nvme_space branch from cbf0a7a to 9de81e3 Compare March 23, 2026 20:31
@kccain kccain changed the title DAOS-18579 reserve_space() include NVMe info in ERR log DAOS-18579 vos: reserve_space() include NVMe info in ERR log Mar 23, 2026
so if we get a rebuild -DER_NOSPACE failure, we can see
what the initial space pressure conditions were.

Also, choose what engine to drain based on which one has
the least amount of (aggregate, across all of its targets)
free NVMe space available, compared to all pool engines.

Signed-off-by: Kenneth Cain <kenneth.cain@hpe.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant