Fix infrastructure leak in template from volume creation error message#12650
Fix infrastructure leak in template from volume creation error message#12650erikbocks wants to merge 1 commit intoapache:mainfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #12650 +/- ##
============================================
+ Coverage 16.26% 17.94% +1.67%
- Complexity 13428 16165 +2737
============================================
Files 5660 5939 +279
Lines 499963 533017 +33054
Branches 60708 65218 +4510
============================================
+ Hits 81330 95650 +14320
- Misses 409559 426639 +17080
- Partials 9074 10728 +1654
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR updates error handling around selecting an image (secondary) datastore so that API-facing exceptions no longer expose internal zone IDs, moving the detailed context into server logs instead.
Changes:
- Replace zone-ID-containing exception messages with a sanitized, user-facing
CloudRuntimeExceptionmessage. - Add error logging that retains the detailed context (zone ID) for operators.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (imageStore == null) { | ||
| throw new CloudRuntimeException(String.format("Cannot find an image store for zone [%s].", zoneId)); | ||
| logger.error("Cannot find an image store for zone [{}].", zoneId); | ||
| throw new CloudRuntimeException("Failed to create template. Please contact the cloud administrator."); |
There was a problem hiding this comment.
getImageStore is used for both template creation and volume upload (e.g., VolumeApiServiceImpl.uploadVolume calls _tmpltMgr.getImageStore(...)). Throwing "Failed to create template..." here will surface an incorrect error message to volume-upload callers. Consider using an operation-neutral message (e.g., "Failed to find an image store. Please contact the cloud administrator."), or passing a caller-specific context string into getImageStore so each API reports the right high-level failure without exposing the zone ID.
| throw new CloudRuntimeException("Failed to create template. Please contact the cloud administrator."); | |
| throw new CloudRuntimeException("Failed to find an image store. Please contact the cloud administrator."); |
| logger.error("Cannot find an image store for zone [{}].", zoneId); | ||
| throw new CloudRuntimeException("Failed to create template. Please contact the cloud administrator."); |
There was a problem hiding this comment.
There are existing unit tests for getImageStore in TemplateManagerImplTest, but they currently only assert that an exception is thrown. Since this change is specifically about preventing infrastructure/zone ID leakage via exception messages, please add assertions that the thrown CloudRuntimeException message does not include the zone ID (and matches the new sanitized message).
| if (store == null) { | ||
| throw new CloudRuntimeException("cannot find an image store for zone " + zoneId); | ||
| logger.error("Cannot find an image store for zone [{}].", zoneId); | ||
| throw new CloudRuntimeException("Failed to create template. Please contact the cloud administrator."); |
There was a problem hiding this comment.
The same log+throw pattern and user-facing message is now duplicated here and in getImageStore. To avoid the two sites drifting (especially if you later tweak the sanitized message), consider factoring this into a small helper (e.g., throwImageStoreNotFound(zoneId, operation)), which logs the detailed message and throws the sanitized exception.
Description
Currently, if an error occurs when trying to obtain a secondary storage for the creation of a template from a volume, or when uploading a volume, the message from the thrown exception exposes the zone's internal ID. Thus, the exception message was changed, and the descriptive message was moved to the logs.
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Screenshots (if appropriate):
How Has This Been Tested?
In an environment with only one secondary storage, I set it as
read-only. Then, I tried to create a template from a volume. An exception was thrown, informing that an error had occurred, but no infrastructure leak was present. I accessed the logs, and validated that the log with more information was shown, as well as the new exception message.