<refactor>[compute]: split VmInstanceManagerImpl 2808L into 6 SubManagers#3400
<refactor>[compute]: split VmInstanceManagerImpl 2808L into 6 SubManagers#3400zstack-robot-1 wants to merge 1 commit into5.5.12from
Conversation
…gers Resolves: ZSTAC-82456 Change-Id: Ifecc0b1a31fb5fb2c37c8e9f3893bf2096c51ce3
概览本次变更引入了六个新的虚拟机(VM)管理子模块类,用于将虚拟机创建、过期清理、扩展点处理、API接口、工作流链管理和配额计算等功能进行模块化分离,提升代码组织性和可维护性。总计新增约2,358行代码。 变更内容
代码审查工作量评估🎯 4 (复杂) | ⏱️ ~60 分钟 诗
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (1 passed)
✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (1)
compute/src/main/java/org/zstack/compute/vm/VmExpungeSubManager.java (1)
167-169: 回调中用indexOf反查消息是 O(n²),建议按下标遍历这里可以直接按索引遍历
replies,避免重复查找并提升可读性。♻️ 建议重构
- for (MessageReply r : replies) { - ExpungeVmMsg msg = msgs.get(replies.indexOf(r)); + for (int i = 0; i < replies.size(); i++) { + MessageReply r = replies.get(i); + ExpungeVmMsg msg = msgs.get(i);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@compute/src/main/java/org/zstack/compute/vm/VmExpungeSubManager.java` around lines 167 - 169, The loop in VmExpungeSubManager currently finds the matching ExpungeVmMsg by calling msgs.get(replies.indexOf(r)) inside a for-each over replies, causing O(n²); change the loop to iterate by index (for i from 0 to replies.size()-1) or use an indexed for loop so you can directly access replies.get(i) and msgs.get(i), then use MessageReply r = replies.get(i) and ExpungeVmMsg msg = msgs.get(i) to handle the success/failure logic; update any references inside the loop accordingly to avoid repeated indexOf calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@compute/src/main/java/org/zstack/compute/vm/VmCreationSubManager.java`:
- Around line 1152-1156: The code in VmCreationSubManager initializes result
with Collections.emptyList(), then calls result.add(...) in defensive branches
which throws UnsupportedOperationException; change the initialization in the
affected helper methods (the methods that declare List<ErrorCode> result =
Collections.emptyList()) to return a mutable list (e.g., new ArrayList<>()) so
result.add(...) is safe, and update both occurrences so all subsequent adds to
result succeed; keep the rest of the logic in the same methods (refer to the
result variable and the methods in VmCreationSubManager that perform the
null/msg checks).
- Around line 577-600: Inside VmCreationSubManager (the method containing the
loop over pss) you are causing an N+1 by querying BackupStorageVO inside the for
(PrimaryStorageVO ps : pss) loop; instead, collect all bsUuids from every
PrimaryStorageVO first (use
PrimaryStorageType.valueOf(ps.getType()).findBackupStorage(ps.getUuid()) for
each ps), deduplicate into one Set, run a single query
TypedQuery<BackupStorageVO> with "where bs.uuid in (:uuids)" to fetch all
BackupStorageVOs at once, then build a Map of backupUuid->BackupStorageVO and
aggregate/backfill the original per-primary-storage lists (or create a
Map<PrimaryStorageVO, List<BackupStorageVO>>) so you can replace the per-ps
bq.getResultList() calls with lookups into that map and remove the loop-in-query
pattern.
In `@compute/src/main/java/org/zstack/compute/vm/VmExpungeSubManager.java`:
- Around line 63-65: VmExpungeSubManager.stop() currently returns true without
cancelling the scheduled periodic task, causing a lifecycle leak; update the
stop() method in class VmExpungeSubManager to cancel the expungeVmTask (check
expungeVmTask != null), call its cancel(...) method on the
ScheduledFuture/TimerTask instance, optionally call shutdown on the scheduler if
owned by this class, and null out the expungeVmTask reference so the background
task is terminated when the component stops.
In `@compute/src/main/java/org/zstack/compute/vm/VmExtensionPointAdapter.java`:
- Around line 89-108: The method resourceOwnerAfterChange prematurely returns
when rootVolumeUuid is null or when vmnics is empty, which prevents
changeVmCdRomsOwner(ref.getResourceUuid(), newOwnerUuid) from running; modify
resourceOwnerAfterChange so it does not return early: if rootVolumeUuid is null
skip acntMgr.changeResourceOwner(rootVolumeUuid, ...) but continue, and when
vmnics is empty skip the loop that calls acntMgr.changeResourceOwner(vmnicUuid,
...) but continue to invoke changeVmCdRomsOwner(ref.getResourceUuid(),
newOwnerUuid); update the guards around acntMgr.changeResourceOwner and the
VmNicVO query (dbf.createQuery / VmNicVO_, VmNicVO_.uuid,
VmNicVO_.vmInstanceUuid) to conditionally execute without exiting the method.
- Around line 148-150: Fix the logging typos in VmExtensionPointAdapter: replace
the misspelled "unknowngc" message in the logger.warn call that mentions
creating an unknowngc job for vm[uuid:%s] to a clear English phrase such as
"create an UnknownVmGC job for vm[uuid:%s]" (or "create an unknown GC job for
vm[uuid:%s]") and also correct any possessive misuse in the earlier message
("it's" -> "its") so both logger.warn statements use proper spelling and
grammar.
In `@compute/src/main/java/org/zstack/compute/vm/VmNicApiSubManager.java`:
- Around line 84-108: The retry loop that only updates nicVO.mac leaves nic (the
in-memory representation) stale and still calls trigger.next() even if all
retries fail; fix by (1) after a successful SQLBatch.persist() inside the while
loop, copy the persisted mac back to the in-memory nic (e.g., set
nic.setMac(nicVO.getMac())) so callers see the new MAC, and (2) if the loop
exhausts all attempts (no successful persist), do not call trigger.next() —
instead throw an exception (or rethrow the last PersistenceException) to abort
further processing; update references around nicVO, nic, the SQLBatch block and
the tries loop, and keep using MacOperator.generateMacWithDeviceId(...) when
regenerating macs on Duplicate entry.
In `@compute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.java`:
- Around line 124-140: The loop in VmQuotaSubManager that iterates
diskOfferingCountMap.keySet() issues a DB query per entry (TypedQuery in the
loop) causing N+1 queries; replace it with a single Q API batch query to fetch
diskSize for all UUIDs in diskOfferingCountMap (use
Q.New(DiskOfferingVO.class).in(DiskOfferingVO_.uuid,
diskOfferingCountMap.keySet()).select(DiskOfferingVO_.diskSize).listValues()),
then locally multiply each returned diskSize by the corresponding count from
diskOfferingCountMap and sum into allVolumeSizeAsked, removing the per-UUID
TypedQuery and looped DB calls.
- Line 144: The current addCheckCondition in VmQuotaSubManager uses
msg.getStrategy().equals(VmCreationStrategy.JustCreate.toString()) which can NPE
if strategy is null; change the condition to a null-safe comparison (e.g., use
Objects.equals or call the constant's equals with msg.getStrategy()) so the
check does not throw when msg.getStrategy() is null, and keep the surrounding
logic in addCheckCondition((msg) -> ...) unchanged.
---
Nitpick comments:
In `@compute/src/main/java/org/zstack/compute/vm/VmExpungeSubManager.java`:
- Around line 167-169: The loop in VmExpungeSubManager currently finds the
matching ExpungeVmMsg by calling msgs.get(replies.indexOf(r)) inside a for-each
over replies, causing O(n²); change the loop to iterate by index (for i from 0
to replies.size()-1) or use an indexed for loop so you can directly access
replies.get(i) and msgs.get(i), then use MessageReply r = replies.get(i) and
ExpungeVmMsg msg = msgs.get(i) to handle the success/failure logic; update any
references inside the loop accordingly to avoid repeated indexOf calls.
ℹ️ Review info
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
conf/springConfigXml/VmInstanceManager.xmlis excluded by!**/*.xml
📒 Files selected for processing (7)
compute/src/main/java/org/zstack/compute/vm/VmCreationSubManager.javacompute/src/main/java/org/zstack/compute/vm/VmExpungeSubManager.javacompute/src/main/java/org/zstack/compute/vm/VmExtensionPointAdapter.javacompute/src/main/java/org/zstack/compute/vm/VmFlowChainRegistry.javacompute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.javacompute/src/main/java/org/zstack/compute/vm/VmNicApiSubManager.javacompute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.java
| for (String l3uuid : l3s) { | ||
| String sql = "select ps" + | ||
| " from PrimaryStorageVO ps, L2NetworkClusterRefVO l2ref," + | ||
| " L3NetworkVO l3, PrimaryStorageClusterRefVO psref" + | ||
| " where ps.uuid = psref.primaryStorageUuid" + | ||
| " and psref.clusterUuid = l2ref.clusterUuid" + | ||
| " and l2ref.l2NetworkUuid = l3.l2NetworkUuid" + | ||
| " and l3.uuid = :l3uuid"; | ||
| TypedQuery<PrimaryStorageVO> psq = dbf.getEntityManager().createQuery(sql, PrimaryStorageVO.class); | ||
| psq.setParameter("l3uuid", l3uuid); | ||
| List<PrimaryStorageVO> pss = psq.getResultList(); | ||
|
|
||
| List<BackupStorageVO> lst = new ArrayList<>(); | ||
| for (PrimaryStorageVO ps : pss) { | ||
| PrimaryStorageType psType = PrimaryStorageType.valueOf(ps.getType()); | ||
| List<String> bsUuids = psType.findBackupStorage(ps.getUuid()); | ||
|
|
||
| if (!bsUuids.isEmpty()) { | ||
| // the primary storage has bound backup storage, e.g. ceph | ||
| sql = "select bs from BackupStorageVO bs where bs.uuid in (:uuids)"; | ||
| TypedQuery<BackupStorageVO> bq = dbf.getEntityManager().createQuery(sql, BackupStorageVO.class); | ||
| bq.setParameter("uuids", bsUuids); | ||
| lst.addAll(bq.getResultList()); | ||
| } else { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n compute/src/main/java/org/zstack/compute/vm/VmCreationSubManager.java | sed -n '570,610p'Repository: MatheMatrix/zstack
Length of output: 2397
🏁 Script executed:
# 搜索 TypedQuery 在整个文件中的使用
rg -n 'TypedQuery' compute/src/main/java/org/zstack/compute/vm/VmCreationSubManager.java -B2 -A2Repository: MatheMatrix/zstack
Length of output: 3169
🏁 Script executed:
# 获取文件总行数和更多上下文
wc -l compute/src/main/java/org/zstack/compute/vm/VmCreationSubManager.javaRepository: MatheMatrix/zstack
Length of output: 136
循环内嵌套查询不符合性能规范
第 590-604 行的 for (PrimaryStorageVO ps : pss) 循环内部在第 597-599 行对每一条 PrimaryStorageVO 记录都执行了一次 BackupStorageVO 的查询,造成 N+1 问题。应改为先批量收集所有 bsUuids,然后一次性查询所有 BackupStorageVO 并按 PrimaryStorage 聚合,符合编码指南"禁止循环里套查询"的要求。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@compute/src/main/java/org/zstack/compute/vm/VmCreationSubManager.java` around
lines 577 - 600, Inside VmCreationSubManager (the method containing the loop
over pss) you are causing an N+1 by querying BackupStorageVO inside the for
(PrimaryStorageVO ps : pss) loop; instead, collect all bsUuids from every
PrimaryStorageVO first (use
PrimaryStorageType.valueOf(ps.getType()).findBackupStorage(ps.getUuid()) for
each ps), deduplicate into one Set, run a single query
TypedQuery<BackupStorageVO> with "where bs.uuid in (:uuids)" to fetch all
BackupStorageVOs at once, then build a Map of backupUuid->BackupStorageVO and
aggregate/backfill the original per-primary-storage lists (or create a
Map<PrimaryStorageVO, List<BackupStorageVO>>) so you can replace the per-ps
bq.getResultList() calls with lookups into that map and remove the loop-in-query
pattern.
| List<ErrorCode> result = Collections.emptyList(); | ||
| if (msg == null) { | ||
| result.add(operr(ORG_ZSTACK_COMPUTE_VM_10238, "CreateVmInstanceMsg cannot be null")); | ||
| return result; | ||
| } else if (cmsg != null && cmsg.getSystemTags() != null && !cmsg.getSystemTags().isEmpty()) { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# 确认文件大小和相关行的内容
wc -l compute/src/main/java/org/zstack/compute/vm/VmCreationSubManager.javaRepository: MatheMatrix/zstack
Length of output: 136
🏁 Script executed:
# 查看 1152-1156 和 1165-1168 周围的代码
sed -n '1150,1170p' compute/src/main/java/org/zstack/compute/vm/VmCreationSubManager.java | cat -nRepository: MatheMatrix/zstack
Length of output: 1513
🏁 Script executed:
# 搜索所有 Collections.emptyList() 的使用位置
rg -n 'Collections\.emptyList\(\)' compute/src/main/java/org/zstack/compute/vm/VmCreationSubManager.java -B2 -A5Repository: MatheMatrix/zstack
Length of output: 1662
Collections.emptyList() 后调用 add() 会直接抛异常
两处 helper 方法都将 result 初始化为不可变列表,但在防御分支执行了 result.add(...)。Collections.emptyList() 返回的是不可变集合,直接调用 add() 会抛出 UnsupportedOperationException。虽然正常调用链通常传入非空的 msg,但这个防御分支本身的实现是错误的。
修复建议
private List<ErrorCode> extEmitterHandleSystemTag(final CreateVmInstanceMsg msg, final APICreateMessage cmsg, VmInstanceVO finalVo) {
- List<ErrorCode> result = Collections.emptyList();
+ List<ErrorCode> result = new ArrayList<>();
if (msg == null) {
result.add(operr(ORG_ZSTACK_COMPUTE_VM_10238, "CreateVmInstanceMsg cannot be null"));
return result; private List<ErrorCode> extEmitterHandleSshKeyPair(final CreateVmInstanceMsg msg, final APICreateMessage cmsg, VmInstanceVO finalVo) {
- List<ErrorCode> result = Collections.emptyList();
+ List<ErrorCode> result = new ArrayList<>();
if (msg == null) {
result.add(operr(ORG_ZSTACK_COMPUTE_VM_10239, "CreateVmInstanceMsg cannot be null"));
return result;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| List<ErrorCode> result = Collections.emptyList(); | |
| if (msg == null) { | |
| result.add(operr(ORG_ZSTACK_COMPUTE_VM_10238, "CreateVmInstanceMsg cannot be null")); | |
| return result; | |
| } else if (cmsg != null && cmsg.getSystemTags() != null && !cmsg.getSystemTags().isEmpty()) { | |
| List<ErrorCode> result = new ArrayList<>(); | |
| if (msg == null) { | |
| result.add(operr(ORG_ZSTACK_COMPUTE_VM_10238, "CreateVmInstanceMsg cannot be null")); | |
| return result; | |
| } else if (cmsg != null && cmsg.getSystemTags() != null && !cmsg.getSystemTags().isEmpty()) { |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@compute/src/main/java/org/zstack/compute/vm/VmCreationSubManager.java` around
lines 1152 - 1156, The code in VmCreationSubManager initializes result with
Collections.emptyList(), then calls result.add(...) in defensive branches which
throws UnsupportedOperationException; change the initialization in the affected
helper methods (the methods that declare List<ErrorCode> result =
Collections.emptyList()) to return a mutable list (e.g., new ArrayList<>()) so
result.add(...) is safe, and update both occurrences so all subsequent adds to
result succeed; keep the rest of the logic in the same methods (refer to the
result variable and the methods in VmCreationSubManager that perform the
null/msg checks).
| public boolean stop() { | ||
| return true; | ||
| } |
There was a problem hiding this comment.
stop() 未取消周期任务,存在生命周期泄漏风险
组件停止时应主动取消 expungeVmTask,避免后台任务继续运行。
🔧 建议修复
`@Override`
public boolean stop() {
+ if (expungeVmTask != null) {
+ expungeVmTask.cancel(true);
+ expungeVmTask = null;
+ }
return true;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public boolean stop() { | |
| return true; | |
| } | |
| `@Override` | |
| public boolean stop() { | |
| if (expungeVmTask != null) { | |
| expungeVmTask.cancel(true); | |
| expungeVmTask = null; | |
| } | |
| return true; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@compute/src/main/java/org/zstack/compute/vm/VmExpungeSubManager.java` around
lines 63 - 65, VmExpungeSubManager.stop() currently returns true without
cancelling the scheduled periodic task, causing a lifecycle leak; update the
stop() method in class VmExpungeSubManager to cancel the expungeVmTask (check
expungeVmTask != null), call its cancel(...) method on the
ScheduledFuture/TimerTask instance, optionally call shutdown on the scheduler if
owned by this class, and null out the expungeVmTask reference so the background
task is terminated when the component stops.
| if (rootVolumeUuid == null) { | ||
| return; | ||
| } | ||
|
|
||
| acntMgr.changeResourceOwner(rootVolumeUuid, newOwnerUuid); | ||
|
|
||
| // change vmnic(s) | ||
| SimpleQuery<VmNicVO> sq = dbf.createQuery(VmNicVO.class); | ||
| sq.select(VmNicVO_.uuid); | ||
| sq.add(VmNicVO_.vmInstanceUuid, Op.EQ, ref.getResourceUuid()); | ||
| List<String> vmnics = sq.listValue(); | ||
| if (vmnics.isEmpty()) { | ||
| return; | ||
| } | ||
| for (String vmnicUuid : vmnics) { | ||
| acntMgr.changeResourceOwner(vmnicUuid, newOwnerUuid); | ||
| } | ||
|
|
||
| changeVmCdRomsOwner(ref.getResourceUuid(), newOwnerUuid); | ||
| } |
There was a problem hiding this comment.
资源 owner 迁移被过早 return 截断
resourceOwnerAfterChange() 在 root volume 或 NIC 为空时直接 return,会导致 changeVmCdRomsOwner() 不执行,CD-ROM 的 owner 可能未同步更新。这里应只跳过缺失资源,不应提前结束整个流程。
🔧 建议修复
- String rootVolumeUuid = q.findValue();
- if (rootVolumeUuid == null) {
- return;
- }
-
- acntMgr.changeResourceOwner(rootVolumeUuid, newOwnerUuid);
+ String rootVolumeUuid = q.findValue();
+ if (rootVolumeUuid != null) {
+ acntMgr.changeResourceOwner(rootVolumeUuid, newOwnerUuid);
+ }
// change vmnic(s)
SimpleQuery<VmNicVO> sq = dbf.createQuery(VmNicVO.class);
sq.select(VmNicVO_.uuid);
sq.add(VmNicVO_.vmInstanceUuid, Op.EQ, ref.getResourceUuid());
List<String> vmnics = sq.listValue();
- if (vmnics.isEmpty()) {
- return;
- }
for (String vmnicUuid : vmnics) {
acntMgr.changeResourceOwner(vmnicUuid, newOwnerUuid);
}
changeVmCdRomsOwner(ref.getResourceUuid(), newOwnerUuid);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@compute/src/main/java/org/zstack/compute/vm/VmExtensionPointAdapter.java`
around lines 89 - 108, The method resourceOwnerAfterChange prematurely returns
when rootVolumeUuid is null or when vmnics is empty, which prevents
changeVmCdRomsOwner(ref.getResourceUuid(), newOwnerUuid) from running; modify
resourceOwnerAfterChange so it does not return early: if rootVolumeUuid is null
skip acntMgr.changeResourceOwner(rootVolumeUuid, ...) but continue, and when
vmnics is empty skip the loop that calls acntMgr.changeResourceOwner(vmnicUuid,
...) but continue to invoke changeVmCdRomsOwner(ref.getResourceUuid(),
newOwnerUuid); update the guards around acntMgr.changeResourceOwner and the
VmNicVO query (dbf.createQuery / VmNicVO_, VmNicVO_.uuid,
VmNicVO_.vmInstanceUuid) to conditionally execute without exiting the method.
| logger.warn(String.format("the host[uuid:%s] disconnected, but the vm[uuid:%s] fails to " + | ||
| "change it's state to Unknown, %s", hostUuid, vmUuid, reply.getError())); | ||
| logger.warn(String.format("create an unknowngc job for vm[uuid:%s]", vmUuid)); |
There was a problem hiding this comment.
修正日志拼写错误,避免告警检索噪音
unknowngc 建议改为 UnknownVmGC 或 unknown GC。
As per coding guidelines: 代码里不应当有有中文,包括报错、注释等都应当使用正确的、无拼写错误的英文来写。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@compute/src/main/java/org/zstack/compute/vm/VmExtensionPointAdapter.java`
around lines 148 - 150, Fix the logging typos in VmExtensionPointAdapter:
replace the misspelled "unknowngc" message in the logger.warn call that mentions
creating an unknowngc job for vm[uuid:%s] to a clear English phrase such as
"create an UnknownVmGC job for vm[uuid:%s]" (or "create an unknown GC job for
vm[uuid:%s]") and also correct any possessive misuse in the earlier message
("it's" -> "its") so both logger.warn statements use proper spelling and
grammar.
| int tries = 5; | ||
| while (tries-- > 0) { | ||
| try { | ||
| new SQLBatch() { | ||
| @Override | ||
| protected void scripts() { | ||
| persist(nicVO); | ||
| } | ||
| }.execute(); | ||
| break; | ||
| } catch (PersistenceException e) { | ||
| if (ExceptionDSL.isCausedBy(e, SQLIntegrityConstraintViolationException.class, "Duplicate entry")) { | ||
| logger.debug(String.format("Concurrent mac allocation. Mac[%s] has been allocated, try allocating another one. " + | ||
| "The error[Duplicate entry] printed by jdbc.spi.SqlExceptionHelper is no harm, " + | ||
| "we will try finding another mac", nicVO.getMac())); | ||
| logger.trace("", e); | ||
| nicVO.setMac(MacOperator.generateMacWithDeviceId((short) nicVO.getDeviceId())); | ||
| } else { | ||
| throw e; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| trigger.next(); | ||
| } |
There was a problem hiding this comment.
MAC 重试分支会留下脏状态,且重试耗尽后仍继续执行
当前重试时只更新了 nicVO.mac,nic.mac 可能保持旧值;并且 5 次都失败时仍会 trigger.next(),会把“未成功落库”的 NIC 带入后续 IP 分配流程。
🔧 建议修复
int tries = 5;
+ boolean persisted = false;
while (tries-- > 0) {
try {
new SQLBatch() {
`@Override`
protected void scripts() {
persist(nicVO);
}
}.execute();
+ persisted = true;
break;
} catch (PersistenceException e) {
if (ExceptionDSL.isCausedBy(e, SQLIntegrityConstraintViolationException.class, "Duplicate entry")) {
logger.debug(String.format("Concurrent mac allocation. Mac[%s] has been allocated, try allocating another one. " +
"The error[Duplicate entry] printed by jdbc.spi.SqlExceptionHelper is no harm, " +
"we will try finding another mac", nicVO.getMac()));
logger.trace("", e);
nicVO.setMac(MacOperator.generateMacWithDeviceId((short) nicVO.getDeviceId()));
+ nic.setMac(nicVO.getMac());
} else {
throw e;
}
}
}
+ if (!persisted) {
+ trigger.fail(Platform.operr("failed to persist vm nic after mac retry exhausted"));
+ return;
+ }
+ nic.setMac(nicVO.getMac());
trigger.next();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@compute/src/main/java/org/zstack/compute/vm/VmNicApiSubManager.java` around
lines 84 - 108, The retry loop that only updates nicVO.mac leaves nic (the
in-memory representation) stale and still calls trigger.next() even if all
retries fail; fix by (1) after a successful SQLBatch.persist() inside the while
loop, copy the persisted mac back to the in-memory nic (e.g., set
nic.setMac(nicVO.getMac())) so callers see the new MAC, and (2) if the loop
exhausts all attempts (no successful persist), do not call trigger.next() —
instead throw an exception (or rethrow the last PersistenceException) to abort
further processing; update references around nicVO, nic, the SQLBatch block and
the tries loop, and keep using MacOperator.generateMacWithDeviceId(...) when
regenerating macs on Duplicate entry.
| HashMap<String, Long> diskOfferingCountMap = new HashMap<>(); | ||
| if (!diskOfferingUuids.isEmpty()) { | ||
| for (String diskOfferingUuid : diskOfferingUuids) { | ||
| if (diskOfferingCountMap.containsKey(diskOfferingUuid)) { | ||
| diskOfferingCountMap.put(diskOfferingUuid, diskOfferingCountMap.get(diskOfferingUuid) + 1); | ||
| } else { | ||
| diskOfferingCountMap.put(diskOfferingUuid, 1L); | ||
| } | ||
| } | ||
| for (String diskOfferingUuid : diskOfferingCountMap.keySet()) { | ||
| sql = "select diskSize from DiskOfferingVO where uuid = :uuid"; | ||
| TypedQuery<Long> dq = dbf.getEntityManager().createQuery(sql, Long.class); | ||
| dq.setParameter("uuid", diskOfferingUuid); | ||
| Long dsize = dq.getSingleResult(); | ||
| dsize = dsize == null ? 0 : dsize; | ||
| allVolumeSizeAsked += dsize * diskOfferingCountMap.get(diskOfferingUuid); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's read the actual file to confirm the code at lines 124-140
wc -l compute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.javaRepository: MatheMatrix/zstack
Length of output: 132
🏁 Script executed:
# Read the specific lines mentioned in the review
sed -n '124,140p' compute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.javaRepository: MatheMatrix/zstack
Length of output: 1266
🏁 Script executed:
# Search for TypedQuery usage in this file
rg -n 'TypedQuery' compute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.javaRepository: MatheMatrix/zstack
Length of output: 411
🏁 Script executed:
# Search for batch query patterns in the codebase to understand recommended approaches
rg -l 'findByUuids\|findByUuids\|listByUuids' --type java compute/src/ storage/src/ | head -5Repository: MatheMatrix/zstack
Length of output: 44
🏁 Script executed:
# Check how other parts of the codebase handle similar batch DiskOfferingVO queries
rg -n 'DiskOfferingVO' compute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.java -B2 -A2Repository: MatheMatrix/zstack
Length of output: 1112
🏁 Script executed:
# Let's check what query methods are available through dbf for batch operations
rg -n 'Q\.' compute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.java -A1 | head -20Repository: MatheMatrix/zstack
Length of output: 902
🏁 Script executed:
# Check how DiskOfferingVO is queried elsewhere in the codebase using recommended methods
rg -n 'class\s+Q\b|Q\.select|Q\.findAll' compute/src/main/java/org/zstack/compute/vm/ -A2 | head -30Repository: MatheMatrix/zstack
Length of output: 44
🏁 Script executed:
# Look at the dbf usage patterns in this file to understand recommended query methods
rg -n 'dbf\.' compute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.java | head -20Repository: MatheMatrix/zstack
Length of output: 371
🏁 Script executed:
# Check if there's a Q API method for batch queries by DiskOfferingVO
rg -n 'listByUuids|findByUuids|findByUuidsAndLoadingRequired' storage/src/ compute/src/ --type java -l | head -5Repository: MatheMatrix/zstack
Length of output: 44
🏁 Script executed:
# Look at the full context around the problematic code to understand better
sed -n '100,180p' compute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.javaRepository: MatheMatrix/zstack
Length of output: 5623
🏁 Script executed:
# Check how to properly query multiple records by UUID using Q API
rg -n 'Q\.New.*\.in\(|listByUuids|getByUuids' compute/src/ --type java -A3 | head -40Repository: MatheMatrix/zstack
Length of output: 4629
🏁 Script executed:
# Look for patterns of querying DiskOfferingVO by multiple UUIDs
rg -n 'DiskOfferingVO_\.' compute/src/main/java/org/zstack/compute/vm/ -B2 -A2Repository: MatheMatrix/zstack
Length of output: 4766
🏁 Script executed:
# Check if there are any Q.New examples that query by collections/lists
rg -n 'Q\.New.*in\(' compute/src/ --type java -A2 | head -30Repository: MatheMatrix/zstack
Length of output: 3502
使用Q API改为批量查询,避免循环内N+1查询
第133-140行的循环遍历diskOfferingCountMap.keySet()逐条查询DiskOfferingVO违反了"禁止循环里套查询"的原则。同时使用了TypedQuery,应改为推荐的Q API方式。
建议改为一次性查询所有diskOfferingUuid对应的diskSize:
List<Long> diskSizes = Q.New(DiskOfferingVO.class)
.in(DiskOfferingVO_.uuid, diskOfferingCountMap.keySet())
.select(DiskOfferingVO_.diskSize)
.listValues();然后本地聚合计算总大小,避免循环内查询。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@compute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.java` around
lines 124 - 140, The loop in VmQuotaSubManager that iterates
diskOfferingCountMap.keySet() issues a DB query per entry (TypedQuery in the
loop) causing N+1 queries; replace it with a single Q API batch query to fetch
diskSize for all UUIDs in diskOfferingCountMap (use
Q.New(DiskOfferingVO.class).in(DiskOfferingVO_.uuid,
diskOfferingCountMap.keySet()).select(DiskOfferingVO_.diskSize).listValues()),
then locally multiply each returned diskSize by the corresponding count from
diskOfferingCountMap and sum into allVolumeSizeAsked, removing the per-UUID
TypedQuery and looped DB calls.
| } | ||
|
|
||
| return allVolumeSizeAsked; | ||
| }).addCheckCondition((msg) -> !msg.getStrategy().equals(VmCreationStrategy.JustCreate.toString()))); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# 查找消息类定义中 strategy 字段的定义
echo "=== 查找 APICreateVmInstanceMsg 或相关消息类 ==="
fd -e java | xargs grep -l "getStrategy\|strategy" | grep -i "msg\|message" | head -20Repository: MatheMatrix/zstack
Length of output: 1368
🏁 Script executed:
#!/bin/bash
# 查看 VmQuotaSubManager.java 的完整内容和上下文
echo "=== 查找 VmQuotaSubManager.java 文件位置 ==="
fd 'VmQuotaSubManager.java'Repository: MatheMatrix/zstack
Length of output: 167
🏁 Script executed:
#!/bin/bash
# 查找 VmCreationStrategy 的定义
echo "=== 查找 VmCreationStrategy 定义 ==="
fd 'VmCreationStrategy.java'Repository: MatheMatrix/zstack
Length of output: 160
🏁 Script executed:
#!/bin/bash
# 查看 VmQuotaSubManager.java 的完整内容
echo "=== VmQuotaSubManager.java 内容 ==="
cat -n compute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.javaRepository: MatheMatrix/zstack
Length of output: 13955
🏁 Script executed:
#!/bin/bash
# 查看 APICreateVmInstanceMsg.java 中 strategy 字段的定义
echo "=== APICreateVmInstanceMsg.java 中 strategy 相关定义 ==="
cat -n header/src/main/java/org/zstack/header/vm/APICreateVmInstanceMsg.java | grep -A 5 -B 5 -i "strategy"Repository: MatheMatrix/zstack
Length of output: 1519
🏁 Script executed:
#!/bin/bash
# 查看 CreateVmInstanceMsg.java 中 strategy 字段的定义
echo "=== CreateVmInstanceMsg.java 中 strategy 相关定义 ==="
cat -n header/src/main/java/org/zstack/header/vm/CreateVmInstanceMsg.java | grep -A 5 -B 5 -i "strategy"Repository: MatheMatrix/zstack
Length of output: 1634
🏁 Script executed:
#!/bin/bash
# 查看 VmCreationStrategy 的完整定义
echo "=== VmCreationStrategy.java 内容 ==="
cat -n header/src/main/java/org/zstack/header/vm/VmCreationStrategy.javaRepository: MatheMatrix/zstack
Length of output: 353
🏁 Script executed:
#!/bin/bash
# 验证 APICreateVmInstanceMsg 在消息链路中是否可能被设置为 null strategy
echo "=== 查找 setStrategy 的调用位置 ==="
rg -n "setStrategy\|\.strategy\s*=" --type java | head -30Repository: MatheMatrix/zstack
Length of output: 90
🏁 Script executed:
#!/bin/bash
# 查看 APICreateVmInstanceMsg 的消息创建和初始化过程
echo "=== 查找 APICreateVmInstanceMsg 的消息创建 ==="
rg -n "new APICreateVmInstanceMsg\|APICreateVmInstanceMsg()" --type java | head -20Repository: MatheMatrix/zstack
Length of output: 101
🏁 Script executed:
#!/bin/bash
# 检查 QuotaMessageHandler 的 addCheckCondition 方法以理解其使用场景
echo "=== 查找 QuotaMessageHandler 和 addCheckCondition 的定义 ==="
fd 'QuotaMessageHandler.java'Repository: MatheMatrix/zstack
Length of output: 195
🏁 Script executed:
#!/bin/bash
# 查看 QuotaMessageHandler 的 addCheckCondition 方法定义
echo "=== QuotaMessageHandler.java addCheckCondition 相关定义 ==="
cat -n header/src/main/java/org/zstack/header/identity/quota/QuotaMessageHandler.java | head -100Repository: MatheMatrix/zstack
Length of output: 4245
🏁 Script executed:
#!/bin/bash
# 搜索其他类似的 addCheckCondition 使用场景
echo "=== 其他 addCheckCondition 的使用示例 ==="
rg -n "addCheckCondition" --type java -A 2 | head -50Repository: MatheMatrix/zstack
Length of output: 5660
策略比较缺少空值保护,可能导致配额校验 NPE
Line 144 直接调用 msg.getStrategy().equals() 时,若 strategy 为 null 会抛出异常中断配额检查。建议改为空安全比较:
🔧 建议修复
- }).addCheckCondition((msg) -> !msg.getStrategy().equals(VmCreationStrategy.JustCreate.toString())));
+ }).addCheckCondition((msg) -> !VmCreationStrategy.JustCreate.toString().equals(msg.getStrategy())));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@compute/src/main/java/org/zstack/compute/vm/VmQuotaSubManager.java` at line
144, The current addCheckCondition in VmQuotaSubManager uses
msg.getStrategy().equals(VmCreationStrategy.JustCreate.toString()) which can NPE
if strategy is null; change the condition to a null-safe comparison (e.g., use
Objects.equals or call the constant's equals with msg.getStrategy()) so the
check does not throw when msg.getStrategy() is null, and keep the surrounding
logic in addCheckCondition((msg) -> ...) unchanged.
背景
VmInstanceManagerImpl.java累计达到 2808 行,严重违反 God Object 规范(硬上限 1500 行),违反docs/rules/anti-degradation.mdRule 1。Resolves: ZSTAC-82456
变更内容
将主类拆分为 6 个职责单一的 SubManager:
VmCreationSubManager.javaVmNicApiSubManager.javaVmQuotaSubManager.javaVmExtensionPointAdapter.javaVmExpungeSubManager.javaVmFlowChainRegistry.java效果
VmInstanceManagerImpl.java: 2808 → 809 行(缩减 71%)VmInstanceManager.xml验证
mvn compile -pl compute -am -Dmaven.test.skip→ exit 0sync from gitlab !9248