NVIDIA-554: DPU-host mode: use ConfigMap for OVN feature enablement instead of per-node script gating#2907
NVIDIA-554: DPU-host mode: use ConfigMap for OVN feature enablement instead of per-node script gating#2907tsorya wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
@tsorya: This pull request references MGMT-21485 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughRemoved per-node (DPU-host) mode documentation and runtime gating; consolidated feature enablement to cluster-wide ConfigMap flags and simplified ovnkube startup flag wiring across templates, scripts, and tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Node as Node startup script (008-script-lib)
participant KubeAPI as kube-apiserver / ConfigMap
participant Control as ovnkube-control-plane
rect rgba(200,230,255,0.5)
Node->>KubeAPI: Read cluster ConfigMap (--config-file)
KubeAPI-->>Node: ConfigMap with feature flags (enable-*)
end
rect rgba(200,255,200,0.5)
Node->>Control: Start ovnkube with flags derived from ConfigMap
Control-->>Node: ovnkube starts with cluster-wide flags
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (3 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tsorya The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
79c2f28 to
9d8ca23
Compare
|
@tsorya: This pull request references MGMT-21485 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@tsorya: This pull request references NVIDIA-554 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@bindata/network/ovn-kubernetes/managed/004-config.yaml`:
- Around line 48-54: Remove or balance the orphaned Go template directives:
delete the stray "{{- end }}" that does not close any "{{- if }}" near the block
that references ".OVN_MULTI_NETWORK_POLICY_ENABLE", and remove the two
consecutive orphaned "{{- end }}" occurrences later (or alternatively add the
appropriate matching "{{- if <condition> }}" blocks if conditional behavior was
intended) so that every "{{- end }}" has a corresponding opening "{{- if ... }}"
and the template parses correctly.
🧹 Nitpick comments (1)
bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml (1)
196-198: Minor: Redundant variable initialization.Line 196 initializes
multi_network_enabled_flagto empty, and line 197 immediately sets it to"--enable-multi-network". The empty initialization on line 196 is unnecessary.♻️ Suggested simplification
- multi_network_enabled_flag= multi_network_enabled_flag="--enable-multi-network"
fc4f0a7 to
f8367a3
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml (1)
208-214:⚠️ Potential issue | 🟠 MajorRemove unused variable definitions that are no longer passed to the ovnkube command.
The variables
multi_network_policy_enabled_flag(lines 208-211) andadmin_network_policy_enabled_flag(line 214) are computed but never used in the exec command (lines 217-239). These settings are now managed via the ConfigMap (004-config.yaml), making these shell variable definitions dead code. Remove them to improve clarity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml` around lines 208 - 214, Remove the dead shell variables that are computed but never used: delete the multi_network_policy_enabled_flag definition/block (the conditional setting using {{.OVN_MULTI_NETWORK_POLICY_ENABLE}} that assigns "--enable-multi-networkpolicy") and the admin_network_policy_enabled_flag assignment ("--enable-admin-network-policy"); these flags are not referenced by the ovnkube exec invocation so remove their declarations to clean up the script and rely on the ConfigMap-managed settings instead.
🧹 Nitpick comments (2)
bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml (1)
141-143: Redundant empty initialization ofnetwork_segmentation_enabled_flag.Line 141 initializes
network_segmentation_enabled_flagto empty, but it's unconditionally overwritten on line 143. The empty initialization can be removed.- network_segmentation_enabled_flag= multi_network_enabled_flag="--enable-multi-network" network_segmentation_enabled_flag="--enable-network-segmentation"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml` around lines 141 - 143, Remove the redundant empty initialization of the variable network_segmentation_enabled_flag (the standalone network_segmentation_enabled_flag= line); it is immediately overwritten by the later assignment network_segmentation_enabled_flag="--enable-network-segmentation", so delete the empty initialization to avoid confusion and dead code while leaving the later assignment intact.bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml (1)
196-197: Redundant empty initialization on line 196.The variable
multi_network_enabled_flagis initialized to empty and immediately overwritten on the next line. Consider removing the redundant empty assignment.- multi_network_enabled_flag= multi_network_enabled_flag="--enable-multi-network"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml` around lines 196 - 197, Remove the redundant empty initialization of the variable multi_network_enabled_flag that immediately gets overwritten; keep only the meaningful assignment multi_network_enabled_flag="--enable-multi-network" (ensure no other code expects an earlier empty declaration before this assignment).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bindata/network/ovn-kubernetes/common/008-script-lib.yaml`:
- Line 678: The hardcoded CLI flag `--enable-multicast` in the script lib should
be removed and multicast should be managed via the cluster-wide ConfigMap like
other feature flags; update the `[ovnkubernetesfeature]` section in
004-config.yaml to include `enable-multicast=true` (or `false` as appropriate)
and remove the `--enable-multicast` token from the command construction in
008-script-lib.yaml, or if multicast truly requires an exception, add a clear
comment in 008-script-lib.yaml documenting why it cannot be moved to the
ConfigMap and reference the `[ovnkubernetesfeature]` design decision.
---
Outside diff comments:
In `@bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml`:
- Around line 208-214: Remove the dead shell variables that are computed but
never used: delete the multi_network_policy_enabled_flag definition/block (the
conditional setting using {{.OVN_MULTI_NETWORK_POLICY_ENABLE}} that assigns
"--enable-multi-networkpolicy") and the admin_network_policy_enabled_flag
assignment ("--enable-admin-network-policy"); these flags are not referenced by
the ovnkube exec invocation so remove their declarations to clean up the script
and rely on the ConfigMap-managed settings instead.
---
Nitpick comments:
In `@bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml`:
- Around line 196-197: Remove the redundant empty initialization of the variable
multi_network_enabled_flag that immediately gets overwritten; keep only the
meaningful assignment multi_network_enabled_flag="--enable-multi-network"
(ensure no other code expects an earlier empty declaration before this
assignment).
In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml`:
- Around line 141-143: Remove the redundant empty initialization of the variable
network_segmentation_enabled_flag (the standalone
network_segmentation_enabled_flag= line); it is immediately overwritten by the
later assignment
network_segmentation_enabled_flag="--enable-network-segmentation", so delete the
empty initialization to avoid confusion and dead code while leaving the later
assignment intact.
f8367a3 to
1f801f7
Compare
|
@tsorya: This pull request references NVIDIA-554 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml (1)
141-144:⚠️ Potential issue | 🟠 MajorHardcoded
--enable-multi-networkflag overrides ConfigMap settings and breaks DisableMultiNetwork behavior.The unconditional
multi_network_enabled_flag="--enable-multi-network"(line 142) is passed to ovnkube at line 192. Since ovn-kubernetes CLI flags take precedence over config-file settings, this hardcoded flag will force multi-network on regardless of whether.OVN_MULTI_NETWORK_ENABLEis false or omitted in the ConfigMap. The 004-config.yaml correctly makesenable-multi-network=trueconditional on.OVN_MULTI_NETWORK_ENABLE, but the CLI flag bypasses that control.Also applies to
network_segmentation_enabled_flag(line 143) andadmin_network_policy_enabled_flag(line 164), which are similarly hardcoded while other feature flags follow a conditional pattern (e.g.,route_advertisements_enable_flag,evpn_enable_flag).Make all three flags conditional to respect ConfigMap configuration:
Fix for multi_network_enabled_flag
- multi_network_enabled_flag="--enable-multi-network" + multi_network_enabled_flag= + if [[ "{{.OVN_MULTI_NETWORK_ENABLE}}" == "true" ]]; then + multi_network_enabled_flag="--enable-multi-network" + fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml` around lines 141 - 144, The template hardcodes CLI flags that override ConfigMap settings: change multi_network_enabled_flag, network_segmentation_enabled_flag, and admin_network_policy_enabled_flag so they are set conditionally (like route_advertisements_enable_flag / evpn_enable_flag) based on the corresponding template variables (e.g., .OVN_MULTI_NETWORK_ENABLE, .OVN_NETWORK_SEGMENTATION_ENABLE, .OVN_ADMIN_NETWORK_POLICY_ENABLE) instead of unconditionally assigning "--enable-..." so the ovnkube command-line assembly does not force features on and respects the ConfigMap values.
🧹 Nitpick comments (4)
bindata/network/ovn-kubernetes/common/008-script-lib.yaml (1)
536-550: OVN_NODE_MODE handling is now cleanly limited to structural differences; good.The
dpu-hostbranch only tweaksgateway_interface, sets--ovnkube-node-mode dpu-host, and disables--init-ovnkube-controller, which matches the PR objective.Small hardening: initialize
ovnkube_node_mode=""before the mode check for readability and to make futureset -usafer.Optional cleanup
init_ovnkube_controller="--init-ovnkube-controller ${K8S_NODE}" gateway_interface="br-ex" + ovnkube_node_mode="" OVN_NODE_MODE=${OVN_NODE_MODE:-full}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/common/008-script-lib.yaml` around lines 536 - 550, Initialize the ovnkube_node_mode variable before the OVN_NODE_MODE check to avoid unset-variable issues: add ovnkube_node_mode="" above the if that inspects OVN_NODE_MODE so the script always defines ovnkube_node_mode (it is later set to "--ovnkube-node-mode dpu-host" inside the dpu-host branch), and keep the existing handling of init_ovnkube_controller and gateway_interface as-is.bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml (1)
155-165: Remove now-dead flag variables to avoid misleading future edits.
preconfigured_udn_addresses_enable_flag,multi_network_policy_enabled_flag, andadmin_network_policy_enabled_flagare still defined but no longer used in theexec /usr/bin/ovnkube ...invocation. Keeping them around makes it easy to assume they’re still wired.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml` around lines 155 - 165, Remove the now-unused flag variables to avoid confusion: delete the preconfigured_udn_addresses_enable_flag assignment, the multi_network_policy_enabled_flag variable and its if block (which sets --enable-multi-networkpolicy), and the admin_network_policy_enabled_flag declaration from ovnkube-control-plane.yaml; ensure you also remove any related empty assignments or conditional blocks around those symbols so only flags actually passed to the exec /usr/bin/ovnkube invocation remain.docs/ovn_node_mode.md (1)
7-15: Mode behavior table looks correct; minor: clarifyderive-from-mgmt-portis a literal token consumed by the script.Since the doc quotes
gateway_interface="derive-from-mgmt-port", it’d be good to note it’s a script-level sentinel (not an actual Linux interface name) to avoid confusion for operators greppingip link.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/ovn_node_mode.md` around lines 7 - 15, Clarify that the value gateway_interface="derive-from-mgmt-port" is a literal sentinel token consumed by the provisioning script (not an actual Linux interface name); update the docs entry for the `dpu-host` mode to state that `derive-from-mgmt-port` is parsed by the script/initializer (e.g., the code path that handles `gateway_interface`) and should not be interpreted as an interface to grep with `ip link` so operators aren’t confused.pkg/network/ovn_kubernetes_test.go (1)
259-272: Add test assertion for control-plane container args whenDisableMultiNetwork=true.The current
TestRenderedOVNKubernetesConfigtest checks only the rendered config file, which includesenable-multi-network=truein all cases. However, there are test cases withdisableMultiNet=true(lines 766, 909) where the control-plane Deployment's container args should not include--enable-multi-networkif the setting is properly respected.Without an assertion on the actual rendered Deployment's container command/args, a regression—where CLI arguments override the config file setting—would go undetected. Use
findInObjsto retrieve the control-plane Deployment anduns.NestedSliceto inspect the container args, similar to the pattern incheckOVNKubernetesPostStart(line 1090).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/network/ovn_kubernetes_test.go` around lines 259 - 272, TestRenderedOVNKubernetesConfig currently only asserts the rendered config file but must also assert the control-plane Deployment container args when disableMultiNet=true; locate the control-plane Deployment in TestRenderedOVNKubernetesConfig using findInObjs (same pattern as checkOVNKubernetesPostStart), extract the containers' args with uns.NestedSlice (or nested string slice for containers[0].args) and add assertions that when a testcase has disableMultiNet == true the args do NOT contain "--enable-multi-network" (and conversely that other cases include it), failing the test if the flag is present incorrectly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml`:
- Around line 195-199: The control-plane manifest is unconditionally setting
multi_network_enabled_flag and network_segmentation_enabled_flag which forces
CLI flags regardless of the ConfigMap render value; update the template in
ovnkube-control-plane.yaml so multi_network_enabled_flag and
network_segmentation_enabled_flag are only populated when the same render value
(OVN_MULTI_NETWORK_ENABLE / DisableMultiNetwork) used by 004-config.yaml is
true, or remove those variables entirely and rely on the config file;
specifically, change the template logic that defines multi_network_enabled_flag
and network_segmentation_enabled_flag to conditionally set
"--enable-multi-network" and "--enable-network-segmentation" (or an empty
string) and ensure the ovnkube invocation only includes those flags when
non-empty.
In `@bindata/network/ovn-kubernetes/self-hosted/004-config.yaml`:
- Around line 46-49: The self-hosted template condition for multi-network is
inverted vs managed and the CLI flag multi_network_enabled_flag is always
appended to the ovnkube command, so disable via DisableMultiNetwork is
ineffective; fix by (1) making the .OVN_MULTI_NETWORK_ENABLE conditional
consistent across managed and self-hosted (use the same if/not expression as
managed) for the enable-multi-network=true block, and (2) only define and append
multi_network_enabled_flag="--enable-multi-network" to the ovnkube invocation
when .OVN_MULTI_NETWORK_ENABLE is true (mirror how OVN_EVPN_ENABLE optional
flags are handled) so the command-line flag is not unconditionally overriding
the config file.
- Around line 35-42: The four egress flags (enable-egress-ip,
enable-egress-firewall, enable-egress-qos, enable-egress-service) are hardcoded
in the OVN templates; change each to use template variables (e.g., {{- if
.EnableEgressIP }}enable-egress-ip=true{{- end }}, etc.) in both the managed and
self-hosted 004-config.yaml templates and preserve the existing
ReachabilityTotalTimeoutSeconds/ReachabilityNodePort pattern for optional
rendering; then expose those bool fields from the data struct used to render
these templates in the Go rendering code (add .EnableEgressIP,
.EnableEgressFirewall, .EnableEgressQoS, .EnableEgressService to the template
data provider so callers/operator configuration can control them).
- Around line 54-57: Unconditionally enabling enable-admin-network-policy,
enable-multicast, and enable-multi-external-gateway can break controller startup
if required CRDs or APIs are absent; either wrap each flag in template guards
(e.g., add {{- if .EnableAdminNetworkPolicy }} ... {{- end }} for
enable-admin-network-policy and similarly for enable-multicast and
enable-multi-external-gateway) so they only appear when explicitly enabled, or
add clear repo documentation that the operator must pre-install the
adminnetworkpolicies / baselineadminnetworkpolicies CRDs and the multicast API
before applying this ConfigMap; also verify whether
enable-multi-external-gateway is a custom flag or a misnamed option (refer to
enable-multi-networkpolicy and AdminPolicyBasedExternalRoute for the expected
external gateway configuration) and update the flag name or docs accordingly.
---
Outside diff comments:
In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml`:
- Around line 141-144: The template hardcodes CLI flags that override ConfigMap
settings: change multi_network_enabled_flag, network_segmentation_enabled_flag,
and admin_network_policy_enabled_flag so they are set conditionally (like
route_advertisements_enable_flag / evpn_enable_flag) based on the corresponding
template variables (e.g., .OVN_MULTI_NETWORK_ENABLE,
.OVN_NETWORK_SEGMENTATION_ENABLE, .OVN_ADMIN_NETWORK_POLICY_ENABLE) instead of
unconditionally assigning "--enable-..." so the ovnkube command-line assembly
does not force features on and respects the ConfigMap values.
---
Nitpick comments:
In `@bindata/network/ovn-kubernetes/common/008-script-lib.yaml`:
- Around line 536-550: Initialize the ovnkube_node_mode variable before the
OVN_NODE_MODE check to avoid unset-variable issues: add ovnkube_node_mode=""
above the if that inspects OVN_NODE_MODE so the script always defines
ovnkube_node_mode (it is later set to "--ovnkube-node-mode dpu-host" inside the
dpu-host branch), and keep the existing handling of init_ovnkube_controller and
gateway_interface as-is.
In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml`:
- Around line 155-165: Remove the now-unused flag variables to avoid confusion:
delete the preconfigured_udn_addresses_enable_flag assignment, the
multi_network_policy_enabled_flag variable and its if block (which sets
--enable-multi-networkpolicy), and the admin_network_policy_enabled_flag
declaration from ovnkube-control-plane.yaml; ensure you also remove any related
empty assignments or conditional blocks around those symbols so only flags
actually passed to the exec /usr/bin/ovnkube invocation remain.
In `@docs/ovn_node_mode.md`:
- Around line 7-15: Clarify that the value
gateway_interface="derive-from-mgmt-port" is a literal sentinel token consumed
by the provisioning script (not an actual Linux interface name); update the docs
entry for the `dpu-host` mode to state that `derive-from-mgmt-port` is parsed by
the script/initializer (e.g., the code path that handles `gateway_interface`)
and should not be interpreted as an interface to grep with `ip link` so
operators aren’t confused.
In `@pkg/network/ovn_kubernetes_test.go`:
- Around line 259-272: TestRenderedOVNKubernetesConfig currently only asserts
the rendered config file but must also assert the control-plane Deployment
container args when disableMultiNet=true; locate the control-plane Deployment in
TestRenderedOVNKubernetesConfig using findInObjs (same pattern as
checkOVNKubernetesPostStart), extract the containers' args with uns.NestedSlice
(or nested string slice for containers[0].args) and add assertions that when a
testcase has disableMultiNet == true the args do NOT contain
"--enable-multi-network" (and conversely that other cases include it), failing
the test if the flag is present incorrectly.
1f801f7 to
4c08f43
Compare
|
@tsorya: This pull request references NVIDIA-554 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml (1)
139-196:⚠️ Potential issue | 🟠 MajorRemove unconditional
--enable-*flags for multi-network and network-segmentation.
multi_network_enabled_flagandnetwork_segmentation_enabled_flagare hardcoded to--enable-*and always passed to the ovnkube command. In ovn-kubernetes, CLI flags take precedence over config-file values; these unconditional flags will force these features ON even if the ConfigMap (004-config.yaml) explicitly disables them, defeating the intent to make ConfigMap the source of truth for feature configuration.Suggested change
- multi_network_enabled_flag="--enable-multi-network" - network_segmentation_enabled_flag="--enable-network-segmentation" ... - ${multi_network_enabled_flag} \ - ${network_segmentation_enabled_flag} \Remove these flags entirely and rely on the config file passed via
--config-file=/run/ovnkube-config/ovnkube.conf, or conditionally set them using the same render value that drives004-config.yamlto keep them in sync.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml` around lines 139 - 196, The manifest unconditionally sets multi_network_enabled_flag="--enable-multi-network" and network_segmentation_enabled_flag="--enable-network-segmentation" and always passes them to the ovnkube exec, which overrides the config file; remove those unconditional assignments and stop passing those variables to the ovnkube command (or instead set them conditionally using the same template variables used to render 004-config.yaml, e.g. only set multi_network_enabled_flag when "{{.OVN_MULTI_NETWORK_ENABLE}}" is true and network_segmentation_enabled_flag when "{{.OVN_NETWORK_SEGMENTATION_ENABLE}}" is true) so the --config-file=/run/ovnkube-config/ovnkube.conf remains the source of truth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml`:
- Around line 139-196: The manifest unconditionally sets
multi_network_enabled_flag="--enable-multi-network" and
network_segmentation_enabled_flag="--enable-network-segmentation" and always
passes them to the ovnkube exec, which overrides the config file; remove those
unconditional assignments and stop passing those variables to the ovnkube
command (or instead set them conditionally using the same template variables
used to render 004-config.yaml, e.g. only set multi_network_enabled_flag when
"{{.OVN_MULTI_NETWORK_ENABLE}}" is true and network_segmentation_enabled_flag
when "{{.OVN_NETWORK_SEGMENTATION_ENABLE}}" is true) so the
--config-file=/run/ovnkube-config/ovnkube.conf remains the source of truth.
---
Duplicate comments:
In `@bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml`:
- Around line 193-244: The template unconditionally sets
multi_network_enabled_flag and network_segmentation_enabled_flag to
"--enable-multi-network" and "--enable-network-segmentation", which forces CLI
enablement; remove those unconditional assignments and instead initialize both
flags empty and add conditional if-blocks (like route_advertisements_enable_flag
and evpn_enable_flag) that set
multi_network_enabled_flag="--enable-multi-network" when
"{{.OVN_MULTI_NETWORK_ENABLE}}" == "true" and
network_segmentation_enabled_flag="--enable-network-segmentation" when
"{{.OVN_NETWORK_SEGMENTATION_ENABLE}}" == "true", so the exec of
/usr/bin/ovnkube only includes those flags when the corresponding templated
variables are true.
In `@bindata/network/ovn-kubernetes/self-hosted/004-config.yaml`:
- Around line 52-54: The three feature flags enable-admin-network-policy,
enable-multi-external-gateway, and enable-multicast are turned on
unconditionally; update the template so each flag is either gated by a
corresponding conditional/template switch (e.g., expose variables like
.Values.enableAdminNetworkPolicy, .Values.enableMultiExternalGateway,
.Values.enableMulticast and use those to set each config key) or add inline
comments/documentation in the template explaining platform prerequisites and
required setup steps for each feature (admin-network-policy,
multi-external-gateway, multicast) to prevent rollout surprises.
- Around line 36-39: The egress feature flags (enable-egress-ip,
enable-egress-firewall, enable-egress-qos, enable-egress-service) are hard-coded
to true; make them configurable by replacing the literal true values with
template-driven variables (with sensible defaults) so footprints can
enable/disable each feature, and update any related values.yaml/README to
document the new flags; locate the keys
enable-egress-ip/enable-egress-firewall/enable-egress-qos/enable-egress-service
in the 004-config.yaml and swap to template variables and defaults used
elsewhere in the chart/manifest.
4c08f43 to
12a2193
Compare
|
@tsorya: This pull request references NVIDIA-554 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml (1)
155-165: Dead code: three variable assignments are never consumed by theovnkubeexec call.
preconfigured_udn_addresses_enable_flag,multi_network_policy_enabled_flag(and its surrounding conditional), andadmin_network_policy_enabled_flagare all assigned here but none appear in theexec /usr/bin/ovnkubeinvocation at lines 176–196. These features are now handled by the ConfigMap (enable-preconfigured-udn-addresses,enable-multi-networkpolicy,enable-admin-network-policy), making these assignments vestigial. They should be removed.♻️ Proposed cleanup
- preconfigured_udn_addresses_enable_flag="--enable-preconfigured-udn-addresses" - - # Enable multi-network policy if configured (control-plane always full mode) - multi_network_policy_enabled_flag= - if [[ "{{.OVN_MULTI_NETWORK_POLICY_ENABLE}}" == "true" ]]; then - multi_network_policy_enabled_flag="--enable-multi-networkpolicy" - fi - - # Enable admin network policy if configured (control-plane always full mode) - admin_network_policy_enabled_flag="--enable-admin-network-policy"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml` around lines 155 - 165, Remove the dead flag variables and conditional that are never used by the ovnkube invocation: delete the assignments to preconfigured_udn_addresses_enable_flag, admin_network_policy_enabled_flag, and the multi_network_policy_enabled_flag block (including the if [[ "{{.OVN_MULTI_NETWORK_POLICY_ENABLE}}" == "true" ]] ... fi). These flags are not passed into the exec /usr/bin/ovnkube call (ovnkube invocation) and are now configured via the ConfigMap, so remove the unused variables and conditional to clean up vestigial code.bindata/network/ovn-kubernetes/common/008-script-lib.yaml (1)
536-550:ovnkube_node_modeis used without initialization for the non-dpu-host path.
init_ovnkube_controller(line 536) andovn_advertised_udn_isolation_mode_flag(line 505) are both explicitly reset before the conditional block.ovnkube_node_modefollows a different pattern — it is only assigned inside thedpu-hostbranch and relies on bash's implicit empty-string expansion for all other modes. Sinceset -uis not active this is not a runtime issue, but it is inconsistent with the surrounding code's defensive initialization style.♻️ Suggested defensive initialization
init_ovnkube_controller="--init-ovnkube-controller ${K8S_NODE}" gateway_interface="br-ex" + ovnkube_node_mode= OVN_NODE_MODE=${OVN_NODE_MODE:-full}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/common/008-script-lib.yaml` around lines 536 - 550, The variable ovnkube_node_mode is only set inside the dpu-host branch, which is inconsistent with the defensive initialization used for init_ovnkube_controller and ovn_advertised_udn_isolation_mode_flag; initialize ovnkube_node_mode to an empty string before the OVN_NODE_MODE conditional (alongside init_ovnkube_controller and gateway_interface assignments) so it always exists and is predictable for non-dpu-host paths, and keep the existing assignment inside the if [ "${OVN_NODE_MODE}" == "dpu-host" ] branch to override it when needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@bindata/network/ovn-kubernetes/common/008-script-lib.yaml`:
- Around line 536-550: The variable ovnkube_node_mode is only set inside the
dpu-host branch, which is inconsistent with the defensive initialization used
for init_ovnkube_controller and ovn_advertised_udn_isolation_mode_flag;
initialize ovnkube_node_mode to an empty string before the OVN_NODE_MODE
conditional (alongside init_ovnkube_controller and gateway_interface
assignments) so it always exists and is predictable for non-dpu-host paths, and
keep the existing assignment inside the if [ "${OVN_NODE_MODE}" == "dpu-host" ]
branch to override it when needed.
In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml`:
- Around line 155-165: Remove the dead flag variables and conditional that are
never used by the ovnkube invocation: delete the assignments to
preconfigured_udn_addresses_enable_flag, admin_network_policy_enabled_flag, and
the multi_network_policy_enabled_flag block (including the if [[
"{{.OVN_MULTI_NETWORK_POLICY_ENABLE}}" == "true" ]] ... fi). These flags are not
passed into the exec /usr/bin/ovnkube call (ovnkube invocation) and are now
configured via the ConfigMap, so remove the unused variables and conditional to
clean up vestigial code.
ℹ️ Review info
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (10)
README.mdbindata/network/ovn-kubernetes/common/008-script-lib.yamlbindata/network/ovn-kubernetes/managed/004-config.yamlbindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yamlbindata/network/ovn-kubernetes/self-hosted/004-config.yamlbindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yamldocs/architecture.mddocs/operands.mddocs/ovn_node_mode.mdpkg/network/ovn_kubernetes_test.go
💤 Files with no reviewable changes (3)
- docs/architecture.md
- README.md
- docs/operands.md
🚧 Files skipped from review as they are similar to previous changes (2)
- bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml
- bindata/network/ovn-kubernetes/self-hosted/004-config.yaml
|
/retest-required |
3 similar comments
|
/retest-required |
|
/retest-required |
|
/retest-required |
|
/retest |
dave-tucker
left a comment
There was a problem hiding this comment.
took a quick pass over this and had some concerns.
the question for other reviewers more familiar with this repo is why we seem to be mixing passing config via both command-line flags and via config map? it looks like it would be cleaner to do this via cm.
| {{- if .ReachabilityNodePort }} | ||
| egressip-node-healthcheck-port={{.ReachabilityNodePort}} | ||
| {{- end }} | ||
| {{- if not .OVN_MULTI_NETWORK_ENABLE }} |
There was a problem hiding this comment.
did you mean to remove this guard?
There was a problem hiding this comment.
This is mistake, adding it back
| {{- if .ReachabilityNodePort }} | ||
| egressip-node-healthcheck-port={{.ReachabilityNodePort}} | ||
| {{- end }} | ||
| {{- if .OVN_MULTI_NETWORK_ENABLE }} |
| # This is needed so that converting clusters from GA to TP | ||
| # will rollout control plane pods as well | ||
| network_segmentation_enabled_flag= | ||
| multi_network_enabled_flag= |
There was a problem hiding this comment.
this doesn't appear to do anything
There was a problem hiding this comment.
I believe it should be even here, as it is part of https://github.com/openshift/cluster-network-operator/pull/2907/changes#diff-7f341b8bb5f1e78a51b286a4e2c35e8e998e18d9e6aabb9b1ab7ad3fb1032722L124
| ${gateway_mode_flags} \ | ||
| ${route_advertisements_enable_flag} \ | ||
| ${evpn_enable_flag} \ | ||
| ${preconfigured_udn_addresses_enable_flag} \ |
There was a problem hiding this comment.
no preconfigured_udn_addresses for anyone?
There was a problem hiding this comment.
It is part of configmap https://github.com/openshift/cluster-network-operator/blob/master/bindata/network/ovn-kubernetes/managed/004-config.yaml#L43
no point to have it in both place as especially if configmap
9da9717 to
adbc131
Compare
…nstead of per-node script gating This reverts commit c17dba5 (MGMT-21485) which moved feature enablement from the ConfigMap into per-node script logic gated by OVN_NODE_MODE. Feature flags (egress IP, multicast, multi-network, network segmentation, admin network policy, multi-external-gateway, etc.) are managed via the cluster-wide ConfigMap (004-config.yaml) passed to ovnkube through --config-file. They do not need per-node gating in the startup script. OVN_NODE_MODE remains used only for DPU-host structural differences: gateway interface, ovnkube-node-mode flag, and init-controller. Also re-applies the feature gate cleanup from f5b8490 (removal of OVN_ADMIN_NETWORK_POLICY_ENABLE template references) and removes redundant CLI flags from 008-script-lib.yaml that duplicate what the ConfigMap already provides. Co-authored-by: Cursor <cursoragent@cursor.com> Made-with: Cursor
adbc131 to
37d447c
Compare
|
@tsorya: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
NVIDIA-554: DPU-host mode: use ConfigMap for OVN feature enablement instead of per-node script gating
This reverts commit c17dba5 (MGMT-21485) which moved feature
enablement from the ConfigMap into per-node script logic gated
by OVN_NODE_MODE.
Feature flags (egress IP, multicast, multi-network, network
segmentation, admin network policy, multi-external-gateway, etc.)
are managed via the cluster-wide ConfigMap (004-config.yaml) passed
to ovnkube through --config-file. They do not need per-node gating
in the startup script.
OVN_NODE_MODE remains used only for DPU-host structural differences:
gateway interface, ovnkube-node-mode flag, and init-controller.
Also re-applies the feature gate cleanup from f5b8490 (removal of
OVN_ADMIN_NETWORK_POLICY_ENABLE template references) and removes
redundant CLI flags from 008-script-lib.yaml that duplicate what
the ConfigMap already provides.
Co-authored-by: Cursor cursoragent@cursor.com
Summary by CodeRabbit
Documentation
Chores