Skip to content

BUG: Fire IterationEvent before AdvanceOneStep in v4 gradient optimizers (closes #2570, supersedes #2572)#6196

Merged
hjmjohnson merged 1 commit intoInsightSoftwareConsortium:mainfrom
hjmjohnson:fix-issue-2570-gd-event-order
May 6, 2026
Merged

BUG: Fire IterationEvent before AdvanceOneStep in v4 gradient optimizers (closes #2570, supersedes #2572)#6196
hjmjohnson merged 1 commit intoInsightSoftwareConsortium:mainfrom
hjmjohnson:fix-issue-2570-gd-event-order

Conversation

@hjmjohnson
Copy link
Copy Markdown
Member

@hjmjohnson hjmjohnson commented May 2, 2026

Fixes #2570 and supersedes the WIP draft #2572 by @gregsharp. The 2021 PR stalled on coding-standards review, not on the diagnosis — Greg was right about the bug.

GradientDescentOptimizerv4 (and its subclasses) fired IterationEvent at the end of AdvanceOneStep(), after UpdateTransformParameters had already moved the position from X to X+step, while m_CurrentMetricValue still held v(X). Observers reading GetCurrentMetricValue() and GetCurrentPosition() got an inconsistent pair (old value with new position).

This PR moves the event invocation to the loop in ResumeOptimization(), fired BEFORE AdvanceOneStep(), so observers always see the value at the position where it was actually computed. Six-line change to the base; trailing InvokeEvent removed from each subclass override (RegularStep, QuasiNewton, GradientDescentLineSearch, ConjugateGradientLineSearch).

User-visible symptom (from issue #2570)

The reproducer (a plastimatch v4-registration test program and its AmoebaOptimizerv4 vs RegularStepGradientDescentOptimizerv4 comparison) produces this observer output at iteration 0:

expected:  0   17208.6   [0, 0, 0]
actual:    0   17208.6   [14.999996789911513, -0.006939114154077091, -0.006939116585946947]

The score at iteration 0 is the score evaluated at the initial parameters [0, 0, 0] — but the position the observer reads has already been advanced one step. Same value, different position, no way for the observer to reconstruct what happened. Discussed on Discourse: imageregistrationmethodv4-questions/4117/7.

TDD evidence — failing test pinned the bug, then passes after fix

The new GTest itkGradientDescentOptimizerv4ObserverGTest.cxx attaches an observer at IterationEvent and asserts that metric_at(reported_position) == reported_value. On unfixed upstream/main:

GradientDescentOptimizerv4.ObserverReportsConsistentValueAndPosition (FAILED)
RegularStepGradientDescentOptimizerv4.ObserverReportsConsistentValueAndPosition (FAILED)
  Iteration 0: observer reported value 23999.99... at position [99.998..., -99.992...]
               but metric at that position is 23995.96... (off by ~4)

After the fix:

[ OK ] GradientDescentOptimizerv4.ObserverReportsConsistentValueAndPosition (0 ms)
[ OK ] RegularStepGradientDescentOptimizerv4.ObserverReportsConsistentValueAndPosition (0 ms)
Why this approach over Greg's #2572 restructure

Greg's PR rewrote ResumeOptimization to compute the metric value/derivative after AdvanceOneStep (~150 lines changed across 7 files). It's correct, but it changes the iteration semantics:

Old: zero iterations = no evaluations; one iteration = one evaluation + one position update.
Greg's: zero iterations = no evaluations; one iteration = one evaluation but no position update.

@dzenanz pushed back on that semantic flip in the 2021 review, and the discussion never resolved. This PR is the minimal alternative: keep the existing "1 iteration = 1 step taken" semantic and move the event invocation 6 lines earlier in the loop. No iteration-semantic change, no risk of breaking dependent registration code.

Other items in Greg's branch — explicit zero-init of m_Gradient / m_PreviousGradient in StartOptimization, an extra m_PreviousGradient = m_Gradient after the initial evaluation — were inspected for first-iteration correctness concerns. With the event-firing fix in place, neither is load-bearing: subclasses (RegularStepGradientDescentOptimizerv4) already initialize their own gradient buffers, and the relaxation dot product on iteration 0 is unchanged either way. They're omitted here to keep the surface minimal.

Files changed
  • Modules/Numerics/Optimizersv4/include/itkGradientDescentOptimizerv4.hxx — add InvokeEvent(IterationEvent()) before AdvanceOneStep() in ResumeOptimization; remove the trailing InvokeEvent from AdvanceOneStep
  • Modules/Numerics/Optimizersv4/include/itkRegularStepGradientDescentOptimizerv4.hxx — remove trailing InvokeEvent from AdvanceOneStep
  • Modules/Numerics/Optimizersv4/include/itkQuasiNewtonOptimizerv4.hxx — same
  • Modules/Numerics/Optimizersv4/include/itkGradientDescentLineSearchOptimizerv4.hxx — same
  • Modules/Numerics/Optimizersv4/include/itkConjugateGradientLineSearchOptimizerv4.hxx — same
  • Modules/Numerics/Optimizersv4/test/itkGradientDescentOptimizerv4ObserverGTest.cxx — new (regression guard)
  • Modules/Numerics/Optimizersv4/test/CMakeLists.txt — register the new GTest
Verification
  • All 21 tests in Optimizersv4 + RegistrationMethodsv4 labels pass locally on the fix
  • pre-commit run --all-files exit 0
  • New GTest fails on upstream/main (no fix), passes on this branch — confirms it actually guards against the bug

@github-actions github-actions Bot added type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct area:Numerics Issues affecting the Numerics module labels May 2, 2026
@hjmjohnson hjmjohnson force-pushed the fix-issue-2570-gd-event-order branch from 6687ce8 to 4e8804f Compare May 2, 2026 21:21
@github-actions github-actions Bot added the area:Documentation Issues affecting the Documentation module label May 2, 2026
@hjmjohnson hjmjohnson marked this pull request as ready for review May 3, 2026 12:40
@greptile-apps

This comment was marked as resolved.

Comment thread Modules/Numerics/Optimizersv4/include/itkGradientDescentOptimizerv4.hxx Outdated
@hjmjohnson hjmjohnson force-pushed the fix-issue-2570-gd-event-order branch from 4e8804f to 3048603 Compare May 4, 2026 11:21
@hjmjohnson hjmjohnson added this to the ITK 6.0 Release Candidate 1 milestone May 5, 2026
@hjmjohnson
Copy link
Copy Markdown
Member Author

@ntustison @stnava @blowekamp @cookpa

This issue has been outstanding for a long time. The ANTs may have behavior that depends on this, so I'm giving you a chance to review before it is merged in the next few days.

@hjmjohnson hjmjohnson force-pushed the fix-issue-2570-gd-event-order branch from 3048603 to b9e59e9 Compare May 5, 2026 22:34
@github-actions github-actions Bot removed the area:Documentation Issues affecting the Documentation module label May 5, 2026
@cookpa
Copy link
Copy Markdown
Contributor

cookpa commented May 6, 2026

Thanks for the heads up @hjmjohnson. I did a bit of checking with Codex and I think there will be no problems with ANTs. It threw out the suggestion I put in the comment above just as a general point

Issue InsightSoftwareConsortium#2570: an observer attached to IterationEvent on
GradientDescentOptimizerv4 (or its subclasses) was given an internally
inconsistent (value, position) pair. The optimizer was firing the event
at the end of AdvanceOneStep(), after UpdateTransformParameters() had
already moved the position from X to X+step, while m_CurrentMetricValue
still held v(X). Observers reading GetCurrentMetricValue() and
GetCurrentPosition() saw value-at-old-position with the new position.

Move InvokeEvent(IterationEvent()) into the loop in
GradientDescentOptimizerv4Template::ResumeOptimization(), fired BEFORE
AdvanceOneStep() so the reported (value, position) pair always agrees.
Remove the now-redundant InvokeEvent at the end of each subclass's
AdvanceOneStep override (RegularStep, QuasiNewton, GradientDescentLineSearch,
ConjugateGradientLineSearch).

Adds a GTest regression guard
(itkGradientDescentOptimizerv4ObserverGTest.cxx) that fails on the
unfixed code with a ~4-unit value/position mismatch on a quadratic
fixture, and passes on the fixed code.

Closes InsightSoftwareConsortium#2570. Supersedes the WIP draft InsightSoftwareConsortium#2572 by @gregsharp; the test
demonstrates Greg's diagnosis is correct, the fix here is a simpler
one-event-move that does not change iteration semantics.

Co-Authored-By: Gregory C. Sharp <gregsharp.geo@yahoo.com>
@hjmjohnson hjmjohnson force-pushed the fix-issue-2570-gd-event-order branch from b9e59e9 to cf929f2 Compare May 6, 2026 14:56
@hjmjohnson hjmjohnson merged commit d0ffc99 into InsightSoftwareConsortium:main May 6, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Numerics Issues affecting the Numerics module type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GradientDescentOptimizerv4 can return incorrect positions

3 participants