Skip to content

[essreduce] Fast wavelength lookup table workflow#589

Merged
nvaytet merged 15 commits into
mainfrom
fast-lut
May 22, 2026
Merged

[essreduce] Fast wavelength lookup table workflow#589
nvaytet merged 15 commits into
mainfrom
fast-lut

Conversation

@nvaytet
Copy link
Copy Markdown
Member

@nvaytet nvaytet commented May 18, 2026

Alternative to #557:

Instead of replacing the existing workflow with the one that uses scippneutron.tof.chopper_cascade to compute the wavelength LUT, we simply add a new workflow.

This can then be rolled out to different packages as desired.

The next step will be to use the providers from the new workflow and inject them directly into the GenericUnwrapWorkflow to compute the LUT on-the-fly. We defer that to a later PR.

Additional info (copied from #557)

To approximate the polygons as single lines, we produce a set of vertical lines that represent the bins in event_time_offset.
For each line, we find the intersections with the polygon edges, and then compute the midpoint between the min and max y coordinate of the intersections.

The table below is computed in ~0.7s compared to ~16s before (using 5M neutrons).

To use:

import scipp as sc
from ess.reduce import unwrap
from ess.reduce.nexus.types import AnyRun
from ess.odin.beamline import choppers

source_position = sc.vector([0, 0, 0], unit='m')
disk_choppers = choppers(source_position)

wf = unwrap.FastLookupTableWorkflow()
wf[unwrap.DiskChoppers[AnyRun]] = disk_choppers
wf[unwrap.SourcePosition] = source_position
wf[unwrap.PulseStride] = 2
wf[unwrap.LtotalRange] = sc.scalar(5.0, unit="m"), sc.scalar(65.0, unit="m")
wf[unwrap.DistanceResolution] = sc.scalar(0.1, unit="m")
wf[unwrap.TimeResolution] = sc.scalar(250.0, unit='us')

table = wf.compute(unwrap.LookupTable)
table.plot(vmax=10.0)
Screenshot_20260507_161109

Needs scipp/scippneutron#700

@github-actions github-actions Bot added the essreduce Issues for essreduce. label May 18, 2026
@github-actions
Copy link
Copy Markdown

Hi! Your PR was missing some labels 🔖 so I added them: essreduce

@nvaytet nvaytet requested review from SimonHeybrock and jokasimr May 18, 2026 09:24
and event_time_offset in the lookup table.
"""
distance_unit = "m"
time_unit = "us"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not ns, since event_time_offset typically has unit ns?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was because scientists usually think about time-of-flight in μs, and so when looking at lookup tables, it make more sense visually when the axis is in μs.
That said, you are right that this means we are almost always making a copy of the array of event_time_offset because we comvert it to μs...

I think I will use ns and save the compute 👍

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha I see, in that case either unit works for me

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min_dist = ltotal_range[0].to(unit=distance_unit)
max_dist = ltotal_range[1].to(unit=distance_unit)

# We need to bin the data below, to compute the weighted mean of the wavelength.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this refer to? I don't see any binning or weighted mean here. Was the comment duplicated from the other lookup-table-computation function?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, copied over from old computation.

# Find the correct simulation reading
selected_frame = None
for frame in sorted_frames:
if dist.value >= frame.distance.to(unit=dist.unit).value:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this, what is frame here? Why are there multiple of them? Later on in this function we compute "subframes" associated with the selected frame, what are those?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'frame' and 'subframe' terminology is from the chopper_cascade module: https://scipp.github.io/scippneutron/user-guide/chopper/chopper-cascade.html

A frame sequence contains frames, which is a description of how the pulse looks like at different stages along the beamline. There is a frame at the source (when the pulse is just a rectangle), one frame at each chopper, and a final frame that was propagated to the detector distance.

Then, each frame consists of one or multiple Subframe which are the polygons on the figure. Basically, one Frame is depicted by one colour on the figure. And there are one or more polygons of the same colour.

pulse_period=pulse_period,
pulse_stride=pulse_stride,
distance_resolution=table.coords["distance"][1] - table.coords["distance"][0],
time_resolution=table.coords["event_time_offset"][1]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we are not using the time_resolution and distance_resolution arguments here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final time_resolution may not be exactly what the value in the argument was, because we want it to fit exactly into one or more pulse periods. We give the step size which is the closest to what the user requested but fits exactly in the range (resolution will always be at least what the user requested or finer, never coarser).

The distance_resolution is preserved and in principle we could just use the argument. In this case, it is the ltotal_range which is not exactly preserved. But in case we decided later to change that behaviour in the future, I wanted to be sure that bugs were not introduced here (like forgetting to update the resolution in the table).
So I thought the safest is just to take it from the array.

# around the frame period, so we cover two pulses strides.
frequency_for_chopper_rotation = (1.0 / pulse_period.to(unit='s')) / (
pulse_stride * 2
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if pulse_stride was not an argument to this function, because essentially the pulse stride is a function of the chopper system, and we already pass the chopper parameters as an argument. Can we derive the pulse_stride from the chopper parameters?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial thought was that trying to guess from chopper settings was dangerous.
Were you thinking of guessing from the rotation frequencies?

There may be chopper configurations we did not think of?

For example, can there be a chopper that instead of having one opening and spins at 14 Hz, it has 2 symmetric openings and spins at 7 Hz?

Or maybe during hot commissioning, they may operate the source at 7Hz, meaning that Odin might just decide to park the pulse skipping chopper, meaning that we find only choppers with frequencies of 14 Hz and higher, thus guessing a pulse_stride of 1?

Copy link
Copy Markdown
Contributor

@jokasimr jokasimr May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I was thinking we could guess from the chopper frequencies.

Is it a problem if we guess a "too large" pulse stride?
My understanding is that it's only a problem if we guess a "too small" pulse stride. So in the case of a 7Hz chopper with two openings wouldn't be a problem, because we would guess pulse_stride=2, while it could have been treated as 1 as well.

I don't really understand the Odin example, if all choppers are operating at 14 Hz then the $\lambda(t_{oa})$ function will also be periodic with a frequency of 14 Hz right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand the Odin example, if all choppers are operating at 14 Hz then the λ(toa) function will also be periodic with a frequency of 14 Hz right?

The period would indeed be 1/14,which would be half as long as it would need to be, because all other choppers are running as if it was normal pulse-skipping mode.

But it is indeed a bit of an edge case, and I don't even know if it will happen in practice.
I will see if we can make a reliable detection of the pulse stride. Maybe I will keep the option to override the PulseStride if needed?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the PulseStride is a parameter that enters both the LookupTableWorkflow and the GenericUnwrapWorkflow, would you mind if I defer guessing the PulseStride from chopper params until I merge the two workflow to compute the LUT on-the-fly?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah absolutely, we can do it like this for now.

return 0.5 * (y_min + y_max), 0.5 * (y_max - y_min)


def _compute_mean_wavelength_in_polygons(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename to _estimate_wavelength_by_polygon_centers to be more specific.



@dataclass
class SourcePulse:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename to UniformSourcePulse to make it distinct from the non-uniform source pulse that is used in tof.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about SourceExtent or SourceBounds?
I agree SourcePulse isn't great, but I'm not sure users will immediately understand why Uniform comes into play?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those work too 👍

return np.concatenate(edges, axis=0) # (E, 2, 2)


def _polygon_intersections(polygons: list[np.ndarray], xs: np.ndarray) -> np.ndarray:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an option for doing this in a different way that is probably more efficient, and a bit simpler even.

Instead of converting the polygons to edges, we can convert each of them to two 1D curves representing the lower and upper bounds of the polygon.

Then np.interp can be used to evaluate the interpolation.

# instead of polygon_edges:
def _polygon_bounds(polygons: list[np.ndarray]) -> list[np.ndarray]:
    bounds = []

    for polygon in polygons:
        left = polygon[:, 0].argmin()
        right = polygon[:, 0].argmax()
    
        k = (right - left) % len(polygon)

        p = np.roll(polygon, -left, axis=0)

        bounds.extend((
            p[:k + 1],
            np.concatenate((p[k:], p[:1]))[::-1],
       ))
    return bounds
  

def _polygon_intersection(polygons: list[np.ndarray], xs: np.ndarray) -> np.ndarray:
    bounds = _polygon_bounds(polygons)

    y = np.vstack([
        np.interp(xs, b[:, 0], b[:, 1], left=np.nan, right=np.nan)
        for b in bounds
    ])
    y_min = np.nanmin(y, axis=0)
    y_max = np.nanmax(y, axis=0)

    # Median value and spread estimate
    return 0.5 * (y_min + y_max), 0.5 * (y_max - y_min)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea. One question: can we ever end up in the case where one of the two curves contains both segments that would intersect, and the second curve would contain none?

For example if one edge of the polygons is exactly vertical?
I guess the agrmin/max would pick one of the points, and then one edge of the polygon (say the upper) would contain one of the vertical points, while the lower edge would contain both. Then the np.interp would give you the one vertical point from the upper edge, but it's undefined what the lower edge would give you, because you could get either of the two points?

Copy link
Copy Markdown
Member Author

@nvaytet nvaytet May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I tried it with

polygons = [np.array([[0, 1, 1, 0.5],
                      [0, 0.2, 3.2, 3.0]]).T]
xs = np.array([1.0])

_polygon_intersection(polygons, xs)

and it gives 0.2 (returns the bottom point), instead of 1.7 (the mean).

That said, the old method gives NaN...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find and test.
I agree that what you get in that case is more or less random, because I guess it depends on how the polygon vertex list was ordered (?).

I think it is fine, because

  1. this case only occurs if the slope of the polygon is exactly vertical, I think that is pretty unusual,
  2. the probability that a neutron arrival time is exactly the edge of a polygon is approximately 1 over 100 million, so I think this is very much dominated by other noise.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. this case only occurs if the slope of the polygon is exactly vertical, I think that is pretty unusual,

This happens:

  • at distance = 0 where the pulse start -> we don't care about it here
  • at every chopper: a monitor would never be placed at the exact same position as a chopper. However, since we are computing a table over the whole range of Ltotal, it is not completely unlikely that one of the positions where we populate the table falls on a chopper. Without fixing, it would lead to glitches/discontinuities in the table, which may not be important but could look weird to users and lead them to not trust the results.
  1. the probability that a neutron arrival time is exactly the edge of a polygon is approximately 1 over 100 million, so I think this is very much dominated by other noise.

We are not finding the intersects for every neutron, but for the event_time_offset bin edges in our lookup table. Those are usually a linspace, and they could coincide with the chopper opening times.

In any case, I fixed it by adding into the list of intersects the vertices themselves, if they are equal to values in xs, and adding nans otherwise to ensure shape is consistent:

y = np.vstack([
        np.interp(xs, b[:, 0], b[:, 1], left=np.nan, right=np.nan)
        for b in bounds
    ] + [np.where(p[:, 0:1] == xs, p[:, 1:2], np.nan) for p in polygons])

It would also add the vertices of non-vertical polygon edges if they coincide with ther vertical line, but it does not hurt to add them in again, because we take half way between min and max.

This potentially adds a lot of nans for nothing, but it should remain small enough to not matter (1000 bin edges * eg 6 vertices per polygon * 6 polygons = 36000).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not finding the intersects for every neutron, but for the event_time_offset bin edges in our lookup table. Those are usually a linspace, and they could coincide with the chopper opening times.

Yes you're right. However I think it's still similarly unlikely that the bin edges fall on top of points where the polygons have vertical edges.

An alternative fix would be to make a change in the polygon bounds function so that, if the x-values of two rightmost or leftmost points are the same, set the y-value of the endpoint equal to that of the point before it. Not sure if that would be simpler or faster, but the intent would probably be clearer.

Copy link
Copy Markdown
Contributor

@jokasimr jokasimr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

The pulse skipping stuff will probably need some more work in the future but that seems unrelated to these changes.

BeamlineComponentReading,
ChopperFrameSequence,
DistanceResolution,
FastLookupTableWorkflow,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad name. Will we also have SlowLookupTableWorkflow?

)


def FastLookupTableWorkflow():
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually need to be a separate workflow? Can we change to

def LookupTableWorkflow(use_simulation: bool = True):  # switch to False after rollout
    ...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt it was natural to use a different workflow because they require different parameters, but I don't really mind.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the different parameters in conflict, or can a user easily switch the flag but otherwise just set the union of params?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No conflicts, so it should be fine to set the union also in the list of default params.

@nvaytet nvaytet added this pull request to the merge queue May 22, 2026
Merged via the queue into main with commit 868750d May 22, 2026
17 checks passed
@nvaytet nvaytet deleted the fast-lut branch May 22, 2026 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

essreduce Issues for essreduce.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants