Conversation
|
Hi! Your PR was missing some labels 🔖 so I added them: essreduce |
| and event_time_offset in the lookup table. | ||
| """ | ||
| distance_unit = "m" | ||
| time_unit = "us" |
There was a problem hiding this comment.
why not ns, since event_time_offset typically has unit ns?
There was a problem hiding this comment.
It was because scientists usually think about time-of-flight in μs, and so when looking at lookup tables, it make more sense visually when the axis is in μs.
That said, you are right that this means we are almost always making a copy of the array of event_time_offset because we comvert it to μs...
I think I will use ns and save the compute 👍
There was a problem hiding this comment.
Aha I see, in that case either unit works for me
There was a problem hiding this comment.
Hmm actually it's the other way round:
we take the event_time_offset unit: https://github.com/scipp/ess/blob/fast-lut/packages/essreduce/src/ess/reduce/unwrap/to_wavelength.py#L270
and convert the axis of the LUT to that unit: https://github.com/scipp/ess/blob/fast-lut/packages/essreduce/src/ess/reduce/unwrap/to_wavelength.py#L93
| min_dist = ltotal_range[0].to(unit=distance_unit) | ||
| max_dist = ltotal_range[1].to(unit=distance_unit) | ||
|
|
||
| # We need to bin the data below, to compute the weighted mean of the wavelength. |
There was a problem hiding this comment.
What does this refer to? I don't see any binning or weighted mean here. Was the comment duplicated from the other lookup-table-computation function?
There was a problem hiding this comment.
Yes, copied over from old computation.
| # Find the correct simulation reading | ||
| selected_frame = None | ||
| for frame in sorted_frames: | ||
| if dist.value >= frame.distance.to(unit=dist.unit).value: |
There was a problem hiding this comment.
I don't quite understand this, what is frame here? Why are there multiple of them? Later on in this function we compute "subframes" associated with the selected frame, what are those?
There was a problem hiding this comment.
The 'frame' and 'subframe' terminology is from the chopper_cascade module: https://scipp.github.io/scippneutron/user-guide/chopper/chopper-cascade.html
A frame sequence contains frames, which is a description of how the pulse looks like at different stages along the beamline. There is a frame at the source (when the pulse is just a rectangle), one frame at each chopper, and a final frame that was propagated to the detector distance.
Then, each frame consists of one or multiple Subframe which are the polygons on the figure. Basically, one Frame is depicted by one colour on the figure. And there are one or more polygons of the same colour.
| pulse_period=pulse_period, | ||
| pulse_stride=pulse_stride, | ||
| distance_resolution=table.coords["distance"][1] - table.coords["distance"][0], | ||
| time_resolution=table.coords["event_time_offset"][1] |
There was a problem hiding this comment.
Is there a reason why we are not using the time_resolution and distance_resolution arguments here?
There was a problem hiding this comment.
The final time_resolution may not be exactly what the value in the argument was, because we want it to fit exactly into one or more pulse periods. We give the step size which is the closest to what the user requested but fits exactly in the range (resolution will always be at least what the user requested or finer, never coarser).
The distance_resolution is preserved and in principle we could just use the argument. In this case, it is the ltotal_range which is not exactly preserved. But in case we decided later to change that behaviour in the future, I wanted to be sure that bugs were not introduced here (like forgetting to update the resolution in the table).
So I thought the safest is just to take it from the array.
| # around the frame period, so we cover two pulses strides. | ||
| frequency_for_chopper_rotation = (1.0 / pulse_period.to(unit='s')) / ( | ||
| pulse_stride * 2 | ||
| ) |
There was a problem hiding this comment.
I'd prefer if pulse_stride was not an argument to this function, because essentially the pulse stride is a function of the chopper system, and we already pass the chopper parameters as an argument. Can we derive the pulse_stride from the chopper parameters?
There was a problem hiding this comment.
The initial thought was that trying to guess from chopper settings was dangerous.
Were you thinking of guessing from the rotation frequencies?
There may be chopper configurations we did not think of?
For example, can there be a chopper that instead of having one opening and spins at 14 Hz, it has 2 symmetric openings and spins at 7 Hz?
Or maybe during hot commissioning, they may operate the source at 7Hz, meaning that Odin might just decide to park the pulse skipping chopper, meaning that we find only choppers with frequencies of 14 Hz and higher, thus guessing a pulse_stride of 1?
There was a problem hiding this comment.
Yes I was thinking we could guess from the chopper frequencies.
Is it a problem if we guess a "too large" pulse stride?
My understanding is that it's only a problem if we guess a "too small" pulse stride. So in the case of a 7Hz chopper with two openings wouldn't be a problem, because we would guess pulse_stride=2, while it could have been treated as 1 as well.
I don't really understand the Odin example, if all choppers are operating at 14 Hz then the
There was a problem hiding this comment.
I don't really understand the Odin example, if all choppers are operating at 14 Hz then the λ(toa) function will also be periodic with a frequency of 14 Hz right?
The period would indeed be 1/14,which would be half as long as it would need to be, because all other choppers are running as if it was normal pulse-skipping mode.
But it is indeed a bit of an edge case, and I don't even know if it will happen in practice.
I will see if we can make a reliable detection of the pulse stride. Maybe I will keep the option to override the PulseStride if needed?
There was a problem hiding this comment.
As the PulseStride is a parameter that enters both the LookupTableWorkflow and the GenericUnwrapWorkflow, would you mind if I defer guessing the PulseStride from chopper params until I merge the two workflow to compute the LUT on-the-fly?
There was a problem hiding this comment.
Yeah absolutely, we can do it like this for now.
| return 0.5 * (y_min + y_max), 0.5 * (y_max - y_min) | ||
|
|
||
|
|
||
| def _compute_mean_wavelength_in_polygons( |
There was a problem hiding this comment.
Maybe rename to _estimate_wavelength_by_polygon_centers to be more specific.
|
|
||
|
|
||
| @dataclass | ||
| class SourcePulse: |
There was a problem hiding this comment.
Maybe rename to UniformSourcePulse to make it distinct from the non-uniform source pulse that is used in tof.
There was a problem hiding this comment.
How about SourceExtent or SourceBounds?
I agree SourcePulse isn't great, but I'm not sure users will immediately understand why Uniform comes into play?
| return np.concatenate(edges, axis=0) # (E, 2, 2) | ||
|
|
||
|
|
||
| def _polygon_intersections(polygons: list[np.ndarray], xs: np.ndarray) -> np.ndarray: |
There was a problem hiding this comment.
There's an option for doing this in a different way that is probably more efficient, and a bit simpler even.
Instead of converting the polygons to edges, we can convert each of them to two 1D curves representing the lower and upper bounds of the polygon.
Then np.interp can be used to evaluate the interpolation.
# instead of polygon_edges:
def _polygon_bounds(polygons: list[np.ndarray]) -> list[np.ndarray]:
bounds = []
for polygon in polygons:
left = polygon[:, 0].argmin()
right = polygon[:, 0].argmax()
k = (right - left) % len(polygon)
p = np.roll(polygon, -left, axis=0)
bounds.extend((
p[:k + 1],
np.concatenate((p[k:], p[:1]))[::-1],
))
return bounds
def _polygon_intersection(polygons: list[np.ndarray], xs: np.ndarray) -> np.ndarray:
bounds = _polygon_bounds(polygons)
y = np.vstack([
np.interp(xs, b[:, 0], b[:, 1], left=np.nan, right=np.nan)
for b in bounds
])
y_min = np.nanmin(y, axis=0)
y_max = np.nanmax(y, axis=0)
# Median value and spread estimate
return 0.5 * (y_min + y_max), 0.5 * (y_max - y_min)There was a problem hiding this comment.
Nice idea. One question: can we ever end up in the case where one of the two curves contains both segments that would intersect, and the second curve would contain none?
For example if one edge of the polygons is exactly vertical?
I guess the agrmin/max would pick one of the points, and then one edge of the polygon (say the upper) would contain one of the vertical points, while the lower edge would contain both. Then the np.interp would give you the one vertical point from the upper edge, but it's undefined what the lower edge would give you, because you could get either of the two points?
There was a problem hiding this comment.
Indeed, I tried it with
polygons = [np.array([[0, 1, 1, 0.5],
[0, 0.2, 3.2, 3.0]]).T]
xs = np.array([1.0])
_polygon_intersection(polygons, xs)and it gives 0.2 (returns the bottom point), instead of 1.7 (the mean).
That said, the old method gives NaN...
There was a problem hiding this comment.
Nice find and test.
I agree that what you get in that case is more or less random, because I guess it depends on how the polygon vertex list was ordered (?).
I think it is fine, because
- this case only occurs if the slope of the polygon is exactly vertical, I think that is pretty unusual,
- the probability that a neutron arrival time is exactly the edge of a polygon is approximately 1 over 100 million, so I think this is very much dominated by other noise.
There was a problem hiding this comment.
- this case only occurs if the slope of the polygon is exactly vertical, I think that is pretty unusual,
This happens:
- at distance = 0 where the pulse start -> we don't care about it here
- at every chopper: a monitor would never be placed at the exact same position as a chopper. However, since we are computing a table over the whole range of Ltotal, it is not completely unlikely that one of the positions where we populate the table falls on a chopper. Without fixing, it would lead to glitches/discontinuities in the table, which may not be important but could look weird to users and lead them to not trust the results.
- the probability that a neutron arrival time is exactly the edge of a polygon is approximately 1 over 100 million, so I think this is very much dominated by other noise.
We are not finding the intersects for every neutron, but for the event_time_offset bin edges in our lookup table. Those are usually a linspace, and they could coincide with the chopper opening times.
In any case, I fixed it by adding into the list of intersects the vertices themselves, if they are equal to values in xs, and adding nans otherwise to ensure shape is consistent:
y = np.vstack([
np.interp(xs, b[:, 0], b[:, 1], left=np.nan, right=np.nan)
for b in bounds
] + [np.where(p[:, 0:1] == xs, p[:, 1:2], np.nan) for p in polygons])It would also add the vertices of non-vertical polygon edges if they coincide with ther vertical line, but it does not hurt to add them in again, because we take half way between min and max.
This potentially adds a lot of nans for nothing, but it should remain small enough to not matter (1000 bin edges * eg 6 vertices per polygon * 6 polygons = 36000).
There was a problem hiding this comment.
We are not finding the intersects for every neutron, but for the event_time_offset bin edges in our lookup table. Those are usually a linspace, and they could coincide with the chopper opening times.
Yes you're right. However I think it's still similarly unlikely that the bin edges fall on top of points where the polygons have vertical edges.
An alternative fix would be to make a change in the polygon bounds function so that, if the x-values of two rightmost or leftmost points are the same, set the y-value of the endpoint equal to that of the point before it. Not sure if that would be simpler or faster, but the intent would probably be clearer.
jokasimr
left a comment
There was a problem hiding this comment.
Looks good to me!
The pulse skipping stuff will probably need some more work in the future but that seems unrelated to these changes.
| BeamlineComponentReading, | ||
| ChopperFrameSequence, | ||
| DistanceResolution, | ||
| FastLookupTableWorkflow, |
There was a problem hiding this comment.
Bad name. Will we also have SlowLookupTableWorkflow?
| ) | ||
|
|
||
|
|
||
| def FastLookupTableWorkflow(): |
There was a problem hiding this comment.
Does this actually need to be a separate workflow? Can we change to
def LookupTableWorkflow(use_simulation: bool = True): # switch to False after rollout
...There was a problem hiding this comment.
I felt it was natural to use a different workflow because they require different parameters, but I don't really mind.
There was a problem hiding this comment.
Are the different parameters in conflict, or can a user easily switch the flag but otherwise just set the union of params?
There was a problem hiding this comment.
No conflicts, so it should be fine to set the union also in the list of default params.
Alternative to #557:
Instead of replacing the existing workflow with the one that uses
scippneutron.tof.chopper_cascadeto compute the wavelength LUT, we simply add a new workflow.This can then be rolled out to different packages as desired.
The next step will be to use the providers from the new workflow and inject them directly into the
GenericUnwrapWorkflowto compute the LUT on-the-fly. We defer that to a later PR.Additional info (copied from #557)
To approximate the polygons as single lines, we produce a set of vertical lines that represent the bins in
event_time_offset.For each line, we find the intersections with the polygon edges, and then compute the midpoint between the min and max y coordinate of the intersections.
The table below is computed in ~0.7s compared to ~16s before (using 5M neutrons).
To use:
Needs scipp/scippneutron#700