fix(vi): use Vi-standard word boundary rules for w/b/e motions#1059
fix(vi): use Vi-standard word boundary rules for w/b/e motions#1059sim590 wants to merge 8 commits into
Conversation
Replace unicode_segmentation word boundaries with a Vi-compatible three-class system (keyword, punctuation, whitespace) for w/e/b motions. Separate inclusive Vi cut/copy (de/dE/ce/cE/ye/yE) from exclusive Emacs word commands (Alt+d) by adding dedicated EditCommand variants. Closes: nushell#563, nushell#667 Addresses: nushell#788 (point 2)
…iases Add emacs_word_right_index, emacs_word_right_end_index, emacs_word_right_start_index, emacs_word_left_index, and emacs_current_word_range as the canonical Emacs-path functions. Deprecate the unprefixed word_* variants to encourage callers to choose explicitly between emacs_* and vi_* word boundaries. All internal callers migrated to emacs_* variants.
Cover basic cases (spaces, single word, empty string, edge positions) for vi_word_right_start_index, vi_word_right_end_index, vi_word_left_index, vi_word_right_index, and vi_current_word_range.
Add Vi prefix to CutWordRightEnd, CutBigWordRightEnd, CopyWordRightEnd, and CopyBigWordRightEnd for consistency with other Vi-specific EditCommand variants.
|
What LLM did you use? The description is verbose and repetitive which makes me concerned that the code will be the same. |
|
I'm not sure where the description is repetitive. But the LLM used is Claude Opus 4.6. |
|
I have a thought about this fix, motivated from when I was working on my The Vi/Emacs boundary decoupling in One concern I have on the Same Long-term there's something I see coming anyways, which is decoupling motions from operators entirely. Which might come up naturally in #960 once we bring in the I raise this now since helix will add a lot more to I don't want to block the bug fix. If the consensus is to stay flat, this is fine. |
|
I'd rather wait until the helix changes are in. Marking this draft. |
|
Thanks for the feedback @kronberger-droid and @fdncred. To summarize the situation: this PR fixes Vi word boundary behavior ( The discussion point is how the new Having looked at #960, I note that the Helix mode takes a similar "separate commands per mode" approach — Regarding the longer-term idea of decoupling motions from operators entirely: if that refactor is coming anyway, then whatever approach we pick today (flat or parameterized) is equally temporary — it'll all be reworked when that happens. So there's little benefit in over-engineering the interim solution. That said, we're open to restructuring the This bug has been open for a long time and affects everyday Vi users. We'd like to avoid blocking on #960 if possible, but we're happy to adapt the implementation to your preferences. |
|
@fdncred @schlich @sim590 Concrete shape: EditCommand::MoveWord { flavor: WordFlavor, target: MotionTarget, select: bool }
EditCommand::CutWord { flavor: WordFlavor, target: MotionTarget }
I believe this lines up with where a later motion/operator decoupling would land rather than competing with it, since |
|
ok, sounds good to me |
Add tests for cut_vi_big_word_right_end (dE), copy_vi_word_right_end (ye), and copy_vi_big_word_right_end (yE) which contain non-trivial inclusive-end logic via grapheme_right_index_from_pos.
Replace 28 flat word-related EditCommand variants with 3 parameterized ones: MoveWord, CutWord, CopyWord, each taking WordFlavor, WordSize, and MotionTarget fields. Add WordFlavor (Emacs/Vi), WordSize (Word/BigWord), and MotionTarget (Left/Right/RightStart/RightEnd/RightToNext) enums. BigWord commands use wildcard flavor matching since word/WORD semantics are identical across Vi and Emacs. Invalid flavor/target combinations are caught with unreachable!(). Fix Emacs cut/copy_word_right_to_next incorrectly using vi_word_right_start_index instead of emacs_word_right_start_index.
7fba736 to
799f070
Compare
|
Pushed a new commit that implements the parameterized approach as discussed. The 28 flat word-related MoveWord { flavor: WordFlavor, target: MotionTarget, size: WordSize, select: bool }
CutWord { flavor: WordFlavor, target: MotionTarget, size: WordSize }
CopyWord { flavor: WordFlavor, target: MotionTarget, size: WordSize }
BigWord dispatch uses a wildcard on All tests pass (904/904), clippy clean, fmt clean. @kronberger-droid does this match what you had in mind? |
|
Walking back the parameterization I suggested.
Flat variants fit better, then mode picks the command, editor dispatches 1:1: EditCommand::MoveEmacsWordLeft { select: bool },
EditCommand::MoveViWordLeft { select: bool },
EditCommand::MoveBigWordLeft { select: bool }, // shared~37 statically-valid variants vs 3 with The fully mode-agnostic version ( The Vi word-boundary fix itself ( Sorry for the churn. |
|
Also one thing: |
This reverts commit 799f070.
Rename word-related EditCommand variants to make the mode explicit: MoveWordLeft -> MoveEmacsWordLeft, CutWordRight -> CutEmacsWordRight, CopyWordRightToNext -> CopyEmacsWordRightToNext, etc. BigWord variants remain unprefixed since word/WORD semantics are identical across Vi and Emacs modes. Update all dispatch, keybinding, and call sites accordingly.
|
Thanks for the honest feedback @kronberger-droid — the Reverted the parameterization and went with flat variants instead, with explicit mode prefixes:
All 904 tests pass, clippy clean, fmt clean. Happy to adjust further if anything still looks off. |
Summary
Fixes Vi small-word motions (
w,b,e) and their operator variants (dw,de,db,cw,ce,cb,yw,ye,yb) to use standard Vi word boundary rules instead of Unicode UAX #29 segmentation.Additionally, refactors all 28 flat word-related
EditCommandvariants into 3 parameterized ones (MoveWord,CutWord,CopyWord) usingWordFlavor,WordSize, andMotionTargetenums, as suggested by @kronberger-droid.Problem
The
w,e, andbmotions in Vi mode shared the same word boundary functions as Emacs mode, which useunicode_segmentation::split_word_bound_indices()(UAX #29). This treats punctuation inconsistently — for example, infoo.bar, Vi should see three words (foo,.,bar), but reedline treated it as one.Additionally, inclusive Vi operator commands (
de,dE,ce,cE,ye,yE) sharedEditCommandvariants with exclusive Emacs commands (Alt+d), causing incorrect cut/copy ranges.Solution
Word boundary logic
Keyword,Punctuation,Whitespace) matching the Vim/POSIX standard, and avi_word_segments()iterator that segments text accordingly.vi_word_*functions (vi_word_right_index,vi_word_right_start_index,vi_word_right_end_index,vi_word_left_index,vi_current_word_range) using Vi three-class segmentation.emacs_*aliases (emacs_word_right_index,emacs_word_right_start_index, etc.) for the existing UAX added a slightly-fancy prompt #29-based functions. The unprefixedword_*variants are deprecated to encourage callers to choose explicitly.EditCommand refactoring
Replaced 28 flat word-related
EditCommandvariants with 3 parameterized ones:WordFlavor:EmacsorVi(extensible toHelixlater)WordSize:WordorBigWordMotionTarget:Left,Right,RightStart,RightEnd,RightToNextBigWord dispatch uses a wildcard on
flavorsince word/WORD semantics are identical across Vi and Emacs. Invalid combinations are caught withunreachable!().Other changes
Tests
vi_word_segments()w,e,b)vi_word_right_start_index,vi_word_right_end_index,vi_word_left_index,vi_word_right_index, andvi_current_word_rangede,dE,db,ye,yE,yb)Note on other open PRs
This PR introduces a structural change to how word motions are handled — word-related
EditCommandvariants have been parameterized and Vi/Emacs word boundary paths are now fully decoupled. Open PRs that touch word motions orEditCommandvariants (notably #1016 and #960) may need to rebase after this lands. PRs that only touch other areas are unaffected.Closes #563, closes #667, addresses #788 (point 2)