Skip to content

fix: parse year-before-month-name expressions (e.g. "2024 Aug")#648

Open
chatman-media wants to merge 1 commit into
wanasit:masterfrom
chatman-media:fix/year-before-month-name
Open

fix: parse year-before-month-name expressions (e.g. "2024 Aug")#648
chatman-media wants to merge 1 commit into
wanasit:masterfrom
chatman-media:fix/year-before-month-name

Conversation

@chatman-media

Copy link
Copy Markdown

Closes #639

Problem

Parsing "2024 Aug" returns August 2023 instead of August 2024 (the yyyy MMM order is a common format). Reproduced with a pinned reference date:

chrono.parse("2024 Aug", new Date(2024, 0, 15));
// => 2023-08-20  ❌  (expected 2024-08-01)

Two parsers were involved:

  1. ENMonthNameParser only recognized a trailing year (MMM yyyy), so for "2024 Aug" it matched just "Aug", found no year, and fell back to findYearClosestToRef — which, from a January reference, picks the previous August.
  2. ENMonthNameLittleEndianParser then matched the contiguous 2024 as a bogus day range: 20 + (empty connector) + 24 → "the 20th–24th of August". Because both results spanned the same text, overlap resolution kept this wrong one.

The reverse order ("Aug 2024") already worked, which is why this looked inconsistent.

Fix

  • ENMonthNameParser: recognize an optional leading 4-digit year, so "2024 Aug", "2024-Aug", "2024.Aug", and "2012 January" parse as a single result with a certain year. A 4-digit year is required (not 2-digit) to avoid clashing with the day-month format ("09 Aug" must stay "9 August"). A guard rejects supplying a year on both sides ("2012 January 2013").
  • ENMonthNameLittleEndianParser: require a real separator between the two day numbers of a range, so a contiguous 4-digit number is no longer split into a spurious day-to-day range. Existing ranges are unaffected: "10 - 22 August", "10 to 22 August", "10-22 August", and space-only "10 22 August" all still parse as ranges.

Tests

Added a Year-Month expression (year before month) block in test/en/en_month.test.ts covering "2024 Aug", "2024 August", "2024-Aug", and "2012 January" with pinned reference dates. These fail on master (Expected: 2024, Received: 2023) and pass with the fix.

Full suite green: 615 passed.

"2024 Aug" was parsed as August 2023 instead of August 2024. The year
token before the month name was not recognized by ENMonthNameParser, so
the month fell back to the year closest to the reference date. Worse,
ENMonthNameLittleEndianParser split the contiguous 4-digit number into a
bogus day range (20-24 Aug), which won overlap resolution.

- ENMonthNameParser: recognize an optional leading 4-digit year so
  "yyyy MMM" ("2024 Aug", "2024-Aug", "2012 January") is parsed as a
  single, year-certain result.
- ENMonthNameLittleEndianParser: require a real separator between the two
  day numbers of a range, so a 4-digit year is no longer split into a
  spurious day-to-day range. Valid ranges ("10 - 22 August",
  "10 to 22 August", "10-22 August", space-only "10 22 August") are
  unaffected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug report : Incorrect parsing of yyyy MMM format

1 participant