Skip to content

Support scientific-notation float literals (1.5e10, 1e6)#82

Merged
CoreyRDean merged 2 commits into
developfrom
feat/scientific-notation-float-literals
May 30, 2026
Merged

Support scientific-notation float literals (1.5e10, 1e6)#82
CoreyRDean merged 2 commits into
developfrom
feat/scientific-notation-float-literals

Conversation

@CoreyRDean
Copy link
Copy Markdown
Collaborator

Summary (non-technical)

You can now write floating-point numbers in exponent / scientific notation1.5e10, 2.0e-3, 1e6 — and they evaluate to the value you'd expect. Before this change, writing 1.0e30 didn't just fail; it failed with a baffling Function 'e30' not found, sending you hunting for a typo in a function that doesn't exist. A real downstream consumer (rcce2) hit exactly this on Local huge# = 1.0e30 and had to rewrite every constant as a plain decimal and record the gotcha. Exponent literals are table stakes for a language; this closes the gap and makes more legacy/standard BASIC drop-in compatible.

Technical summary

The value pipeline was already exponent-correct: FloatConstNode converts its token via atof() (parser.cpp:1106, exprnode.cpp:385), and atof("1.5e10") / atof("1e30") parse correctly. The entire gap was in tokenizationtoker.cpp's numeric paths scanned only mantissa digits, so 1.5e10 lexed as FLOATCONST(1.5) + IDENT(e10) and the parser read e10 as a call.

  • src/blitzrc/compiler/toker.cpp: add a small scanExp helper that consumes an optional [eE][+-]?<digits> suffix, applied to all three numeric paths (.-leading, digit-with-dot, digit-no-dot). A bare-integer mantissa with an exponent is promoted to FLOATCONST (1e6 is a float, not an int).
  • Conservative by construction: the e/E is consumed only when a full [eE][+-]?<digit> follows. 1.0e, 1.0eq, 1.0e+x keep their current behavior (1.0 + a separate identifier). No legacy program can depend on the old tokenization, because every newly-accepted form was previously a hard compile error.
  • No parser/AST/codegen/runtime change. macOS unaffected (front-end only).

No breaking changes.

Acceptance criteria & results

  • 1.5e3=1500, 2.0e-3=0.002, 1.0e+2=100, 1e6=1000000, .5e2=50 — pinned by Assert(FloatClose(...)) in tests/MathsTest.bb; blitzcc -t tests/MathsTest.bb passes (all 5 new blocks).
  • Conservative guard: 1.5eqFunction 'eq' not found, 1.5e+qFunction 'e' not found (the e is not swallowed); 1.5e10 compiles. Verified at the CLI and pinned in a Test block.
  • Formerly-failing Local huge# = 1.0e30 now compiles (exit 0).
  • Full test.bat suite green ("Tests passed").
  • Corpus sweep: 352 files, 325 compiled, 27 known-fail, 0 new-fail.
  • lang_ref_basicdatatypes.html documents exponent notation with examples.

Trade-offs / deferred

  • Promoted the no-dot form 1e30 to float deliberately (the common case; leaving it out would be a second sharp edge). Safe: an identifier can't start with a digit, so 1e30 was always INTCONST(1)+IDENT(e30) = a parse error before.
  • Out of scope: hex/binary exponents, float suffixes, the Release/Recast/Reference contextual-keyword work (still the top legacy-compat backlog item).

🤖 Generated with Claude Code

CoreyRDean and others added 2 commits May 30, 2026 04:39
Writing a float in exponent form (1.5e10, 2.0e-3, 1e6) did not produce
the expected value. The tokenizer scanned only mantissa digits, so 1.5e10
lexed as FLOATCONST(1.5) followed by IDENT(e10) -- which the parser then
read as a function call, yielding the misdirecting "Function 'e30' not
found". A real downstream consumer hit this on `Local huge# = 1.0e30` and
had to rewrite every constant as a plain decimal.

The value pipeline was already exponent-correct (FloatConstNode converts
via atof, which parses "1.5e10"/"1e30" fine); the gap was purely
tokenization. Scan an optional [eE][+-]?<digits> suffix on all three
numeric paths and emit FLOATCONST. A bare-integer mantissa with an
exponent is promoted to float (1e6 is a float, not an int).

The scan is conservative: the e/E is consumed only when a full
[eE][+-]?<digit> follows, so 1.0e / 1.0eq / 1.0e+x keep their existing
behavior (float 1.0 followed by a separate identifier) -- no legacy
program can depend on the old tokenization because every newly-accepted
form was previously a hard compile error.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add acceptance tests to MathsTest.bb (written clean-room from the
criteria) covering the canonical forms (1.5e3, 2.0e-3, 1.0e+2), the
risky no-dot promotion (1e6) and leading-dot (.5e2) paths, exponents in
arithmetic, a large-magnitude case (asserted by magnitude/round-trip
since FloatClose's absolute tolerance is meaningless at 1e30), and a
conservative-guard case proving a bare e is not swallowed.

Document exponent notation on the float-literals reference page, and
record the working note for the change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@CoreyRDean CoreyRDean requested a review from a team as a code owner May 30, 2026 09:41
@CoreyRDean CoreyRDean merged commit 6b01e40 into develop May 30, 2026
4 checks passed
@CoreyRDean CoreyRDean deleted the feat/scientific-notation-float-literals branch May 30, 2026 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant