Skip to content

feat: Allow identifiers to begin with _#7243

Open
jbencin-stacks wants to merge 18 commits into
stacks-network:pox-wf-integrationfrom
jbencin-stacks:feat/underscore-prefix
Open

feat: Allow identifiers to begin with _#7243
jbencin-stacks wants to merge 18 commits into
stacks-network:pox-wf-integrationfrom
jbencin-stacks:feat/underscore-prefix

Conversation

@jbencin-stacks
Copy link
Copy Markdown
Contributor

Description

Allow all Clarity identifiers to start with _ as of Clarity 6.

Additionally, allow bare _ name in let and match bindings, which discards the result of the expression and doesn't create a binding that can be referenced later

This will allow users to signal to the Clarinet linter that an identifier is intentionally unused

Applicable issues

Additional info (benefits, drawbacks, caveats)

Checklist

  • Test coverage for new or modified code paths
  • For new Clarity features or consensus changes, add property tests (see docs/property-testing.md)
  • Changelog fragment(s) or "no changelog" label added (see changelog.d/README.md)
  • Required documentation changes (e.g., rpc/openapi.yaml for RPC endpoints, event-dispatcher.md for new events)
  • New clarity functions have corresponding PR in clarity-benchmarking repo

SIP-04x relaxes the `ClarityName` rules so that identifiers may begin with
`_` (and the bare `_` is a valid identifier). The relaxation is wire-level
unconditional but the language-level rule is gated at `ClarityVersion::Clarity6`
via a new AST pass.

* `clarity-types/src/representations.rs` — add a fourth alternation arm to
  `CLARITY_NAME_REGEX_STRING` for `^_([a-zA-Z0-9]|[-_!?+<>=/*])*$`. Tests in
  `clarity-types/src/tests/representations.rs` and the matching proptests in
  `clarity/src/vm/tests/representations.rs` are extended for the new arm and
  the bare-`_` case.
* `clarity/src/vm/ast/parser/v2/lexer/mod.rs` — accept `_` as an identifier
  first-character.
* `clarity/src/vm/ast/underscore_checker.rs` (new) — `BuildASTPass` that
  walks `pre_expressions` and rejects atoms / trait references / field
  identifiers beginning with `_` when `ClarityVersion < Clarity6`. Wired
  into `inner_build_ast` between the depth checks and `ExpressionIdentifier`.
* `clarity/src/vm/ast/errors.rs` — new `UnderscoreIdentifierNotAllowed(String)`
  variant with a precise diagnostic.
* `stackslib/src/chainstate/tests/parse_tests.rs` — exhaustive
  `variant_coverage_report` arm for the new variant, marked `Ignored`
  because it is covered by the clarity-side unit tests rather than by a
  consensus snapshot.
Per SIP-04x: the bare identifier `_` is a discard pattern in `let` and `match`
binding positions. The bound expression is still evaluated (preserving the
short-circuit effects of `try!`/`unwrap!`) but the name is not placed in scope,
and repeated `_` bindings in the same form do not raise `NameAlreadyUsed`.

Both the runtime and the type checker treat `_` as discard only when the
contract's `ClarityVersion >= Clarity6`. Memory and cost accounting are
preserved: a discard binding charges the same as a non-discard one, avoiding
any gas-side-channel asymmetry. The value drops naturally at scope exit
because it is never inserted into `inner_context.variables`.

* `clarity/src/vm/functions/mod.rs` — `special_let` gates the conflict checks
  and the scope insertion on `!is_discard`.
* `clarity/src/vm/functions/options.rs` — `eval_with_new_binding` (used by
  `match` opt/resp arms) gates the same way.
* `clarity/src/vm/analysis/type_checker/v2_1/natives/mod.rs` — `check_special_let`
  skips `check_name_used`/`lookup_variable_type` and the `add_variable_type`
  for discards while still type-checking the bound expression.
* `clarity/src/vm/analysis/type_checker/v2_1/natives/options.rs` — analyzer
  `eval_with_new_binding` gates similarly; new `use crate::vm::ClarityVersion`.
* `clarity/src/vm/tests/simple_apply_eval.rs` — five end-to-end tests:
  - `test_let_discard_bare_underscore`: multiple `_` in one `let` form
    coexist without `NameAlreadyUsed`.
  - `test_let_discard_underscore_not_referenceable`: a `_` binding does
    not place the name in scope; the body referring to `_` fails as
    unbound.
  - `test_let_underscore_prefix_is_regular_binding`: `_admin` (the
    leading-`_` convention from the SIP example) binds normally and is
    readable in the body — the leading `_` is convention only.
  - `test_match_opt_discard_bare_underscore` / `_resp_…`: `_` discard
    semantics extend to `match` on `(some …)` and `(ok …)/(err …)`.
* `changelog.d/underscore-identifiers.added` — release note.
The dedicated `^_(...)$` alternation arm added in bca48790fe duplicated the
body character class from the letter arm. Since the body class `[-_!?+<>=/*]`
already accepts `_`, the cleaner factoring is to widen the leading character
class from `[a-zA-Z]` to `[a-zA-Z_]` and drop the second arm entirely. The
bare-`_` case still works because the body quantifier is `*` (zero or more).

Old:
  ^[a-zA-Z]([a-zA-Z0-9]|[-_!?+<>=/*])*$|^_([a-zA-Z0-9]|[-_!?+<>=/*])*$|^[-+=/*]$|^[<>]=?$
New:
  ^[a-zA-Z_]([a-zA-Z0-9]|[-_!?+<>=/*])*$|^[-+=/*]$|^[<>]=?$

The matched language is identical — purely a regex simplification. Tests
in `clarity-types/src/tests/representations.rs` are unchanged; the
proptest strategies in `clarity/src/vm/tests/representations.rs` collapse
their two identifier branches into one and the `assert_regex_unchanged`
literals are updated.
While auditing coverage on `underscore_checker.rs` I noticed two gaps:

1. The lexer dispatch for `<` only enters `read_trait_identifier` when
   the next char `is_ascii_alphabetic()`. So `<_foo>` was being lexed as
   three tokens (`Less`, the atom `_foo`, `Greater`) rather than as one
   `Token::TraitIdent("_foo")`. The pre-Clarity-6 rejection test for
   `<_foo>` was passing for the wrong reason — the underscore checker
   was catching the `_foo` `Atom` branch, never the `TraitReference`
   branch. Fix: also enter the trait-identifier reader when `<` is
   followed by `_`, matching the relaxation done earlier in the main
   identifier dispatch.

2. Missing coverage in `underscore_checker::tests`:
   - The `Tuple` recursion arm of `check_one` was unexercised. New tests
     `underscore_in_tuple_key_{rejected,accepted}` cover it.
   - The `SugaredFieldIdentifier` and `FieldIdentifier` arms (the
     `.contract.trait` and `'<addr>.contract.trait` desugarings) were
     unexercised. New rejection tests cover both.
   - Symmetric Clarity-6 acceptance tests for `_x` in let bindings,
     `_addr` function arguments, and `<_foo>` trait references were
     missing — added.
   - No test verified the *specific* `ParseErrorKind` emitted. Added
     `rejection_emits_underscore_identifier_not_allowed_kind`, which
     uses `build_ast` (with `error_early=true`) to assert
     `ParseErrorKind::UnderscoreIdentifierNotAllowed("_admin")`.
   - Added `leading_operator_names_unaffected_pre_clarity6` as a
     regression guard: the pass must leave `+`, `<=`, `*` etc. alone.

Test count goes from 10 to 19.
@jbencin-stacks jbencin-stacks changed the base branch from master to pox-wf-integration May 27, 2026 20:39
@jbencin-stacks jbencin-stacks marked this pull request as ready for review May 27, 2026 22:49
Comment thread clarity/src/vm/analysis/type_checker/v2_1/natives/mod.rs Outdated
Comment thread clarity/src/vm/ast/underscore_checker.rs Outdated
// 3) `[<>]=?` — comparison operator name.
pub static ref CLARITY_NAME_REGEX_STRING: String =
"^[a-zA-Z]([a-zA-Z0-9]|[-_!?+<>=/*])*$|^[-+=/*]$|^[<>]=?$".into();
"^[a-zA-Z_]([a-zA-Z0-9]|[-_!?+<>=/*])*$|^[-+=/*]$|^[<>]=?$".into();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about changing this without a gate at this level. I think this could cause an accidental hard fork due to its use in wire-level validation. Like could a transaction with a leading _ somewhere get into a block that is accepted by updated nodes and rejected by unupdated nodes, before the epoch transition?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've got a point here, I can try to tighten this up and make better use the type system to "make invalid states unrepresentable"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm planning to fix by leaving the current ClarityName alone and creating a new struct, ClarityNameV6, which allows the leading underscore. This should be the safest and most correct way to proceed, as we'll never be able to use a leading _ in the legacy code paths, but will also be a large diff

@brice-stacks Let me know if this sounds right to you

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that sounds good. Thanks.

@coveralls
Copy link
Copy Markdown

Coverage Report for CI Build 26543228798

Coverage decreased (-38.4%) to 47.577%

Details

  • Coverage decreased (-38.4%) from the base build.
  • Patch coverage: 179 uncovered changes across 12 files (100 of 279 lines covered, 35.84%).
  • 88607 coverage regressions across 344 files.

Uncovered Changes

Top 10 Files by Coverage Impact Changed Covered %
clarity/src/vm/ast/underscore_checker.rs 171 28 16.37%
clarity/src/vm/functions/options.rs 24 15 62.5%
clarity/src/vm/analysis/type_checker/v2_1/natives/mod.rs 19 14 73.68%
clarity/src/vm/ast/mod.rs 7 2 28.57%
clarity/src/vm/analysis/errors.rs 3 0 0.0%
clarity/src/vm/analysis/type_checker/v2_1/contexts.rs 5 2 40.0%
clarity/src/vm/analysis/type_checker/v2_1/natives/options.rs 8 6 75.0%
clarity/src/vm/ast/errors.rs 2 0 0.0%
clarity/src/vm/functions/define.rs 15 13 86.67%
clarity/src/vm/functions/mod.rs 9 7 77.78%
Total (13 files) 279 100 35.84%

Coverage Regressions

88607 previously-covered lines in 344 files lost coverage.

Top 10 Files by Coverage Loss Lines Losing Coverage Coverage
stackslib/src/chainstate/stacks/db/transactions.rs 8105 7.51%
stackslib/src/chainstate/stacks/db/blocks.rs 4478 35.68%
stackslib/src/chainstate/stacks/transaction.rs 4313 9.87%
stackslib/src/chainstate/burn/db/sortdb.rs 4193 42.89%
stackslib/src/chainstate/stacks/boot/mod.rs 3564 13.75%
stackslib/src/net/chat.rs 3166 27.21%
stackslib/src/chainstate/burn/operations/leader_block_commit.rs 2579 18.81%
stackslib/src/net/db.rs 1960 28.99%
stacks-signer/src/signerdb.rs 1578 35.21%
stackslib/src/chainstate/stacks/index/test/node.rs 1317 0.0%

Coverage Stats

Coverage Status
Relevant Lines: 220788
Covered Lines: 105044
Line Coverage: 47.58%
Coverage Strength: 13061717.97 hits per line

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow variables to start with _

3 participants