Skip to content

Add instruction support for Curve25519 armv7-M optimization tasks#341

Draft
willieyz wants to merge 10 commits into
mainfrom
add-support-m7-inst
Draft

Add instruction support for Curve25519 armv7-M optimization tasks#341
willieyz wants to merge 10 commits into
mainfrom
add-support-m7-inst

Conversation

@willieyz
Copy link
Copy Markdown
Collaborator

@willieyz willieyz commented Oct 9, 2025

  • This PR add the instruction support support for Curve25519 armv7-M optimization tasks.
  • The added instruction support as follow:
    • vldm_internal
    • vmov_gpr_dual

@willieyz willieyz force-pushed the add-support-m7-inst branch from 03545b5 to aa81ea6 Compare October 9, 2025 02:55
@willieyz willieyz changed the title armv7-m: add support for vldm_internal Add instruction support for Curve25519 optimization tasks Oct 9, 2025
@willieyz willieyz changed the title Add instruction support for Curve25519 optimization tasks Add instruction support for Curve25519 armv7-M optimization tasks Oct 9, 2025
@willieyz willieyz force-pushed the add-support-m7-inst branch 7 times, most recently from 07df989 to 8144d02 Compare October 14, 2025 06:26
@willieyz willieyz force-pushed the add-support-m7-inst branch 5 times, most recently from 40e56e1 to c545b7f Compare October 22, 2025 03:38
@willieyz willieyz force-pushed the add-support-m7-inst branch 2 times, most recently from 85002ba to 7258b74 Compare October 22, 2025 06:18
Signed-off-by: willieyz <willie.zhao@chelpis.com>
Signed-off-by: willieyz <willie.zhao@chelpis.com>
- This commit add instruction support for `umull`
- `umull`
  - (reference from `smull` and
    cortex-m85 SWOG: Divide and multiply instructions)
  - latency: 2
  - inverse throughput: 1
  - ExecutionUnit: MAC (Multiply Accmumlate)

Signed-off-by: willieyz <willie.zhao@chelpis.com>
- This commit add instruction support for `umaal`

- `umaal`
  - (reference from PR #160 and
    cortex-m85 SWOG: Divide and multiply instructions)
  - latency: 2
  - inverse throughput: 1
  - ExecutionUnit: MAC (Multiply Accmumlate)

Signed-off-by: willieyz <willie.zhao@chelpis.com>
- This commit add instruction support for `movs_imm`

- `movs_imm`

  - (reference from cortex-m85 SWOG: Move and shift instructions)
  - latency: 1
  - inverse throughput: 1
  - ExecutionUnit: ALU (reference from `movw_imm` in cortex-m7)
Signed-off-by: willieyz <willie.zhao@chelpis.com>
This commit add instruction support for mov (register) T1 variant
- mov (reference from cortex-m85 SWOG: Move and shift instructions)
  - latency: 1
  - inverse throughput: 1
  - ExecutionUnit: ALU

Signed-off-by: willieyz <willie.zhao@chelpis.com>
- This commit add instruction support for mov(immediate).

- mov (reference from cortex-m85 SWOG: Move and shift instructions)
  - latency: 1
  - inverse throughput: 1
  - ExecutionUnit: ALU

Signed-off-by: willieyz <willie.zhao@chelpis.com>
- This commit add instruction support for muls.
- muls(reference from cortex-m85 SWOG: Divide and multiply instructions)
  - latency: 2
  - inverse throughput: 1
  - ExecutionUnit: MAC

Signed-off-by: willieyz <willie.zhao@chelpis.com>
…ghput

- This commit resolve the multiple match error for the `vmov_gpr2_dual`
  by remove the one with comment: # actually not, just placeholder.

- vmov_gpr2_dual
  - latency: 1
  - inverse_throughput: 1
  - Dual issue: 11
Signed-off-by: willieyz <willie.zhao@chelpis.com>
- This commit removes the restriction in the `make()` function of `ldr`,
  `ldr_with_imm`, `ldrb_with_imm` and  `ldrh_with_imm`, namely the line:

  `obj.args_in_out_different = [(0, 0)] # Can't have Rd==Ra`.

- This restriction makes it harder for Slothy to find a solution
  in scenarios where the number of available registers was limited.

Signed-off-by: willieyz <willie.zhao@chelpis.com>
@willieyz willieyz force-pushed the add-support-m7-inst branch from 7258b74 to 020bbe9 Compare October 31, 2025 04:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant