Skip to content

Using gemv for batched_vec #662

@AntonOresten

Description

@AntonOresten

batched_vec is currently implemented to use batched_mul (which calls batched gemm) only with some extra reshapes.

Some basic benchmarks (on an RTX PRO 6000 Blackwell) suggest batched gemv is sometimes 1-2% faster:

Image Image

and converges to be similar in the limit:

Image

but in some cases is consistently slightly slower:

Image Image

Not sure if this has been considered already. The difference isn't huge, but in the cases where there actually is justification to specifically use gemv, it could be nice to have the option. Maybe this also helps in the backward pass.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions