Tasks Improvement: Enhanced Benchmarking and Task Organization by sharareh-y · Pull Request #24 · AMD-AGI/AgentKernelArena

sharareh-y · 2026-03-06T19:18:00Z

Tasks Improvement: Enhanced Benchmarking and Task Organization

Summary

This PR introduces improvements to the task infrastructure, including the addition of 24 new hip2hip tasks, reorganization of existing tasks, and refactoring of benchmarking logic to run performance tests on all test cases with improved configurability.

Key Changes

1. New Tasks Added (24 hip2hip/gpumode tasks)

Added comprehensive task implementations for:

Activation Functions: GELU, SiLU, Sigmoid, TanH, FusedLeakyReLU
Attention Mechanisms: MultiHeadAttention, ItemQueryAttention, NormalAttention_dot, NormalAttention_embedded_gaussian
Neural Network Components: Feedforward, PositionWiseFeedForward, TransformerFFNLayer, MLP_model, GateGRUSelectionLayer
Normalization Layers: layer_normalization
Loss Functions: CrossEntropyLossLabelSmoothing, KDLoss
Other Operations: Gather, InnerProd, Transpose, SoftmaxModule, SimpleMatmulModule, PositionEmbedder, MaskedLanguageModel

2. Task Organization Improvements

Moved hip2hip tasks to others/ subfolder: Reorganized existing hip2hip tasks (assign_score_withk, ball_query, furthest_point_sample, gather_points, knn, matrix_multiplication, points_in_boxes, roiaware_pool3d, roipoint_pool3d, silu, three_interpolate, three_nn) into a dedicated others/ subfolder for better structure and maintainability.

3. Benchmarking Enhancements

vllm Task Runners Refactoring

Removed PERF_SHAPE_IDX: Eliminated the limitation of benchmarking only a single test case
Benchmark all test cases: Performance tests now run across all test shapes/cases automatically
Configurable iterations: Added WARMUP_ITERATIONS and BENCHMARK_ITERATIONS constants for fine-grained control
Parameter reporting: Enhanced reporting to include test case parameters in performance results

hip2hip/others Task Runners Improvements

Run performance on all test cases: Modified task runners to benchmark all test shapes instead of a single one
Improved consistency: Unified benchmarking approach across all hip2hip/others tasks
Better performance visibility: Results now show performance metrics for all test configurations

Testing

New hip2hip tasks validation: Ran task_validator (claude) on all 24 newly added hip2hip/gpumode tasks to verify correctness and proper structure
Multi-test-case evaluation testing: Tested evaluation and plotting functionality on examples from:
- hip2hip/gpumode tasks
- hip2hip/others tasks
- triton2triton/vll tasks
Verified multi-test-case scenario: Confirmed that evaluation and plotting work correctly when running performance benchmarks on multi test-case scenarios.

… reporting. Remove PERF_SHAPE_IDX and add configurable WARMUP_ITERATIONS/BENCHMARK_ITERATIONS constants.

…d ROCm compile issues

irvineoy · 2026-03-12T17:36:50Z

LGTM, thank you!

sharareh-y added 6 commits March 6, 2026 16:38

Add 24 new hip2hip tasks

e21957c

Move other hip2hip tasks into others subfolder

7cacdb4

Refactor vllm task runners to benchmark all test cases with parameter…

c998212

… reporting. Remove PERF_SHAPE_IDX and add configurable WARMUP_ITERATIONS/BENCHMARK_ITERATIONS constants.

Rename and fix issues in hip2hip tasks

b545226

Fix indent issue in hip2hip/gpumode tasks

afafc2d

Modify hip2hip/others task_runners to run performance on all test-cases

2417ab2

sharareh-y requested a review from irvineoy March 6, 2026 19:18

fix(hip2hip): handle partial perf failures and patch Softmax/InnerPro…

6b8d2a1

…d ROCm compile issues

irvineoy merged commit 97e1f5a into AMD-AGI:main Mar 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tasks Improvement: Enhanced Benchmarking and Task Organization#24

Tasks Improvement: Enhanced Benchmarking and Task Organization#24
irvineoy merged 7 commits intoAMD-AGI:mainfrom
sharareh-y:sharareh/tasks-improvement

sharareh-y commented Mar 6, 2026

Uh oh!

irvineoy commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sharareh-y commented Mar 6, 2026

Tasks Improvement: Enhanced Benchmarking and Task Organization

Summary

Key Changes

1. New Tasks Added (24 hip2hip/gpumode tasks)

2. Task Organization Improvements

3. Benchmarking Enhancements

vllm Task Runners Refactoring

hip2hip/others Task Runners Improvements

Testing

Uh oh!

irvineoy commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants