Fix definition from ENABLE_NO_UNDERSCORE_API CMake config option#41
Open
Fix definition from ENABLE_NO_UNDERSCORE_API CMake config option#41
Conversation
|
Hi @imciner2, Thank you for your interest in contributing to AOCL-BLAS and helping improve the project. However, our CI/CD pipeline has detected a build error on Windows OS related to this PR. We've attached the build log and the steps used during the build process for your reference. Kindly review the logs and update the same pull request with a new commit to address the issue. |
vkirangoud
pushed a commit
to vkirangoud/blis
that referenced
this pull request
Nov 15, 2025
* Bug Fixes in FP32 Kernels: - The current implementation lets m=1 tiny cases inside LPGEMV_TINY loop, but the m=1 GEMV kernel call doesn't have the call to GEMV_M_ONE kernels. Added the m=1 path in LPGEMV_TINY loop by handling the pack A/Pack B/reorder B conditions. - Added BF16 support for BIAS, Matrix-Add and Matrix-Mul for AVX512 F32 main and GEMV kernels - Added BF16 Matrix-Add and Matrix-Mul support for AVX512_256 F32 kernels. - Modified the condition check in FP32 Zero point in AVX512 kernels, and fixed few bugs in Col-major Zero point evaluation. AMD Internal: [ CPUPL - 6748 ] * Bug Fixes in FP32 Kernels: - The current implementation lets m=1 tiny cases inside LPGEMV_TINY loop, but doesn't have the call to GEMV_M_ONE kernels. Added the m=1 path in LPGEMV_TINY loop by handling the pack A/Pack B/reorder B conditions. - Added BF16 support for BIAS, Matrix-Add and Matrix-Mul for AVX512 F32 main and GEMV kernels. - Added BF16 Downscale, BIAS, Matrix-Add and Matrix-Mul support in AVX2 GEMV_N and AVX512_256 GEMV kernels. - Added BF16 Matrix-Add and Matrix-Mul support for AVX512_256 F32 kernels. - Modified the condition check in FP32 Zero point in AVX512 kernels, and fixed few bugs in Col-major Zero point evaluation and instruction usage. AMD Internal: [ CPUPL - 6748 ] * Bug Fixes in FP32 Kernels: - The current implementation lets m=1 tiny cases inside LPGEMV_TINY loop, but doesn't have the call to GEMV_M_ONE kernels. Added the m=1 path in LPGEMV_TINY loop by handling the pack A/Pack B/reorder B conditions. - Added BF16 support for BIAS, Matrix-Add and Matrix-Mul for AVX512 F32 main and GEMV kernels. - Added BF16 Downscale, BIAS, Matrix-Add and Matrix-Mul support in AVX2 GEMV_N and AVX512_256 GEMV kernels. - Added BF16 Matrix-Add and Matrix-Mul support for AVX512_256 F32 kernels. - Modified the condition check in FP32 Zero point in AVX512 kernels, and fixed few bugs in Col-major Zero point evaluation and instruction usage. AMD Internal: [ CPUPL - 6748 ] * Bug Fixes in FP32 Kernels: - The current implementation lets m=1 tiny cases inside LPGEMV_TINY loop, but doesn't have the call to GEMV_M_ONE kernels. Added the m=1 path in LPGEMV_TINY loop by handling the pack A/Pack B/reorder B conditions. - Added BF16 support for BIAS, Matrix-Add and Matrix-Mul for AVX512 F32 main and GEMV kernels. - Added BF16 Downscale, BIAS, Matrix-Add and Matrix-Mul support in AVX2 GEMV_N and AVX512_256 GEMV kernels. - Added BF16 Matrix-Add and Matrix-Mul support for AVX512_256 F32 kernels. - Modified the condition check in FP32 Zero point in AVX512 kernels, and fixed few bugs in Col-major Zero point evaluation and instruction usage. AMD Internal: [ CPUPL - 6748 ] --------- Co-authored-by: VarshaV <varshav2@amd.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The code guarding the use of underscores uses the define
BLIS_ENABLE_NO_UNDERSCORE_API, but the CMake config option was actually definingENABLE_NO_UNDERSCORE_API, so the config option was never propagating into the code. This updates the CMake build system to define the correct thing.