Skip to content

CUDA: add bf16 and i32 to getrows #14529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 7, 2025

Conversation

am17an
Copy link
Collaborator

@am17an am17an commented Jul 4, 2025

Just add the missing case statements

@am17an am17an requested a review from JohannesGaessler July 4, 2025 08:53
@github-actions github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Jul 4, 2025
@am17an am17an merged commit b9c3eef into ggml-org:master Jul 7, 2025
48 checks passed
@am17an am17an deleted the cuda_bf16_i32_get_rows branch July 7, 2025 13:45
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Jul 7, 2025
* origin/master:
CUDA: add bf16 and i32 to getrows (ggml-org#14529)
vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (ggml-org#14485)
vulkan: fix rms_norm+mul fusion (ggml-org#14545)
vulkan: Handle updated FA dim2/3 definition (ggml-org#14518)
server : fix assistant prefilling when content is an array (ggml-org#14360)
opencl: add GELU_ERF (ggml-org#14476)
eval-callback : check for empty input (ggml-org#14539)
test-backend-ops: add support for specifying output format (ggml-org#14368)
metal : disable fast math in all quantize kernels (ggml-org#14528)
batch : add optional for sequential equal split (ggml-org#14511)
graph : prepare for 4D mask (ggml-org#14515)
batch : add n_used count (ggml-org#14512)
CANN: Replace aclrtMemsetSync with aclnnInplaceZero operator (ggml-org#14002)
ggml : implement GEGLU_ERF and GEGLU_QUICK ops (ggml-org#14445)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants