Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

llama-quant : overlap compute and write with double buffering
#21507 opened Apr 6, 2026 by nuri-yoo Loading…
6 tasks done
[codex] fix LFM2 GGUF conversion fallback examples ggml changes relating to the ggml tensor library for machine learning python python script changes testing Everything test related
#21505 opened Apr 6, 2026 by i386 Loading…
convert : set "add bos" == True for Gemma 4 python python script changes
#21500 opened Apr 6, 2026 by ggerganov Loading…
vocab : remove </s> eog token if gemma4
#21492 opened Apr 6, 2026 by aldehir Loading…
docs: add hunyuan-ocr gguf, also add test [no ci] documentation Improvements or additions to documentation examples
#21490 opened Apr 5, 2026 by ngxson Loading…
llama-quantize: fix tensor-type logic
#21482 opened Apr 5, 2026 by theo77186 Loading…
gguf-py: Fix lazy tensor handling for keyword arguments python python script changes
#21476 opened Apr 5, 2026 by lainon1 Loading…
CUDA: make cuda graphs props check faster ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#21472 opened Apr 5, 2026 by am17an Loading…
ggml : fix repeat_back assert with non-contiguous gradients ggml changes relating to the ggml tensor library for machine learning
#21467 opened Apr 5, 2026 by RealOrko Loading…
ggml : add GGML_OP_GATHER for DeepSeek Sparse Attention (DSA) #21149 ggml changes relating to the ggml tensor library for machine learning testing Everything test related
#21458 opened Apr 5, 2026 by LilySu Loading…
vulkan: Support GGML_TYPE_NVFP4 ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#21455 opened Apr 5, 2026 by jeffbolznv Loading…
metal : add GATED_LINEAR_ATTN op Apple Metal https://en.wikipedia.org/wiki/Metal_(API) documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning testing Everything test related
#21452 opened Apr 5, 2026 by TheTom Loading…
Gemma 4: move some computations to BF16 examples ggml changes relating to the ggml tensor library for machine learning model Model specific Nvidia GPU Issues specific to Nvidia GPUs python python script changes
#21451 opened Apr 5, 2026 by pwilkin Draft
metal: speed up Qwen3-VL image encoding on large images by ~11% Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
#21443 opened Apr 4, 2026 by Avidanborisov Loading…
eagle3: add qwen3.5 4B 9B 35B-A3B support examples model Model specific python python script changes server
#21437 opened Apr 4, 2026 by 36330 Draft
fix(gemma4): handle nullable type arrays testing Everything test related
#21433 opened Apr 4, 2026 by gerstnr Loading…
vulkan: Tweak Xe2 warptile configuration ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#21431 opened Apr 4, 2026 by TheBlueMatt Loading…
mtmd: add Gemma 4 audio conformer encoder support documentation Improvements or additions to documentation examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#21421 opened Apr 4, 2026 by stephencox-ict Loading…
9 tasks done
ProTip! Follow long discussions with comments:>50.