ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445) f798922 Sigbjørn Skjæret commited on Jul 3, 2025
vulkan: support mixed/deepseekR1 FA head sizes (llama/14509) 90cefa0 jeffbolznv commited on Jul 3, 2025
Fix conditional enabling following arch checks for ggml-sycl (llama/14504) 1f15602 Nicolò Scipione commited on Jul 3, 2025
CUDA: add dynamic shared mem to softmax, refactor general usage (llama/14497) 8e1f56c am17an commited on Jul 2, 2025
CUDA: broadcasting for FlashAttention mask (llama/14500) 47e02a8 JohannesGaessler commited on Jul 2, 2025
vulkan: support softmax/FA batch and broadcast (llama/14449) f6b0b76 jeffbolznv commited on Jul 1, 2025
ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435) ebacb3e ggerganov HF Staff commited on Jul 12, 2025
opencl : fix possible buffer overflow in dump_tensor (llama/14490) deb934d jeffzhou2000 commited on Jul 2, 2025
ci : disable fast-math for Metal GHA CI (llama/14478) ec4b1b3 ggerganov HF Staff commited on Jul 1, 2025
CANN: update aclnnGroupedMatmulV2 to aclnnGroupedMatmulV3 (llama/14411) d8d5b0b Chenguang Li commited on Jul 1, 2025
vulkan: Split large mul_mat_id to fit in shared memory (llama/14451) bf678f0 jeffbolznv commited on Jul 1, 2025
vulkan : implement bilinear interpolation for ggml_upscale/ggml_interpolate (ggml/1291) 666e65b Acly commited on Jul 3, 2025
ggml : add version function to get lib version (ggml/1286) 880f633 danbev ggerganov HF Staff commited on Jul 2, 2025
server : add dtw.params for v3-large-turbo (#3307) 1250fd1 unverified accessiblepixel commited on Jul 7, 2025
feat: support vad for addon.node (#3301) f795870 unverified Lin Xiaodong linxiaodong commited on Jul 2, 2025
metal : disable fast-math for some cpy kernels (llama/14460) 9d1185a ggerganov HF Staff commited on Jun 30, 2025
cmake : Remove redundant include path in CMakeLists.txt (llama/14452) 6b59b68 xiaobing318 commited on Jun 30, 2025
scripts : make the shell scripts cross-platform (llama/14341) 9de52c8 Vedran Miletić commited on Jun 30, 2025
ggml : fix unmerged GGML_FPxx_TO_FPxx refactoring (llama/14443) f7995cb Sigbjørn Skjæret commited on Jun 29, 2025
ggml : implement REGLU/GEGLU/SWIGLU ops (llama/14158) add5c0f Sigbjørn Skjæret ggerganov HF Staff OccamRazor qnixsynapse jeffbolznv commited on Jun 29, 2025
vulkan: Add fusion support for RMS_NORM+MUL (llama/14366) 737f12d jeffbolznv slaren commited on Jun 29, 2025
CUDA: add bf16 and f32 support to cublas_mul_mat_batched (llama/14361) c7936d3 am17an commited on Jun 28, 2025
vulkan: handle noncontig in the final case of ggml_vk_get_cpy_pipeline (llama/14378) 1c3b94c jeffbolznv commited on Jun 28, 2025
vulkan: lock accesses of pinned_memory vector (llama/14333) 59dca4f jeffbolznv commited on Jun 28, 2025
cmake: regen vulkan shaders when shaders-gen sources change (llama/14398) 7988638 bandoti commited on Jun 26, 2025
metal : add special-case mat-vec mul for ne00 == 4 (llama/14385) 724622d ggerganov HF Staff commited on Jun 26, 2025
metal : batch rows copy in a single threadgroup (llama/14384) b4ff704 ggerganov HF Staff commited on Jun 26, 2025
musa: enable fp16 mma (all) and cublas on qy2 (llama/13842) e35329b yeahdongcn JohannesGaessler commited on Jun 26, 2025
ggml-cpu: enable IBM NNPA Vector Intrinsics (llama/14317) fea8f94 taronaeo slaren commited on Jun 25, 2025