Spaces:
Running
Running
Commit History
remove unnecessary output 84f8c68
Unified return format e408c7b
fix cmd 6c8a230
fix cmd a1a7aac
fix build 1b9b118
fix build d937c3a
fix dockerfile path 03ff3a5
add meta 36ff0ea
chore: track binaries with git-lfs aa000f7
chore: track binaries with git-lfs f33d63d
add sync task 46ebeba
Handle negative value in padding (#3389) 6e115ac unverified
Treboko commited on
models : update`./models/download-ggml-model.cmd` to allow for tdrz download (#3381) 0b65831 unverified
talk-llama : sync llama.cpp 4321600
sync : ggml a0af6fc
ggml: Add initial WebGPU backend (llama/14521) 4b3da1d
Reese Levine commited on
ggml : initial zDNN backend (llama/14975) 6dd510c
common : handle mxfp4 enum fd4c0e1
ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors (llama/15379) a575f57
vulkan: disable spirv-opt for bfloat16 shaders (llama/15352) cf24af7
vulkan: Use larger workgroups for mul_mat_vec when M is small (llama/15355) 054584a
vulkan: support sqrt (llama/15370) e5406c0
Dong Won Kim commited on
vulkan: Optimize argsort (llama/15354) 80a188c
vulkan: fuse adds (llama/15252) ad199b1
vulkan: Support mul_mat_id with f32 accumulators (llama/15337) 41a76e6
vulkan: Add missing bounds checking to scalar/coopmat1 mul_mat_id (llama/15334) a6fa78e
OpenCL: add initial FA support (llama/14987) 8ece1ee
opencl: add initial mxfp4 support via mv (llama/15270) 1a0281c
lhez shawngu-quic commited on
vulkan : fix out-of-bounds access in argmax kernel (llama/15342) 78a1865
vulkan : fix compile warnings on macos (llama/15340) e3107ff
ggml: initial IBM zDNN backend (llama/14975) 449e1a4
CUDA: fix negative KV_max values in FA (llama/15321) 6e3a7b6
HIP: Cleanup hipification header (llama/15285) 7cdf9cd
vulkan: perf_logger improvements (llama/15246) d48d508
ggml: fix ggml_conv_1d_dw bug (ggml/1323) 4496862
cuda : fix GGML_CUDA_GRAPHS=OFF (llama/15300) 59c694d
Sigbjørn Skjæret commited on
finetune: SGD optimizer, more CLI args (llama/13873) f585fe7
HIP: bump requirement to rocm 6.1 (llama/15296) 58a3802
ggml : update `ggml_rope_multi` (llama/12665) b4896dc
ggml : repack block_iq4_nlx8 (llama/14904) db4407f
CUDA: Optimize `reduce_rows_f32` kernel, leading up to 25x perf improvement on kernel-level and 10% perf increase for Gemma3n (llama/15132) c768824
ggml-rpc: chunk send()/recv() to avoid EINVAL for very large tensors over RPC (macOS & others) (llama/15188) c8284f2
HIP: disable sync warp shuffel operators from clr amd_warp_sync_functions.h (llama/15273) 8fca6dd
sycl: Fix and disable more configurations of mul_mat (llama/15151) 7b868ed
Romain Biessy commited on
opencl: allow mixed f16/f32 `add` (llama/15140) 345810b
CUDA cmake: add `-lineinfo` for easier debug (llama/15260) 008e169
CANN: GGML_OP_CPY optimization (llama/15070) 73e90ff
Chenguang Li commited on