Kernels:

drbh
/

fp8-weight-only-linear

Kernel card Files Files and versions

This is the repository card of drbh/fp8-weight-only-linear that has been pushed on the Hub. It was built to be used with the kernels library. This card was automatically generated.

How to use

# make sure `kernels` is installed: `pip install -U kernels`
from kernels import get_kernel

kernel_module = get_kernel("drbh/fp8-weight-only-linear")
ops = kernel_module.ops

ops(...)

Available functions

ops
fp8_linear_w8a16
quantize_weight

Benchmarks

No benchmark available yet.

Source code

Source code of this kernel originally comes from https://github.com/drbh/fp8-weight-only-linear and it was repurposed for compatibility with kernels.

Downloads last month: 4

mit

Supported hardwares new

CUDA

8.99.010.011.012.0

GPU

B300

288GB

NVIDIA SXM

B200

192GB

NVIDIA SXM

H200

141GB

NVIDIA SXM

H100

80GB

GPU

H800

80GB

GPU

H20

96GB

GPU

L40s

48GB

GPU

L40

48GB

GPU

L20

48GB

GPU

L4

24GB

DGX Spark

GB10

128GB

GPU

RTX PRO 6000 WS

96GB

GPU

RTX PRO 6000 Max-Q

96GB

GPU

RTX PRO 5000

48GB

GPU

RTX PRO 4500 WS

32GB

GPU

RTX PRO 4000

24GB

GPU

RTX PRO 4000 SFF

24GB

GPU

RTX PRO 2000

16GB

GPU

RTX 6000 Ada

48GB

GPU

RTX 5880 Ada

48GB

RTX

RTX 5000 Ada

32GB

GPU

RTX 4500 Ada

24GB

RTX

RTX 4000 Ada

20GB

RTX

RTX 4000 SFF Ada

20GB

GPU

RTX 2000 Ada

16GB

RTX

RTX 5090

32GB

RTX

RTX 5090 D

32GB

RTX

RTX 5090 Mobile

24GB

RTX

RTX 5080

16GB

RTX

RTX 5080 Mobile

16GB

RTX

RTX 5070

12GB

RTX

RTX 5070 Mobile

8GB

RTX

RTX 5070 Ti

16GB

RTX

RTX 5070 Ti Mobile

12GB

RTX

RTX 5060 Ti

16GB

RTX

RTX 5060

8GB

RTX

RTX 5060 Mobile

8GB

RTX

RTX 5050

8GB

RTX

RTX 5050 Mobile

8GB

RTX

RTX 4090

24GB

RTX

RTX 4090D

24GB

RTX

RTX 4090 Mobile

16GB

RTX

RTX 4080 SUPER

16GB

RTX

RTX 4080

16GB

RTX

RTX 4080 Mobile

12GB

RTX

RTX 4070

12GB

RTX

RTX 4070 Mobile

8GB

RTX

RTX 4070 Ti

12GB

RTX

RTX 4070 Super

12GB

RTX

RTX 4070 Ti Super

16GB

RTX

RTX 4060

8GB

RTX

RTX 4060 Ti

8GB

RTX

RTX 4090 Laptop

16GB

RTX

RTX 4080 Laptop

12GB

RTX

RTX 4070 Laptop

8GB

RTX

RTX 4060 Laptop

8GB

RTX

RTX 4050 Laptop

6GB

OS: linux

Arch: x86_64