RedHatAI/Meta-Llama-3.1-70B-Instruct-FP8
Text Generation • 71B • Updated • 30.5k • 51
OpenSource and AI
SNLP: Layer-Parallel Inference via Structured Newton Corrections
S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation