Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models Paper • 2412.06748 • Published Dec 9, 2024 • 3
DynaGuard: A Dynamic Guardrail Model With User-Defined Policies Paper • 2509.02563 • Published Sep 2, 2025 • 21
Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs Paper • 2502.06766 • Published Feb 10, 2025
FAST: Factorizable Attention for Speeding up Transformers Paper • 2402.07901 • Published Feb 12, 2024 • 3
DynaGuard: A Dynamic Guardrail Model With User-Defined Policies Paper • 2509.02563 • Published Sep 2, 2025 • 21
Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs Paper • 2502.06766 • Published Feb 10, 2025
Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis Paper • 2502.20383 • Published Feb 27, 2025 • 3
Has My System Prompt Been Used? Large Language Model Prompt Membership Inference Paper • 2502.09974 • Published Feb 14, 2025 • 9
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7, 2025 • 152
LiveBench: A Challenging, Contamination-Free LLM Benchmark Paper • 2406.19314 • Published Jun 27, 2024 • 23