TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill \& Decode Inference Paper • 2508.15881 • Published Aug 21, 2025 • 10
LooGLE v2: Are LLMs Ready for Real World Long Dependency Challenges? Paper • 2510.22548 • Published Oct 26, 2025 • 1
HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention Paper • 2603.28458 • Published 3 days ago • 21