Enhanced Graph Transformer with Serialized Graph Tokens Paper • 2602.09065 • Published 13 days ago • 1
Flexible Entropy Control in RLVR with Gradient-Preserving Perspective Paper • 2602.09782 • Published 11 days ago • 3
IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual Prompting Paper • 2512.09663 • Published Dec 10, 2025 • 4
IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual Prompting Paper • 2512.09663 • Published Dec 10, 2025 • 4
Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering Paper • 2510.14605 • Published Oct 16, 2025 • 5
Taming Modality Entanglement in Continual Audio-Visual Segmentation Paper • 2510.17234 • Published Oct 20, 2025 • 5
LiveStar: Live Streaming Assistant for Real-World Online Video Understanding Paper • 2511.05299 • Published Nov 7, 2025 • 2
MR-Align: Meta-Reasoning Informed Factuality Alignment for Large Reasoning Models Paper • 2510.24794 • Published Oct 27, 2025 • 32
Taming Modality Entanglement in Continual Audio-Visual Segmentation Paper • 2510.17234 • Published Oct 20, 2025 • 5
Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering Paper • 2510.14605 • Published Oct 16, 2025 • 5
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning Paper • 2508.21113 • Published Aug 28, 2025 • 110
SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding Paper • 2502.10810 • Published Feb 15, 2025 • 1
Continuous Speculative Decoding for Autoregressive Image Generation Paper • 2411.11925 • Published Nov 18, 2024 • 16
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought Paper • 2505.15431 • Published May 21, 2025 • 1
Re-ranking Reasoning Context with Tree Search Makes Large Vision-Language Models Stronger Paper • 2506.07785 • Published Jun 9, 2025 • 1
Faster and Better LLMs via Latency-Aware Test-Time Scaling Paper • 2505.19634 • Published May 26, 2025
Continuous Speculative Decoding for Autoregressive Image Generation Paper • 2411.11925 • Published Nov 18, 2024 • 16
Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis Paper • 2409.06135 • Published Sep 10, 2024 • 16