Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning Paper • 2406.05064 • Published Jun 7, 2024
Multi-Objective Alignment of Large Language Models Through Hypervolume Maximization Paper • 2412.05469 • Published Dec 6, 2024
Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization Paper • 2506.06964 • Published Jun 8, 2025
From Selection to Generation: A Survey of LLM-based Active Learning Paper • 2502.11767 • Published Feb 17, 2025 • 2
A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality Paper • 2507.07202 • Published Jul 9, 2025 • 25
A Personalized Conversational Benchmark: Towards Simulating Personalized Conversations Paper • 2505.14106 • Published May 20, 2025
MLLM as a UI Judge: Benchmarking Multimodal LLMs for Predicting Human Perception of User Interfaces Paper • 2510.08783 • Published Oct 9, 2025 • 5
StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos Paper • 2512.01707 • Published Dec 1, 2025 • 8
InfinityStory: Unlimited Video Generation with World Consistency and Character-Aware Shot Transitions Paper • 2603.03646 • Published 14 days ago • 8
Human-Aligned MLLM Judges for Fine-Grained Image Editing Evaluation: A Benchmark, Framework, and Analysis Paper • 2602.13028 • Published Feb 13
Agentic Planning with Reasoning for Image Styling via Offline RL Paper • 2603.07148 • Published 11 days ago • 3