SafeSwitch: Steering Unsafe LLM Behavior via Internal Activation Signals Paper • 2502.01042 • Published Feb 3, 2025 • 1
Energy-Based Transformers are Scalable Learners and Thinkers Paper • 2507.02092 • Published Jul 2, 2025 • 69
EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents Paper • 2412.13549 • Published Dec 18, 2024
Self-Aligned Reward: Towards Effective and Efficient Reasoners Paper • 2509.05489 • Published Sep 5, 2025 • 1
DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal Paper • 2601.18081 • Published 2 days ago • 5
DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal Paper • 2601.18081 • Published 2 days ago • 5
DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal Paper • 2601.18081 • Published 2 days ago • 5
DRPG_RebuttalAgent Collection https://arxiv.org/pdf/2601.18081 • 4 items • Updated about 18 hours ago
SafeSwitch: Steering Unsafe LLM Behavior via Internal Activation Signals Paper • 2502.01042 • Published Feb 3, 2025 • 1
The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination Paper • 2502.16143 • Published Feb 22, 2025 • 6