VulnLLM-R: Specialized Reasoning LLM with Agent Scaffold for Vulnerability Detection Paper • 2512.07533 • Published 21 days ago • 2
SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI Paper • 2410.11096 • Published Oct 14, 2024 • 13
OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation Paper • 2505.23885 • Published May 29
AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents Paper • 2505.05849 • Published May 9
Scaling Flaws of Verifier-Guided Search in Mathematical Reasoning Paper • 2502.00271 • Published Feb 1 • 1
OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation Paper • 2505.23885 • Published May 29
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8 • 195
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents Paper • 2509.09265 • Published Sep 11 • 47
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2 • 83