Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories Paper • 2606.02060 • Published 5 days ago • 49 • 7
Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories Paper • 2606.02060 • Published 5 days ago • 49 • 7
FineVerify: Scaling Test-Time Compute with Fine-Grained Self-Verification for Agentic Search Paper • 2606.00660 • Published 7 days ago • 8 • 2
Silent Failures in Physical AI: A Literature Review of Runtime Action Authorization for Autonomous Systems Paper • 2606.00090 • Published 14 days ago • 6 • 3