LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards Paper • 2605.31584 • Published 5 days ago • 36
LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards Paper • 2605.31584 • Published 5 days ago • 36
CaRR & C-GRPO Collection Data and models for the paper "Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards". • 6 items • Updated Mar 25 • 1
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse Paper • 2603.12201 • Published Mar 12 • 53
CaRR & C-GRPO Collection Data and models for the paper "Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards". • 6 items • Updated Mar 25 • 1
Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards Paper • 2601.06021 • Published Jan 9 • 48
Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards Paper • 2601.06021 • Published Jan 9 • 48