TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation Paper • 2605.22355 • Published 4 days ago • 169
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 5 days ago • 197
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published 6 days ago • 172
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published 11 days ago • 143
XL-SafetyBench: A Country-Grounded Cross-Cultural Benchmark for LLM Safety and Cultural Sensitivity Paper • 2605.05662 • Published 18 days ago • 11
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation Paper • 2604.28196 • Published 25 days ago • 71
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 22 days ago • 162
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published Apr 13 • 102
Q-Zoom: Query-Aware Adaptive Perception for Efficient Multimodal Large Language Models Paper • 2604.06912 • Published Apr 8 • 8