MMSkills: Towards Multimodal Skills for General Visual Agents Paper • 2605.13527 • Published 26 days ago • 118
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published May 4 • 348
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published Apr 13 • 143
All Roads Lead to Rome: Incentivizing Divergent Thinking in Vision-Language Models Paper • 2604.00479 • Published Apr 1 • 69