Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published 5 days ago • 43
Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria Paper • 2605.08354 • Published 17 days ago • 23
Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria Paper • 2605.08354 • Published 17 days ago • 23
MultiWorld: Scalable Multi-Agent Multi-View Video World Models Paper • 2604.18564 • Published Apr 20 • 46
R-Align: Enhancing Generative Reward Models through Rationale-Centric Meta-Judging Paper • 2602.06763 • Published Feb 6
SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments Paper • 2604.14144 • Published Apr 15 • 63
SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments Paper • 2604.14144 • Published Apr 15 • 63
WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics Paper • 2603.13391 • Published Mar 11 • 19
WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics Paper • 2603.13391 • Published Mar 11 • 19
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation Paper • 2603.08652 • Published Mar 9 • 40
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation Paper • 2603.08652 • Published Mar 9 • 40
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation Paper • 2603.08652 • Published Mar 9 • 40