From Scale to Speed: Adaptive Test-Time Scaling for Image Editing Paper • 2603.00141 • Published Feb 24 • 138
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training Paper • 2603.12255 • Published 16 days ago • 90
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published Feb 10 • 201
Olaf-World: Orienting Latent Actions for Video World Modeling Paper • 2602.10104 • Published Feb 10 • 27
ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands Paper • 2512.24965 • Published Dec 31, 2025 • 43
FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection Paper • 2601.03928 • Published Jan 7 • 17
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper • 2511.02778 • Published Nov 4, 2025 • 103
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation Paper • 2511.20256 • Published Nov 25, 2025 • 28
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published Nov 10, 2025 • 107
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback Paper • 2511.01678 • Published Nov 3, 2025 • 38