Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published 19 days ago • 49
Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding Paper • 2512.17532 • Published 23 days ago • 65
Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing Paper • 2512.17909 • Published 23 days ago • 36
LongLive: Real-time Interactive Long Video Generation Paper • 2509.22622 • Published Sep 26, 2025 • 184
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent Paper • 2506.17612 • Published Jun 21, 2025 • 64
FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation Paper • 2502.05179 • Published Feb 7, 2025 • 24
ControlNeXt: Powerful and Efficient Control for Image and Video Generation Paper • 2408.06070 • Published Aug 12, 2024 • 55