DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing Paper • 2602.12205 • Published 15 days ago • 79
MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published 18 days ago • 154
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing Paper • 2602.02437 • Published 25 days ago • 77
MVGGT: Multimodal Visual Geometry Grounded Transformer for Multiview 3D Referring Expression Segmentation Paper • 2601.06874 • Published Jan 11 • 12
MVGGT: Multimodal Visual Geometry Grounded Transformer for Multiview 3D Referring Expression Segmentation Paper • 2601.06874 • Published Jan 11 • 12
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published Mar 7, 2025 • 123