DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos Paper • 2602.06949 • Published 4 days ago • 26
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models Paper • 2601.23143 • Published 11 days ago • 38
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 56
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders Paper • 2601.16208 • Published 19 days ago • 51
Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning Paper • 2601.16163 • Published 19 days ago • 13
Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals Paper • 2601.05848 • Published Jan 9 • 16
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 56
Qwen/Qwen3-VL-235B-A22B-Instruct Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 373k • • 367
PAI-Bench: A Comprehensive Benchmark For Physical AI Paper • 2512.01989 • Published Dec 1, 2025 • 6
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper • 2512.02425 • Published Dec 2, 2025 • 25
RedHatAI/Qwen2.5-VL-72B-Instruct-FP8-dynamic Image-Text-to-Text • 73B • Updated Apr 25, 2025 • 48.5k • 15
Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising Paper • 2511.08633 • Published Nov 9, 2025 • 55