Hide-and-Seek in Trajectories: Discovering Failure Signals for VLA Runtime Monitoring Paper • 2605.30834 • Published May 29 • 10
Neglected Free Lunch from Post-training: Progress Advantage for LLM Agents Paper • 2606.26080 • Published 7 days ago • 10
Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation Paper • 2606.26907 • Published 6 days ago • 47
DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis Paper • 2604.13416 • Published 13 days ago • 32
Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation Paper • 2605.18740 • Published May 18 • 5
Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts Paper • 2606.05922 • Published 26 days ago • 69
Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models Paper • 2606.11025 • Published 22 days ago • 41
SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning Paper • 2606.10804 • Published 22 days ago • 51
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published Apr 13 • 103
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published Apr 14 • 113
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Paper • 2603.25746 • Published Mar 26 • 155
CodeGoat24/UniGenBench-EvalModel-qwen-72b-v1 Image-Text-to-Text • 73B • Updated Oct 25, 2025 • 77 • 4
Evaluating and Steering Modality Preferences in Multimodal Large Language Model Paper • 2505.20977 • Published May 27, 2025 • 10
view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 167
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation Paper • 2603.08652 • Published Mar 9 • 41