GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators Paper • 2512.19682 • Published Dec 22, 2025 • 19
Beyond Binary Preference: Aligning Diffusion Models to Fine-grained Criteria by Decoupling Attributes Paper • 2601.04300 • Published Jan 7 • 3
AutoTool: Dynamic Tool Selection and Integration for Agentic Reasoning Paper • 2512.13278 • Published Dec 15, 2025
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System Paper • 2602.02488 • Published Feb 2 • 33
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System Paper • 2602.02488 • Published Feb 2 • 33
UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers Paper • 2511.20123 • Published Nov 25, 2025 • 18
Demystifying Reinforcement Learning in Agentic Reasoning Paper • 2510.11701 • Published Oct 13, 2025 • 33
Generative Universal Verifier as Multimodal Meta-Reasoner Paper • 2510.13804 • Published Oct 15, 2025 • 27
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published Nov 12, 2025 • 70
AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes Paper • 2506.14728 • Published Jun 17, 2025
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs Paper • 2506.18896 • Published Jun 23, 2025 • 29
Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models Paper • 2509.06949 • Published Sep 8, 2025 • 56
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety Paper • 2504.09689 • Published Apr 13, 2025 • 6
Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion Paper • 2504.11447 • Published Apr 15, 2025 • 4
Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning Paper • 2505.16270 • Published May 22, 2025 • 7
On Path to Multimodal Historical Reasoning: HistBench and HistAgent Paper • 2505.20246 • Published May 26, 2025