Li Dong

unilm

AI & ML interests

Language Model Pre-Training

Recent Activity

authored a paper 14 days ago

Universal YOCO for Efficient Depth Scaling

upvoted a paper 14 days ago

Universal YOCO for Efficient Depth Scaling

submitted a paper 14 days ago

Universal YOCO for Efficient Depth Scaling

View all activity

Organizations

authored a paper 14 days ago

Universal YOCO for Efficient Depth Scaling

Paper • 2604.01220 • Published 14 days ago • 18

submitted a paper to Daily Papers 14 days ago

Universal YOCO for Efficient Depth Scaling

Paper • 2604.01220 • Published 14 days ago • 18

authored 5 papers 29 days ago

VIBEVOICE-ASR Technical Report

Paper • 2601.18184 • Published Jan 26 • 23

On-Policy Context Distillation for Language Models

Paper • 2602.12275 • Published Feb 12 • 3

Breaking Training Bottlenecks: Effective and Stable Reinforcement Learning for Coding Models

Paper • 2603.07777 • Published Mar 8 • 5

Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity

Paper • 2603.05168 • Published Mar 5 • 5

Online Experiential Learning for Language Models

Paper • 2603.16856 • Published 29 days ago • 58

submitted a paper to Daily Papers 29 days ago

Online Experiential Learning for Language Models

Paper • 2603.16856 • Published 29 days ago • 58

authored 5 papers 3 months ago

LLM-in-Sandbox Elicits General Agentic Intelligence

Paper • 2601.16206 • Published Jan 22 • 86

MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems

Paper • 2412.07067 • Published Dec 10, 2024

submitted a paper to Daily Papers 3 months ago

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Paper • 2601.08808 • Published Jan 13 • 39

authored a paper 5 months ago

Black-Box On-Policy Distillation of Large Language Models

Paper • 2511.10643 • Published Nov 13, 2025 • 52

authored 5 papers 6 months ago

Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

Paper • 2509.22613 • Published Sep 26, 2025 • 10

DocReward: A Document Reward Model for Structuring and Stylizing

Paper • 2510.11391 • Published Oct 13, 2025 • 27

Information-Preserving Reformulation of Reasoning Traces for Antidistillation

Paper • 2510.11545 • Published Oct 13, 2025 • 2

BitNet Distillation

Paper • 2510.13998 • Published Oct 15, 2025 • 59

Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal Reasoning in MLLMs

Paper • 2510.24514 • Published Oct 28, 2025 • 22

Li Dong

AI & ML interests

Recent Activity

Organizations

unilm's activity