ODA-Scored-Data by implemented multiple data scores.
AI & ML interests
Data-centric AI, LLM, MLLM
Recent Activity
View all activity
Papers
Closing the Data Loop: Using OpenDataArena to Engineer Superior Training Datasets
OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value
High-quality STEM reasoning dataset for Multimodal LLM post-training.
-
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
Paper • 2601.21821 • Published • 60 -
OpenDataArena/MMFineReason-1.8M-Qwen3-VL-235B-Thinking
Viewer • Updated • 1.81M • 4.75k • 118 -
OpenDataArena/MMFineReason-SFT-586K-Qwen3-VL-235B-Thinking
Viewer • Updated • 586k • 720 • 6 -
OpenDataArena/MMFineReason-SFT-123K-Qwen3-VL-235B-Thinking
Viewer • Updated • 123k • 1.08k • 76
High-quality mathematical datasets for post training.
ODA-Scored-Data by implemented multiple data scores.
High-quality STEM reasoning dataset for Multimodal LLM post-training.
-
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
Paper • 2601.21821 • Published • 60 -
OpenDataArena/MMFineReason-1.8M-Qwen3-VL-235B-Thinking
Viewer • Updated • 1.81M • 4.75k • 118 -
OpenDataArena/MMFineReason-SFT-586K-Qwen3-VL-235B-Thinking
Viewer • Updated • 586k • 720 • 6 -
OpenDataArena/MMFineReason-SFT-123K-Qwen3-VL-235B-Thinking
Viewer • Updated • 123k • 1.08k • 76
High-quality mixture datasets for post-training covering multiple domains.
High-quality mathematical datasets for post training.