Yofuria/UltraFeedback-ms-swift-hard-1024-v5-Qwen2.5-v2-SFT Viewer • Updated about 20 hours ago • 19.9k • 4
Yofuria/UltraFeedback-ms-swift-hard-1024-v5-Qwen2.5-v2-SFT Viewer • Updated about 20 hours ago • 19.9k • 4
\$OneMillion-Bench: How Far are Language Agents from Human Experts? Paper • 2603.07980 • Published 18 days ago • 27
PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives Paper • 2505.19558 • Published May 26, 2025