RUT-Bench Benchmark data in "Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions". Miaow-Lab/RUT-Bench Viewer • Updated 12 days ago • 1.64k • 70 Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 14 days ago
Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 14 days ago
STT-Arena benchmark data, training data, and STT-Agent from our paper "STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics" STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published 29 days ago • 1 Miaow-Lab/STT-Agent-SFT 196k • Updated 28 days ago • 17 • 1 Miaow-Lab/STT-Agent-RL 196k • Updated 28 days ago • 17 • 1 Miaow-Lab/STT-Arena Preview • Updated 28 days ago • 108 • 2
STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published 29 days ago • 1
RUT-Bench Benchmark data in "Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions". Miaow-Lab/RUT-Bench Viewer • Updated 12 days ago • 1.64k • 70 Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 14 days ago
Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 14 days ago
STT-Arena benchmark data, training data, and STT-Agent from our paper "STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics" STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published 29 days ago • 1 Miaow-Lab/STT-Agent-SFT 196k • Updated 28 days ago • 17 • 1 Miaow-Lab/STT-Agent-RL 196k • Updated 28 days ago • 17 • 1 Miaow-Lab/STT-Arena Preview • Updated 28 days ago • 108 • 2
STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published 29 days ago • 1