DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published Nov 24, 2025 • 60
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature Paper • 2406.07835 • Published Jun 10, 2024 • 2
SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models Paper • 2510.09541 • Published Oct 10, 2025 • 15
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments Paper • 2511.07317 • Published Nov 10, 2025 • 15
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published Nov 24, 2025 • 60
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published Nov 24, 2025 • 60
Large Language Models Discriminate Against Speakers of German Dialects Paper • 2509.13835 • Published Sep 17, 2025 • 7
MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published Aug 11, 2025 • 44
IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance Paper • 2502.08395 • Published Feb 12, 2025
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge Paper • 1803.05457 • Published Mar 14, 2018 • 3
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering Paper • 1809.02789 • Published Sep 8, 2018
From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project Paper • 1909.01958 • Published Sep 4, 2019