[ICLR'24 Spotlight] Tool-Augmented Reward Modeling
AI & ML interests
Large Language Models
Papers
View all Papers models 12
ernie-research/Themis-7b
Updated • 5 • 4
ernie-research/APPS-Gemma-7B-MA-PPO-Fixed10
9B • Updated • 1
ernie-research/APPS-Gemma-2B-MA-PPO-Fixed10
3B • Updated • 5
ernie-research/HH-RLHF-Gemma-2B-MA-PPO-Fixed5
3B • Updated • 5 • 1
ernie-research/HH-RLHF-Gemma-7B-MA-PPO-Fixed5
9B • Updated • 1
ernie-research/TLDR-Gemma-7B-MA-PPO-Fixed5
9B • Updated • 1
ernie-research/TLDR-Gemma-2B-MA-PPO-Fixed5
3B • Updated • 1 • 1
ernie-research/TLDR-Gemma-2-27B-MA-PPO-Fixed5
27B • Updated • 1
ernie-research/ernie-code-560m
Updated • 14 • 10
ernie-research/MonoGPT
Text Generation • 0.4B • Updated • 2 • 2
datasets 7
ernie-research/MEnvData-SWE-Trajectory
Viewer • Updated • 3.92k • 623 • 26
ernie-research/MEnvData-SWE
Preview • Updated • 540 • 3
ernie-research/MEnvBench
Viewer • Updated • 1k • 22 • 2
ernie-research/TARA
Preview • Updated • 109 • 1
ernie-research/GPTDynamics
Preview • Updated • 175 • 1
ernie-research/rendered_xnli
Updated • 20 • 1
ernie-research/rendered_GLUE
Updated • 15 • 1