General Exploratory Bonus for optimistic exploration in RLHF
WendiLi
Windy0822
AI & ML interests
None yet
Organizations
None yet
models 32
Windy0822/LAD_b4_js_code
8B • Updated • 3
Windy0822/LAD_b4_js_math
8B • Updated • 2
Windy0822/qwen_s120_rlpr
8B • Updated • 2
Windy0822/lad_js_1_s120_rlpr
8B • Updated • 2
Windy0822/distrl_b1_tv-125-code
8B • Updated
Windy0822/grpo-125-code
8B • Updated • 2
Windy0822/distrl_b1_tv-100-code
8B • Updated • 2
Windy0822/grpo-50-code
8B • Updated • 2
Windy0822/distrl_b1_tv-50-code
8B • Updated • 1
Windy0822/Mistral0.3_geb_tanh_fkl
7B • Updated • 1