ernie-research

community

AI & ML interests

Large Language Models

cyk1337

updated a collection 4 months ago

Macro-Action RLHF

[ICLR'25] [MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions](https://openreview.net/forum?id=WWXjMYZxfH) • 8 items • Updated Sep 20, 2025

Moyu-hrsun

authored a paper 7 months ago

MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions

Paper • 2410.02743 • Published Oct 3, 2024 • 7

cyk1337

updated a model 7 months ago

ernie-research/Themis-7b

Updated Jun 6, 2025 • 5 • 4

cyk1337

updated a dataset 7 months ago

ernie-research/TARA

Preview • Updated Jun 6, 2025 • 76 • 1

cyk1337

updated a collection 7 months ago

Macro-Action RLHF

[ICLR'25] [MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions](https://openreview.net/forum?id=WWXjMYZxfH) • 8 items • Updated Sep 20, 2025

msdyc

updated 4 collections 8 months ago

Tool-Augmented Reward Models

[ICLR'24 Spotlight] Tool-Augmented Reward Modeling • 3 items • Updated May 21, 2025

Multilingual Code Pre-training (ERNIE-Code)

[ACL'23 Findings] ERNIE-Code, the First multilingual text and multlingual code pre-training. • 2 items • Updated May 21, 2025

Pixel-based Pre-training (PixelGPT)

[EMNLP'24] [Autoregressive Pre-Training on Pixels and Texts](https://arxiv.org/pdf/2404.10710). • 6 items • Updated May 21, 2025

Macro-Action RLHF

[ICLR'25] [MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions](https://openreview.net/forum?id=WWXjMYZxfH) • 8 items • Updated Sep 20, 2025