Official collection for paper "Reward Modeling from Natural Language Human Feedback".
AI & ML interests
LLM, Conversational AI, Agent
Recent Activity
View all activity
Papers
View all Papers Organization Card
Tongyi-ConvAI: The official repository containing the Alibaba Tongyi Conversational AI models and datasets.
models 13
Tongyi-ConvAI/Cold-Start-MetaRM-RM-NLHF-Qwen-32B
32B • Updated • 4 • 1
Tongyi-ConvAI/Cold-Start-MetaRM-RM-NLHF-Qwen-7B
7B • Updated
Tongyi-ConvAI/Final-MetaRM-RM-NLHF-Qwen-7B
7B • Updated
Tongyi-ConvAI/Final-MetaRM-RM-NLHF-Qwen-32B
32B • Updated
Tongyi-ConvAI/RM-NLHF-Qwen-32B
33B • Updated • 3
Tongyi-ConvAI/Baseline-Outcome-Reward-Qwen-7B
8B • Updated • 1
Tongyi-ConvAI/RM-NLHF-Qwen-7B
8B • Updated • 4 • 2
Tongyi-ConvAI/P-GenRM-8B-ChatbotArena
8B • Updated • 11 • 2
Tongyi-ConvAI/P-GenRM-8B-PRISM
8B • Updated • 1 • 1
Tongyi-ConvAI/OmniCharacter-7B
Updated
datasets 6
Tongyi-ConvAI/RM-NLHF
Viewer • Updated • 49.5k • 25 • 1
Tongyi-ConvAI/OmniCharacter
Viewer • Updated • 10.1k • 223 • 2
Tongyi-ConvAI/EPO-RL-data
Viewer • Updated • 9.38k • 49 • 1
Tongyi-ConvAI/OpenOmni
Preview • Updated • 108 • 4
Tongyi-ConvAI/SDPO
Preview • Updated • 24 • 5
Tongyi-ConvAI/MMEvol
Preview • Updated • 967 • 15