Xiaobo Wang's picture

Xiaobo Wang

Yofuria

·

https://yofuria.github.io/

Yofuria

AI & ML interests

Reward Modeling, Agent Memory, LLM Alignment

Recent Activity

upvoted a paper 20 days ago

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives

updated a collection 20 days ago

updated a collection 20 days ago

View all activity

Organizations

upvoted a paper 20 days ago

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives

Paper • 2505.19558 • Published May 26, 2025 • 1

updated 4 collections 20 days ago

ICE

In-Context Editing: Learning Knowledge from Self-Induced Distributions • 2 items • Updated 20 days ago

PoliCon

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives • 2 items • Updated 20 days ago

UAPO

Adaptive Preference Optimization with Uncertainty-aware Utility Anchor • 4 items • Updated 20 days ago

SAVE

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement • 4 items • Updated 20 days ago

updated a collection 27 days ago

UAPO

Adaptive Preference Optimization with Uncertainty-aware Utility Anchor • 4 items • Updated 20 days ago

authored a paper 27 days ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Paper • 2605.30888 • Published about 1 month ago • 10

updated 2 collections 27 days ago

ICE

In-Context Editing: Learning Knowledge from Self-Induced Distributions • 2 items • Updated 20 days ago

SAVE

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement • 4 items • Updated 20 days ago

upvoted a paper 27 days ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Paper • 2605.30888 • Published about 1 month ago • 10

submitted a paper to Daily Papers 27 days ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Paper • 2605.30888 • Published about 1 month ago • 10

updated a dataset about 2 months ago

Yofuria/UltraFeedback-binarized-ms-swift-1024

Viewer • Updated May 12 • 38.9k • 13

published a dataset about 2 months ago

Yofuria/UltraFeedback-binarized-ms-swift-1024

Viewer • Updated May 12 • 38.9k • 13

updated a dataset 2 months ago

Yofuria/UltraFeedback-ms-swift-1024

Viewer • Updated Apr 27 • 41k • 64

published a dataset 2 months ago

Yofuria/UltraFeedback-ms-swift-1024

Viewer • Updated Apr 27 • 41k • 64

updated a collection 2 months ago

PoliCon

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives • 2 items • Updated 20 days ago