1 10 7

Xuan Xiao

xiaoxuanzi

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation

updated a model 4 days ago

Kwai-Klear/GoLongRL-4B

updated a model 4 days ago

Kwai-Klear/GoLongRL-30B-A3B

View all activity

Organizations

upvoted a paper 2 days ago

ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation

Paper • 2605.28293 • Published 3 days ago • 78

updated 2 models 4 days ago

Kwai-Klear/GoLongRL-4B

Text Generation • 4B • Updated 4 days ago • 205 • • 4

Kwai-Klear/GoLongRL-30B-A3B

Text Generation • 31B • Updated 4 days ago • 394 • 10

updated a dataset 4 days ago

Kwai-Klear/GoLongRL

Viewer • Updated 4 days ago • 23k • 896 • 23

upvoted a paper 8 days ago

Kwai Summary Attention Technical Report

Paper • 2604.24432 • Published Apr 27 • 1

authored 2 papers 10 days ago

CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts

Paper • 2410.16077 • Published Oct 21, 2024 • 1

Kwai Summary Attention Technical Report

Paper • 2604.24432 • Published Apr 27 • 1

updated a dataset 10 days ago

Kwai-Klear/GoLongRL-Eval

Preview • Updated 10 days ago • 71

published a dataset 10 days ago

Kwai-Klear/GoLongRL-Eval

Preview • Updated 10 days ago • 71

upvoted a collection 10 days ago

GoLongRL

Collection

GoLongRL • 4 items • Updated 10 days ago • 3

upvoted a paper 10 days ago

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

Paper • 2605.19577 • Published 11 days ago • 58

liked a dataset 10 days ago

Kwai-Klear/GoLongRL

Viewer • Updated 4 days ago • 23k • 896 • 23

liked 2 models 10 days ago

Kwai-Klear/GoLongRL-30B-A3B

Text Generation • 31B • Updated 4 days ago • 394 • 10

Kwai-Klear/GoLongRL-4B

Text Generation • 4B • Updated 4 days ago • 205 • • 4

published 2 models 11 days ago

Kwai-Klear/GoLongRL-30B-A3B

Text Generation • 31B • Updated 4 days ago • 394 • 10

Kwai-Klear/GoLongRL-4B

Text Generation • 4B • Updated 4 days ago • 205 • • 4

published a dataset 11 days ago

Kwai-Klear/GoLongRL

Viewer • Updated 4 days ago • 23k • 896 • 23

upvoted a paper 6 months ago

Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning

Paper • 2512.05591 • Published Dec 5, 2025 • 17

liked a dataset 6 months ago

a-m-team/AM-DeepSeek-R1-Distilled-1.4M

Preview • Updated Mar 30, 2025 • 1.84k • 180

upvoted a collection 7 months ago

SYNTHETIC-1

Collection

A collection of tasks & verifiers for reasoning datasets • 9 items • Updated Oct 7, 2025 • 67

Xuan Xiao

AI & ML interests

Recent Activity

Organizations

xiaoxuanzi's activity