Submitted by
Xiao Wang
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training