sxcasf (ads)

6 10 5

sxcasf

AI & ML interests

None yet

Recent Activity

upvoted a paper 21 days ago

GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation

upvoted a paper 29 days ago

Multi-Objective and Mixed-Reward Reinforcement Learning via Reward-Decorrelated Policy Optimization

commentedon a paper about 2 months ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

View all activity

Organizations

upvoted a paper 21 days ago

GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation

Paper • 2605.21605 • Published 23 days ago • 13

upvoted a paper 29 days ago

Multi-Objective and Mixed-Reward Reinforcement Learning via Reward-Decorrelated Policy Optimization

Paper • 2605.13641 • Published about 1 month ago • 50

commented a paper about 2 months ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published Apr 14 • 17 •

upvoted a paper about 2 months ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published Apr 14 • 17

commented a paper about 2 months ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published Apr 14 • 17 •

upvoted a paper 2 months ago

A Survey of On-Policy Distillation for Large Language Models

Paper • 2604.00626 • Published Apr 1 • 13

upvoted a paper 3 months ago

LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

Paper • 2603.21065 • Published Mar 22 • 78

liked a dataset 3 months ago

AudioVisual-Caption/ASID-1M

Viewer • Updated Mar 11 • 241k • 606 • 85

liked 2 models 5 months ago

Alibaba-Apsara/DASD-30B-A3B-Thinking-Preview

Text Generation • 31B • Updated Jan 15 • 92 • 55

Alibaba-Apsara/DASD-4B-Thinking

Text Generation • 4B • Updated Jan 15 • 271 • • 236

upvoted a collection 5 months ago

DASD-Thinking

Collection

6 items • Updated Feb 3 • 25

liked a dataset 5 months ago

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b

Viewer • Updated Jan 31 • 306k • 1.56k • 348

upvoted a paper 5 months ago

Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published Jan 14 • 64

liked a dataset 5 months ago

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b-Logprob

Viewer • Updated Jan 15 • 435k • 7.89k • 62

updated 2 datasets 5 months ago

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b

Viewer • Updated Jan 31 • 306k • 1.56k • 348

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b-Logprob

Viewer • Updated Jan 15 • 435k • 7.89k • 62

updated 2 models 5 months ago

Alibaba-Apsara/DASD-30B-A3B-Thinking-Preview

Text Generation • 31B • Updated Jan 15 • 92 • 55

Alibaba-Apsara/DASD-4B-Thinking

Text Generation • 4B • Updated Jan 15 • 271 • • 236

upvoted a paper 5 months ago

Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation

Paper • 2512.20908 • Published Dec 24, 2025 • 29

New activity in HuggingFaceTB/Countdown-Task-GOLD 6 months ago

Inconsistent numbers

#1 opened 7 months ago by

MysticJay

ads

AI & ML interests

Recent Activity

Organizations

sxcasf's activity

Inconsistent numbers