A Tale of Tails: Model Collapse as a Change of Scaling Laws
Paper
• 2402.07043
• Published • 15
What Characterizes Effective Reasoning? Revisiting Length, Review, and
Structure of CoT
Paper
• 2509.19284
• Published • 23
OnePiece: Bringing Context Engineering and Reasoning to Industrial
Cascade Ranking System
Paper
• 2509.18091
• Published • 34
Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM
Paper
• 2509.18058
• Published • 12
Igniting Creative Writing in Small Language Models: LLM-as-a-Judge
versus Multi-Agent Refined Rewards
Paper
• 2508.21476
• Published • 3
Competition Report: Finding Universal Jailbreak Backdoors in Aligned
LLMs
Paper
• 2404.14461
• Published • 3
Universal Jailbreak Backdoors from Poisoned Human Feedback
Paper
• 2311.14455
• Published • 3
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world
Markets?
Paper
• 2510.02209
• Published • 57
Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming
Attacks
Paper
• 2510.02286
• Published • 29