The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development? Paper • 2606.04455 • Published 10 days ago • 3
The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development? Paper • 2606.04455 • Published 10 days ago • 3
The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development? Paper • 2606.04455 • Published 10 days ago • 3
SoFA: Shielded On-the-fly Alignment via Priority Rule Following Paper • 2402.17358 • Published Feb 27, 2024 • 1
Scalable Oversight for Superhuman AI via Recursive Self-Critiquing Paper • 2502.04675 • Published Feb 7, 2025 • 1
On-Policy Self-Alignment with Fine-grained Knowledge Feedback for Hallucination Mitigation Paper • 2406.12221 • Published Jun 18, 2024
Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree? Paper • 2410.05584 • Published Oct 8, 2024
LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents Paper • 2605.29559 • Published 16 days ago • 17
LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents Paper • 2605.29559 • Published 16 days ago • 17
LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents Paper • 2605.29559 • Published 16 days ago • 17
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces Paper • 2601.11868 • Published Jan 17 • 37
view article Article Announcing LiteCoder-Terminal: Lightweight Terminal Agents with <1k Synthesized Trajectories Lite-Coder • Dec 18, 2025 • 9
GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks Paper • 2504.12764 • Published Apr 17, 2025 • 42