AgentDoG Collection A Diagnostic Guardrail Framework for AI Agent Safety and Security • 12 items • Updated about 4 hours ago • 109
AgentDoG Collection A Diagnostic Guardrail Framework for AI Agent Safety and Security • 12 items • Updated about 4 hours ago • 109
ATBench: A Diverse and Realistic Trajectory Benchmark for Long-Horizon Agent Safety Paper • 2604.02022 • Published Apr 2 • 15
Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration? Paper • 2603.03202 • Published Mar 3 • 17
Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5 Paper • 2602.14457 • Published Feb 16 • 29