3 5

jpy

https://scholar.google.com.hk/citations?user=oPQZpwkAAAAJ&hl=zh-CN

yupeijei1997

AI & ML interests

LLM Agent

Recent Activity

authored a paper 1 day ago

Hunt Instead of Wait: Evaluating Deep Data Research on Large Language Models

liked a Space 3 days ago

thinkwee/DDR_Bench

liked a Space 4 months ago

thinkwee/NOVER

View all activity

Organizations

authored a paper 1 day ago

Hunt Instead of Wait: Evaluating Deep Data Research on Large Language Models

Paper • 2602.02039 • Published 4 days ago • 5

liked a Space 3 days ago

DDR Bench

🚀

Deep Data Research Benchmark

liked a Space 4 months ago

NOVER

🧠

Reasoning on ANYTHING

liked a model 6 months ago

tencent/Hunyuan-1.8B-Instruct

Text Generation • 2B • Updated Aug 6, 2025 • 293 • 228

updated a dataset 7 months ago

tencent/C3-BenchMark

Viewer • Updated Jul 1, 2025 • 256 • 229 • 6

New activity in tencent/C3-BenchMark 7 months ago

update

#4 opened 7 months ago by

jpy

Improve dataset card: Add task category, tags, and comprehensive details from GitHub

#3 opened 7 months ago by

nielsr

liked a dataset 7 months ago

tencent/C3-BenchMark

Viewer • Updated Jul 1, 2025 • 256 • 229 • 6

liked a model 7 months ago

tencent/Hunyuan-A13B-Instruct

Text Generation • 80B • Updated Aug 21, 2025 • 9.27k • 614

published a dataset 7 months ago

tencent/C3-BenchMark

Viewer • Updated Jul 1, 2025 • 256 • 229 • 6

New activity in tencent/C3-BenchMark 7 months ago

Upload 2 files

#1 opened 7 months ago by

jpy

authored a paper 10 months ago

Multi-Mission Tool Bench: Assessing the Robustness of LLM based Agents through Related and Dynamic Missions

Paper • 2504.02623 • Published Apr 3, 2025

jpy

AI & ML interests

Recent Activity

Organizations

jpy's activity

DDR Bench

NOVER

update

Improve dataset card: Add task category, tags, and comprehensive details from GitHub

Upload 2 files