👋 Open to Work

Zhimin Zhao PRO

zhiminy

458 15 340

https://zhimin-z.github.io

AI & ML interests

SE4AI, AI4SE, LLMOps, LLM4Code

Recent Activity

upvoted a paper 5 days ago

From Human-Centric to Agentic Code Review: The Impact of Different Generations of Generative AI Technology on Review Quality

updated a dataset 14 days ago

SWE-Arena/leaderboard_data

updated a dataset 14 days ago

SWE-Arena/vote_data

View all activity

Organizations

upvoted a paper 5 days ago

From Human-Centric to Agentic Code Review: The Impact of Different Generations of Generative AI Technology on Review Quality

Paper • 2607.13196 • Published 11 days ago • 29

updated 3 datasets 14 days ago

updated a Space 18 days ago

SWE-Agent-Arena

⚔

Agent arena for software engineering tasks

updated a dataset 18 days ago

SWE-Arena/cli_data

Updated 18 days ago • 105 • 1

liked a model about 1 month ago

SanDiegoDude/Cosmos3-Nano-nf4

Text-to-Image • 8B • Updated Jun 2 • 100 • 2

liked a Space about 1 month ago

Cosmos3 Action Viewer

🤖

Explore interactive visualizations with Viser

upvoted a collection about 1 month ago

Cosmos3

Collection

Omnimodal World Models for Physical AI • 20 items • Updated 1 day ago • 166

liked a model about 2 months ago

nvidia/Cosmos3-Super

65B • Updated 15 days ago • 97k • 213

upvoted 2 papers about 2 months ago

On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards

Paper • 2407.04065 • Published Jul 4, 2024 • 10

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Paper • 2605.29801 • Published May 28 • 149

authored 2 papers about 2 months ago

On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards

Paper • 2407.04065 • Published Jul 4, 2024 • 10

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

Paper • 2605.24213 • Published May 22 • 15

liked a dataset about 2 months ago

zhiminy/EvalEng

Viewer • Updated May 27 • 19.6k • 48 • 5

updated a dataset about 2 months ago

zhiminy/EvalEng

Viewer • Updated May 27 • 19.6k • 48 • 5

published a dataset about 2 months ago

zhiminy/EvalEng

Viewer • Updated May 27 • 19.6k • 48 • 5

upvoted a paper about 2 months ago

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

Paper • 2605.24213 • Published May 22 • 15

submitted a paper to Daily Papers about 2 months ago

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

Paper • 2605.24213 • Published May 22 • 15

upvoted a changelog 3 months ago

Hugging Face Changelog

Introducing Kernels

Apr 15

• 204

Zhimin Zhao PRO

AI & ML interests

Recent Activity

Organizations

zhiminy's activity

SWE-Agent-Arena

Cosmos3 Action Viewer

Introducing Kernels