Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

FlagEval

non-profit
https://flageval.baai.ac.cn/
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

ec-mo  updated a dataset about 23 hours ago
FlagEval/ERQAPlus
philokey  updated a dataset 7 months ago
FlagEval/coco_val2014_sampled
philokey  authored a paper 7 months ago
Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBench
View all activity

Richeng Xuan's profile pictureRowan's profile pictureBowen Qin's profile picturedaiteng01's profile pictureGray 's profile picturelixuejing's profile pictureZheqi He's profile picturejingshu's profile picturemakarov's profile pictureXuannan Liu 's profile pictureMoyu's profile picture

spaces 2

Running
6

FlagEval-Arena

🐢

Arena

Mar 18, 2025
Running
12

FlagEval-Debate

🐠

Display a debate interface

Mar 17, 2025

models 1

FlagEval/flageval_judgemodel

Text Generation • 33B • Updated Dec 30, 2024 • 10 • 1

datasets 13

FlagEval/ERQAPlus

Viewer • Updated about 23 hours ago • 800 • 47 • 1

FlagEval/coco_val2014_sampled

Viewer • Updated Nov 6, 2025 • 1k • 52

FlagEval/MeasureBench

Viewer • Updated Nov 3, 2025 • 2.44k • 458 • 1

FlagEval/EmbodiedVerse-Bench

Viewer • Updated Jun 25, 2025 • 2.04k • 162

FlagEval/Where2Place

Viewer • Updated May 29, 2025 • 100 • 315

FlagEval/SAT

Viewer • Updated May 6, 2025 • 150 • 55

FlagEval/HMMT_2025

Viewer • Updated May 6, 2025 • 30 • 192 • 1

FlagEval/ERQA

Viewer • Updated Apr 22, 2025 • 400 • 4.64k • 5

FlagEval/sub_spatial

Viewer • Updated Apr 21, 2025 • 690 • 14

FlagEval/EmbSpatial-Bench

Viewer • Updated Apr 21, 2025 • 3.64k • 3.58k • 5
View 13 datasets
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs