Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
ai4s-r2
community
Activity Feed
Follow
2
AI & ML interests
None defined yet.
Recent Activity
amphora
submitted
a paper
1 day ago
Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs
amphora
submitted
a paper
3 months ago
Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math
amphora
authored
a paper
12 months ago
When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research
View all activity
Team members
1
ai4s-r2
's models
None public yet