Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

UnstableBaselines

community
https://github.com/LeonGuertler/UnstableBaselines
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

tim-grams  updated a model 5 days ago
UnstableBaselines/Qwen3-4B-Base-15-Environments-v0
tim-grams  updated a model 5 days ago
UnstableBaselines/Qwen3-4B-Base-12-Environments-v0
tim-grams  updated a model 5 days ago
UnstableBaselines/Qwen3-4B-Base-12-Environments-v0
View all activity

Tim Grams's profile picture Bobby Cheng's profile picture
Organization Card
Community About org cards

An Async, Online, Multi-Turn, Multi-Agent RL library for training reasoning models on TextArena games.

models 23

UnstableBaselines/Qwen3-4B-Base-15-Environments-v0

Updated 5 days ago • 58

UnstableBaselines/Qwen3-4B-Base-12-Environments-v0

Updated 5 days ago • 80

UnstableBaselines/Qwen3-4B-Base-10-Environments-v0

Updated 5 days ago • 73

UnstableBaselines/Qwen3-4B-Base-7-Environments-v0

Updated 5 days ago • 98

UnstableBaselines/Qwen3-4B-Base-7-Environments-v1

Updated 5 days ago • 109

UnstableBaselines/Qwen3-4B-Base-Briscola-v0-train

Updated 5 days ago • 85

UnstableBaselines/Qwen3-4B-Base-Othello-v0-train

Updated 7 days ago • 113

UnstableBaselines/Qwen3-4B-Base-PigDice-v0-train

Updated 7 days ago • 128

UnstableBaselines/Qwen3-4B-Base-KuhnPoker-v0-train

Updated 7 days ago • 120

UnstableBaselines/Qwen3-4B-Base-IndianPoker-v0-train

Updated 8 days ago • 127
View 23 models

datasets 1

UnstableBaselines/trajectories-twodollar-v0-train

Viewer • Updated Oct 1, 2025 • 41.1k • 10
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs