Add model-index with benchmark evaluations

#4
by davidlms - opened

Added structured evaluation results from benchmark image:

  • SimpleQA: 8.90
  • MUSR: 63.49
  • MMLU (Zero Shot): 84.95
  • Math-500: 92.10
  • GPQA-Diamond: 58.55
  • BFCL V3: 59.67

This enables the model to appear in leaderboards and makes it easier to compare with other models.

Cannot merge
This branch has merge conflicts in the following files:
  • README.md

Sign up or log in to comment