---
license: cc-by-nc-4.0
datasets:
- openai/gsm8k
language:
- en
base_model:
- Qwen/Qwen2.5-Math-1.5B
pipeline_tag: text-generation
library_name: transformers
tags:
- math
- qwen
- lora
- mathematics
- gsm8k
---

# OpenMath  
Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning  

## Overview  
OpenMath is an open-source project focused on fine-tuning a small language model for mathematical reasoning using parameter-efficient LoRA training.  

This repository contains **only a LoRA adapter** trained on the full GSM8K dataset. Users must load the base model separately and attach the adapter using PEFT.  

The latest version of this model was trained on an **AMD MI300X GPU using ROCm**, demonstrating that high-performance non-NVIDIA accelerators can successfully support modern large language model fine-tuning with PyTorch and Hugging Face.  

---

## Base Model  
**Qwen/Qwen2.5-Math-1.5B**  

This repository **does not contain the base model weights** — they must be loaded directly from Hugging Face before applying this LoRA adapter.  

---

## Hardware Used (Latest Training Run)  

- **GPU:** AMD MI300X (ROCm 7.0)  
- **VRAM:** 192 GB  
- **OS:** Ubuntu 24.04  
- **Framework:** PyTorch + Hugging Face  
- **Backend:** ROCm  

---

## Dataset  

**GSM8K (Grade School Math 8K)**  
- **Training samples:** 7,473 (full training split)  
- **Evaluation:** Full GSM8K test split (1,319 problems)  

Only the solution portion of each example was used for loss computation via loss masking to encourage stronger reasoning behavior.  

---

## Training Configuration  

**Method:** LoRA (full precision, bfloat16)  
**Precision:** bfloat16 (no 4-bit quantization in this run)  

### LoRA settings  
- Rank: 16  
- Alpha: 32  
- Dropout: 0.05  
- Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`  

### Data & sequence  
- Max sequence length: 1024  

### Optimization  
- Per-device batch size: 2  
- Gradient accumulation: 8  
- Effective batch size: 16  
- Learning rate: 1e-4  
- Optimizer: `adamw_torch`  
- Scheduler: cosine  
- Warmup: 5%  

### Training  
- **Epochs:** 3  

---

## Results  

**GSM8K Accuracy (Full Test Set):**  
750 / 1319 = **56.86% accuracy**  

This represents a substantial improvement over earlier small-scale Colab experiments and is a strong result for a 1.5B model trained with LoRA on the full dataset.  

---

## How to Use This Model  

1. Load the base model **Qwen/Qwen2.5-Math-1.5B** from Hugging Face.  
2. Attach this LoRA adapter using PEFT.  
3. Use a structured prompt that includes an instruction, problem, and solution section for best results.  

---

## Why This Matters  

- Demonstrates that **AMD MI300X** can effectively train modern LLMs with Hugging Face + LoRA.  
- Shows strong math reasoning at **1.5B parameters** with lightweight fine-tuning.  
- Provides a compact adapter instead of requiring users to download a massive full model.  

---

## Limitations  

- The model can make reasoning mistakes.  
- It should not be used for exams, assignments, or professional decisions.  
- Performance depends heavily on prompt formatting.  

---

## License  

**cc-by-nc-4.0**