File size: 3,540 Bytes
9cc842e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89f5a4c
9cc842e
89f5a4c
9cc842e
89f5a4c
9cc842e
 
 
 
89f5a4c
9cc842e
89f5a4c
9cc842e
 
 
 
 
89f5a4c
 
 
 
 
9cc842e
 
 
 
 
89f5a4c
 
 
9cc842e
89f5a4c
9cc842e
 
 
 
 
3540770
89f5a4c
3540770
89f5a4c
3540770
 
 
 
 
89f5a4c
3540770
 
89f5a4c
 
3540770
89f5a4c
3540770
 
 
 
 
89f5a4c
 
9cc842e
 
 
 
 
89f5a4c
 
9cc842e
89f5a4c
9cc842e
 
 
5ab6bad
 
 
 
 
 
 
 
 
 
 
 
330ef99
6d9e0de
 
5ab6bad
 
 
9cc842e
 
89f5a4c
 
 
9cc842e
 
 
 
 
89f5a4c
 
 
9cc842e
 
 
 
 
89f5a4c
 
 
9cc842e
 
 
 
 
89f5a4c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
---
license: cc-by-nc-4.0
datasets:
- openai/gsm8k
language:
- en
base_model:
- Qwen/Qwen2.5-Math-1.5B
pipeline_tag: text-generation
library_name: transformers
tags:
- math
- qwen
- lora
- mathematics
- gsm8k
---

# OpenMath  
Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning  

## Overview  
OpenMath is an open-source project focused on fine-tuning a small language model for mathematical reasoning using parameter-efficient LoRA training.  

This repository contains **only a LoRA adapter** trained on the full GSM8K dataset. Users must load the base model separately and attach the adapter using PEFT.  

The latest version of this model was trained on an **AMD MI300X GPU using ROCm**, demonstrating that high-performance non-NVIDIA accelerators can successfully support modern large language model fine-tuning with PyTorch and Hugging Face.  

---

## Base Model  
**Qwen/Qwen2.5-Math-1.5B**  

This repository **does not contain the base model weights** — they must be loaded directly from Hugging Face before applying this LoRA adapter.  

---

## Hardware Used (Latest Training Run)  

- **GPU:** AMD MI300X (ROCm 7.0)  
- **VRAM:** 192 GB  
- **OS:** Ubuntu 24.04  
- **Framework:** PyTorch + Hugging Face  
- **Backend:** ROCm  

---

## Dataset  

**GSM8K (Grade School Math 8K)**  
- **Training samples:** 7,473 (full training split)  
- **Evaluation:** Full GSM8K test split (1,319 problems)  

Only the solution portion of each example was used for loss computation via loss masking to encourage stronger reasoning behavior.  

---

## Training Configuration  

**Method:** LoRA (full precision, bfloat16)  
**Precision:** bfloat16 (no 4-bit quantization in this run)  

### LoRA settings  
- Rank: 16  
- Alpha: 32  
- Dropout: 0.05  
- Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`  

### Data & sequence  
- Max sequence length: 1024  

### Optimization  
- Per-device batch size: 2  
- Gradient accumulation: 8  
- Effective batch size: 16  
- Learning rate: 1e-4  
- Optimizer: `adamw_torch`  
- Scheduler: cosine  
- Warmup: 5%  

### Training  
- **Epochs:** 3  

---

## Results  

**GSM8K Accuracy (Full Test Set):**  
750 / 1319 = **56.86% accuracy**  

This represents a substantial improvement over earlier small-scale Colab experiments and is a strong result for a 1.5B model trained with LoRA on the full dataset.  

---

## GSM8K Accuracy Comparison

| Model        | Accuracy (%) |
|--------------|--------------|
| Llama2-70B   | 56.8 |
| Gemma-7B     | 46.4 |
| Mistral-7B   | 52.2 |
| Llama-3-8B   | 56.0 |
| Qwen1.5-7B   | 62.5 |
| **OpenMath** | **56.86** |




![image](https://cdn-uploads.huggingface.co/production/uploads/677ac3710c9718b04aac4c1f/T5Gq4euiqRXOZkEKH58_P.png)

---

## How to Use This Model  

1. Load the base model **Qwen/Qwen2.5-Math-1.5B** from Hugging Face.  
2. Attach this LoRA adapter using PEFT.  
3. Use a structured prompt that includes an instruction, problem, and solution section for best results.  

---

## Why This Matters  

- Demonstrates that **AMD MI300X** can effectively train modern LLMs with Hugging Face + LoRA.  
- Shows strong math reasoning at **1.5B parameters** with lightweight fine-tuning.  
- Provides a compact adapter instead of requiring users to download a massive full model.  

---

## Limitations  

- The model can make reasoning mistakes.  
- It should not be used for exams, assignments, or professional decisions.  
- Performance depends heavily on prompt formatting.  

---

## License  

**cc-by-nc-4.0**