--- license: cc-by-nc-4.0 datasets: - openai/gsm8k language: - en base_model: - Qwen/Qwen2.5-Math-1.5B pipeline_tag: text-generation library_name: transformers tags: - math - qwen - lora - mathematics - gsm8k --- # OpenMath Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning ## Overview OpenMath is an open-source project focused on fine-tuning a small language model for mathematical reasoning using parameter-efficient LoRA training. This repository contains **only a LoRA adapter** trained on the full GSM8K dataset. Users must load the base model separately and attach the adapter using PEFT. The latest version of this model was trained on an **AMD MI300X GPU using ROCm**, demonstrating that high-performance non-NVIDIA accelerators can successfully support modern large language model fine-tuning with PyTorch and Hugging Face. --- ## Base Model **Qwen/Qwen2.5-Math-1.5B** This repository **does not contain the base model weights** — they must be loaded directly from Hugging Face before applying this LoRA adapter. --- ## Hardware Used (Latest Training Run) - **GPU:** AMD MI300X (ROCm 7.0) - **VRAM:** 192 GB - **OS:** Ubuntu 24.04 - **Framework:** PyTorch + Hugging Face - **Backend:** ROCm --- ## Dataset **GSM8K (Grade School Math 8K)** - **Training samples:** 7,473 (full training split) - **Evaluation:** Full GSM8K test split (1,319 problems) Only the solution portion of each example was used for loss computation via loss masking to encourage stronger reasoning behavior. --- ## Training Configuration **Method:** LoRA (full precision, bfloat16) **Precision:** bfloat16 (no 4-bit quantization in this run) ### LoRA settings - Rank: 16 - Alpha: 32 - Dropout: 0.05 - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj` ### Data & sequence - Max sequence length: 1024 ### Optimization - Per-device batch size: 2 - Gradient accumulation: 8 - Effective batch size: 16 - Learning rate: 1e-4 - Optimizer: `adamw_torch` - Scheduler: cosine - Warmup: 5% ### Training - **Epochs:** 3 --- ## Results **GSM8K Accuracy (Full Test Set):** 750 / 1319 = **56.86% accuracy** This represents a substantial improvement over earlier small-scale Colab experiments and is a strong result for a 1.5B model trained with LoRA on the full dataset. --- ## How to Use This Model 1. Load the base model **Qwen/Qwen2.5-Math-1.5B** from Hugging Face. 2. Attach this LoRA adapter using PEFT. 3. Use a structured prompt that includes an instruction, problem, and solution section for best results. --- ## Why This Matters - Demonstrates that **AMD MI300X** can effectively train modern LLMs with Hugging Face + LoRA. - Shows strong math reasoning at **1.5B parameters** with lightweight fine-tuning. - Provides a compact adapter instead of requiring users to download a massive full model. --- ## Limitations - The model can make reasoning mistakes. - It should not be used for exams, assignments, or professional decisions. - Performance depends heavily on prompt formatting. --- ## License **cc-by-nc-4.0**