๐งฎ NanoCalc-1M
NanoCalc-1M is a ultra-compact, character-level Seq2Seq Transformer based on the T5 architecture, specifically trained to perform arithmetic operations with high precision.
- Architecture: T5-based Encoder-Decoder
- Parameters: 0.99M
- Precision: Mixed Precision (BF16/FP16)
- Vocab: Character-level (0-9, +, -, *, /, =)
- Training Data: 2,000,000 synthetic samples (3-digit arithmetic)
- Max Input Length: 20 tokens
- Performance: ~97% Accuracy on 4-operation math (Validation Set)
Performance Chart
| Epoch | Training Loss | Val Accuracy | Status |
|---|---|---|---|
| 1 | 1.1420 | 54.89% | ๐ด Learnt Format |
| 2 | 0.3931 | 78.79% | ๐ก Learnt Digits |
| 5 | 0.1638 | 91.91% | ๐ข Learning subtleties |
| 9 | 0.1051 | 97.15% | ๐ต High Precision |
| 10 | 0.1004 | 97.73% | ๐ Near Perfect |
How to use
To use this model, download model.pt and use.py and run it on any type of device with Python3.
Examples
Model loaded (Accuracy: 97.73% from epoch 10)
--- Mini Math Model interactive --- Enter an arithmetic task (e.g. 15*15) or type 'exit' to quit this.
Task > 0*567 Model: 0 | Correct: 0 โ
Task > 999+999 Model: 1998 | Correct: 1998 โ
Task > 1/1 Model: 1 | Correct: 1 โ
Task > 1684*8787 Model: 6398 | Correct: 14797308 โ
Task > 124*598 Model: 2452 | Correct: 74152 โ
Task > 12/68 Model: 4 | Correct: 0 โ
Task > 123*123 Model: 499 | Correct: 15129 โ
Task > 47*5 Model: 235 | Correct: 235 โ
Task > 456+125 Model: 581 | Correct: 581 โ
Task > 957-234 Model: 723 | Correct: 723 โ
Task > 120-7650 Model: -550 | Correct: -7530 โ
Task > 450-750 Model: -300 | Correct: -300 โ
Task > 453-97 Model: 356 | Correct: 356 โ
Task > 129-462 Model: -333 | Correct: -333 โ
Task > 8*8 Model: 64 | Correct: 64 โ
Task > 54*54 Model: 2916 | Correct: 2916 โ
Task > 102*78 Model: 748 | Correct: 7956 โ
Task > 74*9 Model: 666 | Correct: 666 โ
Task > 103-34 Model: 69 | Correct: 69 โ
Overall accuracy
The overall accuracy after 10 epochs of training is ~97% for tasks with max. 3 digits each like 74*9 or 103-34.
Limitations
The can't do:
- Tasks with more than 3 digits like
3984-125 - Multiplication tasks with numbers above 99 like
293*21 - Complex tasks
Training
We trained for 10 epochs (~20 minutes of training on Kaggle 2x T4) with 2 million randomly generated training samples.
Final thoughts
We may be releasing an improved version of this that can solve really complex tasks and much more...stay tuned!