๐Ÿงฎ NanoCalc-1M

NanoCalc-1M is a ultra-compact, character-level Seq2Seq Transformer based on the T5 architecture, specifically trained to perform arithmetic operations with high precision.

  • Architecture: T5-based Encoder-Decoder
  • Parameters: 0.99M
  • Precision: Mixed Precision (BF16/FP16)
  • Vocab: Character-level (0-9, +, -, *, /, =)
  • Training Data: 2,000,000 synthetic samples (3-digit arithmetic)
  • Max Input Length: 20 tokens
  • Performance: ~97% Accuracy on 4-operation math (Validation Set)

Performance Chart

Epoch Training Loss Val Accuracy Status
1 1.1420 54.89% ๐Ÿ”ด Learnt Format
2 0.3931 78.79% ๐ŸŸก Learnt Digits
5 0.1638 91.91% ๐ŸŸข Learning subtleties
9 0.1051 97.15% ๐Ÿ”ต High Precision
10 0.1004 97.73% ๐Ÿš€ Near Perfect

How to use

To use this model, download model.pt and use.py and run it on any type of device with Python3.

Examples

Model loaded (Accuracy: 97.73% from epoch 10)

--- Mini Math Model interactive --- Enter an arithmetic task (e.g. 15*15) or type 'exit' to quit this.

Task > 0*567 Model: 0 | Correct: 0 โœ…

Task > 999+999 Model: 1998 | Correct: 1998 โœ…

Task > 1/1 Model: 1 | Correct: 1 โœ…

Task > 1684*8787 Model: 6398 | Correct: 14797308 โŒ

Task > 124*598 Model: 2452 | Correct: 74152 โŒ

Task > 12/68 Model: 4 | Correct: 0 โŒ

Task > 123*123 Model: 499 | Correct: 15129 โŒ

Task > 47*5 Model: 235 | Correct: 235 โœ…

Task > 456+125 Model: 581 | Correct: 581 โœ…

Task > 957-234 Model: 723 | Correct: 723 โœ…

Task > 120-7650 Model: -550 | Correct: -7530 โŒ

Task > 450-750 Model: -300 | Correct: -300 โœ…

Task > 453-97 Model: 356 | Correct: 356 โœ…

Task > 129-462 Model: -333 | Correct: -333 โœ…

Task > 8*8 Model: 64 | Correct: 64 โœ…

Task > 54*54 Model: 2916 | Correct: 2916 โœ…

Task > 102*78 Model: 748 | Correct: 7956 โŒ

Task > 74*9 Model: 666 | Correct: 666 โœ…

Task > 103-34 Model: 69 | Correct: 69 โœ…

Overall accuracy

The overall accuracy after 10 epochs of training is ~97% for tasks with max. 3 digits each like 74*9 or 103-34.

Limitations

The can't do:

  • Tasks with more than 3 digits like 3984-125
  • Multiplication tasks with numbers above 99 like 293*21
  • Complex tasks

Training

We trained for 10 epochs (~20 minutes of training on Kaggle 2x T4) with 2 million randomly generated training samples.

Final thoughts

We may be releasing an improved version of this that can solve really complex tasks and much more...stay tuned!

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including LH-Tech-AI/NanoCalc-1M