roneneldan/TinyStories
Viewer • Updated • 2.14M • 88.4k • 989
A small educational transformer model trained from scratch for text generation tasks.
MiniTransformer is a compact transformer architecture designed for educational purposes and experimentation. The model is trained on question-answer pairs with various system prompts to demonstrate fundamental transformer capabilities.
This is an educational model - it's designed to help understand transformer architectures and training processes, not for production use.
torch.compile() optimizationimport torch
from transformers import AutoTokenizer
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
# Load model (you'll need to download the checkpoint)
# model = MiniTransformer(...)
# model.load_state_dict(torch.load("checkpoint.pt"))
# Generate text
input_text = "Your prompt here"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
# Generation code here