(v2 train scripts : RoPE Postitional Encoding)

SpiceeChat β€” Train From Scratch via Flash3 Kernels

A custom TinyGPT trained from scratch using a BPE tokenizer and causal LM pipeline.

Model Details

Property Value
Architecture TinyGPT (custom GPT-style)
Layers 4
Heads 4
Hidden size 384
Context length 128
Vocab size 32,768
Attention Torch (T4 compatible)

Files

  • checkpoint_step_*.pt β€” model weights
  • tokenizer/ β€” BPE tokenizer trained on the same data
  • config.json β€” model hyperparameters

Load

import torch
import json
from tokenizers import Tokenizer

# Load tokenizer
tok = Tokenizer.from_file("tokenizer/tokenizer.json")

# Load model (requires train.py in same directory)
from train import TinyGPT, GPTConfig
cfg = GPTConfig(vocab_size=32768, ctx_len=128, n_layer=4, n_head=4, n_embd=384, attention_backend="torch")
model = TinyGPT(cfg)
ckpt = torch.load("latest.pt", map_location="cpu")
model.load_state_dict(ckpt["model"])
model.eval()
Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support