comphys (Computational Physicist)

Post

467

just published a short article about something that bit me hard while porting PI05’s subtask prediction to PyTorch: left vs right alignment in transformer padding.
turns out JAX (what Physical Intelligence used) and Hugging Face use opposite padding conventions — and if you don’t catch it, your model silently produces nonsense instead of crashing. no NaN, no error, just garbled subtasks 🤡
i walk through the full tensor pipeline — images → embeddings → pad masks → attention masks → position IDs — and show exactly where the mismatch corrupts everything. also included the implementation file with the fix.
if you’ve ever ported a model between frameworks or messed with custom attention patterns, i think you will enjoy it

1 reply

AI & ML interests

Team members 1

comphys's activity