AI & ML interests

None defined yet.

jorgemunozlΒ 
posted an update 5 months ago
view post
Post
467
just published a short article about something that bit me hard while porting PI05’s subtask prediction to PyTorch: left vs right alignment in transformer padding.
turns out JAX (what Physical Intelligence used) and Hugging Face use opposite padding conventions β€” and if you don’t catch it, your model silently produces nonsense instead of crashing. no NaN, no error, just garbled subtasks 🀑
i walk through the full tensor pipeline β€” images β†’ embeddings β†’ pad masks β†’ attention masks β†’ position IDs β€” and show exactly where the mismatch corrupts everything. also included the implementation file with the fix.
if you’ve ever ported a model between frameworks or messed with custom attention patterns, i think you will enjoy it
  • 1 reply
Β·