Stas Bekman
stas
AI & ML interests
Toolmaker. Software creator, optimizer and harmonizer.
Makes things work and fly at Snowflake AI Research
Training LLM/RAG/Generative AI/Machine Learning/Scalability
Recent Activity
updated a model about 1 month ago
stas/ml-engineering-book posted an update about 1 month ago
Good news! Ulysses Sequence Parallelism from the Snowflake AI Research and the Deepspeed teams has been integrated into
HuggingFace Trainer, Accelerate and TRL
For extensive details please see this writeup:
https://huggingface.co/blog/ulysses-sp
Thanks a lot to Kashif Rasul for helping make it happen. Also the others in the HF team who helped with integration. published an article about 1 month ago
Ulysses Sequence Parallelism: Training with Million-Token Contexts